We’ve already reviewed the Rockchip RK3568-power Youyeetoo YY3568 SBC with Android 11 – and listed the specifications and checked out the hardware kit – in the first part of the review. We now had time to switch to Lubuntu 20.04, perform some basic tests, and also have a closer look at the RKNPU2 AI SDK for the built-in 0.8 TOPS AI accelerator found in the Rockchip RK3568 SoC.
Installing Ubuntu or Debian on YY3568 SBC
The company provides both Debian and Ubuntu images for the YY3568 SBC with different images depending on the boot device (SD card or eMMC flash) and video interface used (DSI, eDP, HDMI).
Our YY3568 “Bundle 5” kit comes with an 11.6-inch eDP display so we’ll select the “Ubuntu 20” image with edp in the file name. The RKDevTool program is used to flash Linux images and it’s the same procedure as we used with Android 11.
After the installation is complete, we can press the Reset button and boot into Ubuntu 20.04, or more exactly Lubuntu 20.04 with the LXQt desktop environment.
Benchmarking YY3568 SBC in Linux
Now that we have Lubuntu 20.04 properly installed, we can run some Linux benchmarks on the YY3568 SBC.
We’ll start with the sbc-bench.sh script from Thomass Kaiser:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
tinymembench,ramlat, mhz. Done. Checking cpufreq OPP. Done (results will be available in 10-16 minutes). Executing tinymembench. Done. Executing RAM latency tester. Done. Executing OpenSSL benchmark. Done. Executing 7-zip benchmark. Done. Checking cpufreq OPP again. Done (13 minutes elapsed). Results validation: * Advertised vs. measured max CPU clockspeed: -5.4% before, -6.4% after * Background activity (%system) OK * No throttling Memory performance memcpy: 3074.9 MB/s memset: 5942.3 MB/s 7-zip total scores (3 consecutive runs): 4496,4437,4431, single-threaded: 1211 OpenSSL results: type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-128-cbc 178431.77k 520431.64k 992779.52k 1288375.30k 1411738.28k 1420580.18k aes-128-cbc 181153.17k 521213.70k 994408.62k 1288928.94k 1410916.35k 1421082.62k aes-192-cbc 172219.92k 460495.64k 802809.26k 990434.99k 1061336.41k 1067149.99k aes-192-cbc 172259.37k 460785.32k 802708.99k 990319.62k 1060656.47k 1066183.34k aes-256-cbc 165998.18k 424880.55k 694931.71k 826145.11k 872235.01k 875752.11k aes-256-cbc 165966.50k 424769.43k 694433.88k 826590.55k 872420.69k 875724.80k |
You can find the full details @ http://ix.io/4Ga2
We then used iozone from Phoronix Test Suite to test the eMMC flash read and write speeds With a file size of 512MB and a block size of 1MB, iozone reported a 1190.83 MB/sec read speed which does not seem realistic:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
IOzone 3.465: pts/iozone-1.9.6 [Record Size: 1MB - File Size: 512MB - Disk Test: Read Performance] Test 1 of 8 Test Profile Status: Deprecated Estimated Trial Run Count: 3 Estimated Test Run-Time: 20 Minutes Record Size: 1MB - File Size: 512MB - Disk Test: Read Performance: 1068.0517578125 1322.853515625 1323.724609375 1251.857421875 1277.0048828125 1307.0283203125 1366.4228515625 1238.892578125 1197.197265625 999.79296875 1297.1181640625 1060.5302734375 979.6884765625 1058.6123046875 1113.677734375 Average: 1190.83 MB/s Deviation: 11.01% Samples: 15 Comparison of 1,716 OpenBenchmarking.org samples since 26 February 2011 to 2 August; median result: 4716 MB/s. Box plot of samples: [ |---*--------###########!######*####---*----*-------------*----*--| ] ^ This Result (6th Percentile): 1191 512GB database: 5833 ^ 240GB DELLBOSS VD: 11355 ^ 480GB SAMSUNG MZ7LH480: 10579 ^ 2 x 250GB Samsung SSD 850: 8195 ^ 5 x 500GB Crucial_CT500MX2: 7346 ^ |
That result is clearly due to caching, so we repeated the test directly with iozone using the I parameter to enable “DIRECT IO” to bypass the buffer cache and go directly to disk:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
root@smartfly:/# iozone -e -I -a -s 512M -r 1024k -r 16384k -i 0 -i 1 -i 2 Iozone: Performance Test of File I/O Version $Revision: 3.489 $ Compiled for 64 bit mode. Build: linux Run began: Wed Sep 20 15:24:27 2023 Include fsync in write timing O_DIRECT feature enabled Auto Mode File size set to 524288 kB Record Size 1024 kB Record Size 16384 kB Command line used: iozone -e -I -a -s 512M -r 1024k -r 16384k -i 0 -i 1 -i 2 Output is in kBytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 kBytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 524288 1024 87888 89372 112574 113474 113612 85778 524288 16384 98887 95894 152500 153558 153118 100792 iozone test complete. |
That’s about 153.5MB/s read speed and 98.8MB/s write speed, so the 64GB eMMC flash used in the board is rather fast.
Networking performance
We will test the Ethernet and WiFi networking performance with iperf3 using the router provided by AIS (a telecom operator in Thailand)
The YY3568 board has two Ethernet interfaces. Let’s start with eth0:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
root@smartfly:/# iperf3 -c 192.168.1.162 -f g Connecting to host 192.168.1.162, port 5201 [ 5] local 192.168.1.131 port 39516 connected to 192.168.1.162 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 113 MBytes 0.95 Gbits/sec 0 451 KBytes [ 5] 1.00-2.00 sec 113 MBytes 0.95 Gbits/sec 0 477 KBytes [ 5] 2.00-3.00 sec 112 MBytes 0.94 Gbits/sec 0 477 KBytes [ 5] 3.00-4.00 sec 112 MBytes 0.94 Gbits/sec 0 499 KBytes [ 5] 4.00-5.00 sec 112 MBytes 0.94 Gbits/sec 0 525 KBytes [ 5] 5.00-6.00 sec 113 MBytes 0.94 Gbits/sec 0 525 KBytes [ 5] 6.00-7.00 sec 112 MBytes 0.94 Gbits/sec 0 525 KBytes [ 5] 7.00-8.00 sec 113 MBytes 0.95 Gbits/sec 0 549 KBytes [ 5] 8.00-9.00 sec 112 MBytes 0.94 Gbits/sec 0 549 KBytes [ 5] 9.00-10.00 sec 112 MBytes 0.94 Gbits/sec 0 549 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 1.10 GBytes 0.94 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 1.10 GBytes 0.94 Gbits/sec receiver iperf Done. |
All good, and the same can be said when running the test on Eth1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
root@smartfly:/# iperf3 -c 192.168.1.162 -f g Connecting to host 192.168.1.162, port 5201 [ 5] local 192.168.1.134 port 54100 connected to 192.168.1.162 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 113 MBytes 0.95 Gbits/sec 0 451 KBytes [ 5] 1.00-2.00 sec 112 MBytes 0.94 Gbits/sec 0 520 KBytes [ 5] 2.00-3.00 sec 113 MBytes 0.95 Gbits/sec 0 520 KBytes [ 5] 3.00-4.00 sec 112 MBytes 0.94 Gbits/sec 0 547 KBytes [ 5] 4.00-5.00 sec 113 MBytes 0.94 Gbits/sec 0 547 KBytes [ 5] 5.00-6.00 sec 112 MBytes 0.94 Gbits/sec 0 547 KBytes [ 5] 6.00-7.00 sec 112 MBytes 0.94 Gbits/sec 0 547 KBytes [ 5] 7.00-8.00 sec 112 MBytes 0.94 Gbits/sec 0 547 KBytes [ 5] 8.00-9.00 sec 112 MBytes 0.94 Gbits/sec 0 547 KBytes [ 5] 9.00-10.00 sec 112 MBytes 0.94 Gbits/sec 0 547 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 1.10 GBytes 0.94 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 1.10 GBytes 0.94 Gbits/sec receiver |
We used the AIS router’s 5GHz network to test WiFi 5 (RTL8822CE module) and the average data transmission speed was a respectable 575 Mbps.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
root@smartfly:/# iperf3 -c 192.168.1.162 -f m Connecting to host 192.168.1.162, port 5201 [ 5] local 192.168.1.124 port 39890 connected to 192.168.1.162 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.01 sec 51.5 MBytes 427 Mbits/sec 0 2.67 MBytes [ 5] 1.01-2.00 sec 72.5 MBytes 614 Mbits/sec 0 3.13 MBytes [ 5] 2.00-3.00 sec 73.8 MBytes 620 Mbits/sec 0 3.13 MBytes [ 5] 3.00-4.00 sec 73.8 MBytes 619 Mbits/sec 0 3.13 MBytes [ 5] 4.00-5.01 sec 71.2 MBytes 591 Mbits/sec 0 3.13 MBytes [ 5] 5.01-6.01 sec 70.0 MBytes 590 Mbits/sec 0 3.13 MBytes [ 5] 6.01-7.00 sec 71.2 MBytes 602 Mbits/sec 0 3.13 MBytes [ 5] 7.00-8.00 sec 72.5 MBytes 608 Mbits/sec 0 3.13 MBytes [ 5] 8.00-9.00 sec 73.8 MBytes 616 Mbits/sec 0 3.13 MBytes [ 5] 9.00-10.01 sec 61.2 MBytes 509 Mbits/sec 0 3.13 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.01 sec 692 MBytes 579 Mbits/sec 0 sender [ 5] 0.00-10.08 sec 691 MBytes 575 Mbits/sec receiver |
So networking works well on the YY3568 SBC either with Ethernet or WiFi 5
3D graphics acceleration on Rockchip RK3568
We tested the Mali-G52 GPU performance with glmark2 benchmark with the system getting 115 points.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
======================================================= glmark2 2021.02 ======================================================= OpenGL Information GL_VENDOR: ARM GL_RENDERER: Mali-G52 GL_VERSION: OpenGL ES 3.2 v1.g2p0-01eac0.327c41db9c110a33ae6f67b4cc0581c7 ======================================================= [build] use-vbo=false: FPS: 116 FrameTime: 8.621 ms [build] use-vbo=true: FPS: 126 FrameTime: 7.937 ms [texture] texture-filter=nearest: FPS: 156 FrameTime: 6.410 ms [texture] texture-filter=linear: FPS: 150 FrameTime: 6.667 ms [texture] texture-filter=mipmap: FPS: 157 FrameTime: 6.369 ms [shading] shading=gouraud: FPS: 123 FrameTime: 8.130 ms [shading] shading=blinn-phong-inf: FPS: 119 FrameTime: 8.403 ms [shading] shading=phong: FPS: 122 FrameTime: 8.197 ms [shading] shading=cel: FPS: 122 FrameTime: 8.197 ms [bump] bump-render=high-poly: FPS: 91 FrameTime: 10.989 ms [bump] bump-render=normals: FPS: 151 FrameTime: 6.623 ms [bump] bump-render=height: FPS: 147 FrameTime: 6.803 ms [effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 125 FrameTime: 8.000 ms [effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 70 FrameTime: 14.286 ms [pulsar] light=false:quads=5:texture=false: FPS: 147 FrameTime: 6.803 ms [desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 78 FrameTime: 12.821 ms [desktop] effect=shadow:windows=4: FPS: 125 FrameTime: 8.000 ms [buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 55 FrameTime: 18.182 ms [buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 52 FrameTime: 19.231 ms [buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 73 FrameTime: 13.699 ms [ideas] speed=duration: FPS: 74 FrameTime: 13.514 ms [jellyfish] <default>: FPS: 106 FrameTime: 9.434 ms [terrain] <default>: FPS: 28 FrameTime: 35.714 ms [shadow] <default>: FPS: 97 FrameTime: 10.309 ms [refract] <default>: FPS: 48 FrameTime: 20.833 ms [conditionals] fragment-steps=0:vertex-steps=0: FPS: 147 FrameTime: 6.803 ms [conditionals] fragment-steps=5:vertex-steps=0: FPS: 144 FrameTime: 6.944 ms [conditionals] fragment-steps=0:vertex-steps=5: FPS: 149 FrameTime: 6.711 ms [function] fragment-complexity=low:fragment-steps=5: FPS: 147 FrameTime: 6.803 ms [function] fragment-complexity=medium:fragment-steps=5: FPS: 129 FrameTime: 7.752 ms [loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 144 FrameTime: 6.944 ms [loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 146 FrameTime: 6.849 ms [loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 131 FrameTime: 7.634 ms ======================================================= glmark2 Score: 115 ======================================================= |
The score is a bit on the low side, but 3D graphics hardware acceleration is enabled.
Video and audio playback
We played YouTube videos to test both video and audio playback. Youyeetoo YY3568 board happens to have multiple audio output options including a 3.5mm audio jack and a connector with a mono class D power amplifier to connect speakers directly. We had no issues with video and audio playback and everything worked well. This should be a suitable platform for digital signage applications.
The short video clip above shows a YouTube video played in the Chromium web browser.
Checking out the RKNPU2 SDK for AI workload on Rockchip RK3568 SoC
RKNN-Toolkit2 is a software development kit (SDK) for AI workload running on recent Rockchip SoCs with an NPU, namely RK3566, RK3568, RK3588, RK3588S, RV1103, RV1106, RK3562).
There are two parts to get started
- Taking pre-trained models and converting them to the RKNN models using the tools at https://github.com/rockchip-linux/rknn-toolkit2
- Using the transformed model through the RKNPU2 available at https://github.com/rockchip-linux/rknpu2
Installation
In this review, we will show how to deploy the YOLO5 model converted through rknn-toolkit2. Let’s build the RKNPU2 sample for the Rockchip RK3568 processor:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
root@smartfly:/# cd /home/youyeetoo/rknpu2/examples/rknn_yolov5_demo root@smartfly:/home/youyeetoo/rknpu2/examples/rknn_yolov5_demo# ls build-android_RK3562.sh build-linux_RK3588.sh README_CN.md build-android_RK3566_RK3568.sh CMakeLists.txt README.md build-android_RK3588.sh convert_rknn_demo src build-linux_RK3562.sh include utils build-linux_RK3566_RK3568.sh model <pu2/examples/rknn_yolov5_demo# chmod +x build-linux_RK3566_RK3568.sh <2/examples/rknn_yolov5_demo# ./build-linux_RK3566_RK3568.sh -- The C compiler identification is GNU 9.4.0 -- The CXX compiler identification is GNU 9.4.0 -- Check for working C compiler: /usr/bin/aarch64-linux-gnu-gcc -- Check for working C compiler: /usr/bin/aarch64-linux-gnu-gcc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/aarch64-linux-gnu-g++ -- Check for working CXX compiler: /usr/bin/aarch64-linux-gnu-g++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found OpenCV: /home/youyeetoo/rknpu2/examples/3rdparty/opencv/opencv-linux-aarch64 (found version "3.4.5") -- Configuring done -- Generating done -- Build files have been written to: /home/youyeetoo/rknpu2/examples/rknn_yolov5_demo/build/build_linux_aarch64rknn_benchmark/install/rknn_benchmark_Linux/lib/librknnrt.so /home/youyeetoo/rknpu2/examples/rknn_benchmark |
When the build is complete, we will get a binary file named rknn_yolov5_demo.
Running YOLO5 on the YY3568 board
In order to test the sample, we will run rknn_yolov5_demo with two parameters: a model to use and an input image.
1 2 |
cd /home/youyeetoo/rknpu2/examples/rknn_yolov5_demo ./rknn_yolov5_demo ./model/RK3566_RK3568/yolov5s-640-640.rknn ./bus.jpg |
The output will be the out.jpg image with boxes drawn with labels around objects detected in the image. The model is trained for 640×640, so don’t be surprised if nothing is detected when using images of different sizes. Besides the bus image, we also tested the man.jpg image with both having multiple objects detected.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
post process config: box_conf_threshold = 0.25, nms_threshold = 0.45 Read ./model/man.jpg ... img width = 640, img height = 640 Loading mode... sdk version: 1.5.2 (c6b7b351a@2023-08-23T15:28:22) driver version: 0.8.8 model input num: 1, output num: 3 index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, w_stride = 640, size_with_stride=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=0, name=output, n_dims=4, dims=[1, 255, 80, 80], n_elems=1632000, size=1632000, w_stride = 0, size_with_stride=1638400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003860 index=1, name=283, n_dims=4, dims=[1, 255, 40, 40], n_elems=408000, size=408000, w_stride = 0, size_with_stride=409600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=2, name=285, n_dims=4, dims=[1, 255, 20, 20], n_elems=102000, size=102000, w_stride = 0, size_with_stride=122880, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003915 model is NHWC input fmt model input height=640, width=640, channel=3 once run use 78.917000 ms loadLabelName ./model/coco_80_labels_list.txt person @ (89 157 258 631) 0.895037 bowl @ (483 221 506 240) 0.679969 bowl @ (395 322 444 343) 0.659576 wine glass @ (570 200 588 241) 0.544585 bowl @ (505 221 527 239) 0.477606 bowl @ (482 322 532 338) 0.458121 wine glass @ (543 199 564 239) 0.452579 cup @ (418 215 437 238) 0.410092 cup @ (385 204 402 240) 0.374592 cup @ (435 212 451 238) 0.371657 bowl @ (613 215 639 239) 0.359605 wine glass @ (557 200 575 240) 0.359143 cup @ (446 211 461 238) 0.358369 spoon @ (255 257 271 313) 0.340807 bottle @ (412 84 432 119) 0.338540 spoon @ (307 267 322 326) 0.318563 spoon @ (324 265 340 332) 0.315867 bottle @ (453 305 466 340) 0.308927 cup @ (526 210 544 239) 0.290318 bottle @ (389 83 411 119) 0.277804 wine glass @ (583 198 602 239) 0.277093 bowl @ (24 359 101 383) 0.275663 oven @ (4 370 168 632) 0.256395 spoon @ (268 262 282 322) 0.252866 bottle @ (434 85 454 118) 0.250721 loop count = 10 , average run 69.709700 ms |
The sample could detect up to 25 objects and it took around 70 ms, or about 14 FPS which can be considered quite fast as using a Raspberry Pi 4 with YOLO5 and the same 640×640 imsges run at only 4 FPS (on the CPU).
RKNN Benchmark
We can evaluate the NPU performance on the YY3568 with the RKNN Benchmark provided in the SDK running the test 10 times to get an average FPS value.
1 |
./rknn_benchmark /home/youyeetoo/rknpu2/examples/rknn_yolov5_demo/model/RK3566_RK3568/yolov5s-640-640.rknn man.jpg 10 |
Output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
rknn_api/rknnrt version: 1.5.2 (c6b7b351a@2023-08-23T15:28:22), driver version: 0.8.8 total weight size: 7308672, total internal size: 6144000 total dma used size: 21528576 model input num: 1, output num: 3 input tensors: index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, w_stride = 640, size_with_stride=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 output tensors: index=0, name=output, n_dims=4, dims=[1, 255, 80, 80], n_elems=1632000, size=1632000, w_stride = 0, size_with_stride=1638400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003860 index=1, name=283, n_dims=4, dims=[1, 255, 40, 40], n_elems=408000, size=408000, w_stride = 0, size_with_stride=409600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=2, name=285, n_dims=4, dims=[1, 255, 20, 20], n_elems=102000, size=102000, w_stride = 0, size_with_stride=122880, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003915 custom string: Warmup ... 0: Elapse Time = 58.82ms, FPS = 17.00 1: Elapse Time = 59.41ms, FPS = 16.83 2: Elapse Time = 59.29ms, FPS = 16.87 3: Elapse Time = 59.30ms, FPS = 16.86 4: Elapse Time = 59.30ms, FPS = 16.86 Begin perf ... 0: Elapse Time = 59.52ms, FPS = 16.80 1: Elapse Time = 49.05ms, FPS = 20.39 2: Elapse Time = 44.44ms, FPS = 22.50 3: Elapse Time = 44.29ms, FPS = 22.58 4: Elapse Time = 44.33ms, FPS = 22.56 5: Elapse Time = 44.48ms, FPS = 22.48 6: Elapse Time = 44.29ms, FPS = 22.58 7: Elapse Time = 44.28ms, FPS = 22.58 8: Elapse Time = 44.46ms, FPS = 22.49 9: Elapse Time = 44.29ms, FPS = 22.58 Avg Time 46.34ms, Avg FPS = 21.577 |
The average FPS value is 21.5 FPS is even higher than in our single run. It looks like continuously running the model a few times improves the performance possibly because some of the code or data is being cached. In any case, it shows the performance of the YY3568 board is suitable for computer vision applications, as long as real-time processing (30 fps+) is not needed.
YOLO5 with a USB camera
The YY3358 “Bundle 5” development kit comes with a MIPI CSI camera, but sadly we’ve been informed it only works in Android 11, and the Linux drivers are not ready for the camera. So we used a USB camera to do a quick test capture an image and run YOLO5 with one command line:
1 |
fswebcam -r 1280x720 -S 1 --no-banner --no-timestamp me.jpg | ./rknn_yolov5_demo ./model/RK3566_RK3568/yolov5s-640-640.rknn $me.jpg |
We could record an image from a USB webcam and send it to the rknn_yolov5_demo program. In order to draw a frame around object and label the text more clearly, we can edit ~/rknpu2/examples/rknn_yolov5_demo/src/main.cc as follows:
1 2 |
301 rectangle(orig_img, cv::Point(x1, y1), cv::Point(x2, y2), cv::Scalar(0, 0, 255, 0), 3); 302 putText(orig_img, text, cv::Point(x1, y1 + 12), cv::FONT_HERSHEY_SIMPLEX, 0.5, cv::Scalar(0, 0, 0)); |
Conclusion
After more than one week of testing, we found out that one recent issue with the board is that it freezes and stops working altogether from time to time due to the small heatsink that is not enough to cool the system and keep it running reliably at all times. If Youyeetoo had an extra heatsink +fan set it would be good especially because the board is designed to connect a cooling fan.
Apart from this issue, usability is good and the user guide on Youyeetoo’s wiki is well done. You can read the manual without having to be an expert at all. In terms of AI usage, the Rockchip platform is continuously evolving and improving as can be seen from the latest release date for the RKNPU2 SDK on GitHub (three weeks ago at the time of publication). There’s some learning curve to using Rockchip boards, but the performance/price ratio typically makes them interesting choices.
We’d like to thank Youyeetoo for sending the YY3568 “Bundle 5” devkit for review. The YY3568-Core CPU module, YY3568 SBC, and the Bundle 5 kit reviewed here can all be purchased on Aliexpress, Amazon, or Youyeetoo store with prices starting at $36.99 for the module only. The “Bundle 5” kit reviewed here with a module equipped with 8GB RAM and 64GB eMMC flash sells for $206.15 plus shipping.
CNXSoft: This article is a translation – with some edits – of the original review on CNX Software Thailand by Arnon Thongtem, and edited by Suthinee Kerdkaew.
Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.
Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress
Always thought its a shame that the RK3566/RK3568 all suffer from cost bloat of all function technology demonstrator SBC that maybe focus on what they have some unique selling points. The RK3568 pcie3.0 x2 lanes / 2x ethernet router with maybe not so much bloat in that 80/20 rule could cover a wide array of low cost solutions. Even a sata only as the paralels with networked CCTV are strong. The ones we do get with all bells and whistles have a problem that maybe the cost is just a tad too high. Would love to see a cutdown RK3566… Read more »
Well, I think the reason why this dev kit with such all functions is to show people what applications that they could use to for, especially for those who want to build their own product for businees with YY3568 core board and customized carrier board. Such as Intelligent NVR, cloud terminal, IoT gateway, industrial control, edge computing, face gate, NAS, vehicle center control and other scenarios.
Their up-coming youyeetoo R1 SBC would be the one for specific function you metioned.
What?! No power usage measurements?!
The urvanov syntax highlighting creates a scrollable div 1px smaller than its content, so every single source on the page grabs the mouse scroll, preventing the page from scrolling naturally until you allow the page to come to a complete stop, scroll it down 1px, wait a couple seconds, then you’re allowed to scroll again. To avoid this papercut, visitors can move their mouse left or right of the syntax highlighter boxes… Of course I forget to do that every single time I visit, so instead I wrote a fix for cnx-software in Stylish extension… /* 1px error lock page… Read more »
I can’t reproduce it myself (or does it only happen on mobile?), but I’ve reported the bug to the developers. We’ll see if they fix it.