NVIDIA created a lot of buzz when they released $99 Jetson Nano SBC featuring a 128-core Maxwell GPU, and said to deliver 472 GFLOPS of compute performance for running modern AI workloads with a power consumption of around 5 watts.
But Jetson Nano is not the only low cost platform to deliver high performance at low power for AI workloads, as for example Rockchip RK3399Pro (RK1808 NPU) found in boards such as Toybrick RK3399Pro is said to deliver 3 TOPS for INT8, 300 GOPS for INT16, and 100 GOPS for FP16 inferences.
Those operations per second numbers can be confusing and misleading, so it’s important to check out the performance of actual neural network models, and Rockchip did provide some RK3399Pro benchmarks last year for Inception V3, ResNet34 and VGG16 models comparing the results to Apple A11, Huawei Kirin 970, and NVIDIA Jetson TX2. However, ideally you’d want result from third parties, and Chengwei Zhang got hold of a Toybrick board, and explain in details how to run Inception V3 Keras model on the board in his blog.
There are basically two main steps:
- Freeze Keras model to TensorFlow graph and creates inference model with RKNN Toolkit. To be done in a powerful Linux computer instead of the target board for performance reasons.
- Load the RKNN model on an RK3399Pro dev board and make predictions.
As a side note, Toybrick team put the Android source code in Github using a manifest file, instead of just providing the usual 6GB tarballs. That means you can just retrieve the code as follows:
1 |
repo init --repo-url http://github.com/aosp-mirror/tools_repo.git -u http://github.com/rockchip-toybrick/manifest.git -b master -m rk3399pro.xml |
The tarball is also available on Baidu in case you run into problems with Github.
Chengwei goes into details about the two steps described above, so I’ll skip right the the final results. Toybrick RK3399Pro board achieves an average FPS of 28.94, even faster than Jetson Nano’s 27.18 FPS running a much smaller MobileNetV2 model. The Inception V3 model is way more complex than MobileNet V2, so we can expect a larger difference between the two boards for identical models.
For reference, Rockchip reported VGG16 ran at 50 fps on RK3399Pro while running at 32 fps on the Jetson TX2 with a 256-core Pascal GPU, and 86 fps vs 82 fps for Resnet50 as shown in the older chart below.
The downside is that Toybrick RK3399Pro is much more expensive than Jetson Nano, since the 3GB RAM version sells for $249, and the 6GB RAM model for $299 on VAMRS website. Hopefully, a vendor will come up with a cheap RK3399Pro board, or better with a Rockchip RK1808 board that should offer similar inference performance at a much lower cost.
Thanks to Jon for the tip.
Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.
Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress
Comparison of latest RK3399 with 2 year old apple is little not fair 😉 if I good remember that was one of the first CPU with NPU, release when no body before offer that in mobile phones CPU.
CPU A12 – got score for ResNET50 – 147 and for inceptionV3 84. and there we talking about phone unit – but RK3399 on have load could eat 10W so better just compare to this same class of CPU – like A12X from iPad … but then it will be smashed
Like for like comparison is what is required. How much does CPU A12 , A12X device cost?
The interesting point here is mostly about RK1808 NPU. It’s the AI accelerator in RK3399Pro. With a dual core Cortex-A35 it will cost much less, and people have been quoted $12 for RK1808 + PMIC (sample) or around $6 for the same combo in larger quantities. So that means a board could be made for well under $100 with about the same performance as RK3399Pro, and faster than Jetson Nano provided there’s no bottleneck related to the CPU.
These RK3399Pro boards should come down in price. It should be possible to make one for $100 depending on RAM/flash configuration. A RK1808 USB stick is also an alternative. I think we’ll see RK1808 USB sticks for under $25. And Rk3399 – not Pro boards are already available for under $70. So total is under $100. The current boards are expensive because they are new, and because they are loaded with 6GB RAM and flash.
Note that the AI core is in the RK1808 and the RK1808 is $6. You are going to see $30 SBCs shortly that out perform the Nvidia Jetson. It is pretty amazing that a $6 part can do Inception V2 at 30FPS.
The AI core in the RK1808 can only do CNN type operations. The silicon is customized for running CNN. The Jetson is still a GPU so it can do graphics stuff in addition to CNN. This is probably the explanation for the RK1808 needing less power and being cheaper.
ROCKPro64 2GB Single Board Computer is 60USD but there is no storage
8GM EMMC module for ROCKPro64 is $12
Jetson Nano Module is $129. https://developer.nvidia.com/embedded/buy/jetson-nano
TB-96AIoT RK1808 module with same RAM/flash is $119.
But there is a lot of room for price cuts in the TB-96AIoT module. The BOM cost for that module is under $50.
Everyone wants their cut, the bigger the better…
Toybrick’s not the only one with a 3399Pro offered. They’re just more expensive than some of the other offerings.
Just a correction: Jetson nano is 5W only in conservative-power mode; at full power it’s 10W or a tad above.
The new Asus Tinker Edge T and CR1S-CM-A SBC will be both equipped with an NXP i.MX 8M and Google’s Coral Edge TPU.
$199?