Firefly ROC-RK3576-PC low-profile Rockchip RK3576 SBC supports AI models like Gemma-2B, LlaMa2-7B, ChatGLM3-6B

Firefly ROC-RK3576-PC is a low-power, low-profile SBC built around the Rockchip RK3576 octa-core Cortex-A72/A53 SoC which we also find in the Forlinx FET3576-C, the Banana Pi BPI-M5, and Mekotronics R57 Mini PC. In terms of power and performance, this SoC falls in between the Rockchip RK3588 and RK3399 SoCs and can be used for AIoT applications thanks to its 6 TOPS NPU.

Termed “mini computer” by Firefly this SBC supports up to 8GB LPDDR4/LPDDR4X memory and 256GB of eMMC storage. Additionally, it offers Gigabit Ethernet, WiFi 5, and Bluetooth 5.0 for connectivity. An M.2 2242 PCIe/SATA socket and microSD card can be used for storage, and the board also offers HDMI and MIPI DSI display interfaces, two MIPI CSI camera interfaces, a few USB ports, and a 40-pin GPIO header.

Firefly ROC RK3576 PC SBC image

Firefly ROC-RK3576-PC specifications

  • SoC – Rockchip RK3576
    • CPU
      • 4x Cortex-A72 cores at 2.2GHz, four Cortex-A53 cores at 1.8GHz
      • Arm Cortex-M0 MCU at 400MHz
    • GPU – ARM Mali-G52 MC3 GPU clocked at 1GHz with support for OpenGL ES 1.1, 2.0, and 3.2, OpenCL up to 2.0, and Vulkan 1.1 embedded 2D acceleration
    • NPU – 6 TOPS (INT8) AI accelerator with support for INT4/INT8/INT16/BF16/TF32 mixed operations.
    • VPU
      • Video Decoder: H.264, H.265, VP9, AV1, and AVS2 up to 8K at 30fps or 4K at 120fps.
      • Video Encoder: H.264 and H.265 up to 4K at 60fps, (M)JPEG encoder/decoder up to 4K at 60fps.
  • System Memory – 4GB or 8GB 32-bit LPDDR4/LPDDR4x
  • Storage
    • 16GB to 256GB eMMC flash options
    • MicroSD card slot
    • M.2 (2242 PCIe NVMe/SATA SSD)
    • Footprint for UFS 2.0 storage
  • Video Output
    • HDMI 2.0 port up to 4Kp120
    • MIPI DSI connector up to 2Kp60
    • DisplayPort 1.4 via USB-C up to 4Kp120
  • Audio
    • 3.5mm Audio jack (Support MIC recording and American Standard CTIA)
    • Line OUT
  • Camera I/F – 1x MIPI CSI DPHY(30Pin-0.5mm, 1*4 lanes/2*2 lanes)
  • Networking
    • Low-profile Gigabit Ethernet RJ45 port with Motorcomm YT8531
    • WiFi 5 and Bluetooth 5.2 via AMPAK AP6256
  • USB – 1x USB 3.0 port, 1x USB 2.0 port, 1x USB Type-C port
  • Expansion
    • 40-pin GPIO header
    • M.2 for PCIe socket
  • Misc
    • External watchdog
    • 4-pin fan connector
    • 1x Debug port
    • I2C, SPI, USART
    • SARADC
  • Power
    • Supply voltage – DC 12V (5.5mm * 2.1mm, support 12V~24V wide voltage input)
    • Power Consumption – Normal: 1.2W(12V/100mA), Max: 6W(12V/500mA), Min: 0.096W(12V/8mA)
  • Dimensions – 93.00 x 60.15 x 12.49mm
  • Weight – 50 grams
  • Environment
    • Temperature Range: -20°C- 60°C
    • Humidity – 10%90%RH (non-condensing)

Firefly ROC RK3576 PC Interface description

The Firefly ROC-RK3576-PC SBC supports Android 14 and Ubuntu, along with Buildroot and QT is supported through official Rockchip support. Third-party Debian images may become available soon. More information about the SBC can be found on the product page and the Wiki but at the time of writing, there is no information available on the latter page.

As Firefly is portraying the SBC is designed for AI workload, it will support complex AI models like Gemma-2B, LlaMa2-7B, ChatGLM3-6B, and Qwen1.5-1.8B, which are often used for language processing and understanding. It will also support older AI models like CNN, RNN, and LSTM for added flexibility. Additionally, you can use popular AI development tools like TensorFlow, PyTorch, and others, and even create custom functions for your needs.

Firefly ROC RK3576 PC Applications

The ROC-RK3576-PC SBC is priced at around $159.00 for the 4G+32GB variant and  $189.00 for the 8G+64G variant on the official Firefly store, but you’ll also find both variants on AliExpress for $199-$229 including shipping.

 

Firefly ROC RK3576 PC SBC

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK 5 ITX RK3588 mini-ITX motherboard
Subscribe
Notify of
guest
The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment.
9 Comments
oldest
newest
Jean-Luc Aufranc (CNXSoft)
Admin

It looks to be the same design as the ROC-RK3588S-PC, but with a different processor:
https://www.cnx-software.com/2022/04/19/roc-rk3588s-pc-first-rockchip-rk3588s-sbc-32gb-ram/

BattleKing
BattleKing
1 month ago

Please, dont fall for that shitty AI Supported Bullshit.
All those chips come today with min. 6 TOPS. it means nothing for LLMs. They Run with the right Software in CPU, so even a rpi “supports” those. The 6 TOPS make it a little faster. But not useable, because no one wants to wait 5min (more or less) for a word.

RK
RK
1 month ago

6 TOPS NPUs are used for vision object detection and classifications model inference.

Willy
1 month ago

Agreed, especially with *32-bit* memory bus when you know that DRAM bandwidth is always *the* bottleneck with LLMs. A quick calculation, assuming that the processing time would be reduced to zero thanks to the NPU and that they’re using LPDDR4X 4224 like on ROCK5B, you’d max out at 16 GB/s which is *at the very best* 3 tokens/s for a compact 7B model fitting in 5 GB. And we know that the reality will be far from this since nowhere you’ll have zero-time processing, and there will also be the context to update hence memory writes to perform.

Jean-Luc Aufranc (CNXSoft)
Admin

With the 7B parameter models, it will be slow (albeit not 5 minutes per word), but smaller LLMs are processed much faster. TinyLLAMA 1.1B has been measured at 17 tokens/s

See the following post for more benchmarks on the 6 TOP NPU: https://www.cnx-software.com/2024/07/15/rockchip-rkllm-toolkit-npu-accelerated-large-language-models-rk3588-rk3588s-rk3576/

Willy
1 month ago

> TinyLLAMA 1.1B has been measured at 17 tokens/s

But that was on the Rock 5C with 64-bit memory, hence twice the RAM bandwidth.

RK
RK
1 month ago

The general purpose models you’re both referencing aren’t optimized for specific industrial tasks. A factory floor assembly line sorter made to tell apart blemishes on fruits or identify cans when separating recyclables uses different picking arms and appendages (claws… suction… magnets…) for different objects so you end up lining up multiple specialized and independent robots in a series that are managing with far less RAM and compute than the kind of robots we usually see being used in western markets.

Willy
1 month ago
  RK

I’m perfectly fine with that, it’s just that “AI” is used at every sauce these days, and it has become impossible to find a product with it in one form or another nowadays. I’m fine with image recognition etc. I remember that K210-based Maixpi was able to instantly spot and classify objects in an image despite being battery powered, so I’m fairly certain that there are plenty of valid cases where DRAM BW is not a problem. But here, you’ll note that the photo at the beginning of the article displays “Llama2”, “ChatGLM”, “Qwen”. And Llama2 doesn’t exist in less… Read more »

RK
RK
1 month ago

> whlie we all agree it just realistically cannot.

You can prune those specific models down quite a bit without losing too much accuracy: https://www.engr.washington.edu/industry/capstone-projects/images/Amazon-Edge-LLM-Reducing-LLM-Memory-Footprint-to-2GB-6670bfeeadd6e.pdf

I guess it doesn’t take much to run a reddit bot farm.

Khadas VIM4 SBC