NXP i.MX RT crossover processors combine real-time capabilities of microcontrollers with the performance of application processors thanks to an Arm Cortex-M7 core clocked at 528 MHz and more.
The performance is indeed impressive as shown by Teensy 4.0 benchmarks, but so far NXP i.MX RT processor targeted general purpose applications. The company has now introduced three new crossover processors designed for AI applications. NXP i.MX RT106F is designed for offline face recognition and expression Identification, while RT106L and RT106A are made for local and cloud-based embedded voice applications.
NXP i.MX RT106F Processor
- CPU – Arm Cortex-M7 @ 600 MHz (3020 CoreMark/1284 DMIPS)
- Memory – 1 MB On-Chip SRAM plus up to 512 KB configurable as Tightly Coupled Memory (TCM)
- External memory interface options – NAND, eMMC, QuadSPI NOR Flash, and Parallel NOR Flash
- Real-time, low-latency response as low as 20 ns
- Industry’s lowest dynamic power with an integrated DC-DC converter
- Low-power run modes at 24 MHz
- Advanced multimedia for GUI and enhanced HMI
- 2D graphics acceleration engine
- Parallel camera sensor interface
- LCD display controller (up to WXGA 1366×768)
- 3x I2S for high-performance, multi-channel audio
NXP provides FreeRTOS for the microcontroller/processor, and software development can be performed with MCUXpresso SDK, IDE and Config Tools. NXP claims their OASIS face processing engine enables face detection, recognition, and anti-spoofing without cloud connectivity at a much lower price than competing Linux based solutions. More details about the processor itself can be found on the product page.
Face Recognition Devkit
The company is working with OEM to develop i.MX RT106F development kit for face recognition applications such as the one pictured above.
The devkit features an ultra-small form-factor, production-ready hardware design running FreeRTOS that allows for a quick out-of-the-box implementation. It can perform face detection, face tracking, face alignment,
and face recognition without Wi-Fi and cloud connectivity in order to address potential privacy concerns.
Devkit specifications:
- MCU – NXPi.MX RT106F Crossover Processor with Arm Cortex-M7 core @ 600 MHz, 32 kB I-cache, 32 kB D-cache, FPU, 1MB on-chip SRAM
- System Memory – 32 MB SDRAM
- Storage – 32 MB Hyperflash
- ADC/DAC Conversion – 2x ADC (20-ch), 2 x ACMP
- System Control – Secure JTAG, PLL OSC, eDMA, 4x Watch Dog, 6x GP Timer, 4x Quadrature ENC, 4x QuadTimer, 4x FlexPWM, IOMUX
- Security
- Hardware – HAB, TRNG, Encrypted XIP out of Flash
- Software – Ciphers & RNG, Secure RTC, Fuse, HAB
- Misc – Optional display and keypad, supports RGB and IR, interface for temperature monitoring
- Power Supply – 5V USB Type-C port; Low-dropout regulation via DC-to-DC & LDO
- Dimensions – 50 x 40 mm
The kit with come with a full source stack include FreeRTOS operating system and drivers, the face recognition inference engine, GUI API, Bluetooth & WiFi connectivity manager, and more.
Some of the applications are somewhat worrying (does my washing machine really need a camera?) but here they are:
- Smart appliances – Washing machines, dryers, ovens, refrigerators, stoves, and dishwashers
- Home comfort devices – Thermostats, HVAC and lighting control
- Countertop appliances – Microwaves, coffee machines, and rice cookers
- Safety/Security/Alarm devices – Alarm panels and automated access
- Smart industrial devices – Power tools, ergonomic stations, industrial workstations
The processor and development kits will become available in Q1 2020, but if your company has a project that could benefit from the solution you could request early access in the devkit page.
NXP i.MX RT106A & RT106L
I challenge you to find any differences between NXP i.MX RT106A block diagram above and the one for RT106F above, because unless I’m really tired (Friday evening here), there aren’t any.
The highlights for both RT106A and RT106L voice processors are the same, and only differ against RT106F because of wireless connectivity interfaces:
- CPU – Arm Cortex-M7 @ 600 MHz (3020 CoreMark/1284 DMIPS)
- Memory – 1 MB On-Chip SRAM plus up to 512 KB configurable as Tightly Coupled Memory (TCM)
- External memory interface options – NAND, eMMC, QuadSPI NOR Flash, and Parallel NOR Flash
- Real-time, low-latency response as low as 20 ns
- Industry’s lowest dynamic power with an integrated DC-DC converter
- Low-power run modes at 24 MHz
- Advanced multimedia for GUI and enhanced HMI
- 2D graphics acceleration engine
- Parallel camera sensor interface
- LCD display controller (up to WXGA 1366×768)
- 3x I2S for high-performance, multi-channel audio
- Wireless connectivity interface for WiFi, Bluetooth, Bluetooth Low Energy, ZigBee and Thread
So I understand RT106A, RT106L and RT106F are the same hardware, software/licenses vary.
NXP i.MX RT106L is designed for local (offline) voice control solutions using Snips with the following software features:
- Far-field audio front end
- Acoustic echo cancellation (barge-in)
- Ambient noise reduction
- Beamforming
- Playback processing
- Codecs
- Automatic speech recognition engine
- Media player/streamer
- MQTT, lwIP, TLS
- All drivers, including Wi-Fi and Bluetooth
NXP i.MX RT106A is licensed to run NXP’s turnkey voice assistant software solutions with the following features:
- Far-field audio front end softDSP
- Acoustic echo cancellation
- Ambient noise reduction
- Beamforming
- Barge-in
- Playback processing
- Codecs
- Wake-word inference engine
- Media player/streamer
- MQTT, lwIP, TLS
- Discovery and onboarding
- All drivers, including Wi-Fi and Bluetooth
You’ll find more details on their respective product pages here and there.
NXP i.MX RT MCU Alexa Voice Service Solution
There’s also an i.MX RT106A based system reference design for creating products with Alexa built-in called “NXP i.MX RT MCU Alexa Voice Service Solution“.
Specifications:
- MCU – NXP i.MX RT106A Crossover Processor with Arm Cortex-M7 core @ 600 MHz, 32 kB I-cache, 32 kB D-cache, FPU, 1MB on-chip SRAM
- Storage – 32 MB Hyperflash
- Audio – TFA9894 embedded DSP class-D audio amplifier solution associated with speaker protection and boost algorithm; low power: 91% peak efficiency for 600mW sine wave, low battery consumption: <125 mA (Po = 380 mW, average music power)
- Wireless Connectivity – WiFi 4 (802.11 b/g/n) and Bluetooth + BLE 4.1
- System Control – Secure JTAG, PLL OSC, eDMA, 4x Watch Dog, 6x GP Timer, 4x Quadrature ENC, 4x QuadTimer, 4x FlexPWM, IOMUX
- ADC/DAC Conversion – 2x ADC (20-ch), 2 x ACMP
- Security
- Hardware – Optional secure element A71CH
- Software – Ciphers & RNG, Secure RTC, Fuse, HAB
- Protocols – MQTT, mBedTLS, Alexa for MCU, LWIP
- Power Supply – 5V USB Type-C port; Low-dropout regulation via DC-to-DC & LDO
- Dimensions – 40 x 30 mm
- Qualifications – Amazon AVS Qualified
The kit also runs Amazon FreeRTOS and supports up to three Integrated low-cost MEMS microphones, and two external digital microphone lines. The software architecture is similar to the face detection kit, but obviously, it removes the computer vision part and replaces it with audio stacks such as Opus, MP3, G.711 and WMA audio codec, the wake word inference engine, audio DSP, and more. The software also supports “IoT” communication protocols like MQTT and includes lwIP lightweight TCP/IP stack.
You’ll find more details on the product page, and it will soon be sold on Mouser for $49. It’s an official Amazon AVS development kit and as such is listed on Amazon Developer site together with MediaTek MT8516 devkit we covered earlier today, and other compatible platforms.
Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.
Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress
I think the hardware is identical between RT1062, RT106A, RT106L, RT106F. The A, L, F variations include software licenses. Those software licenses license vary between models and have different prices.
The M7 will NOT do IR facial recognition since it cannot handle IR image processing in a timely (<1s) manner. You will need a minimum 4 cores at 1.4Ghz. Near IR (500nm) is the light range used by professionals since white light is easily fooled.
No, the M7 MCU based solution is really supported RGB/IR dual camera image processing with all time < 500ms including detection & liveness & recognition algorithm.
Seeing real-time MCUs run at half a gigahertz looks impressive to me. I’m wondering how accurate the realtime processing remains though, when entering a domain where processing performance starts to heavily depend on cache hits and misses.
I guess that for hard-RT scenarios one’d turn off the caches.
If this meets your real time constraints in the worst case it will still meet your constraints in the best case.
Determining the worst case with a fair degree of certainty happens to be the crux of the problem. Otherwise caching can be slower than non-caching — all it takes is a high-enough miss rate and an average fetch size smaller than a cacheline.
Most CPUs designed to work with caches cannot fetch less than a cache line anyway. Some memory controllers may at least start to fetch from the requested word. I think that if you disable write-back (or at least write-allocate) leaving the cache enabled cannot degrade performance compared to it being completely disabled.
That’s a possibility. We need a volunteer with a Teensy 4 : )
Since that got my curiosity, I had to track down the relevant interfaces and their docs. M7’s TRM is quite clear on the AXIM capabilities: https://static.docs.arm.com/ddi0489/d/DDI0489D_cortex_m7_trm.pdf — 5.4.1 AXI attributes and transactions. The AXIM interface can do way more than linefills at fetches — it’s capable of various combinations of outstanding linefills and (multi-) word fetches.