Espressif introduces ESP32-S3-BOX AI development kit for online and offline voice applications

ESP32-S3-Box

Espressif Systems has very recently introduced the ESP32-S3-BOX AI voice devkit designed for the development of applications with offline and online voice assistants, and whose design I find similar to the M5Stack Core2 devkit, but the applications will be different. The ESP32-S3-BOX features the latest ESP32-S3 processor with WiFi and BLE connectivity, AI capabilities, as well as a 2.4-inch capacitive touchscreen display, a 2-mic microphone array, a speaker, and I/O connectors with everything housed in a plastic enclosure with a stand. ESP32-S3-BOX specifications: WiSoC – ESP32-S3 dual-core Tensilica LX7 up to 240 MHz with Wi-Fi & Bluetooth 5, AI instructions, 512KB SRAM Memory and Storage – 8MB octal PSRAM and 16MB QSPI flash Display – 2.4-inch capacitive touchscreen display with 320×240 resolution Audio – Dual microphone, speaker USB – 1x USB Type-C port for power and debugging (JTAG/serial) Expansion – 2x Pmod-compatible headers for up to 16x GPIOs Misc Power […]

Picovoice Cobra Voice Activity Detection Engine shown to outperform Google WebRTC VAD

PicoVoice Cobra VAD

Picovoice Cobra Voice Activity Detection (VAD) engine has just been publicly released with support for Raspberry Pi, BeagleBone, NVIDIA Jetson Nano, Linux 64-bit, macOS 64-bit, Windows 64-bit, Android, iOS, and web browsers that support WebAssembly. Support for other Cortex-M and Cortex-A based SoCs can also be made available but only to enterprise customers. Picovoice already offered custom wake word detection with an easy and quick web-based training and offline voice recognition for Raspberry Pi, and even later ported their voice engine to Arduino. Cobra VAD is a new release, and, like other VADs, aims to detect the presence of a human voice within an audio stream. Picovoice Cobra can be found on Github, but note this is not an open-source solution, and instead, libpv_cobra.so dynamic library is provided for various targets, together with header files and demos in C, Python, Rust, and WebAssembly, as well as demo apps for iOS […]

Raspberry Pi smart audio devkit features AISonic IA8201 DSP, microphone array

AISonic-Raspberry Pi Development Kit

Knowles AISonic IA8201 Raspberry Pi development kit is designed to bring voice, audio edge processing, and machine learning (ML) listening capabilities to various systems, and can be used to evaluate the company’s AISonic IA8201 DSP that was introduced about two years ago. The kit is comprised of three boards with an adapter board with three buttons connecting to the Raspberry Pi, as well as the AISonic IA8210 DSP board itself connected via a flat cable to a microphone array. Knowles AISonic Raspberry Pi development kit Knowles did not provide the full details for the development but says it enables wake-on-voice processing for low latency voice UI, noise reduction, context awareness, and accelerated machine learning inferencing for edge processing of sensor inputs. Some of the use cases include Low Power Voice Wake to listen for specific OEM keywords to wake the host processor, Proximity Detection when combined with an ultrasonic capable […]

Offline speech recognition MCU module comes with speaker, microphone, and UART connectors

offline voice recognition mcu module

We found out about Unisound US516P6 RISC microcontroller inside an offline voice assistant module last May. The module offers offline speech recognition for just $2 to $4, with good performance, and excellent privacy since no cloud service nor Internet connection is needed. That module requires some soldering, but if you’d prefer something easier to connect the “SU-10A” offline speech recognition MCU module comes with connectors for a speaker, a microphone, as well as UART connectivity to a host MCU if needed. “SU-10A” module specifications: MCU – Unisound US516P6 RISC microcontroller @ 240 MHz with FPU, DSP instruction, FFT accelerator, 242KB SRAM, 2MB flash Audio Built-in 3W mono Class AB power amplifier. 2mm pitch connector for speaker (4 Ohms up to 2.9W,  8 Ohms up to 1.8W) 2mm pitch connector for electret microphone Debugging/programming – UART port for serial console (5V or 3.3V supported) Host interface – 2mm pitch 4-pin connector […]

STEVAL-VOICE-UI Amazon qualified Alexa Smart Home evaluation kit is based on STM32H7 MCU

STEVAL-VOICE-UI

We’ve already covered plenty of Amazon-qualified development kits working with Alexa Voice Services. But here’s another one with STEVAL-VOICE-UI evaluation kit making it to the list of Smart Home Dev Kits, which Amazon describes as “reference designs for creating smart home products such as light switches, thermostats, or Wi-Fi routers”. STEVAL-VOICE-UI voice user interface (VUI) evaluation kit features an STMicro STM32H7 Arm Cortex-M7 microcontroller with 2 MB embedded flash, 1 MB embedded SRAM, 2.4 GHz Wi-Fi, and a microphone array with three MEMS microphones, as well as a loudspeaker, and some buttons and LEDs. STEVAL-VOICE-UI key features specifications: Microcontroller – STMicro STM32H753VIT6E Cortex-M7 MCU @ up to 550 MHz with 2 MB flash, 1 MB SRAM Connectivity – 2.4 GHz Wi-Fi subsystem (Murata 1DX module) coupled to 2MB NOR flash (ISSI IS25LP016D) Audio 3x MP23DB01HP MEMS microphones with 36 and 30 mm spacing FDA903D class D digital input automotive audio […]

MAIX-II A AI camera board combines Allwinner R329 smart audio processor with USB-C camera

MAIX-II A R329 processor USB camera

Earlier this year, we wrote about  Sipeed MAIX-II Dock AIoT vision devkit with an Allwinner V831 camera processor with a small 200 MOPS NPU, an Omnivision SP2305 2MP camera sensor, and a 1.3-inch display. But for some reason, which could be supply issues, Sipeed has designed a much different variant called MAIX-II A with a board based on Allwinner R329 smart audio processor, a 720p30 USB camera module, and a 1.5-inch display. MAIX-II A board specifications: Main M.2 module – Maix-II A module with Allwinner R329 dual-core Cortex-A53 processor @ 1.5 GHz, 256MB DDR3 on-chip, a dual-core HIFI4 DSP @ 400 MHz, and Arm China AIPU AI accelerator for up to 256 MOPS, plus Wi-Fi & BLE and a footprint for an SPI Flash. Storage – MicroSD card socket Display – 1.5-inch LCD display with 240×240 resolution Audio – Dual microphones, 3W speakers Camera – 720p USB-C camera module based […]

Picovoice offline Voice AI engine now works on Arduino

PicoVoice Arduino

Last year, I wrote about Picovoice support for Raspberry Pi enabling custom wake-word and offline voice recognition to control the board with voice commands without relying on the cloud. They used  ReSpeaker 4-mic array HAT to add four “ears” to the Raspberry Pi SBC. I also tried to generate a custom wake-word using the “Picovoice Console” web interface, and I was able to use “Dear Master” within a few minutes on my computer. No need to provide thousands of samples, or wait weeks before getting a custom wake-word. It’s free for personal projects. But the company has now added Picovoice to Arduino, or more exactly  Arduino Nano 33 BLE Sense powered by a  Nordic Semi nRF52480 Arm Cortex-M4F microcontroller, and already equipped with a digital microphone, so no additional hardware is required for audio capture. To get started, you’d just need to install the Picovoice Arduino library, load the sample […]

US516P6 RISC microcontroller powers offline voice assistant modules

US516P6 offline voice module

I recently wrote about a Linux microwave oven with a built-in voice assistant, and somebody mentioned a quad-core SoC was overkill, and instead US516P6 microcontroller designed for offline voice commands would be a better fit. It’s all good, but finding information about Unisound US516P6 proved to be quite a challenge with not much public information, and most in Chinese. But then I noticed Wireless Tag WT516P6Core offline voice module, and since I have contact with the company I managed to get a few more details, notable with regards to the development tools. US516P6 module specifications: MCU – Unisound US516P6 RISC microcontroller (likely Andes NDS32 based) @ 240 MHz with FPU, DSP instruction, FFT accelerator, 242KB SRAM, 2MB flash Audio – Built-in power amplifier I/Os – 12 castellated holes with UART, GPIO, microphone input, speaker output, VCC, and GND Power Supply – Built-in 5V to 3.3V, 3.3V to 1.2V LDO to […]

EmbeddedTS embedded systems design