Arm Cortex-A320 low-power CPU is the smallest Armv9 core, optimized for Edge AI and IoT SoCs

Arm Cortex-A320 is a low-power Armv9 CPU core optimized for Edge AI and IoT applications, with up to 50% efficiency improvements over the Cortex-A520 CPU core. It is the smallest Armv9 core unveiled so far.

The Armv9 architecture was first introduced in 2021 with a focus on AI and specialized cores, followed by the first Armv9 cores – Cortex-A510, Cortex-A710, Cortex-X2 – unveiled later that year and targeting flagship mobile devices. Since then we’ve seen Armv9 cores on a wider range of smartphones, high-end Armv9 motherboards, and TV boxes, The upcoming Rockchip RK3688 AIoT SoC also features Armv9 but targets high-end applications. The new Arm Cortex-A320 will expand Armv9 usage to a much wider range of IoT devices including power-constrained Edge AI devices.

Arm Cortex-A320

Arm Cortex-A320 highlights:

  • Architecture – Armv9.2-A (Harvard)
  • Extensions
    • Up to Armv8.7 extensions
    • QARMA3 extensions
    • SVE2 extensions
    • Memory Tagging Extensions (MTE) (including Asymmetric MTE)
    • Cryptography extensions
    • RAS extensions
  • Microarchitecture
    • In-order pipeline
    • Partial superscalar support
    • NEON/Floating Point Unit
    • Optional Cryptography Unit
    • Up to 4x CPUs in cluster
    • 40-bit Physical Addressing (PA)
  • Memory system and external interfaces
    • 32KB or 64KB L1 I-Cache / D-Cache
    • Optional L2 Cache – 128KB, 192KB, 256KB, 384KB, or 512KB
    • No L3 Cache
    • ECC Support
    • Bus interfaces – AMBA AXI5
    • No ACP, No Peripheral Port
  • Security – TrustZone, Secure EL2, MTE, PAC/BTI
  • Debugging
    • Debug – Armv9.2-A features
    • CoreSightv3
    • Embedded Trace Extension (ETEv1.1)
    • Trace Buffer Extension
  • Misc
    • Interrupts – GIC interface, GICv4.1
    • Generic timer – Armv9.2-A
    • PMUv3.7
Cortex-M85 upgrade to Cortex-A320 with Ethos U85 for Edge AI
Slides from Arm’s presentation

The Cortex-A320 can be combined with the Ethos-U85 NPU for Edge AI, providing an upgrade path to Cortex-M85+Ethos-U85-based Endpoint AI devices, with support for LLMs with up to one billion parameters, and  Linux or Android operating systems, besides RTOSes like FreeRTOS or Zephyr OS. We’re also told a quad-core Cortex-A320 can execute up to 256 GOPS, measured in 8-bit MACs/cycle when running at 2GHz.

Besides the 50% efficiency improvements over the Cortex-A520, Arm says the performance of the Cortex-A320 has improved by more than 30% in SPECINT2K6, compared to its Armv8 predecessor, the Cortex-A35 thanks to efficient branch predictors and pre-fetchers, and memory system improvements.

The Cortex-A320 also makes use of NEON and SVE2 improvements in the Armv9 architecture to deliver up to 10x better machine learning (ML) performance compared to Cortex-A35, or up to 6x higher ML performance than the Cortex-A53. With these ML improvements and high area and energy efficiencies, Arm claims that the Arm Cortex-A320 is the most efficient core in ML applications across all Arm Cortex-A CPUs.

Arm Cortex-A53/Cortex-A35 vs Arm Cortex-A320

Renesas may be one of the first companies to launch an Arm Cortex-A320 SoC likely in 2026 as they are one of the few partners mentioned in the press release, and they were the first to introduce an Arm Cortex-M85 microcontroller, over a year after the core was unveiled. More details about the Cortex-A320 CPU core can also be found on a blog post and Arm’s developer website.

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

Radxa Orion O6 Armv9 mini-ITX motherboard
Subscribe
Notify of
guest
The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment.
1 Comment
oldest
newest
Boardcon CM3588 Rockchip RK3588 System-on-Module designed for AI and IoT applications