Armv9 architecture to focus on AI, security, and "specialized compute"

Armv8 was announced in October 2011 as the first 64-bit architecture from Arm. while keeping compatibility with 32-bit Armv7 code. Since then we’ve seen plenty of Armv8 cores from the energy-efficient Cortex-A35 to the powerful Cortex-X1 core, as long as some custom cores from Arm partners.

But Arm has now announced the first new architecture in nearly ten years with Armv9 which builds upon Armv8 but adds blocks for artificial intelligence, security, and “specialized compute” which are basically hardware accelerators or instructions optimized for specific tasks.

Armv9 still supports Aarch32 and Aarch64 instructions, NEON, Crypto Extensions, Trustzone, etc…, and is more an evolution of Armv8 rather than a completely new architecture.

Some of the new features brought about by Armv9-A include:

Scalable Vector Extension v2 (SVE2) is a superset of the Armv8-A SVE found in some Arm supercomputer core with the addition of fixed-point arithmetic support, vector length in multiples of 128, up to 2048 bits. Useful for specialized DSP and XR (augmented and virtual reality) workloads, from 5G to genomics to computer vision.
Arm Confidential Compute Architecture (CCA)
- The Realm Management Extension (RME) establishes a new hardware-backed secure environment that extends Confidential Compute on Arm platforms to all developers and all workloads. Typical use case: a public cloud that processes sensitive or valuable data.
- Arm Confidential Compute Firmware Architecture – A standard platform software framework for the Arm Confidential Compute Architecture that simplifies hardware design and encourages reuse and portability. Typical use case: protection of sensitive personal healthcare data on mobile devices.

Tracing & Debugging
- Branch Record Buffer Extensions (BRBE) providing profiling information, such as hot-spot analysis and Auto FDO, for debugging/optimization. To be implemented in Armv9.2-A scheduled for release in Q3/Q4 2021
- Embedded Trace Extension (ETE) and Trace Buffer Extension (TRBE) for improved trace capabilities for Armv9
The Transactional Memory Extension (TME) brings Hardware Transactional Memory (HTM) support to the Arm architecture to address the difficulty of writing highly concurrent, multi-threaded programs in which the amount of coarse-grain, thread-level parallelism can scale better with the number of CPUs, by reducing serialization due to lock contention.

The CPU performance of Armv9 SoC is expected to increase by more than 30% over the next two generations of mobile and infrastructure CPUs, but Arm highlights Total Compute design principles where SoC will be optimized for specific tasks.

A few more details can be found in the press release and Total Compute page, but for technical details, I’d recommend checking out the A-profile and security features pages from Arm’s developer website. The “Arm Vision” section of the website also has some more details including a video by Arm SVP, Chief Architect and Fellow, Richard Grisenthwaite, that gives a more practical overview of Armv9 and use cases made possible with the new features. All the screenshot above comes with this video as well.

Jean-Luc Aufranc (CNXSoft)

Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.