Qualcomm officially announced they started sampling Centriq 2400 SoC with 48 ARMv8 cores for datacenters & cloud workloads using a 10nm process, but at the time the company did not provide that many details about the solution or the customization made to the CPU cores.
Qualcomm has now announced that Falkor is the custom CPU design in Centriq 2400 SoC with the key features listed by the company including:
- Fully custom core design – Designed specifically for the cloud datacenter server market, with a 64-bit only micro-architecture based on ARMv8 (Aarch64).
- Scalable building block – The Falkor core duplex includes two custom Falkor CPUs, a shared L2 cache and a shared bus interface to the Qualcomm System Bus (QSB) ring interconnect.
- Designed for performance, optimized for power
- 4-issue, 8-dispatch heterogeneous pipeline designed to optimize performance per unit of power, with variable length pipelines that are tuned per function to maximize throughput and minimize idle hardware.
- power management techniques: independent p-state control for each of the CPUs and L2, with entry to and exit from low-power states controlled by hardware state machines, and hardware state retention for power-collapsed sleep states with ultra-fast recovery.
- Performance under memory-intensive workloads – Falkor is designed to fulfill the demand for larger instruction footprints using an innovative split instruction cache comprised of a single-cycle, low-power 24KB L0 I-cache complementing its 64KB L1 I-cache. The core also supports a 32KB L1 D-cache with a 3-cycle load-use latency. The L1 D-cache is augmented by a sophisticated multi-level hardware prefetch engine that dynamically adapts to system conditions.
- Datacenter features
- ARM Execution Levels (EL0-EL3) and TrustZone secure execution environment.
- ARMv8 instruction extensions to accelerate cryptographic transform and secure hash operations such as AES, SHA1, and SHA2-256
- RAS mechanisms needed to keep a datacenter running, such as fault isolation, reporting, and handling techniques.
- System on a chip – The 48 Falkor CPUs are brought together in a fully-integrated SoC with high-bandwidth and low-latency ring interconnect, large L3 cache and multiple memory controllers. It also includes an on-die hardware-based immutable root of trust that authenticates firmware before the first line of firmware is ever executed
Centriq 2400 SoC is scheduled to start shipping later this year. You’ll find an in-depth overview of Falkor micro-architecture, and more slides on Anandtech.
Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.
Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress
“Segmented ring bus” sounds like a latency optimisation over a classic ring bus. I’d love to see more details on the subject, though. I mean, a torus bus is a ‘segmented ring bus’ in a way.
That’s a machine! I’m really interested in the price. I guess it’s around 200 Dollar?!
@Philipp
I think you can reasonably add a “1” in front of your guessed price 🙂
@willy
Only a “1”?
no info yet on its power consumption…
it also seems qualcomm will support windows server with this soc
Any sources or information on this?
@tkaiser
the anandtech link provided by cnxsoft
see last page, called “closing thoughts” : Qualcomm is going to be supporting Windows Server on Centriq 2400-series SoCs
@tkaiser
MS/QCOMM talks of cloud partnership predate their talks of notebook partnership.