MediaTek drops efficiency cores in Dimensity 9300 Cortex-X4/A720 mobile SoC

MediaTek Dimensity 9300 is a premium octa-core 5G mobile SoC with two clusters of four Cortex-X4 cores and four Cortex-A720 cores, but doing without any Cortex-A520 efficiency core, plus the latest Arm Mali-G720 GPU, and a MediaTek APU 790 neural processing unit (NPU) capable of support generative AI and large language models (LLM) with up to 33 billion parameters.

Arm invented big.LITTLE and then DynamIQ technologies in order to mix cores with different power efficiency and performance characteristics in order to improve power consumption. Their latest launches included the Cortex-X4 premium core, Cortex-A720 performance/big core, and Cortex-A520 efficient/LITTLE core, but MediaTek decided to do without the Cortex-A520 in the Dimensity 9300 which strikes me as odd for a mobile SoC where power efficiency is important for a long battery life.

MediaTek Dimensity 9300

MediaTek Dimensity 9300 specifications:

  • Octa-core CPU with DynamIQ
    • 4x Arm Cortex-X4 at up to 3.25GHz
    • 4x Arm Cortex-A720 up to 2.0GHz
  • GPU – Arm Immortalis-G720 MC12
  • VPU
    • Video Encoding – H.264, HEVC
    • Video Playback – H.264, HEVC, VP9, AV1
  • AI accelerator – MediaTek APU 790 (Generative AI) accelerator supporting large language models with 1B, 7B, and 13B parameters and scalability up to 33B. This includes support for Meta Llama 2, Baichuan 2, and Baidu AI LLM.
  • Memory – LPDDR5T up to 9,600 Mbps
  • Storage – UFS 4.0 + MCQ
  • Display – 4K up to 120Hz, WQHD up to 180Hz
  • Camera
    • Up to 320MP camera
    • Video capture resolution – Up to 8K30 (7690 x 4320) or 4K60 (3840 x 2160)
  • Wireless connectivity
    • Cellular
      • Sub-6GHz (FR1), mmWave (FR2), 2G-5G multi-mode, 5G-CA, 4G-CA, 5G FDD / TDD, 4G FDD / TDD, TD-SCDMA, WCDMA, EDGE, GSM
      • Functions – 5G/4G Dual SIM Dual Active, SA & NSA modes; SA Option2, NSA Option3 / 3a / 3x, NR FR1 TDD+FDD, DSS, FR1 DL 4CC up to 300 MHz 4×4 MIMO, FR2 DL 4CC up to 400MHz, 256QAM FR1 UL 2CC 2×2 MIMO, 256QAM NR UL 2CC, R16 UL Enhancement, 256QAM VoNR / EPS fallback
    • Wi-Fi 7 (a/b/g/n/ac/ax/be) up to 6.5 Gbps with 2T2R antenna support, MediaTek Xtra Range for a longer range and Multi-Link Hotspot to triple the performance in hotspot mode against unnamed competitors
    • Bluetooth 5.4
    • GNSS – GPS, BeiDou, Glonass, Galileo, QZSS, NavIC
  • Security
    • Secure Processor, HWRoT
    • Arm Memory Tagging Extension (MTE) Technology
    • CC EAL4+ Capable, FIPS 140-3, China DRM
  • Process – 3rd generation TSMC 4nm

MediaTek does claim power efficiency improvements for the Dimensity 9300 although those are all under load, and not in standby mode which may not be that important in a premium smartphone that people charge every day. So that must be why they didn’t include any efficiency cores as the benefits may not have been limited or simply not worth it.

The company notably highlights that the APU 790 AI processor doubles the integer and floating-point operations performance while reducing power consumption by 45%, and thanks to tweaks to the Transformer model can deliver up to eight times the performance of the previous generation, with image generation within one second using Stable Diffusion. On the graphics front, the Dimensity 9300’s Arm Mali Immortalis-720 GPU provides a 40% reduction in GPU power consumption at the same level of performance as the previous generation Dimensity 9200 or an almost 46% boost in GPU performance while at the same level of power consumption.

MediaTek also promises efficiency improvements for the CPU, but only for multi-core workloads with up to 33% power savings in that case. The company also noted a 15% increase in single-core performance and 40% in multi-core performance. The first Dimensity 9300-powered Android smartphones are expected by the end of 2023. Further details may be found on the product page and in the press release.

Share this:
FacebookTwitterHacker NewsSlashdotRedditLinkedInPinterestFlipboardMeWeLineEmailShare

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK 5 ITX RK3588 mini-ITX motherboard

18 Replies to “MediaTek drops efficiency cores in Dimensity 9300 Cortex-X4/A720 mobile SoC”

  1. MediaTek decided to do without the Cortex-A520 in the Dimensity 9300 which strikes me as odd for a mobile SoC where power efficiency is important for a long battery life.

    MediaTek does claim power efficiency improvements for the Dimensity 9300 although those are all under load, and not in standby mode which may not be that important in a premium smartphone that people charge every day. So that must be why they didn’t include any efficiency cores as the benefits may not have been limited or simply not worth it.

    Reviewers will have to determine if this was a good course of action, but it’s not like A7xx cores are particularly inefficient. According to ARM, A720 cores are up to 20% more efficient than A715, and MediaTek has limited the clock speeds and used what might be a slightly better N4 node.

    As for the X4 cores, only one will clock up to 3.25 GHz, while the other three can do 2.85 GHz.

    1. [ they avoid in-order execution from A520 cores and adjust/underclock A720 cores to a top clock speed of A520 cores, ~2GHz; compared to Cortex-X4 cores, A720 cores are the ‘efficient’ cores now (and efficiency gains for already optimized cores (node size, circuits design, increased L2 L3 cache sizes) come from faster adjusting between work loads and idling instead of low consumption for longer periods and even mobile cores are towards ~4.3GHz boosted clock speeds _ Snapdragon X Elite; while fully powered X4 cores can ‘disable the clock at the top of the clock tree. The core remains fully powered and retains the state’ ‘The WFI and WFE instructions put the core into a low-power mode.’ (clock gating modes), instead of ramp up through several steps in clock speed(?) )

      A720 ‘It can be used as either “big” or “LITTLE”.’

      seems it’s not that much about absolute ‘cost efficiency’ but performance (?) ]

  2. They probably figured that there’s never anything useful to run on such cores and that for background stuff, the race to idle works as well if not better. With moderately efficient cores running at a lower frequency for less time it’s not unreasonable to achieve as good savings IMHO.

    1. Exactly – if you look at the power/performance graphs of the performance and efficiency cores, there’s only a small portion of the graph where the efficiency cores are actually more efficient, and that’s down toward the bottom of the efficiency cores’ frequency, where they take an extended amount of time to complete their work. The performance cores are actually more efficient when they are down-clocked than the efficiency cores are when they are up-clocked. So in the race to idle, when the efficiency cores would be clocked up, the performance cores would not only get the job done faster, but also with less total power consumed.

      The efficiency cores are useful for products optimising licensing costs, not power usage.

      1. > The efficiency cores are useful for products optimising licensing costs, not power usage.

        I think it’s well summarized like this. Basically they allow a SoC vendor to advertise more cores for a same price in a market where clueless consumers believe that “8 cores is better than 4” for a smartphone without realizing they’re not operating a server…

        1. [ (Thx), while we probably cannot really compare, it would be an interesting question on to what extent an OS (including all background processes and applications) influences power consumption on a mobile phone;
          where’s the progress happening?
          peripherals power consumption, higher clock speeds and resolutions, more efficient/capable batteries, memory closer to SoC (or integrated), logic circuits node shrinking (efficiency gain)
          What’s left for cpu part on a SoC to be improved?
          Why is there no mentioning of PCIe version (v3, v4, v5?)? ]

          1. What’s left to be improved? Density, parallelism, performance/power, cost.. the usual. Why is there no mention of PCIe in a mobile product announcement? Because the storage and other peripherals aren’t using it in mobile devices. When we get a mobile device with Thunderbolt 4 then we’ll see encapsulated PCIe over that port, and maybe in 2025 we’ll finally see a mobile device support the new MicroSDXpress cards, but I’m starting to wonder whether CXL might beat PCIe to mobile phone chipset marketing language.

          2. [ armv9.2 SoCs ~Mediatek
            ‘has Interface and Security IP for PCI Express 6.0 with Integrity and Data Encryption (IDE), CXL 3.0 with IDE, DDR5 with Inline Memory Encryption (IME) and UCIe, all of which are optimized for performance with ARM-specific features’

            maybe used in parallel for PCIe and CXL3.0 (based on PCIe physical layer or combined PHY); the Cortex CPU cores (CHI, sufficient bandwidth for PCIe versions?) access a PCIe controller (~AMBA5(?) AXI) through system memory (controller)

            A 2 cluster 4x Cortex-A75 (on CoreLink CMN600 Interconnect) is capable of supporting PCIe4 bandwidth.

            one example for mobile computing will be Snapdragon X Elite (maybe not on phones, yet)
            ‘NVMe SSD over PCIe Gen 4, UFS 4.0, SD v3.0’ ]

        2. Yeah when I see a selection of more application cores vs a selection of higher single-thread performance cores for the same total workload, I still shake my head. Unless it’s a cost-optimised part. Then the licensing and fab costs go down with the efficiency cores for a given power/performance target.  But – that may change when we actually start seeing memory integrated with compute… and I don’t mean like the M2. I mean DRAM wafers with compute cores in them, in mainstream products.

    1. No it’s not a lie. having 4 big and 1 little does make a lot of sense to run on the little one anything that fits on it. The A311D/S922X with 4xA73 and 2xA53 match this quite well. What is stupid is what we often see with few big and many little. E.g. RK3399. The total capacity of the 2 A72 is roughly equivalent to the 4 A53 there, except that you need a seriously complicated workload to make optimal use of such 6 cores at once. Even the bcm2711 in the rpi4b with 4 A72 is more suited for most workloads! (however its I/O are pathetic and the RAM bandwidth is laughable).

      This mostly serves marketing. How many times we’ve heard “with more cores you can start more applications because you know, each background application requires one core”, so SoC vendors are encouraged to provide many useless cores that will remain idle forever. But having just a few remains useful.

    2. > Big. Little has always been a lie?

      Nope since in the beginning (2011 with A7/A15, later then A53/A57) there was no sophisticated software support and only ‘clustered switching’ existed: on a SoC with an equal amount of big and little cores (4/4 or 2/2) apps only saw half the cores and the whole working set was moved between big and little cluster based on CPU utilization.

      Now we have fine-grained software control, all cores are used simultaneously (HMP) and those ‘middle’ cores are not only designed for raw power.

      1. The old bigLITTLE concept made in so far sense as the whole galore was hidden from the OS, and was just some evolution to the speedstep approach of the later 90ies. As soon as HMP came on board some morons (me included) were enthusiastic about using all the silicon for a power boost. Until we realised (some of us at least) that 10 A53 cores still make no good CPU…

        1. [ yes, absolutely, one needs about 24 A53s (~5W_1GHz, (one of) longest lasting and most widespread ARM platform core(s) on armv8 ISA)
          ‘http://www.socionext.com/en/download/catalog/AD04-00111-2E.pdf’ ]

    3. Also at the same time as CPU cores became much faster (such a ‘middle’ core like A720 is magnitudes faster at same clock compared to early big cores like A15 or A57) the SoCs got way more complex, now containing blocks like a ‘neural engine’ and so on.

      As such the ‘race to idle’ concept became even more important since today much more SoC engines need to be sent to deep sleep in idling as before to keep the whole SoC’s consumption down.

    4. [ maybe it advanced to DynamIQ (optimized big.Little concept, more freely core cluster configuration/combinations options), with Little cores being more capable/performant with high clock speeds (if available) than big cores with lowered clock speeds; it became somewhat more interchangeable compared to previous Cortex core generations(?, more or less ‘For a given library of CMOS logic, active power increases as the logic switches more per second, while leakage increases with the number of transistors. So, CPUs designed to run fast are different from CPUs designed to save power.’ maybe in meantime node shrinking combined with snappy frequency adjusting somewhat outweighs these advantages (depending on usage profiles), although there’s again a successor to A520 with A530 and it seems a trade off between out-of-order advantage of designs for ‘big’ cores compared to lower transistor count for ‘Little’ cores )

      maybe OS don’t include multiple small cores well into a distributed ‘background processes’/’demanding apps startup’ distribution, yet? ]

  3. this big little is all about right marketing.
    Small cores are cheaper and they occupies small space.
    And it is all about selling more cores!

    I have LG TV with web os
    SOC has 1 big core and 4 small cores
    And TV’s power management don’t scale up and down frequencies.
    It simple disable big core!
    So when I run top/htop on that I clearly see core off/on

Leave a Reply

Your email address will not be published. Required fields are marked *

Boardcon Rockchip and Allwinner SoM and SBC products
Boardcon Rockchip and Allwinner SoM and SBC products