MediaTek Announces Dimensity 9000: Supercharged Flagship SoC on 4nm
by Andrei Frumusanu on November 18, 2021 6:00 PM ESTMediaTek over the last few years has generally always been regarded as the “other” SoC vendor in the mobile industry, with most media and consumer attention being paid to the flagship SoC products by the likes of Apple, Qualcomm, Samsung and HiSilicon. Indeed, the last time MediaTek had attempted a true flagship SoC was several years ago with the Helio X20 and X30, before seeing very little success in the market and instead refocusing on the mid-range and “premium” segments.
Today, MediaTek is looking to change this positioning. After seeing new-found success in the market, particularly seeing a fantastic 2020 and 2021, where the Taiwanese vendor is now able to claim the #1 spot with 40% market share, as well as a growing 28% of 5G SoC market share, the company is now also aiming for recognition and leadership position in the flagship SoC market – this is where the new Dimensity 9000 comes in.
The Dimensity 9000 is MediaTek’s latest effort in creating a no-compromise flagship SoC, with the designers throwing in everything but the kitchen sink at it in terms of specifications, representing a lot of industry firsts, such as the first Armv9 SoC with Cortex-X2, A710’s and A510’s CPUs, a new Mali-G710 GPU, first LPDDR5X compatible SoC, astounding camera ISP claims, and the first outright publicly announced TSMC N4 silicon design in the industry. The list of features and capabilities is extensive, and the announcement today definitely represents MediaTek’s largest effort in generations and years.
Starting off with the process node, MediaTek is able to claim a first in the industry, with the Dimensity 9000 being the world’s first TSMC N4 chip. Over the past few years, we’ve always been accustomed that either Apple or HiSilicon be the very first customers on TSMC’s latest leading-edge nodes. With HiSilicon being cut-off from TSMC, that left Apple as the obvious lead partner for TSMC’s new generation process nodes – however, the timing here just didn’t work out for the A15 as the N4 node just wasn’t ready yet. With Qualcomm currently being tied to Samsung Foundry for their flagships (arguably with not great success), this left a vacuum for where HiSilicon used to be, which MediaTek is now looking to fill. In fact, I think this would be the company’s first time where they’re truly on a leading edge node since the 20nm days.
TSMC’s N4 node is supposed to be a smaller optical shrink over the N5 node, resulting in 6% more density, with similar single-digit improvements in performance and efficiency. TSMC had announced risk production for N4 to start in 3Q21, and with the Dimensity 9000 planned to hit commercial devices in 1Q22, the chip is likely the lead product for the process node.
New MediaTek Flagship SoC 2022 | ||
SoC |
Dimensity 9000 |
|
CPU | 1x Cortex-X2 @ 3.05GHz 1x1024KB pL2 3x Cortex-A710 @ 2.85GHz 3x512KB pL2 4x Cortex-A510 @ 1.80GHz 4x256KB pL2 8MB sL3 |
|
GPU | Mali-G710MP10 @ ~850MHz |
|
Memory Controller |
4x 16-bit CH @ 3200MHz LPDDR5 / 51.2GB/s @ 3750MHz LPDDR5X / 60.0GB/s 6MB System Cache |
|
ISP | Imagiq790 New-gen Triple 18-bit ISP 9GPix/s processing throughput Single Sensor up to 320MP Triple Sensor 32+32+32MP |
|
NPU | 5th Gen 4+2 core APU | |
Media | 8K30 & 4K120 encode & 8K60 decode H.265/HEVC, H.264, VP9 8K30 AV1 Decode |
|
Modem | (LTE Category 24/18) (5G NR Sub-6) |
|
Mfc. Process | TSMC N4 |
There’s a lot to talk about the Dimensity 9000, so obviously enough as MediaTek advertises it as the first Armv9 SoC, let’s start off with the CPU configuration and the various IPs employed here.
No-Compromise CPU Setup
Being an Armv9 SoC, this means that the company is refreshing all the CPU IPs, employing the new Cortex-X2, Cortex-A710, and Cortex-A510 IPs from Arm. We had covered the new generation CPUs extensive earlier this year, so be sure to read up on those articles.
The Dimensity 9000 goes with a 1+3+4 CPU setup that has seen popularity in the market ever since Qualcomm had adopted the setup for the first time in the Snapdragon 855. For the performance cores, MediaTek uses the new Cortex-X2 cores, equipping them with the full 1MB of L2 cache, and clocking them at up to 3.05GHz. The clock frequency is higher than what we’re seeing from designs today on X1 cores such as the Snapdragon 888 or the Exynos 2100 at respectively 2.86 and 2.9GHz, but those competing SoCs were also on an inferior Samsung 5LPE process node. We don’t yet know exactly where the next-gen Snapdragon and Exynos chips will end up in terms of clocks, but I think it’s unlikely they will exceed the 3GHz mark, leaving the new Dimensity 9000 with a likely frequency advantage, and thus also a likely single-threaded performance leadership position amongst the Android SoC vendors.
MediaTek does quote a +35% performance leap over current generation Android Flagship chips, which we assume is going to be a Snapdragon 888, however also states that efficiency is +37% better. This would mean that peak absolute power levels for the MediaTek 9000’s X2 cores would be similar to what we’re seeing from the X1 cores in a Snapdragon 888 today, which generally is a good position to be in, and the figures generally line up with what we expect from the IPC and process node differences between the designs.
MediaTek did note that the performance leap in more memory-bound workloads to be much higher than more core-local workloads, for example SPECint2006 seeing a +35% increase, while GeekBench 5 only will see a +10.5% increase over the competition. This generally also lines up with our understanding of the Cortex-X2, pointing out to low IPC improvements in anything that’s not taking advantage of the increase caches of the CPU cluster.
The middle cores of the Dimensity 9000 are 3x Cortex-A710 cores, equipped with 512KB L2’s, and clocked up to 2.85GHz. In this regard, MediaTek’s approach here is more similar to the Exynos 2100 in that it’s using quite high frequency mid-cores, in contrast to the lower 2.4GHz design point Qualcomm employs.
Alongside the middle cores, we also see the new Cortex-A510 little cores, and here MediaTek is doing things quite differently compared to what we expected from the first iterations of the IP. Instead of using Arm’s new “merged-core” approach, where a Cortex-A510 complex can consist of two cores sharing a SIMD/FP pipeline as well as a shared L2, MediaTek completely ignores this design aspect of the IP and instead goes the traditional route of only using one core per complex, with each core thus having its own SIMD/FP pipeline and private L2 cache. The cache here lies in at 256KB, which is also quite large, and short of the 512KB maximum. In effect, what MediaTek has done here is to configure the A510 cores with a near-maximum performance setup. While we still have our reservations about the cores, it’s good to see MediaTek not skimping out on the new designs.
Due to the strongly configured middle cores, as well as well equipped little cores, the multi-core performance of the Dimensity 9000 is advertised as well exceeding the current Android competition, and falling in line with Apple achieves on the A15.
At the cluster level, MediaTek also equips the DSU with 8MB of L3 – this is likely the new generation DSU-110 as well.
On the CPU side, the Dimensity 9000 is essentially configured in the most optimal way – MediaTek went all-out in terms of frequencies and caches, and it’s generally hard to imagine a more performant configuration than what the chip is currently set up with, at least in the context of Arm Cortex CPU IP.
First LPDDR5X, Large System Cache
Another world first for the Dimensity 9000 is the fact that it’s the first chip announced to be compatible with LPDDR5X. The standard had only been published by JEDEC in July of this year, so the fact that the chip already supports it means that MediaTek was working off a draft and should be fully compatible with the new standard. While the full standard is advertised to go up to 8533Mbps support, the chip here does limit itself to 7500Mbps, so that means +17% bandwidth compared to current generation LPDDR5-6400 solutions. Still, I hadn’t expected LP5X SoCs until next late next year, so this was definitely a surprise. Naturally, the memory controller still fully supports LPDDR5 at up to 6400Mbps in case a vendor chooses to employ different memory modules.
The Dimensity 9000 is MediaTek’s first SoC also employing a system cache at 6MB. During the briefing, MediaTek noted that larger caches and SoC designs with system caches are definitely the way forward and is where everybody will be aiming for in the future. System level caches, or how we like to call them abbreviated, SLCs, are able to amplify performance of SoC blocks other than just the CPU, as well as reduce the memory traffic to DRAM, also having a positive benefit to power efficiency.
GPU: Mali G710MP10
On the GPU side of things, the MediaTek Dimensity 9000 is also the first SoC to see the deployment of the new Mali-G710 GPU. Earlier this year when we talked about the IP we had mentioned that MediaTek was the only remaining vendor that was expected to release an SoC with a larger Mali GPU implementation, given HiSilicon’s troubles and Samsung’s adoption of AMD RDNA GPUs.
The configuration on the Dimensity 9000 is a 10-core. We have to remember here that in terms of per-core performance one new G710 core is roughly equivalent to two G78 cores, so in terms of size and performance the new chip’s GPU is roughly comparable to the Google Tensor G78MP20 GPU, plus maybe an expected 20% performance boost due to generational IP improvements. MediaTek noted the peak frequencies to be at around 850MHz (exact clock to be confirmed).
In terms of performance figures, the company’s materials advertised +35% vs the current Android flagships, while efficiency being +60% better. All of this year’s flagships had been rather disappointing in gaming efficiency, and we saw absolute power figures reaching +7.5-9W on the leading Exynos, Tensor and Snapdragon chips. MediaTek noting that their efficiency advantage is significantly larger than their performance leap also suggests they’re using lower peak power levels that what we see today, which is definitely a welcome change.
The company makes note of Ray Tracing capability, but this is simply a software API implementation rather than hardware, as the G710 doesn’t yet support this.
MediaTek had a slide showcasing longer term performance versus an iPhone 13 with the A15, with the Dimensity 9000 being able to slightly exceed the performance of the iPhone. We saw that the new iPhones throttle to around 3-3.5W, and that under cellular conditions the phones are reported to perform even worse due to the bad thermals. MediaTek notes the comparison is made under a similar thermal budget, so hopefully the comparison is valid here. It's to be noted, as we wrote in our A15 review, comparing real-world games such as Genshin Impact for GPU analysis isn’t great as the game always runs at different internal resolutions or detail levels, especially between Android and iOS.
That being said, MediaTek’s efficiency claims for the GPU do position it extremely well, and would likely allow it to effectively compete against the upcoming Snapdragon and Exynos chips which are still projected to arrive on less efficient process nodes.
Low-Power Leadership Claims
An interesting claim from MediaTek is that they are achieving low power leadership, thanks to the new TSMC N4 node as well as the smart power management the SoC as well as the platform is designed with.
The above figures are comparisons of platforms total power, excluding power supply towards the display panel. This means we’re seeing a power comparison of the SoC, DRAM, PMICs, cellular RF and Wi-Fi systems – essentially the “platform” components which the SoC vendors are generally responsive for and which they bundle their offerings with.
Notable figures here are the media playback and recording numbers, where the Dimensity 9000 is said to have much lower power consumption than the competition. Gaming power is also said to be lower, but this is to be expected given the GPU efficiency and lower power claims.
The one data-point I find most interesting is the home idle power. One of the hardest things to achieve in a silicon design is doing nothing in an efficient manner, this actually represents a large percentage of energy consumption and affects the baseline power of a device, and thus your every-day battery life. Getting -20% over the competition here is quite respectable.
150 Comments
View All Comments
ChrisGX - Thursday, November 25, 2021 - link
The two statements that you take as contradicting one another are actually consistent - they refer to different matters. For an implementation of the G710 on the same silicon process as the predecessor G78 (TSMC N5) if power is held constant the Mali-G710 will offer a 20% performance improvement. However, the structure of the Mali cores change with the G710. You are only going to need roughly half of the number of cores to achieve equivalent performance to the earlier generation G78. The cores of the G710 are faster, bigger and more power hungry but in a sensible GPU configuration in an ARM SoC they offer a 20% energy efficiency boost in an ISO Process implementation.If the Dimensity 9000 is anything to go by it looks like ARM premium SoC manufacturers will be looking for big gains in GPU performance (much more than 20%) and they will be willing to pay a power consumption price to get it. So, with ARM SoCs in flagship Android phones already operating at the power limit for such devices power savings will have to be made elsewhere if sustainable performance is to see solid gains in 2022 phones.
tobichii6 - Friday, November 19, 2021 - link
I will not belive until they be friendly with open-source community as GNU GPL v2. Look at Qualcomm, they're still superior, of course their hardware decoder like AV1 Is I think still behind (update on me for this) but they can still whip your butt because of their soon snapdragon cpu. This is just talking of new architecture cpu meanwhile does not represent average software update, and plus it's a pain in the arse to mod a MediaTek device. And that's not funny at all. Because devices tend to go old, We need a source code so our devices would last long.
Change your policy about the source code to getting open then we're talking (all android chipset in mtk family must be open)
dwillmore - Friday, November 19, 2021 - link
Please don't use SLC for System Level Cache as SLC is already used for flash memories. It's going to cause confusion.FunBunny2 - Friday, November 19, 2021 - link
"SLC is already used for flash memories"SLC was used for flash memories
it's been a while since there was any SLC on sale anywhere.
Wereweeb - Friday, November 19, 2021 - link
There's plenty in servers, and almost all SSD's use their cells in pSLC mode for caching. The FuzeDrive uses it for tiering.mode_13h - Saturday, November 20, 2021 - link
I think it's the nomenclature ARM is using, though. It's reasonable to have a distinct term for cache that's shared among all the CPU cores *and* accelerator blocks, so I don't mind them not simply calling it L3 or LLC.Boss Win - Friday, November 19, 2021 - link
Too many Qualcomm employee attacking a chinese city taipei company MediaTek.Biggest excuse is communism stole tech. While capitalism stole Hawaii, California, New Mexico, and the whole USA.
And yeah, happy thanksgiving.. where people saved you from your winter deaths only to be killed by you after.
Stupid white men who want to rape 13 year old Pocahontas.
Raqia - Friday, November 19, 2021 - link
Their issue has always been software support which lags other vendors:https://www.androidauthority.com/qualcomm-mediatek...
Not to mention driver cheats. The part does look like the ARM presentation's '22 reference high end implementation and good on paper so hopefully it spurs more competition and innovation. HBM stacks on phones maybe? Split, mammoth sized CPU and GPU chiplets?
mode_13h - Saturday, November 20, 2021 - link
Cost is probably a big issue with both of those. We're likely to see stacked memory in laptops first (like what Apple did), before it ever reaches phones. I doubt if HBM can be power-efficient enough, which is probably why Apple went with LPDDR5, in the M1 Pro and Max.And a dedicated GPU chiplet in phones? Doesn't seem likely. Maybe if yields of even newer processes are poor.
wjohnas - Saturday, November 20, 2021 - link
Samsung's Exynows 2100 already supports 8K@60fps AV1 Decoding, I don't know what Mediatek talking about