The AMD Ryzen 9 9950X and Ryzen 9 9900X Review: Flagship Zen 5 Soars - and Stalls
by Gavin Bonshor on August 14, 2024 9:00 AM EST- Posted in
- CPUs
- AMD
- Desktop
- Zen 5
- AM5
- Ryzen 9000
- Ryzen 9 9950X
- Ryzen 9 9900X
Core-to-Core Latency: Zen 5 Gets Weird
As the core count of modern CPUs is growing, we are reaching a time when the time to access each core from a different core is no longer a constant. Even before the advent of heterogeneous SoC designs, processors built on large rings or meshes can have different latencies to access the nearest core compared to the furthest core. This rings true especially in multi-socket server environments.
But modern CPUs, even desktop and consumer CPUs, can have variable access latency to get to another core. For example, in the first generation Threadripper CPUs, we had four chips on the package, each with 8 threads, and each with a different core-to-core latency depending on if it was on-die or off-die. This gets more complex with products like Lakefield, which has two different communication buses depending on which core is talking to which.
If you are a regular reader of AnandTech’s CPU reviews, you will recognize our Core-to-Core latency test. It’s a great way to show exactly how groups of cores are laid out on the silicon. This is a custom in-house test, and we know there are competing tests out there, but we feel ours is the most accurate to how quick an access between two cores can happen.
Looking at the above latency matrix of the Ryzen 9 9950X, we observe that the lowest latencies naturally occur between adjacent cores on the same CCX. The core pairs such as 0-1, 1-2, and 2-3 consistently show latencies in the 18.6 to 20.5 nanoseconds range. This is indicative of the fast L3 cache shared within the CCX, which ensures rapid communication between the inner cores on the same complex.
Compared to the Ryzen 9 7950X, we are seeing a slight increase in latencies within a single CCX. The SMT "advantage", where two logical cores sharing a single physical core have a lower latency, appears to be gone. Instead, latencies are consistently around 20ns from any logical core to any other logical core within a single CCX. That average is slightly up from 18ns on the 7950X, though it's not clear what the chief contributing factor is.
More significantly – and worryingly so – are the inter-CCD latencies. That is, the latency to go from a core on one CCD to a core on the other CCD. AMD's multi-CCD Ryzen designs have always taken a penalty here, as communicating between different CCDs means taking a long trek through AMD's Infinity Fabric to the IOD and back out to the other CCD. But the inter-CCD latencies are much higher here than we were expecting.
For reference, on the Ryzen 9 7950X, going to another CCD is around 76ns. But in Ryzen 9 9950X, we're seeing an average latency of 180ns, over twice the cost of the previous generation of Ryzen. Making this all the more confusing, Granite Ridge (desktop Ryzen 9000) reuses the same IOD and Infinity Fabric configuration as Raphael (Ryzen 7000) – all AMD has done is swap out the Zen 4 CCDs for Zen 5 CCDs. So by all expectations, we should not be seeing significantly higher inter-CCD latency here.
Our current working theory is that this is a side-effect of AMD's core parking changes for Ryzen 9000. That cores are being aggressively put to sleep, and that as a result, it's taking an extra 100ns to wake them up. If that is correct, then our core-to-core latency test is just about the worst case scenario for that strategy, as it's sending data between cores in short bursts, rather than running a sustained workload that keeps the cores alive over the long-haul.
At this point, we're running some additional tests on the 9950X without AMD's PPM provisioning driver installed, to see if that's having an impact. Otherwise, these high latencies, if accurate for all workloads, would represent a significant problem for multi-threaded workloads that straddle the Infinity Fabric.
123 Comments
View All Comments
Silver5urfer - Saturday, August 24, 2024 - link
So looking at the response, seems like a Windows OS dependent update. That is ok but it's not going to save the Zen 5 flaws. Plus more over what about Windows 10 ? I honestly have no idea why the damn reviewers do not even care as if Win10 vanished. That OS is far more robust over the garbage Win11 which did regression in CPU performance due (VBS) to some Kernel level changes and the Shell32 / Win32 downgrades plus explorer.exe downgrades AND QA went into sewage. Microsoft is already pathetic in Windows ever since they sacked Terry, Chief of Windows for 20+ years and they dissolved Windows dept exclusivity to some Cloud department and that Panos Panay ruined whatever left of it and left the company. The mismanagement at Microsoft is astounding and these HW companies still lick the boot of the company so badly so do the dummy users who are braindead.GeoffreyA - Saturday, August 24, 2024 - link
I think AMD's tone in that blog isn't right. Sure, mistakes are made: that's no problem. Apologise and fix it. Here, they've spun it in such a way that no fault lies with them. AMD of early Zen would apologise and take accountability. Seems they're changing as their coffers get loaded.Regarding Windows 10, who knows if it'll get the update. Maybe not. Windows 10, in itself, works well. I've never had an issue in the five years of using it. But it is slowly on the way out. There are no more feature updates, it's stuck at Build 19045, and all the development effort is going into 11.
Oxford Guy - Saturday, August 24, 2024 - link
AMD is no better than Intel, which is to be expected since it's a corporation.Remember the FX 9000 series (leaky, too demanding for most AM3+ boards, and massively overpriced on the basis of the deceptive 8-core claim*)? The Radeon VII (unnecessarily small die clocked way too high)?
*Deceptive, not because it didn't have 8 cores but mainly because its cores were weak, even without considering that there were only 4 FPU units, although that definitely did not help.
These companies will milk people for all they can, like Intel is with the ongoing scam vis-à-vis its time bomb high-voltage CPUs.
Oxford Guy - Saturday, August 24, 2024 - link
Another favourite AMD anecdote is how Su gave a presentation in which she unveiled the roadmap for Polaris and Vega. Oh, Vega will come out shortly. But, it did not. Instead, AMD milked customers with its "Polaris Forever" campaign. When Vega did arrive it had inadequate cooling and the same IPC as Fury X. All that time waiting for a part that only had a higher clock (thanks to the process shrink) and more VRAM to provide the illusion of significant progress.GeoffreyA - Sunday, August 25, 2024 - link
In the Bulldozer era, AMD was well known for promising and not delivering, and even when there were improvements, such as across Bulldozer's four iterations, one was sceptical, and the Radeon story was little better. From Zen, they fulfilled their promises or over-delivered. Slowly, they earned our trust, and one could believe the general picture of AMD's marketing. Making mistakes, they took accountability and fixed it. So, it was somewhat surprising to see the tone of that blog, when there are clear issues with Zen 5. If they go on like this, the trust they've earned will be lost. As a plus, they've become greedier since the early days of Zen.jcc5169 - Sunday, August 25, 2024 - link
What's interesting is the difference between performance on Windows and Linux.jcc5169 - Sunday, August 25, 2024 - link
Described here ...https://videocardz.com/newz/amd-promises-windows-1...
cryosx - Monday, August 26, 2024 - link
Might need to retest with 24h2 windows update, cause gaming is seeing an average 10% uplift for both zen4 and zen5evanh - Monday, August 26, 2024 - link
And apparently Intel parts too. That now begs the question as to what M$ has done to Windoze to allow branch prediction to magically get better?GeoffreyA - Tuesday, August 27, 2024 - link
Possibly, it's some mitigation related to speculative execution or indirect branch prediction being enabled on these CPUs, slowing down performance. As can be seen, there are many settings in Win32 concerning speculative execution.https://learn.microsoft.com/en-us/windows/win32/ap...