A New Architecture

This is a first. Usually when we go into these performance previews we’re aware of the architecture we’re reviewing, all we’re missing are the intimate details of how well it performs. This was the case for Conroe, Nehalem and Lynnfield (we sat Westmere out until final hardware was ready). Sandy Bridge, is a different story entirely.

Here’s what we do know.

Sandy Bridge is a 32nm CPU with an on-die GPU. While Clarkdale/Arrandale have a 45nm GPU on package, Sandy Bridge moves the GPU transistors on die. Not only is the GPU on die but it shares the L3 cache of the CPU.

There are two different GPU configurations, referred to internally as 1 core or 2 cores. A single GPU core in this case refers to 6 EUs, Intel’s graphics processor equivalent (NVIDIA would call them CUDA cores). Sandy Bridge will be offered in configurations with 6 or 12 EUs.

While the numbers may not sound like much, the Sandy Bridge GPU is significantly redesigned compared to what’s out currently. Intel already announced a ~2x performance improvement compared to Clarkdale/Arrandale, and I can say that after testing Sandy Bridge Intel has been able to achieve at least that.

Both the CPU and GPU on SB will be able to turbo independently of one another. If you’re playing a game that uses more GPU than CPU, the CPU may run at stock speed (or lower) and the GPU can use the additional thermal headroom to clock up. The same applies in reverse if you’re running something computationally intensive.

On the CPU side little is known about the execution pipeline. Sandy Bridge enables support for AVX instructions, just like Bulldozer. The CPU will also have dedicated hardware video transcoding hardware to fend off advances by GPUs in the transcoding space.

Caches remain mostly unchanged. The L1 cache is still 64KB (32KB instruction + 32KB data) and the L2 is still a low latency 256KB. I measured both as still 4 and 10 cycles respectively. The L3 cache has changed however.

Only the Core i7 2600 has an 8MB L3 cache, the 2400, 2500 and 2600 have a 6MB L3 and the 2100 has a 3MB L3. The L3 size should matter more with Sandy Bridge due to the fact that it’s shared by the GPU in those cases where the integrated graphics is active. I am a bit puzzled why Intel strayed from the steadfast 2MB L3 per core Nehalem’s lead architect wanted to commit to. I guess I’ll find out more from him at IDF :)

The other change appears to either be L3 cache latency or prefetcher aggressiveness, or both. Although most third party tools don’t accurately measure L3 latency they can usually give you a rough idea of latency changes between similar architectures. In this case I turned to cachemem which reported Sandy Bridge’s L3 latency as 26 cycles, down from ~35 in Lynnfield (Lynnfield’s actual L3 latency is 42 clocks).

As I mentioned before, I’m not sure whether this is the result of a lower latency L3 cache or more aggressive prefetchers, or both. I had limited time with the system and was unfortunately unable to do much more.

And that’s about it. I can fit everything I know about Sandy Bridge onto a single page and even then it’s not telling us much. We’ll certainly find out more at IDF next month. What I will say is this: Sandy Bridge is not a minor update. As you’ll soon see, the performance improvements the CPU will offer across the board will make most anyone want to upgrade.

A New Name A New Socket and New Chipsets
Comments Locked

200 Comments

View All Comments

  • iwodo - Sunday, August 29, 2010 - link

    The GPU is on the same die, So depending on what you mean by true "Fusion" product. It is by AMD's definition ( the creator of the tech terms "Fusion" ) a fusion product.
  • iwodo - Sunday, August 29, 2010 - link

    You get 10% of IPC on average. It varies widely from 5 % to ~30% clock per clock.

    None of these Test have had AVX coded. I am not sure if you need to recompile to take advantage of the additional width for faster SSE Code. ( I am thinking such changes in coding of instruction should require one. ) AVX should offer some more improvement in many areas.

    So much performance is here with even less Peak Power usage. If you factor in the Turbo Mode, Sandy Bridge actually give you a huge boost in Performance / Watts!!!

    So i dont understand why people are complaining.
  • yuhong - Sunday, August 29, 2010 - link

    Yes AVX requires software changes, as well as OS support for using XSAVE to save AVX state.
  • BD2003 - Sunday, August 29, 2010 - link

    It sounds like intel has a home run here. At least for my needs. Right now I'm running entirely on core 2 chips, but I can definitely find a use for all these.

    For my main/gaming desktop, the quad core i5s seem like theyll be the first upgrade thats big enough to move me away from my e6300 from 4 years ago.

    For my HTPC, the integrated graphics seem like theyre getting to a point where I can move past my e2180 + 9400 IGP. I need at least some 3d graphics, and the current i3/i5 don't cut it. Even lower power consumption + faster CPU, all in a presumably smaller package - win.

    For my home server, I'd love to put the lowest end i3 in there for great idle power consumption but with the speed to make things happen when it needs to. I'd been contemplating throwing in a quad core, but if the on-die video transcoding engine is legitimate there will be no need for that.

    Thats still my main unanswered question: what's the deal with the video encoder/transcoder? Does it require explicit software support, or is it compatible with anything that's already out there? I'm mainly interested in real time streaming over LAN/internet to devices such as an ipad or even a laptop - if it can put out good quality 720-1080p h264 at decent bitrates in real time, especially on a low end chip, I'll be absolutely blown away. Any more info on this?
  • _Q_ - Sunday, August 29, 2010 - link

    I do understand some complains, but Intel is running a business and so they do what is in their best interest.

    Yet, concerning USB 3 it seems to be too much of a disservice to the costumers that it should be in, without any third party add-on chip!

    I think it is shameful of them to delay this further just so that they can get their LightPeak thing into to the market. Of which I read nothing in this review so I wonder, when will even that one come?!

    I can only hope AMD does support it (haven't read about it) and they start getting more market, maybe that will show these near sighted Intel guys.
  • tatertot - Sunday, August 29, 2010 - link

    Lightpeak would be chipset functionality, at least at first.

    Also, lightpeak is not a protocol, it's protocol-agnostic, and can in fact carry USB 3.0.

    But, rant away if you want...
  • Guimar - Sunday, August 29, 2010 - link

    Really need one
  • Triple Omega - Sunday, August 29, 2010 - link

    I'm really interested to see how Intel is going to price the higher of these new CPU's, as there are several hurdles:

    1) The non-K's are going up against highly overclockable 1366 and 1156 parts. So pricing the K-models too high could mean trouble.

    2) The LGA-1356 platform housing the new consumer high-end(LGA-2011 will be server-only) will also arrive later in 2011. Since these are expected to have up to 8 cores, pricing the higher 1155 CPU's too high will force a massive price-drop when 1356 arrives.(Or the P67 platform will collapse.) And 1366 has shown that such a high-end platform needs the equivalent of an i7 920 to be successful. So pricing the 2600K @ $500 seems impossible. Even $300 would not leave room for a $300 1356 part as that will, with 6-8 cores, easily outperform the 2600K.

    It will also be quite interesting to see the development of those limits on overclocking when 1356 comes out. As imposing limits there too, could make the entire platform fail.(OCed 2600K better then 6-core 1356 CPU for example.) And of course AMD's response to all this. Will they profit from the overclocking limits of Intel? Will they grab back some high-end? Will they force Intel to change their pricing on 1155/1356?

    @Anand:

    It would be nice to see another PCIe 2.0 x8 SLI/CF bottleneck test with the new HD 6xxx series when the time comes. I'm interested to see if the GPU's will catch up with Intels limited platform choice.
  • thewhat - Sunday, August 29, 2010 - link

    I'm disappointed that you didn't test it against 1366 quads. The triple channel memory and a more powerful platform in general have a significant advantage over 1156, so a lot of us are looking at those CPUs. Especially since the i7 950 is about to have its price reduced.

    A $1000 six-core 980X doesn't really fit in there, since it's at a totally different price point.

    I was all for the 1366 as my next upgrade, but the low power consumption of Sandy Bridge looks very promising in terms of silent computing (less heat).
  • SteelCity1981 - Sunday, August 29, 2010 - link

    What do you think the Core i7 980x uses? An LGA 1366 socket with triple channel memory support. So what makes you think that the Core i7 950 is going to perform any diff?????

Log in

Don't have an account? Sign up now