Google's IP: Tensor TPU/NPU

At the heart of the Google Tensor, we find the TPU which actually gives the chip is marketing name. Developed by Google with input and feedback by the team’s research teams, taking advantage of years of extensive experience in the field of machine learning, Google puts a lot of value into the experiences that the new TPU allows for Pixel 6 phones. There’s a lot to talk about here, but let’s first try to break down some numbers, to try to see where the performance of the Tensor ends up relative to the competition.

We start off with MLCommon’s MLPerf – the benchmark suite works closely with all industry vendors in designing something that is representative of actual workloads that run on devices. We also run variants of the benchmark which are able to take advantage of various vendors SDKs and acceleration frameworks. Google had sent us a variant of the MLPerf app to test the Pixel 6 phones with – it’s to be noted that the workloads on the Tensor run via NNAPI, while other phones are optimised to run through the respective chip vendor’s libraries, such as Qualcomm’s SNPE, Samsung’s EDEN, or MediaTek’s Neuron – unfortunately only the Apple variant is lacking CoreML acceleration, thus we should expect lower scores on the A15.

MLPerf 1.0.1 - Image Classification MLPerf 1.0.1 - Object Detection MLPerf 1.0.1 - Image SegmentationMLPerf 1.0.1 - Image Classification (Offline)

Starting off with the Image Classification, Object Detection, and Image Segmentation workloads, the Pixel 6 Pro and the Google Tensor showcase good performance, and the phone is able to outperform the Exynos 2100’s NPU and software stack. More recently, Qualcomm had optimised its software implementation for MLPerf 1.1, able to achieve higher scores than a few months ago, and this allows the Snapdragon 888 to achieve significantly better scores than what we’re seeing on the Google Tensor and the TPU – at least for those workloads, in the current software releases and optimisations.

MLPerf 1.0.1 - Language Processing 

The Language Processing test of MLPerf is a MobileBERT model, and here for either architectural reasons of the TPU, or just a vastly superior software implementation, the Google Tensor is able to obliterate the competition in terms of inference speed.

In Google’s marketing, language processing, such as live transcribing, and live translations, are very major parts of the differentiating features that the new Google Tensor enables for the Pixel 6 series devices – in fact, when talking about the TPU performance, it’s exactly these workloads that the company highlights as being the killer use-cases and what the company calls state-of-the-art.

If the scores here are indeed a direct representation of Google’s design focus of the TPU, then that’s a massively impressive competitive advantage over other platforms, as it represents a giant leap in performance.

GeekBench ML 0.5.0

Other benchmarks we have available are for example GeekBench ML, which is currently still in a pre-release state in that the models and acceleration can still change in further updates.

The performance here depends on the APIs used, with the test either allowing TensorFlow delegates for the GPU or CPU, or using NNAPI on Android devices (and CoreML on iOS). The GPU results should only represent the GPU ML performance, which is surprisingly not that great on the Tensor, as it somehow lands below the Exynos 2100’s GPU.

In NNAPI mode, the Tensor is able to more clearly distinguish itself from the other SoCs, showcasing a 44% lead over the Snapdragon 888. It’s likely this represent the TPU performance lead, however it’s very hard to come to conclusions when it comes to such abstractions layer APIs.

AI Benchmark 4 - NNAPI (CPU+GPU+NPU)

In AI Benchmark 4, when running the benchmark in pure NNAPI mode, the Google Tensor again showcases a very large performance advantage over the competition. Again, it’s hard to come to conclusions as to what’s driving the performance here as there’s use of CPU, GPU, and NPUs.

I briefly looked at the power profile of the Pixel 6 Pro when running the test, and it showcased similar power figures to the Exynos 2100, which extremely high burst power figures of up to 14W when doing individual inferences. Due to the much higher performance the Tensor showcases, it also means it’s that much more efficient. The Snapdragon 888 peaked around 12W in the same workloads, so the efficiency gap here isn’t as large, however it’s still in favour of Google’s chip.

All in all, Google’s ML performance of the Tensor has been its main marketing point, and Google doesn’t disappoint in that regard, as the chip and the TPU seemingly are able to showcase extremely large performance advantages over the competition. While power is still very high, completing an inference faster means that energy efficiency is also much better.

I asked Google what their plans are in regards to the software side of things for the TPU – whether they’ll be releasing a public SDK for developers to tap into the TPU, or whether things will remain more NNAPI centric like how they are today on the Pixels. The company wouldn’t commit yet to any plans as it’s still very early – in generally that’s the same tone we’ve heard from other companies as even Samsung, even 2 years after the release of their first-gen NPU, doesn’t publicly make available their Eden SDK. Google notes that there is massive performance potential for the TPU and that the Pixel 6 phones are able to use them in first-party software, which enables the many ML features for the camera, and many translation features on the phone.

GPU Performance & Power Phone Efficiency & Battery Life
Comments Locked

108 Comments

View All Comments

  • Alistair - Wednesday, November 3, 2021 - link

    It's the opposite, the iPhone is massively ahead in performance, but every high end phone takes the same high end photos... you got the same photos but a lot less performance...
  • aclos3 - Saturday, November 6, 2021 - link

    I took some time to really test the camera and you are simply wrong. I have been photographing with it heavily for the last couple of days and the camera is incredible. Call it a gimmick or whatever, but the way they do their photo stacking puts this phone in a league of its own. If your main use case for a phone is benchmarking, I guess this is not your device.
  • Lavkesh - Thursday, November 11, 2021 - link

    Everyone and their grand mother do image stacking. iPhone is almost as good if not better even with a smaller sensor when compared to the latest Pixel. How's that for "in a league of its own"?
  • Amandtec - Wednesday, November 3, 2021 - link

    I don't doubt the veracity of your comment but I find the hostile undertone somewhat curious.
  • damianrobertjones - Wednesday, November 3, 2021 - link

    But... but... they said that it's amazing!! Who do I believe? /s
  • Zoolook - Saturday, November 6, 2021 - link

    As long as they use Samsung process they will be hopelessly behind Apples Socs in efficiency unfortunately, would be interesting to see SD back on TSMC process for a direct comparison with Apple silicon.
  • Tigran - Tuesday, November 2, 2021 - link

    Performance looks very disappointing. Google promised 4.7x GPU performance improvement vs Pixel 5.
  • singular9 - Tuesday, November 2, 2021 - link

    I was enjoying how the speculation about the GS101 were claiming its "not far behind" the SD888. I was never expecting google to make another high end device, let alone one that undercuts most of the competition, as its just not what trends would say.

    I am not impressed. As someone who was rather hopeful that google would take control and bring us android users a true apple chip equivalent some day, this is definitely not the case with google silicon.

    Considering how cookie cutter this design is, and how google made some major amateur decisions, I do not see google breaking away from the typical android SOC mold next generation.

    Looking back at how long it took apple to design a near 100% solo design for the iPhone (A8X was the first A chip to use a complete inhouse GPU and etc design, other than ARM cores), that is a whopping 4 and a half years. Suppose this first google "designed" chip is following the same trend, an initial "brand name" break away yet still using a lot of help from other designs, and then slowly fixing one part at a time till its all fixed, while also improving what is already good, I could see google getting there by the Pixel X (10?). But as it stands, unless google dedicates a lot of time to actually altering Arm's own designs and simply having samsung make it, I don't see Tensor every surpassing qualcomm (unless samsung has some big breakthrough in their own CPU/GPU IP which may or may not come with AMD's help).

    As the chip stands today, its "passable", but not impressive. Considering Google can get android to run really well on a SD765G, this isn't at all surprising. The TPU seems like a nice touch, since honestly, focusing on voice is more important than on "raw" cpu performance or something. I have always been frustrated with speech to text not being "perfect" and constantly having to correct it manually and "working around" its limitations. As for my own experience with the 6 Pro, its bloody good.

    Now to specifics.
    The X1 chips do get hot, as does the 5G modem. I switched the device to LTE for now. I do get 5G at home and pretty much most places I go, and it is fast, its not something I need right now. I even had a call drop over 5G because I walked around a buildings corner. Not fun.

    The A76 excuse I have heard floating around, is that it takes up less physical die space, by A LOT. And apparently, there was simply no room for an A77 or A78 because the TPU and GPU took up so much room. I don't understand this compromise, when the GPU performance is this mediocre. Why not simply use the same GPU size as the S21 (Ex2100) and give the A78's more room? Don't know, but an odd choice for sure.

    The A55 efficiency issues are noticeable. Try playing spotify over bluetooth for an hour, and watch the battery drain. I get consistently great standby time, and very good battery life when heavily using my device, but its these background screen off tasks that really chug the battery more than expected.

    Over all though I haven't noticed any serious issues with my unit. The finger print scanner works as intended, and is better than my 8T. The camera does just as well if not better than the previous pixels. And over all...no complaints. But I wonder how much of this UX comes from google literally brute forcing their way with 2 X1 cores and a overkill GPU, and how much of it is them actually trying.

    As for recommendations to google for Tensor V2, they need to not compromise efficiency for performance. This phone isn't designed to game, cut the GPU down, or heck, partner with AMD (who is working with samsung) to bring competitive graphics to mobile to compete with Adreno from QComm. 2 X1 cores, if necessary, can stay, but at that point, might as well just have 4 of them and get rid of all the other cores entirely and simply build a very good kernel to modulate the frequency. Or make it a 2+6 design with A57 cores. As someone who codes kernels for pixels and nexus devices for a long time, trying to optimize the software to really get efficiency out of the big.LITTLE system is near impossible, and in my opinion, worthless unless your entire scheduler is "screen on/off" based, which is literally BS. I doubt google has any idea how to build a good CPU governor nor scheduler to truly make this X+X+X system even work properly, since I have yet see qcomm or samsung do it "well" to call commendable.

    The rest of the phone is fine. YoY improvements are always welcome, but I think the pixel 6/pro just really show how current mobile chips are so far behind apple that you might as well give up. YoY improvements have imo halted, and honestly no one seems to be having the thought that maybe we should cut power consumption in half WITHOUT increasing performance. I mean...the phones are fast enough.

    Who knows. We will see next year.

    PS: I also am curious what google will do with the Pixel 6A (if they make one at all). Will it use a cut down GS101 or will it get the whole chip? It would seem overkill to shove this into a 399$ phone. Wonder what cut downs will be made, or if there will be improvements as well.
  • sharath.naik - Tuesday, November 2, 2021 - link

    Good thoughts, there is one big issue you missed. Pixel camera sensors 50mp/48mp being binned to 12mp yet Google labeled them as 50mp/48mp. Every shot outside the native 1x,4x is just a crop of the 12mp image including pottaitk3mp crop) and 10x(2.5mp crop}.
  • teldar - Thursday, November 4, 2021 - link

    You are absolutely a clueless troll and should go back to your cave. Your stupidity is unwanted.

Log in

Don't have an account? Sign up now