Nvidia Pascal over a year ahead of 14/16nm competition

CES 2016: A commanding lead over the entire industry

Nvidia is more than a year ahead of any competition in the mobile space as their Pascal based Drive PX 2 module proved. Held up by no less than CEO Jen-Hsun Huang, SemiAccurate felt it was an automotive tour de force.

The losses of Audi/VAG were a thing of the distant past at Nvidia because of the new Drive PX 2 module and specifically its twin 16nm Pascal GPUs. Teamed up with a total of eight ARM Cortex-A57 cores and four Nvidia Denver cores, full specs at Anandtech here. Together they provide everything a modern automobile could need for compute in a mere 250W package. That number is more than 10x the power draw of competitive solutions so it must be fast!

Getting back to their lead, Nvidia is so far ahead of the industry it is, well, nigh on incredible. Until the CES unveiling, AMD was widely believed to be 2+ quarters ahead of Nvidia on the process front, their 14nm Polaris GPUs had all the mindshare. On Monday January 4th, 2016, Nvidia showed undeniable proof that they were more than a year ahead of AMD in the race to 14/16nm devices. Not only that they were ahead of assumed leaders Apple and Samsung to devices on this node, way way ahead.

Why do we say this? Take a close look at the below picture from the official Nvidia Flickr stream of the Drive PX 2 board held up by Nvidia CEO Jen-Hsun Huang. This was taken at the CES unveiling of the PX 2, and you can see the full picture here. The room is full of press and analysts so SEC rules mandate that everything said is absolutely true, it would be illegal to do otherwise. With that in mind, you know this killer device is both real and what it is said to be.

The picture below is a closeup of one of the two Pascal GPUs on the PX 2 board that Jen-Hsun was holding. These 16nm FinFET devices are said to push out a combined 16TF of SP compute, a massive number for only 250W. That is only possible because of the process tech it is based on, 28nm devices would struggle to have half that performance in the same power envelope. But showing off 16nm pre-production parts is not a big deal, SemiAccurate held an AMD 14nm device in December. Why do we say that Nvidia has a massive lead over AMD then?

Closeup of one of the two Pascal chips on the PX 2 board

If you look closely at the chips you can see the date codes on them are 1503A1 and they are from TSMC. TSMC date codes are formatted as year of manufacture followed by work week of manufacture followed by stepping. For example the GTX 980 that was sent to the press in September of 2014 had a date code of 1421A1, the press GTX 960 from January 2015 had a date code of 1442A1, and the GTX 980Ti sampled in May of 2015 had a date code of 1436A1. You can work out the details of the press samples for yourself when you are bored.

What this means Nvidia has had 16nm FinFET based Pascal chips sitting on the shelf for almost a year, barely a quarter after Maxwell was produced. That silicon was manufactured in mid-January of 2015!!! Everyone including the author was assuming that Pascal didn’t even tape out until Q3 of 2015, we were off by at least nine months, more likely a year. More importantly this version of Pascal somehow doesn’t need HBM memory to reach those heady numbers, it uses vanilla GDDR5 chips as you can see.

That means the consumer versions of Pascal that have HBM will be so much faster than the quoted numbers for the Drive PX2 module that uses the GDDR5 memory that it is nigh unbelievable. Nvidia’s Pascal is so efficient that it can support compute levels that everyone believes is impossible with the memory bandwidths offered by the memory pictures. 16TF with GDDR5 is patently impossible for other silicon providers for anything but trivial benchmarks.

How Nvidia made the big Pascal chip in such a way that both GDDR5 and interposer based HBM memory will work with the same package is going to be very interesting to hear, every other company using HBM has found this impossible to implement too. More to the point such a memory configuration shouldn’t be possible using even bleeding edge technologies, but Nvidia has had this in their pocket, literally sitting on the shelf since January of 2015!!! That is technological leadership if it ever existed, quite the commanding lead.

But wait, the Nvidia leadership position is even more dominant than that. Those WW3/2015 date codes mean Nvidia had to tape their Pascal designs out in late 2014, even before the TSMC process was ready for such tapeouts much less stable. The Nvidia designers were obviously so good and so far ahead of the game that they were able to put out a design that worked so well out of the box that even on an unstable, some go so far as to say non-existent, process it worked on the A1 stepping. Not only that it worked so well that once TSMC stabilized the 16FF+ process months after Nvidia produced their Pascals, no update to the GPUs were even needed! Imagine that, so many critical PDK updates from TSMC and none managed to do any better than those plucky geniuses at Nvidia could get quarters before. That is a commanding lead if there ever was one, simply staggering process tech.

So to wrap it up, Nvidia CEO Jen-Hsun Huang showed off what he said were two Pascal GPUs in an Nvidia Drive PX 2 module. The date codes on the devices he held up made it very clear that they had been sitting on these devices for a week or two short of a year, no one else was even close to large 16nm FinFET GPUs at the time. Even TSMC could not produce such parts for months afterwards, their process wasn’t actually ready at the time. And the performance is more than 2x that of 28nm AMD devices even without the HBM memory that Pascal was supposed to require.

It all has to be true because as an executive in a publicly traded company with press and analysts who cover and trade Nvidia in the room, he is obliged by the SEC to be truthful. The only other explanation is that he knowingly lied to the press and analysts and showed a fake card in his keynote, something that would clearly be illegal. Jen-Hsun would never do something like that, right?S|A

Charlie Demerjian is the founder of Stone Arch Networking Services and SemiAccurate.com. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also a council member with Gerson Lehman Group. FullyAccurate

Thank you, Subscribers!

Thank you to our Subscribers, past and present. You are appreciated. You are what keeps SemiAccurate going, what allows us to maintain our journalism, what keeps us ad-free, what allows us to tell it like it is, it is still just you. You, the reader and subscriber, we thank you.

If you want to know more about subscriptions, both free and paid, the information can be found here.

For more on our track record of leading edge journalism see Fully Accurate.

Our Writers

Charlie Demerjian is the founder of Stone Arch Networking Services and S|A. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, security and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture.

As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also a council member with Gerson Lehman Group.

Thomas Ryan is a freelance technology writer and photographer from Seattle, living in Austin. You can find his work on SemiAccurate and PCWorld. He has a BA in Geography from the University of Washington with a minor in Urban Design and Planning and specializes in geospatial data science. If you have a hardware performance question or an interesting data set Thomas has you covered.