Nvidia Pascal

Jen-Hsen Huang, chief executive officer of Nvidia Corp. said this week that he is excited about the company’s next-generation graphics processing units code-named “Pascal” as well as next-generation process technologies. Nevertheless, while the CEO of Nvidia is confident of the company’s roadmap, he notes that current-gen “Maxwell” family of GPUs is only beginning its journey.
“We have got lot of great surprises for you guys and I am excited about our next generation GPUs,” said Jen-Hsun Huang during quarterly conference call with investors and financial analysts. “But right now we are enjoying ramping Maxwell. This is a brand new product cycle.”
Right now Maxwell mainly addresses the market of consumer gaming PCs with GeForce graphics cards. Nvidia plans to introduce professional-grade Quadro graphics cards as well as Tesla accelerators for high-performance computing applications sometime in 2015.
At present there are only two graphics processing units – GM107 and GM204 – based on the Maxwell architecture. Nvidia is expected to unveil two more chips based on the latest graphics processing technology, which will take a quarter or two. It is believed that one of the forthcoming Maxwell GPUs – code-named GM200 – will address the markets of high-performance computing, professional graphics as well as ultra-high-end gaming PCs.
Nvidia’s next-generation graphics processors are code-named “Pascal”. Based on the company’s roadmap that it demonstrated back in March, Pascal GPUs are due sometime in 2016. Next-generation graphics chips from Nvidia will support stacked high-bandwidth dynamic random access memory (DRAM) (SK Hynix’s high-bandwidth memory (HBM) or Micron’s hybrid memory cube (HMC)), unified memory addressing for CPU and GPU, NVLink interconnection for high-performance computing platforms as well as new graphics, compute and multimedia features that will be a part of DirectX 12, OpenGL 5.0 and other forthcoming application programming interfaces.
In 2016 displays with ultra-high-definition (UHD) resolutions like 4K (3840*2160, 4096*2160) or 5K (5120*2160) will get much more popular than they are today. Therefore, dramatically improved graphics processing horsepower of Pascal GPUs as well as extreme bandwidth provided by stacked HMC or HBM DRAM devices (we are talking about 1TB/s – 2TB/s bandwidth here) will be appreciated by the market.
Given the availability timeframe of the “Pascal” family of graphics processors, it is very likely that they will be manufactured using 16nm FinFET+ process technology at Taiwan Semiconductor Manufacturing Co. Although Jen-Hsun Huang has not confirmed anything about Pascal, he did indicate that he is happy with the forthcoming fabrication processes.
“We are excited about the next generation FinFET [manufacturing technologies],” said Mr. Huang. “I can tell you that for the next couple of nodes, I feel pretty good about [them].”

KitGuru Says: While Nvidia’s Pascal architecture looks like another major step in the evolution of graphics processors in general, it is at least 1.5 years away. Therefore, from an end-user point of view it is more interesting to know, what the GM200 is and how fast it is. The GM200 will power Nvidia’s next-generation Titan graphics cards and it is not a secret that they are performance monsters.

NVIDIA has updated their next generation graphics roadmap at the GTC 2015 conference with the upcoming Pascal GPU and Volta GPU. While we are still a few years away from knowing what the Volta GPU would look and perform like, CEO of NVIDIA, Jen-Hsun Huang did confirm three key aspects of the Pascal GPU which will make it 10 times faster than current generation Maxwell based chips when launched in 2016.NVIDIA Pascal Gets FP16 Mixed Precision, NVLINK and 1 TB/s 3D Memory in 2016

The details shown by NVIDIA on Pascal GPU are pretty much the same things we heard last year at GTC 2014 with a few updates on performance and efficiency bits. We know that NVIDIA’s Pascal GPU would replace Maxwell going in 2016 and would feature the latest core architecture from NVIDIA that will use the 3D Stacked memory that enables memory to be stacked on the GPU die and enable bandwidth speeds of upto 1 TB/s. This 3D chip on wafer integration will not only enable much more BW (bandwidth) but will also deliver upto 4 times the efficiency and 2.5 times more VRAM capacity of the graphics unit to deliver amazing performance on higher resolution screens. AMD is already going for 2.5D memory stacking with their upcoming cards which will have up to 640 GB/s bandwidth and NVIDIA will have 3D HBM integration that will enable tons of memory chips to be stacked with greater than TB/s bandwidth.
Compared to the GeForce GTX Titan X, 3D HBM memory will allow three times more bandwidth since the Titan X have already received the highest standard GDDR5 memory chips capable of 7 GHz frequency. This limitation will end once HBM becomes common on discrete graphics cards. They also mentioned Pascal having 2.7 times more memory available which points out to 32 GB VRAM to users with higher demand. Compared to Pascal, the Titan X only has 12 GB GDDR5 memory which is considered a lot by users.
The Pascal GPU would also introduce NVLINK which is the next generation Unified Virtual Memory link with Gen 2.0 Cache coherency features and 5 – 12 times the bandwidth of a regular PCIe connection. This will solve many of the bandwidth issues that high performance GPUs currently face. One of the latest things we learned about NVLINK is that it will allow several GPUs to be connected in parallel, whether in SLI for gaming or for professional usage. Jen-Hsun specifically mentioned that instead of 4 cards, users will be able to use 8 GPUs in their PCs for gaming and professional purposes.

NVLink is an energy-efficient, high-bandwidth communications channel that uses up to three times less energy to move data on the node at speeds 5-12 times conventional PCIe Gen3 x16. First available in the NVIDIA Pascal GPU architecture, NVLink enables fast communication between the CPU and the GPU, or between multiple GPUs. Figure 3: NVLink is a key building block in the compute node of Summit and Sierra supercomputers.
VOLTA GPU Featuring NVLINK and Stacked Memory NVLINK GPU high speed interconnect 80-200 GB/s 3D Stacked Memory 4x Higher Bandwidth (~1 TB/s) 3x Larger Capacity 4x More Energy Efficient per bit.
NVLink is a key technology in Summit’s and Sierra’s server node architecture, enabling IBM POWER CPUs and NVIDIA GPUs to access each other’s memory fast and seamlessly. From a programmer’s perspective, NVLink erases the visible distinctions of data separately attached to the CPU and the GPU by “merging” the memory systems of the CPU and the GPU with a high-speed interconnect. Because both CPU and GPU have their own memory controllers, the underlying memory systems can be optimized differently (the GPU’s for bandwidth, the CPU’s for latency) while still presenting as a unified memory system to both processors. NVLink offers two distinct benefits for HPC customers. First, it delivers improved application performance, simply by virtue of greatly increased bandwidth between elements of the node. Second, NVLink with Unified Memory technology allows developers to write code much more seamlessly and still achieve high performance. via NVIDIA News

The third thing Jen-Hsun mentioned is how he believes Pascal GPU will be able to achieve 10x better performance compared to Maxwell. The key to this improvement is mixed precision or FP16 compute which NVIDIA recently switched inside their Tegra X1 SOC.NVIDIA GTC 2015 Pascal GPU Slides:

3D Memory: Stacks DRAM chips into dense modules with wide interfaces, and brings them inside the same package as the GPU. This lets GPUs get data from memory more quickly – boosting throughput and efficiency – allowing us to build more compact GPUs that put more power into smaller devices. The result: several times greater bandwidth, more than twice the memory capacity and quadrupled energy efficiency.

Unified Memory: This will make building applications that take advantage of what both GPUs and CPUs can do quicker and easier by allowing the CPU to access the GPU’s memory, and the GPU to access the CPU’s memory, so developers don’t have to allocate resources between the two.

NVLink: Today’s computers are constrained by the speed at which data can move between the CPU and GPU. NVLink puts a fatter pipe between the CPU and GPU, allowing data to flow at more than 80GB per second, compared to the 16GB per second available now.

Pascal Module: NVIDIA has designed a module to house Pascal GPUs with NVLink. At one-third the size of the standard boards used today, they’ll put the power of GPUs into more compact form factors than ever before.

Pascal will feature 4X the mixed precision performance, 2X the performance per watt, 2.7X memory capacity & 3X the bandwidth of Maxwell.
Those are a lot of numbers to digest indeed so lets break them down. Nvidia states that pascal will be the company’s first high performance GPU to feature mixed precision floating point compute FP16. Which is essential for low power devices such as tablets and mobile phones. Mixed precision is also very beneficial from a power efficiency stand point for many compute applications which don’t strictly require higher precision FP32 or FP64 compute which would benefit greatly from this addition.

Nvidia’s CEO went on to state that pascal will be 10 times faster than Maxwell and he arrived at this conclusion via what he calls “CEO math”. Obviously this was just a humorous way to impress the crowd at GTC 2015 and is based on “very rough estimates”.
Pascal will feature three distinct new technologies.#1 HBM : Stacked memory will debut on the green side with Pascal. HBM Gen2 more precisely, the second generation of the SK Hynix AMD co-developed high bandwidth JEDEC memory standard. The new memory will enable memory bandwidth to exceed 1 Terabyte/s which is 3X the bandwidth of the Titan X. The new memory standard will also allow for a huge increase in memory capacities, 2.7X the memory capacity of Maxwell to be precise. Which indicates that the new Pascal flagship will feature 32GB of video memory, a mind-bogglingly huge number.
We’ve already seen AMD take advantage of this memory technology with its R9 390X GPU. Which will feature up to 8GB of HBM delivering 640GB/S of memory bandwidth. Nearly three times that of the GTX 980 and twice that of the GTX Titan X. AMD has also stated that it plans to use the second generation of this new memory technology in future graphics cards. So we’re likely to see both red and green rocking second generation stacked HBM next year.

Advertisements

HBM achieves this amazing improvement in memory bandwidth and capacity by employing a very wide through-silicon-via memory interface. Each HBM cube is connected to the GPU with a 1024bit wide memory bus. HBM modules actually operate at low frequencies compared to GDDR5 but thanks to the significantly wider memory interface they manage to be up to 9 times faster than standard GDDR5 memory modules.We’ve already covered this revolutionary new memory technology exclusively and in-depth last year. HBM will quickly replace GDDR5 as the standard memory technology for high performance graphics solutions. It’s fair to say that HBM is the future.#2 NV-Link : Pascal will also be the first Nvidia GPU to feature the company’s new NV-Link technology which Nvidia states is 5 to 12 times faster than PCIE 3.0.

NVIDIA® NVLink™ is a high-bandwidth, energy-efficient interconnect that enables ultra-fast communication between the CPU and GPU, and between GPUs. The technology allows data sharing at rates 5 to 12 times faster than the traditional PCIe Gen3 interconnect, resulting in dramatic speed-ups in application performance and creating a new breed of high-density, flexible servers for accelerated computing.

#3 16nm manufacturing process : Pascal will the first Nvidia GPU to be built on TSMC’s 16nm FinFET manufacturing process. The new process promises to be significantly more power efficient and significantly more dense than 28nm. Which would enable Nvidia to build significantly more complex and powerful GPUs all the while significantly improving power efficiency.

TSMC’s 16FF+ (FinFET Plus) technology can provide above 65 percent higher speed, around 2 times the density, or 70 percent less power than its 28HPM technology. Comparing with 20SoC technology, 16FF+ provides extra 40% higher speed and 60% power saving. By leveraging the experience of 20SoC technology, TSMC 16FF+ shares the same metal backend process in order to quickly improve yield and demonstrate process maturity for time-to-market value.

Pascal is still scheduled for a 2016 release with Volta coming along sometime after that.

NVIDIA briefed the crowd at its GPU Technology Conference here in San Jose, California where they displayed a slide with the amount of VRAM per GPU architecture. We can see that Kepler in 2012, which the last flagship card based on Kepler was the GeForce GTX 780 and GTX Titan Black featuring 3GB and 6GB of RAM respectively, while the Maxwell architecture provides between 4GB (on the GTX 980) and 12GB (on the GTX Titan X) respectively. The slide teases that Pascal will feature 32GB of RAM, and Volta will rock up to 72GB of RAM in 2018.

The way NVIDIA will do this is thanks to SK Hynix's High Bandwidth Memory (HBM) which allows for four-layer stacks, which is also known as 4-Hi. This will come in 1GB and 2GB varieties, but eight-layer stacks will eventually arrive, which should see a huge increase in the amount of framebuffer on the next generation of GPUs.

Not only will Pascal deliver more VRAM on the card, but it will have magnitudes more memory bandwidth. The Maxwell-based GeForce GTX Titan X has 336GB/sec of memory bandwidth from its 384-bit memory bus on its GDDR5 RAM, but the Pascal architecture will be capable of a huge 750GB/sec or more. This will be using a variety of technologies to achieve this lofty height of memory bandwidth, including mixed precision, 3D Memory and NVLink.

We should expect NVIDIA to talk more about Pascal later in the year, or GTC 2016 this time next year.

Compute performance of modern graphics processing units (GPUs) is tremendous, but so are the needs of modern applications that use such chips to display beautiful images or perform complex scientific calculations. Nowadays it is rather impossible to install more than four GPUs into a computer box and get adequate performance scaling. But brace yourself as Nvidia is working on eight-way multi-GPU technology.
The vast majority of personal computers today have only one graphics processor, but many gaming PCs used to play games integrate two graphics cards for increased framerate. Enthusiasts, who want to have unbeatable performance in select games and benchmarks opt for three-way or four-way multi-GPU setups, but these are pretty rare because scaling beyond two GPUs is not really high. Professionals, who need high-performance GPUs for simulations, deep learning and other applications also benefit from four graphics processors and could use even more GPUs per box. Unfortunately, that is virtually impossible because of limitations imposed by today’s PCI Express and SLI technologies. However, Nvidia hopes that with the emergence of the code-named “Pascal” GPUs and NVLink bus, it will be considerably easier to build multi-GPU machines.
Today even the top-of-the-range Intel Core i7-5960X processor has only 40 PCI Express 3.0 lanes (up to 40GB/s of bandwidth), thus, can connect up to two graphics cards using PCIe 3.0 x16 or up to four cards using PCIe 3.0 x8 bus. In both cases, maximum bandwidth available for GPU-to-GPU communications will be limited to 16GB/s or 8GB/s (useful bandwidth will be around 12GB/s and 6GB/s) in the best case scenarios since GPUs need to communicate with the CPU too.
In a bid to considerably improve communication speed between GPUs, Nvidia will implement support of proprietary NVLink bus into its next-generation “Pascal” GPUs. Each NVLink point-to-point connection will support 20GB/s of bandwidth in both directions simultaneously (16GB/s effective bandwidth in both directions) and each “Pascal” high-end GPU will support at least four of such links. In case a of a system with NVLink, two GPUs would get a total peak bandwidth of 80GB/s (64GB/s effective) per direction between them. Moreover, PCI Express bandwidth would be preserved for CPU-to-GPU communications. In case of four-GPU sub-system, graphics processors would get up to 40GB/s bandwidth to communicate with each other.
According to Nvidia, NVLink is projected to deliver up to two times higher performance in many applications simply by replacing the PCIe interconnect for communication among peer GPUs. It should be noted that in an NVLink-enabled system, CPU-initiated transactions such as control and configuration are still directed over a PCIe connection, while any GPU-initiated transactions use NVLink, which allows to preserve the PCIe programming model.
Additional bandwidth provided by NVLink could allow one to build a personal computer with up to eight GPUs. However, to make it useful in applications beyond technical computing, Nvidia will have to find a way to efficiently use eight graphics cards for rendering. Since performance scaling beyond two GPUs is generally low, it is unlikely that eight-way multi-GPU technology will actually make it to the market. However, if Nvidia manages to improve efficiency of current multi-GPU technologies in general by replacing SLI [scalable link interface] with NVLink, that could further boost popularity of the company’s graphics cards among gamers.
Performance improvement could be even more significant in systems that completely rely on NVLink instead of PCI Express. IBM plans to add NVLink to select Power microprocessors for supercomputers and the technology will be extremely useful for high-performance servers powered by Nvidia Tesla accelerators.

NVLink is a new feature for Nvidia GPUs that aims to drastically improve performance by increasing the total bandwidth between the GPU and other parts of the system.
In modern PCs, GPUs and numerous other devices are connected by PCI-E lanes to the CPU's or the motherboard's chipset. For some GPUs, using the available PCI-E lanes provides sufficient bandwidth that a bottleneck does not occur, but for high-end GPUs and multi-GPU setups, the number of PCI-E lanes and total bandwidth available is insufficient to meet the needs of the GPU(s) and can cause a bottleneck.

Nvidia NVLink

In an attempt to improve this situation, some motherboard manufacturers will sometimes opt to use PLX chips, which can help better utilize the bandwidth from the PCI-E lanes coming from the CPU, but overall bandwidth does not really increase. Nvidia's solution to this problem is called NVLink.
According to Nvidia, NVLink is the world's first high-speed interconnect technology for GPUs, and it allows data to be transferred between the GPU and CPU five to 12 times faster than PCI-E. Nvidia also claimed that application performance using NVLink can be up to twice as fast, relative to PCI-E.

Programs that utilize the Fast Fourier Transform (FFT) algorithm, which is heavily used in seismic processing, signal processing, image processing and partial differential equations, see the greatest performance increase. These types of applications are heavily used inside of servers and are typically bottlenecked by the PCI-E bus.
Other applications used in various fields of research see performance increases, too. According to Nvidia, one application used to study the behavior of matter by simulating molecular structures, called AMBER, gains up to a 50 percent performance increase using NVLink.

When two GPUs are utilized inside of the same system, they can be joined by four NVLink links, which can provide 20 GB/s transfer per link, totaling 80 GB/s transfer between the two cards. Because the cards no longer need to communicate using some of the scarce PCI-E bandwidth, this frees up additional bandwidth for the CPU to send data to the GPUs.
Nvidia claimed that IBM is currently integrating it into future POWER CPUs, and the U.S. Department of Energy announced that it will utilize NVLink in its next flagship supercomputer.

While AMD is about to launch its Fiji XT-based Radeon R9 Fury X and the respin that will arrive as the Radeon R9 390X, it looks like NVIDIA is already playing around with its next-gen GPU: GP100. GP100 will reportedly rock between 4500 and 6000 CUDA cores, making it NVIDIA's biggest GPU yet.

Right now we have GM200, with the 'M' standing for Maxwell, so the GP100 and its 'P' standing for Pascal. Pascal is NVIDIA's next generation architecture, with the GPU being built on the 16nm process. Not only will Pascal be baked onto 16nm, but it will arrive with support for HBM2 memory, which should see memory bandwidth scaling up to an insane 1.2TB/sec or 1200GB/sec. Considering the GeForce GTX 980 Ti has 336GB/sec, the GTX 1080 Ti (or whatever NVIDIA calls it) could have up to 1.2TB/sec bandwidth, a near 400% increase in memory bandwidth alone.

The news is coming from a source on the Beyond3D forums who says that the 'big Pascal' chip (GP100) has been taped out on TSMC's 16nm process, with a 'target release' window of Q1 2016. We don't know if this is true or not, but I would be pretty sure that NVIDIA is playing around with Pascal right now. I've asked many of my NVIDIA sources about Pascal, 16nm and HBM2 and all I get back are smiles... we should be more excited about the next-gen GPU from NVIDIA than any other release from the company, ever.

16nm is going to really let NVIDIA stretch its legs, HBM2 is going to usher in the largest jump in memory bandwidth NVIDIA has ever had by 300-400% and the Pascal architecture - well, we don't even know what to expect from Pascal. Maxwell introduced many new technologies when it was revealed, but the performance and power consumption side of it was the best work NVIDIA has ever done, so I have very high hopes for Pascal.

We should also expect to see between 16GB and 32GB of HBM2-based VRAM to be offered on the new Pascal-based cards, where I'm sure NVIDIA will up the Titan X 2 (that's what I'm calling it for now) to 32GB from the 12GB of VRAM found on Titan X. I think we'll see the GP104 (the smaller chip) arrive as the GeForce GTX 1080, but I'm going to throw this rumor out there: the numbering system will change for this generation. I've just explained above how this is the most important release from NVIDIA ever, with 16nm, HBM2, new architecture and more - so start getting excited, folks!

AMD made history earlier this month by being the first major GPU vendor to ship HBM with their top end Fury and Fury X graphics cards. Nvidia however, has been absent so far, waiting on HBM2, a more advanced version of the HBM1 shipping with Fury(X), before getting into the new tech. According to a report though, AMD is leveraging their deal with SK Hynix to get priority access to HBM2 in time for their upcoming Arctic Islands GPUs.
While HBM1 is limited to 4GB and 512GB/s, HBM2 increases those numbers significantly with up to 16/32GB of VRAM and over 1024 GB/s. Like HBM1, HBM2 is expected to be in limited supply at launch. If AMD has priority for HBM2, and the stocks are low, it may mean that Nvidia practically won’t be able to use HBM2 until the supply improves enough that AMD can’t use what is available. This might create a de facto exclusively for AMD, offering a chance for the underdog to dominate with HBM2 GPUs.
If the supply of HBM2 is limited, it could complicate things for Nvidia. Their Pascal architecture is set for 2016 and could be designed for either GDDR5 or HBM2, which vary widely in implementation. Nvidia can choose to go with GDDR5 but risk losing its lead over AMD and inability to refresh with HBM2 later on. If Nvidia does go with HBM2, supply might be heavily constrained, allowing AMD a chance to grab market share. It will be interesting to see both side’s offerings in early 2016 and the choices they make for their lineup.Thank you WCCFTech for providing us with this information

Nvidia’s big Pascal GPU code named GP100 will feature a massive 4096bit bus and four HBM2 stacks each up to 8-Hi. The upcoming Nvidia flagship Pascal chip set to debut on TSMC’s 16nm FinFET process later next year. We have confirmed with our sources that the GPU will be made with two different variations of stacked HBM2 solutions, however both will feature a massive 4096bit memory interface just like AMD’s flagship Fiji GPU launched last month.
The first variation will pack four HBM2 stacks, each will be 4-Hi and will be clocked at 1Ghz. This will go into the traditional consumer GeForce line of GP100 based products. The second variation is also equipped with four HBM2 stacks clocked at 1Ghz, however each will be 8-Hi.
In HBM stacking #-Hi denotes to the number of stacked DRAM dies, however this system does not take into account the additional base die which incorporates logic and memory PHY. So it’s only meant to specify how many DRAM dies are in the stack rather than the total number of chips in the stack. GP100 packages with 8-Hi HBM stacks will be limited to professional products, including Quadro and TESLA GPUs, where huge memory capacities are essential.Nvidia’s Big Pascal Slated For Release Next Year

TSMC’s 16FF+ (FinFET Plus) technology can provide above 65 percent higher speed, around 2 times the density, or 70 percent less power than its 28HPM technology. Comparing with 20SoC technology, 16FF+ provides extra 40% higher speed and 60% power saving. By leveraging the experience of 20SoC technology, TSMC 16FF+ shares the same metal backend process in order to quickly improve yield and demonstrate process maturity for time-to-market value.

Although notably, unlike Nvidia which has confirmed that its Pascal GPUs will be manufactured using TSMC’s 16nm FinFET process, AMD has yet to announced whether the Arctic Islands family of GPUs will be made on TSMC’s 16nm or Samsung’s 14nm process. Both nodes are very similar, so which process AMD ends up using will be primarily dictated by yields and time-to-market.

Also unlike Nvidia, AMD has a much more powerful incentive to launch its next generation of FinFET GPUs first. This is because the company has priority to HBM2 capacity – which is going to be limited initially – as a result of co-inventing the technology with Hynix. By pushing its graphics products to launch first AMD can establish two competitive advantages over its rival. The first obvious advantage is being first to market by launching its products earlier than its rival. But most importantly this enables AMD to capture much of that initial HBM2 capacity away from Nvidia and extend its time-to-market lead substantially. This could create an interesting market dynamic but whether it can succeed remains to be seen.
Obviously Nvidia realizes that this play is in the cards, no pun intended, and will undoubtedly bide its time wisely honing its chips. GP100 has already been taped out so there’s not much that can be done to the chip’s floorplan, however Nvidia can still use that extra time on post-silicon work. Nvidia can also spend more time working on its smaller Pascal chips which haven’t been taped out as of yet.
Apart from HBM2 and 16nm there is one big compute-centric feature that Nvidia will debut with Pascal. And it’s NVLink. Pascal will be the first GPU from the company to support this new proprietary server interconnect.

The technology is aimed at GPU accelerated servers where the cross-chip communication is extremely bandwidth limited and a major system bottleneck. Nvidia states that NV-Link will be up to 5 to 12 times faster than traditional PCIE 3.0 making it a major step forward in platform atomics. Earlier this year Nvidia announced that IBM will be integrating this new interconnect into its upcoming PowerPC server CPUs.

NVIDIA® NVLink™ is a high-bandwidth, energy-efficient interconnect that enables ultra-fast communication between the CPU and GPU, and between GPUs. The technology allows data sharing at rates 5 to 12 times faster than the traditional PCIe Gen3 interconnect, resulting in dramatic speed-ups in application performance and creating a new breed of high-density, flexible servers for accelerated computing.

Unlike Maxwell, Nvidia has laid major focus on compute and GPGPU acceleration with Pascal. The slew of features and new technologies that Nvidia will debut with Pascal emphasize this focus. Including the use of next generation stacked High Bandwidth Memory, high-speed NVLink GPU interconnect and support of mixed precision for the acceleration of mobile applications to push on mobile perf/watt. We can’t wait to see Pascal in action next year, but until then stay tuned for the latest.

We have known for a while now that Nvidia wanted to get onto the HBM wagon too and that is pretty much a given. It’s the next generation video memory and it looks extremely promising so far. The next gen GPU should be the GP100 and it will reportedly rock somewhere between 4500 and 6000 CUDA cores and it will be coupled with HBM2 memory.
The latest news now points towards two new graphics cards: one for the consumer market and one for the server and workstations, again this is almost a logical assumption anyway. The first new GPU will use 4-Hi stacks of HBM2 memory and be aimed at the consumer market while the other business-oriented GPU will feature 8-Hi stacks of HBM2 memory. Effectively that means HBM powered graphics cards from Nvidia with up to 32GB of very fast VRAM.
Whether the new consumer model will be named the GeForce GTX 1080 is currently unknown. It could very well be that Nvidia is working on a new naming too, not only to avoid the thousand digits, but also to underline that is truly is a next generation graphics card; much like AMD has done with their Fury branding.
We’ll make sure to keep you updated as more news and leaks emerge on these new Nvidia 16nm based graphics cards.Thank You TweakTown for providing us with this information