Though information about the technology has been making rounds over the last several weeks, GDDR5X technology finally gets official with an announcement from JEDEC this morning. The JEDEC Solid State Foundation is, as Wikipedia tells us, an "independent semiconductor engineering trade organization and standardization body" that is responsible for creating memory standards. Getting the official nod from the org means we are likely to see implementations of GDDR5X in the near future.

The press release is short and sweet. Take a look.

ARLINGTON, Va., USA – JANUARY 21, 2016 –JEDEC Solid State Technology Association, the global leader in the development of standards for the microelectronics industry, today announced the publication of JESD232 Graphics Double Data Rate (GDDR5X) SGRAM. Available for free download from the JEDEC website, the new memory standard is designed to satisfy the increasing need for more memory bandwidth in graphics, gaming, compute, and networking applications.

Derived from the widely adopted GDDR5 SGRAM JEDEC standard, GDDR5X specifies key elements related to the design and operability of memory chips for applications requiring very high memory bandwidth. With the intent to address the needs of high-performance applications demanding ever higher data rates, GDDR5X is targeting data rates of 10 to 14 Gb/s, a 2X increase over GDDR5. In order to allow a smooth transition from GDDR5, GDDR5X utilizes the same, proven pseudo open drain (POD) signaling as GDDR5.

“GDDR5X represents a significant leap forward for high end GPU design,” said Mian Quddus, JEDEC Board of Directors Chairman. “Its performance improvements over the prior standard will help enable the next generation of graphics and other high-performance applications.”

JEDEC claims that by using the same signaling type as GDDR5 but it is able to double the per-pin data rate to 10-14 Gb/s. In fact, based on leaked slides about GDDR5X from October, JEDEC actually calls GDDR5X an extension to GDDR5, not a new standard. How does GDDR5X reach these new speeds? By doubling the prefech from 32 bytes to 64 bytes. This will require a redesign of the memory controller for any processor that wants to integrate it.

As for usable bandwidth, though information isn't quoted directly, it would likely see a much lower increase than we are seeing in the per-pin statements from the press release. Because the memory bus width would remain unchanged, and GDDR5X just grabs twice the chunk sizes in prefetch, we should expect an incremental change. No mention of power efficiency is mentioned either and that was one of the driving factors in the development of HBM.

Performance efficiency graph from AMD's HBM presentation

I am excited about any improvement in memory technology that will increase GPU performance, but I can tell you that from my conversations with both AMD and NVIDIA, no one appears to be jumping at the chance to integrate GDDR5X into upcoming graphics cards. That doesn't mean it won't happen with some version of Polaris or Pascal, but it seems that there may be concerns other than bandwidth that keep it from taking hold.

The 30th Game Developers Conference (GDC) will take place on March 14th through March 18th, with the expo itself starting on March 16th. The sessions have been published at some point, with DX12 and Vulkan prominently featured. While the technologies have not been adopted as quickly as advertised, the direction is definitely forward. In fact, NVIDIA, Khronos Group, and Valve have just finished hosting a developer day for Vulkan. It is coming.

One interesting session will be hosted by Codemasters and Intel, which discusses bringing the F1 2015 engine to DirectX 12. It will highlight a few features they implemented, such as voxel based raytracing using conservative rasterization, which overestimates the size of individual triangles so you don't get edge effects on pixels that are partially influenced by an edge that cuts through a tiny, but not negligible, portion of them. Sites like Game Debate (Update: Whoops, forgot the link) wonder if these features will be patched in to older titles, like F1 2015, or if they're just R&D for future games.

Another keynote will discuss bringing Vulkan to mobile through Unreal Engine 4. This one will be hosted by ARM and Epic Games. Mobile processors have quite a few cores, albeit ones that are slower at single-threaded tasks, and decent GPUs. Being able to keep them loaded will bring their gaming potential up closer to the GPU's theoretical performance, which has surpassed both the Xbox 360 and PlayStation 3, sometimes by a factor of 2 or more.

Many (most?) slide decks and video recordings are available for free after the fact, but we can't really know which ones ahead of time. It should be an interesting year, though.

It's nice to see long-term roundups every once in a while. They do not really provide useful information for someone looking to make a purchase, but they show how our industry is changing (or not). In this case, Phoronix tested twenty-seven NVIDIA GeForce cards across four architectures: Tesla, Fermi, Kepler, and Maxwell. In other words, from the GeForce 8 series all the way up to the GTX 980 Ti.

Nine years of advancements in ASIC design, with a doubling time-step of 18 months, should yield a 64-fold improvement. The number of transistors falls short, showing about a 12-fold improvement between the Titan X and the largest first-wave Tesla, although that means nothing for a fabless semiconductor designer. The main reason why I include this figure is to show the actual Moore's Law trend over this time span, but it also highlights the slowdown in process technology.

Performance per watt does depend on NVIDIA though, and the ratio between the GTX 980 Ti and the 8500 GT is about 72:1. While this is slightly better than the target 64:1 ratio, these parts are from very different locations in their respective product stacks. Swapping the 8500 GT for the following year's 9800 GTX, which leads to a comparison between top-of-the-line GPUs of their respective times, and you see a 6.2x improvement in performance per watt versus the GTX 980 Ti. On the other hand, that part was outstanding for its era.

I should note that each of these tests take place on Linux. It might not perfectly reflect the landscape on Windows, but again, it's interesting in its own right.

NVIDIA is reportedly preparing faster mobile GPUs based on Maxwell, with a GTX 980MX and 970MX on the way.

The new GTX 980MX would sit between the GTX 980M and the laptop version of the full GTX 980, with 1664 CUDA cores (compared to 1536 with the 980M), 104 Texture Units (up from the 980M's 96), a 1048 MHz core clock, and up to 8 GB of GDDR5. Memory speed and bandwidth will reportedly be identical to the GTX 980M at 5000 MHz and 160 GB/s respectively, with both GPUs using a 256-bit memory bus.

The GTX 970MX represents a similar upgrade over the existing GTX 970M, with CUDA Core count increased from 1280 to 1408, Texture Units up from 80 to 88, and 8 additional raster devices available (56 vs. 48). Both the 970M and 970MX use 192-bit GDDR5 clocked at 5000 MHz, and available with the same 3 GB or 6 GB of frame buffer.

WCCFtech prepared a chart to demonstrate the differences between NVIDIA's mobile offerings:

Model

GeForce GTX 980 Laptop Version

GeForce GTX 980MX

GeForce GTX 980M

GeForce GTX 970MX

GeForce GTX 970M

GeForce GTX 965M

GeForce GTX 960M

Architecture

Maxwell

Maxwell

Maxwell

Maxwell

Maxwell

Maxwell

Maxwell

GPU

GM204

GM204

GM204

GM204

GM204

GM204

GM107

CUDA Cores

2048

1664

1536

1408

1280

1024

640

Texture Units

128

104

96

88

80

64

40

Raster Devices

64

64

64

56

48

32

16

Clock Speed

1218 MHz

1048 MHz

1038 MHz

941 MHz

924 MHz

950 MHz

1097 MHz

Memory Bus

256-bit

256-bit

256-bit

192-bit

192-bit

128-bit

128-bit

Frame Buffer

8 GB GDDR5

8/4 GB GDDR5

8/4 GB GDDR5

6/3 GB GDDR5

6/3 GB GDDR5

4 GB GDDR5

4 GB GDDR5

Memory Frequency

7008 MHz

5000 MHz

5000 MHz

5000 MHz

5000 MHz

5000 MHz

5000 MHz

Memory Bandwidth

224 GB/s

160 GB/s

160 GB/s

120 GB/s

120 GB/s

80 GB/s

80 GB/s

TDP

~150W

125W

125W

100W

100W

90W

75W

These new GPUs will reportedly be based on the same Maxwell GM204 core, and TDPs are apparently unchanged at 125W for the GTX 980MX, and 100W for the 970MX.

UltraWide G-Sync Arrives

When NVIDIA first launched G-Sync monitors, they had the advantage of being first to literally everything. They had the first variable refresh rate technology, the first displays of any kind that supported it and the first ecosystem to enable it. AMD talked about FreeSync just a few months later, but it wasn't until March of 2015 that we got our hands on the first FreeSync enabled display, and it was very much behind the experience provided by G-Sync displays. That said, what we saw with that launch, and continue to see as time goes on, is that there are a much higher quantity of FreeSync options, with varying specifications and options, compared to what NVIDIA has built out.

This is important to note only because, as we look at the Acer Predator X34 monitor today, the first 34-in curved panel to support G-Sync, it comes 3 months after the release of the similarly matched monitor from Acer that worked with AMD FreeSync. The not-as-sexyily-named Acer XR341CK offers a 3440x1440 resolution, 34-in curved IPS panel and a 75Hz refresh rate.

But, as NVIDIA tends to do, they found a way to differentiate its own products, with the help of Acer. The Predator X34 monitor has a unique look and style to it, and it improves the maximum refresh rate to 100Hz (although that is considered overclocking). The price is a bit higher too, coming in at $1300 or so on Amazon.com; the FreeSync-enabled XR341CK monitor sells for just $941.

NVIDIA has been pushing for WHQL certification for their drivers, but sometimes issues slip through QA, both at Microsoft and their own, internal team(s). Sometimes these issues will be fixed in a future release, but sometimes they push out a “HotFix” driver immediately. This is often great for people who experience the problems, but they should not be installed otherwise.

In this case, GeForce Hotfix driver 361.60 fixes two issues. One is listed as “install & clocking related issues,” which refers to the GPU memory clock. According to Manuel Guzman of NVIDIA, some games and software was not causing the driver to fully wake the memory clock to a high-performance state. The other issue is “Crashes in Photoshop & Illustrator,” which fixes blue screen issues in both software, and possibly other programs that use the GPU in similar ways. I've never seen GeForce Driver 361.43 cause a BSOD in Photoshop, but I am a few versions behind with CS5.5.

How much information can be gleaned from an import shipping manifest (linked here)? The data indicates a chip with a 37.5 x 37.5 mm package and 2152 pins, which is being attributed to the GP104 based on knowledge of “earlier, similar deliveries” (or possible inside information). This has prompted members of the 3dcenter.org forums (German language) to speculate on the use of GDDR5 or GDDR5X memory based on the likelihood of HBM being implemented on a die of this size.

Of course, NVIDIA has stated that Pascal will implement 3D memory, and the upcoming GP100 will reportedly be on a 55 x 55 mm package using HBM2. Could this be a new, lower-cost part using the existing GDDR5 standard or the faster GDDR5X instead? VideoCardz and WCCFtech have posted stories based on the 3DCenter report, and to quote directly from the VideoCardz post on the subject:

"3DCenter has a theory that GP104 could actually not use HBM, but GDDR5(X) instead. This would rather be a very strange decision, but could NVIDIA possibly make smaller GPU (than GM204) and still accommodate 4 HBM modules? This theory is not taken from the thin air. The GP100 aka the Big Pascal, would supposedly come in 55x55mm BGA package. That’s 10mm more than GM200, which were probably required for additional HBM modules. Of course those numbers are for the whole package (with interposer), not just the GPU."

All of this is a lot to take from a shipping record that might not even be related to an NVIDIA product, but the report has made the rounds at this point so now we’ll just have to wait for new information.

NVIDIA has just announced a new game bundle. If you purchase an NVIDIA GeForce GTX 970, GTX 980 desktop or mobile, GTX 980 Ti, GTX 980M, or GTX 970M, then you will receive a free copy of Rise of the Tomb Raider. As always, make sure the retailer is selling the participating card. If the product has a download code, it will be specially marked. NVIDIA will not upgrade non-participating stock to the bundle.

Rise of the Tomb Raider will go live on January 29th. It was originally released in November as an Xbox One timed exclusive. It will also arrive on the PlayStation 4, but not until “holiday,” which is probably around Q4 (or maybe late Q3).

If you purchase the bundle, then you graphics card will obviously be powerful enough to run the game. At a minimum, you will require a GeForce GTX 650 (2GB) or an AMD HD 7770 (2GB). The CPU needs are light too, requiring just a Sandy Bridge Core i3 (Intel Core i3-2100) or AMD's equivalent. Probably the only concern would be the minimum of 6GB system RAM, which also requires a 64-bit operating system. Now that the Xbox 360 and PlayStation 3 have been deprecated, 32-bit gaming will be increasingly rare for “AAA” titles. That said, we've been ramping up to 64-bit for the last decade. one of the first games that supported x86-64 was Unreal Tournament 2004.

Other than the in-depth discussion from NVIDIA on the Drive PX 2 and its push into autonomous driving, NVIDIA didn't have much other news to report. We stopped by the suite and got a few updates on SHIELD and the company's VR Ready program to certify systems that meet minimum recommended specifications for a solid VR experience.

For the SHIELD, NVIDIA is bringing Android 6.0 Marshmallow to the device, with new features like shared storage and the ability to customize the home screen of the Android TV interface. Nothing earth shattering and all of it is part of the 6.0 rollout.

The VR Ready program from NVIDIA will validate notebooks, systems and graphics cards that have the amount of horsepower to meet the minimum performance levels for a good VR experience. At this point, the specs essentially match up with what Oculus has put forth: a GTX 970 or better on the desktop and a GTX 980 (full, not 980M) on mobile.

Other than that, Ken and I took in some of the more recent VR demos including Epic's Bullet Train on the final Oculus Rift and Google's Tilt Brush on the latest iteration of the HTC Vive. Those were both incredibly impressive though the Everest demo that simulates a portion of the mountain climb was the one that really made me feel like I was somewhere else.