As Intel's next generation enthusiast desktop platfom gets closer to fruition, several leaks (such as Gigabyte's X99 manual) and motherboard teasers have surfaced on the Internet. A few days ago, EVGA posted a teaser photograph of an upcoming "next generation" Micro ATX motherboard on its Instagram page.

The so-called EVGA X99 Micro is set to be the company's smallest Classified-branded X99 chipset offering supporting multiple graphics cards, DDR4 memory, and (of course) Intel's upcoming Haswell-E processors. The all-black motherboard features black heatsinks over the PCH and power delivery hardware. It is outfitted with a 10-phase VRM that feeds the CPU socket (socket 2011-3), two DDR4 memory sockets on each side of the processor socket, three PCI-E 3.0 x16 slots (just enough to max out a Core i7-5820K), one M.2 connector, and six SATA III 6Gbps ports. The board will support USB 3.0 and USB 2.0 ports, but beyond that it is difficult to say what the exact rear IO port configuration will be as a metal shield blocks off the ports in the teaser photo. There is an eight pin CPU power connector along with a 24-pin ATX connector for getting power to the board. Overclockers will be further pleased to see physical power and reset buttons.

According to Maximum PC, this pint sized Classified motherboard will be priced around $250 USD making it one of the most expensive mATX motherboards around. As part of EVGA's Classified series, it should be packing plenty of overclocking friendly features in the UEFI firmware and hardware build quality. This could make for one heck of a powerful small form factor system though, and I'm looking forward to seeing what people are able to get out of this board (especially when it comes to overclocking Haswell-E)!

Haswell-E, with its X99 chipset, are expected to launch soon. This will bring a new spread of processors and motherboards to the high-end, enthusiast market. These are the processors that fans of Intel should buy if they have money, want all the RAM, and have a bunch of PCIe expansion cards to install.

The Intel enthusiast platform typically has 40 PCIe lanes, while the mainstream platform has 16. For Haswell-E, the Core i7-5820K will be the exception. According to Gigabyte's X99 manual, the four, full-sized PCIe slots will have the following possible configurations:

Core i7-5930K

(and above)

First Slot

(PCIe 1)

Second Slot

(PCIe 4)

Third Slot

(PCIe 2)

Fourth Slot

(PCIe 3)

16x

Unused

16x

8x

8x

8x

16x

8x

Core i7-5820K

First Slot

(PCIe 1)

Second Slot

(PCIe 4)

Third Slot

(PCIe 2)

Fourth Slot

(PCIe 3)

16x

Unused

8x

4x

8x

8x

8x

4x

If you count the PCIe x1 slots, the table would refer to the first, third, fifth, and seventh slots.

To me, this is not too bad. You are able to use three GPUs with eight-lane bandwidth and stick a four-lane PCIe SSD on the last slot. Considering that each lane is PCIe 3.0, it is similar to having three PCIe 2.0 x16 slots. While two-way and three-way SLI is supported on all CPUs, four-way SLI is only allowed with processors that provide forty lanes of PCIe 3.0.

Gigabyte also provides three PCIe 2.0 x1 slots, which are not handled by the CPU and do not count against its available lanes.

Since I started to write up this news post, Gigabyte seems to have replaced their manual with a single, blank page. Thankfully, I was able to have it cached long enough to finish my thoughts. Some sites claim that the manual failed to mention the 8-8-8 configuration and suggested that configurations of three GPUs were impossible. That is not true; the manual refers to these situations, just not in the most clear of terms.

Haswell-E should launch soon, with most rumors pointing to the end of the month.

At $145 the ASUS Z97-A is an inexpensive base for a system and yet it still offers quite a few higher end features. Three PCIe 3.0 16x slots that support Crossfire and SLI along with a pair of both PCIe 2.0 1x and legacy PCI slots allow for a variety of configurations. The half dozen SATA 6Gbps ports are simply expected now but the addition of an M.2 port is a welcome enhancement. When [H]ard|OCP overclocked their 4790K in this board they could almost hit 4.8GHz but ended up with 4.7GHz as the best overclock which makes it perfect for the price conscious consumer. Read their full review of this Gold winning motherboard here.

"While ASUS is usually known for motherboards like the Maximus and Rampage Extreme series’ or even feature rich solutions like the Z97-Deluxe it is motherboards like the Z97-A that are ASUS’ bread and butter. Shopping for an inexpensive motherboard doesn’t have to mean accepting poor quality feature stripped solutions."

Intel's first generation low powered SoC which goes by the name of Galileo and is powered by a 400MHz Quark X1000 is now capable of running Windows with the help of the latest firmware update. Therefore if you are familiar enough with their tweaked Arduino IDE you should be able to build a testbed for low powered machines that will be running Windows. You will want to have some time on hand, loading Windows to the microSD card can take up to two hours and those used to SSDs will be less than impressed with the boot times. For developers this is not an issue and well worth the wait as it gives them a brand new tool to work with. Pop by The Register for the full details of the firmware upgrade and installation process.

"Windows fans can run their OS of choice on Intel’s counter to Raspberry Pi, courtesy of an Intel firmware update."

Let's be clear: there are two stories here. The first is the release of OpenGL 4.5 and the second is the announcement of the "Next Generation OpenGL Initiative".They both occur on the same press release, but they are two, different statements.

OpenGL 4.5 Released

OpenGL 4.5 expands the core specification with a few extensions. Compatible hardware, with OpenGL 4.5 drivers, will be guaranteed to support these. This includes features like direct_state_access, which allows accessing objects in a context without binding to it, and support of OpenGL ES3.1 features that are traditionally missing from OpenGL 4, which allows easier porting of OpenGL ES3.1 applications to OpenGL.

It also adds a few new extensions as an option:

ARB_pipeline_statistics_querylets a developer ask the GPU what it has been doing. This could be useful for "profiling" an application (list completed work to identify optimization points).

ARB_sparse_bufferallows developers to perform calculations on pieces of generic buffers, without loading it all into memory. This is similar to ARB_sparse_textures... except that those are for textures. Buffers are useful for things like vertex data (and so forth).

ARB_transform_feedback_overflow_queryis apparently designed to let developers choose whether or not to draw objects based on whether the buffer is overflowed. I might be wrong, but it seems like this would be useful for deciding whether or not to draw objects generated by geometry shaders.

KHR_blend_equation_advanced allows new blending equations between objects. If you use Photoshop, this would be "multiply", "screen", "darken", "lighten", "difference", and so forth. On NVIDIA's side, this will be directly supported on Maxwell and Tegra K1 (and later). Fermi and Kepler will support the functionality, but the driver will perform the calculations with shaders. AMD has yet to comment, as far as I can tell.

Next Generation OpenGL Initiative Announced

The Khronos Group has also announced "a call for participation" to outline a new specification for graphics and compute. They want it to allow developers explicit control over CPU and GPU tasks, be multithreaded, have minimal overhead, have a common shader language, and "rigorous conformance testing". This sounds a lot like the design goals of Mantle (and what we know of DirectX 12).

And really, from what I hear and understand, that is what OpenGL needs at this point. Graphics cards look nothing like they did a decade ago (or over two decades ago). They each have very similar interfaces and data structures, even if their fundamental architectures vary greatly. If we can draw a line in the sand, legacy APIs can be supported but not optimized heavily by the drivers. After a short time, available performance for legacy applications would be so high that it wouldn't matter, as long as they continue to run.

Add to it, next-generation drivers should be significantly easier to develop, considering the reduced error checking (and other responsibilities). As I said on Intel's DirectX 12 story, it is still unclear whether it will lead to enough performance increase to make most optimizations, such as those which increase workload or developer effort in exchange for queuing fewer GPU commands, unnecessary. We will need to wait for game developers to use it for a bit before we know.

Along with GDC Europe and Gamescom, Siggraph 2014 is going on in Vancouver, BC. At it, Intel had a DirectX 12 demo at their booth. This scene, containing 50,000 asteroids, each in its own draw call, was developed on both Direct3D 11 and Direct3D 12 code paths and could apparently be switched while the demo is running. Intel claims to have measured both power as well as frame rate.

Variable power to hit a desired frame rate, DX11 and DX12.

The test system is a Surface Pro 3 with an Intel HD 4400 GPU. Doing a bit of digging, this would make it the i5-based Surface Pro 3. Removing another shovel-load of mystery, this would be the Intel Core i5-4300U with two cores, four threads, 1.9 GHz base clock, up-to 2.9 GHz turbo clock, 3MB of cache, and (of course) based on the Haswell architecture.

While not top-of-the-line, it is also not bottom-of-the-barrel. It is a respectable CPU.

Intel's demo on this processor shows a significant power reduction in the CPU, and even a slight decrease in GPU power, for the same target frame rate. If power was not throttled, Intel's demo goes from 19 FPS all the way up to a playable 33 FPS.

Intel will discuss more during a video interview, tomorrow (Thursday) at 5pm EDT.

Maximum power in DirectX 11 mode.

For my contribution to the story, I would like to address the first comment on the MSDN article. It claims that this is just an "ideal scenario" of a scene that is bottlenecked by draw calls. The thing is: that is the point. Sure, a game developer could optimize the scene to (maybe) instance objects together, and so forth, but that is unnecessary work. Why should programmers, or worse, artists, need to spend so much of their time developing art so that it could be batch together into fewer, bigger commands? Would it not be much easier, and all-around better, if the content could be developed as it most naturally comes together?

That, of course, depends on how much performance improvement we will see from DirectX 12, compared to theoretical max efficiency. If pushing two workloads through a DX12 GPU takes about the same time as pushing one, double-sized workload, then it allows developers to, literally, perform whatever solution is most direct.

Maximum power when switching to DirectX 12 mode.

If, on the other hand, pushing two workloads is 1000x slower than pushing a single, double-sized one, but DirectX 11 was 10,000x slower, then it could be less relevant because developers will still need to do their tricks in those situations. The closer it gets, the fewer occasions that strict optimization is necessary.

If there are any DirectX 11 game developers, artists, and producers out there, we would like to hear from you. How much would a (let's say) 90% reduction in draw call latency (which is around what Mantle claims) give you, in terms of fewer required optimizations? Can you afford to solve problems "the naive way" now? Some of the time? Most of the time? Would it still be worth it to do things like object instancing and fewer, larger materials and shaders? How often?

Transactional Synchronization Extensions, aka TSX, are a backwards compatible set of instructions which first appeared in some Haswell chips as a method to improve concurrency and multi-threadedness with as little work for the programmer as possible. It was intended to improve the scaling of multi-threaded apps running on multi-core processors and has not yet been widely adopted. The adoption has run into another hurdle, in some cases the use of TSX can cause critical software failures and as a result Intel will be disabling the instruction set via new BIOS/UEFI updates which will be pushed out soon. If your software uses the new instruction set and you wish it to continue to do so you should avoid updating your motherboard BIOS/UEFI and ask your users to do the same. You can read more about this bug/errata and other famous problems over at The Tech Report.

"The TSX instructions built into Intel's Haswell CPU cores haven't become widely used by everyday software just yet, but they promise to make certain types of multithreaded applications run much faster than they can today. Some of the savviest software developers are likely building TSX-enabled software right about now."

Coming in 2014: Intel Core M

The era of Broadwell begins in late 2014 and based on what Intel has disclosed to us today, the processor architecture appears to be impressive in nearly every aspect. Coming off the success of the Haswell design in 2013 built on 22nm, the Broadwell-Y architecture will not only be the first to market with a new microarchitecture, but will be the flagship product on Intel’s new 14nm tri-gate process technology.

The Intel Core M processor, as Broadwell-Y has been dubbed, includes impressive technological improvements over previous low power Intel processors that result in lower power, thinner form factors, and longer battery life designs. Broadwell-Y will stretch into even lower TDPs enabling 9mm or small fanless designs that maintain current battery lifespans. A new 2nd generation FIVR with modified power delivery design allows for even thinner packaging and a wider range of dynamic frequencies than before. And of course, along with the shift comes an updated converged core design and improved graphics performance.

All of these changes are in service to what Intel claims is a re-invention of the notebook. Compared to 2010 when the company introduced the original Intel Core processor, thus redirecting Intel’s direction almost completely, Intel Core M and the Broadwell-Y changes will allow for some dramatic platform changes.

Notebook thickness will go from 26mm (~1.02 inches) down to a small as 7mm (~0.27 inches) as Intel has proven with its Llama Mountain reference platform. Reductions in total thermal dissipation of 4x while improving core performance by 2x and graphics performance by 7x are something no other company has been able to do over the same time span. And in the end, one of the most important features for the consumer, is getting double the useful battery life with a smaller (and lighter) battery required for it.

But these kinds of advancements just don’t happen by chance – ask any other semiconductor company that is either trying to keep ahead of or catch up to Intel. It takes countless engineers and endless hours to build a platform like this. Today Intel is sharing some key details on how it was able to make this jump including the move to a 14nm FinFET / tri-gate transistor technology and impressive packaging and core design changes to the Broadwell architecture.

Intel 14nm Technology Advancement

Intel consistently creates and builds the most impressive manufacturing and production processes in the world and it has helped it maintain a market leadership over rivals in the CPU space. It is also one of the key tenants that Intel hopes will help them deliver on the world of mobile including tablets and smartphones. At the 22nm node Intel was the first offer 3D transistors, what they called tri-gate and others refer to as FinFET. By focusing on power consumption rather than top level performance Intel was able to build the Haswell design (as well as Silvermont for the Atom line) with impressive performance and power scaling, allowing thinner and less power hungry designs than with previous generations. Some enthusiasts might think that Intel has done this at the expense of high performance components, and there is some truth to that. But Intel believes that by committing to this space it builds the best future for the company.