I keep wondering why AMD just doesn't slap a CPU and GPU architecture on the same die?

For instance, If AMD's 77xx class GPU has 1.5 billion transistors and Trinity has 1.3 billion, why not just stick both of them on the same die?Doesn't AMD's 78xx GPUs have a 2.8 billion transistor budget?

One of the major problems with what you're suggesting (an enthusiast class discrete gpu coupled with a cpu of equivalent pedigree) is memory. To have the memory interface necessary to feed the gpu, I think you'd be looking at a LOT more pins on the cpu and traces on the motherboard, and you'd probably need a separate memory interface for the cpu side of things, since the memory used for gpus is high bandwidth, but also high latency whereas cpu memory needs lower latency, and so goes with lower bandwidth as well.

Getting that much on one die is also going to cause a lot of manufacturing headaches because if there's some sort of problem with a single critical area of either the cpu or gpu, say goodbye to a 3 billion transistor chip. That's going to drive up manufacturing costs, which drives up the final price and likely result in a product that's overpriced in comparison to just purchasing the cpu and gpu separately in the traditional manner.

Finally, dissipating that much heat is going to be a major problem, because you're going to have a 65-130 watt processor and a 200-300 watt gpu all concentrated in a single area. It would almost certainly require a cooling setup that vents the exhaust from your monstrous chip outside the case, which might mean a water cooling setup or custom cases or ductwork, which would further drive up the price. You'd probably need some sort of "cooperation" between the cpu and gpu to manage thermals which isn't out of the question since modern cpus with igpus do this, but to be workable and even moderately affordable, I think you'd have to create separate dies for the cpu and the gpu and then mount them on the same package with some sort of interconnect.

The only real advantage I could see is that there would be (or could be) a very high bandwidth, low latency interconnect between the cpu and gpu, and you might be closer to AMD's vision of heterogeneous computing if you made some modifications to exploit the advantages conferred by having an enthusiast class gpu on the same package as the cpu.

I don't think it's a stupid question; I wondered that long ago, although that was before they actually did move the gpu onto the cpu die.

I think the two biggest things standing in the way of a company like AMD putting an enthusiast class GPU and CPU on the same die is TDP and market reception.

For the first, imagine a single die that requires 350W or more of power. Current CPU heatsink/fan combos, even the fancy aftermarket priceir models, are really only designed to dissipate up to about 150W, if that. We are seeing GPU air coolers that are managing 250W+, but these are becoming quite large and noisy when pushed to having to handle anything above 150-200W (ie, at load). As a result, a combo enthusiast die would require quite an obnoxious air cooler in all likelihood (especially if one wanted to overclock).

The added cost in motherboard design to make something that could handle a significantly higher power envelope, higher memory bandwidth package might be another factor AMD doesn't want to risk - and even if feasible, may well drive up the cost of the requisite motherboard compared to currently offered alternatives with separated enthusiast CPU/GPUS, making it a less desirable alternative for customers.

I think the most important reason is market reception. AMD is pretty much the only company that could offer such a product right now. Yet they don't have the best reputation for high end CPUs at the moment. This is the great AMD catch-22. With Trinity, AMD is still playing it relatively safe and following Intel by adding a GPU that doesn't take too much extra resources than a CPU alone, both in cost to AMD and in memory/power requirements. As such, they can still offer Trinity at affordable prices (competitive with Intel's chips). Sure, A few enthusiast may be OK with paying $400 or more for an AMD CPU with an enthusiast class GPU (ie, HD 78XX o greater), but perhaps AMD is just too wary of market reception as a whole of such a product. In other words, perhaps they're afraid it will just rub people the wrong way to pay so much for a CPU that is outperformed by an Intel CPU that is half the cost, regardless of the GPU performance. If AMD's attempt to manipulate the media coverage of Trinity to be more favorable toward the GPU performance side (by encouraging early coverage of GPU performance) is any indication, I think this is likely.

Ironically, if Intel was in the same position GPU-wise, I wouldn't be surprised if such a product was in the works already.

AMD is doing it, intel is doing it - it's the direction the industry is moving in.

But like others have said there are power/heat/memory/space/cost limitations to consider - and even just the fact that computer industry has it's own "momentum" is a thing to consider, it is hard to innovate changes in standards and practices overnight.

But I wouldn't be surprised if in 10-15 years your regular intel or AMD cpu is powerful enough for all you GPU needs - 4k 3d video and powerful 3d rendering for games/CAD.

Which of course has nvida scared like you wouldn't believe - not that they'd tell you that.

Right, I'm aware of Llano and Trinity. I've been following AMD''s fusion for years? (Or seems like years) I've been excited and daydreaming of building a great little all purpose ITX system, but now that I'm seeing the bench marks I'm rather sad. I was expecting performance of at least a low end discrete GPU. That's why I used a 7750 or 7770 in my hypothetical.

I thought that if AMD can manufacture a 3+ billion 7970 die, then why not a modest CPU and a 77xx class IGP.

I can take (Trinity's CPU != i3) but it's graphics, while much better than Intel's, is still not worth loosing a discrete graphic card. In fact, I'm amazed out how many sites are pairing up Trinity with a graphic's card in their benchmarks! Wasn't that the whole point of Trinity?

I'm scared to death of AMD going bankrupt and then Intel will really slow down their graphics push.

CPU based graphics are never going to catch up to a discrete solution in terms of performance.

And I suspect CPU based graphics always will lag 2-3 generations behind a discrete solution in terms of performance.

What however probably will happen is that eventually both CPU graphics and discrete solutions become so powerful that people stop caring - sort of what's happening right now with CPUs. After all a humble i3 is more than enough for 95% of today's business users and a tablet is all that's needed to play casual games and browse the web.

It's still early in the game - if you want to play the latest generation games with good settings you're going to have to get the latest generation graphic cards as well, which means a discrete solution.

The reason they're pairing Trinity up with GPU's is to test the new Piledriver core, especially compared to the largely equivalent i3 (where it loses). Mostly, it was hope.

But don't expect 'CPU graphics'- you mean IGP- to get anywhere near real discrete cards. To get anywhere near the memory bandwidth needed they're going to have to invest more into the motherboard, socket, and CPU die to get more memory channels, which will only be used for intense gaming; the CPUs don't need it. It's a losing proposition from every point of view.

I think memory bandwidth is the single largest constraining factor in getting the performance levels you're talking about, with TDP and die size second and third. A 7750 has 72 GB/s of memory bandwidth, all dedicated to the GPU. The CPUs in the target market for integrated graphics have 16-20 GB/s, and it's shared between the cpu and the gpu. Providing enough bandwidth is going to be expensive in terms of motherboard modifications (you'll probably need at least 8 channels, with all of the related traces and DIMM slots).

10 years down the road, who knows? Possibly. Or it's possible we won't even be using polygonal rendering techniques. I think it'll take longer than that for graphics to get good enough on the igp to essentially kill the discrete market, but it's likely that it wouldn't take a mass exodus to igps even amongst gamers to make discrete graphics an untenable business. The graphics cards manufacturers rely on a large userbase to fund the R&D that goes into each new generation, and to amortize production costs. If say, 20% or 25% of current video card customers just stopped buying cards tomorrow, it might be enough to push at least one of the graphics card players out of the market.

A 7750 has 72 GB/s of memory bandwidth, all dedicated to the GPU. The CPUs in the target market for integrated graphics have 16-20 GB/s, and it's shared between the cpu and the gpu.

Oh. There must be a (very large) gap in my understanding. So the 72 GB/s is over the bus? That would be a PCIe bus? Whereas the 16-20 GB/s would be between the APU and motherboard? So increasing the memory bandwidth on the APU would require adding more pins to the APU which would of course require altering the motherboard etc.?

And are both these 72 GB/s and 16-20 GB/s interfaces communicating with the DRAM? I'm leaving out the Video memory on the discrete graphics card for simplicity. (That's mainly for texture and screen buffering?)

Yup, it basically boils down to heat and memory bandwidth. Providing even modest 3D graphics performance is quite demanding in terms of memory bandwidth; performance of both the GPU and CPU will suffer without dedicated frame buffer and texture storage. So on-die GPU doesn't really make sense for anything above low-mid range.

WhatMeWorry wrote:Oh. There must be a (very large) gap in my understanding. So the 72 GB/s is over the bus? That would be a PCIe bus?

No, it's an internal memory bus between the GPU and the memory chips on the graphics card.

WhatMeWorry wrote:And are both these 72 GB/s and 16-20 GB/s interfaces communicating with the DRAM? I'm leaving out the Video memory on the discrete graphics card for simplicity. (That's mainly for texture and screen buffering?)

That's precisely where you need the most bandwidth. Without dedicated DRAM, it all needs to go over the CPU's memory bus.

The years just pass like trains. I wave, but they don't slow down.-- Steven Wilson

So hypothetically, even if you yanked a 100% CPU off the motherboard and replaced it with a 100% GPU on the motherboard, you'd still have a GPU with only a 16-20 GB/s memory interface.

So this would seem to severely restrict any APU or IGP. In fact, doesn't this beg the question how good could an APU or IGP ever get? Is it even possible (using existing infrastructure) for an APU or IPG to ever get as fast as even the lowest tier enthusiast class discrete video card?

Guess I better put aside my childish dreams of building a great little video card-less game machine. And maybe Intel (even with Haswell or Broadwell) will never be able to encroach on AMD's or NVIDIA's discrete video card business.

As for the memory bandwidth, I don't see that as being the main deterrent in and of itself. Sure it wouldn't work with current cheaper APU designs that share system memory. Since an entirely new motherboard socket would need to be designed anyway, there is no reason AMD couldn't keep high speed frame buffer memory separate from system memory on the motherboard. Heck, AMD could even start marketing expensive GDDR5 (or equivalent) modules that you plug into a separate slot, or just keep the frame buffer memory onboard the APU package. But then, of course, you'd have something resembling a video card in place of the CPU socket - which, given the TDP, would probably be more expensive to implement and cool than the current CPU/video card combo. And wouldn't really save any room or allow for a much more compact enclosure.

Then you're still faced with the current market perception of AMD CPUs. Right now, most enthusiasts craving performance would just prefer an i-5/i-7 CPU. For AMD, such an endeavor at this stage would be quite risky, even if R&D and production costs of such a high performing APU package were reasonable.

cynan wrote:Since an entirely new motherboard socket would need to be designed anyway, there is no reason AMD couldn't keep high speed frame buffer memory separate from system memory on the motherboard.

Problem with this is that you'd need a socket with several hundred additional pins to provide the sort of memory bandwidth you'd need. Not impossible, but you'd either need a really huge socket (and correspondingly large CPU package), or have to go to more closely spaced pins/pads.

The years just pass like trains. I wave, but they don't slow down.-- Steven Wilson

codedivine wrote:An alternative possibility is to look at GPU architectures that require less memory bandwidth. Tile based renderers might help?

Possibly. I am not certain by how much (if any) tile-based rendering reduces memory bandwidth requirements.

The only reference I have is this paper: http://academic.research.microsoft.com/ ... 91556.aspxThey claim a tile-based renderer can provide a 2x reduction in memory bandwidth compared to "traditional architectures". This work is a bit old though, not really based upon the modern pipelines and modern workloads so not really sure how much of it is applicable in 2012.