Two major pieces of AMD news have crossed the wire today, and both could be good news for the struggling chip company. First, AMD is announcing a major reorganization of its graphics division. The entire graphics team will now be headed by Raja Koduri, including all aspects of GPU architecture, hardware design, driver deployments, and developer relations. Koduri left AMD for Apple in 2009, only to return to the company in 2013. Since then, he’s served as the Corporate Vice President of Visual Computing.

Now, Raja is being promoted to senior vice president and chief architect of AMD’s entire graphics business (dubbed the Radeon Technologies Group). In this new role, Koduri will oversee the development of future console hardware, AMD’s FirePro division, the GPU side of APUs, and all of AMD’s graphics designs on 14/16nm. Bringing all of these elements under one roof, along with developer relations and driver development, will allow AMD to unify its approach to various products that have previously been managed by different departments. This could pay significant dividends in areas like driver management and feature updates, which have previously been handled by other teams that reported to different managers. Koduri is well-respected in the industry and we’ve heard that the R9 Nano, which debuts in the very near future, was a project he championed at AMD.

Based on what we’ve heard, AMD isn’t just shuffling employees on a spreadsheet — it’s looking to increase its investment in graphics products as well. While we wouldn’t expect the company to suddenly hurl huge amounts of money at the concept, this is an excellent time to make prudent additional expenditures in the GPU market. 14/16nm GPUs will come to market next year, with significant performance and power consumption improvements. The advent of HBM2 will allow for larger frame buffers and could turbo-charge AMD’s future Zen-based APUs. If AMD seizes these technological opportunities and capitalizes on the recent launch of DirectX 12, it’ll be in a much stronger competitive position 12-18 months from now.

Did Silver Lake acquire part of AMD?

There’s a rumor making the rounds today that Silver Lake Partners may have purchased a significant stake in AMD. Fudzilla reports that Silver Lake Partners, which owns significant shares in Avagao, Alibaba, and Dell, may have purchased a 20% share of the company. Such a move would inject much-needed capital into AMD and likely negate the need to take on additional debt to finance continuing operations.

If the rumor proves true, it suggests that Silver Lake saw something in AMD’s future roadmap that they felt justified the investment stake. Such an announcement would likely buoy the company’s rather battered stock price, while the fresh cash injection could help AMD hit its production targets for 2016 and beyond. Even if Zen and AMD’s first ARM-based hardware both hit the ground running in 2016, it’ll take time for AMD to rebuild its overall market position.

Tagged In

ok awesome… except!! Last I heard Zen was something different from their arm hybrid project(which they cancelled) Zen is their next cpu line to move passed fx as their higher end cpus

Daniel Anderson

Hopefully I gave give a little info, you can find much more on various tech websites and sites such as seeking alpha. Zen is moving back to SMT architecture which was last used in their Phenom II series. However, this time they’ll employ a type of Hyperthreading (see: Intel) as well as use all instruction sets used by Intel, dropping some of AMD’s own bulldozer instruction set. They are basically trying to bridge the gap that they’ve suffered of not having their instruction sets adopted. They will be gaining 8 physical cores on the desktop and 16+ on server side. Hyperthreading will double these. The biggest piece isn’t the fact that they’ll be throwing more cores, but they’ll be stacking the cores on top of the solution – which on paper should be a more efficient chip. IPC is said to be 40% better than excavator to boot, which isn’t tough to do seeing they are drastically different. Their chip is being created for Servers first and foremost, then it will be trickling to the home user from there. The significance of this is that they’ll be maximizing efficiency (high performance and low TDP) first and foremost, so theoretically the end user should receive a decent product. They haven’t done this since their Athlon 64 series chips IIRC (Latest would potentially be barcelona).

Most of this can be considered rumor, however hopefully it sheds a little bit of light. Overall it can be verified they are taking 2016 seriously for both Graphics and CPU regardless if some of my claims are false, it will make or break them IMO.

yesfull_man

now all that info.. I did not know! oh man! that sounds amazing! all that and they will be keeping their 8 cores! I defently like that they are looking back to their older archatexture.. I did feel like it was a more stable ones used with fx..I would love to see what happens.

I do agree amd is starting to make a much tougher push again.. which I am excited for! my next upgrade is going to be next year so I am excited. thanks for the info man! Zen SOUNDS AWESOME!

Daniel Anderson

Yes it does. I highly recommend waiting for benchmarks, Reviews of INtel VS AMD CPU performance and what not. But I think it’ll be safe to say that if AMD doesn’t even get close and at least compete again at the price vs performance they’ll be done for. They have plenty of growth opportunity.

Domaldel

Well, the rumors put the Zen ship roughly in the Haswell ballpark as far as single threaded performance is concerned.
That will of course be a bit behind the newest and best of Intel in 2016 but with a good price it should be good enough for most of us I think.
Especially when you look at their core counts (and thread count).
And when you look at the possibilities of their APUs with the tech they got in the company now…
I mean, the top CPU from them for the consumer socket will probably be the 16 core 32 thread CPU.
But the same socket should also allow us to run a 8 core 16 thread APU, and that APU could potentially have on-board high bandwidth memory shared between the GPU and CPU due to the HSA.
And considering that bandwidth traditionally have been the achilles heel of APUs this could potentially allow them to dramatically increase the number of GPU cores on their APUs.

Now that could allow them to do some amazing things if developers actually choose to sit down and look at these things and make programs utilizing the possibilities this could afford.

Of course there’s a fair bit of speculation here, and it’s possible that this won’t all be in on the first generation, but they have all the pieces, so…

Daniel Anderson

I could be wrong but most indications show that 8 phy-core max for desktop and then 16 phy-core for server 2016. I’d love to see a ZEN APU with HBM in 2016, but from what I’ve read it looks like excavator will be it. Zen APU for servers in 2016, Zen APU (+/- HBM) for desktops 2017.

Just the fact of so many cores/threads that I’m goign to love most is the AI in video games can be made much better. Hopefully PC industry booms in 2016.

Domaldel

Um, nope I believe you’ve missed out on quite a bit of information.
You’re right about the desktop Zen APU not being out till 2017, but the Zen desktop CPU will come out in 2016.
And for the CPU you’ll get a max core count of 16 cores, for the APU however you’re limited to 8 CPU cores as the rest of the power and heat is reserved for the iGPU.
As for thread count we’re talking 2 x core count.

For the server they plan even more cores.
Heck, there’s even talk of a 32 core (64 CPU thread and who knows how many GPU cores and threads) server “EHP” (basically just APU on steroids) for the high end power computing/super computer crowd…
That’s not something us regular mortals will ever see though…

Of course those EHPs are meant to be a part of a super computer with houndreds of processors, each with 32 cores, 64 threads, according to one source about 32 GB of onboard high bandwidth memory, memory controller to access more then 1 TB (not Tb but real TB) of DDR4 memory, a iGPU that we don’t know the exact power of but that has been speculated might be roughly on par with a 290X and so one and so forth…
Yeah, it’s kind of a monster.
It won’t be as fast as a Intel + nvidia dGPU solution in some tasks but will be faster in others if the rumors hold water.

Of course exactly what sources you (more or less) trust plays a huge part in all the estimates involved here…

For instance one of the places I’m reading, wccftech is a site you should *always* take with more then one pinch of salt…

Daniel Anderson

The EHP core ended up being a concept for lower power consumption and doesn’t have anything like architectures planned for sale as of yet.

I understand the ZEN Desktop CPU will be out in 2016, the question remains at the core count. We’ll be finding out that information from a more grounded source in the next 6 months or so I’m sure. 8 physical cores, 16 physical cores. It hink we’re talking about the same thing but different terminology. You mention “As for thread count we’re talking 2 x core count” this is why i stated its phy cores (physical cores) really amounts to the same thing. Its all good in my book. With the onset of DX12 and even faster internet speeds being more readily available its just the right time for home IT businesses to spring up. Time to convert my kitchen into a walk in freezer, lol.

Domaldel

Yes the idea was lower power consumption, integration of things onto a chip tends to help with that, lower travel distance for the signals and stuff.
But why do you think that involves a low performance chip?
(It sounds like you are from your tone)

As for the core count.
Remember that although few people actually *need* 16 cores it would allow them a cheap marketing trick in that they’d be on top of Intel in the consumer space in core count and total multithreaded performance of the whole chip again even if the individual threads where behind.
(That no one would actually need that many cores in everyday life is kind of beside the point in that situation as your average consumer don’t know that and have been thought to equal more cores with more performance in the last few years, it’s a cheap trick but AMD needs everything they can get…)

And at any rate having a socket with a large headroom for good graphics solutions might make it possible to actually outperform Intel where it counts.
Integrated graphics processors.
If they’re capable of offering an APU that actually servers the average guys need in gaming better then a Intel option at a lover price then they’ll win market share.
And being close on the CPU side while also having power headroom that non of the Intel sockets have for unrivalled graphics performance unless you spend extra money on a GPU…
They’ve already had experience with selling power hungry CPUs with water cooling included in the deal and know that they have enthusiasts willing to use watercooling on their products.
And with the the availability of AIO products they might actually make the 8 core + big GPU solution work quite well.

Of course a side effect of such a high power headroom on the socket is that if they choose to sell a CPU only with double that processor count they’d be able to do that on the same motherboards as their other consumer products.
It’s kind of a side effect really more then anything else I’m guessing.
And it would allow them to offer a value proposition to small companies as they’d be able to just take a of the shelf consumer motherboard and still get a high core count for their company.
I think it’s kind of a neat blow against both Intel and Nvidia both at the same time that for that reason is potentially a cost effective way of fighting back with a relatively low research investment.
And on Intels side it would take a little while to develop a new socket capable of AMDs graphics solution something they’re threatening with overtaking on the current sockets and in fact have beaten on occasion.

It might really be an attempt at pushing Nvidia of the market and giving AMDs some breathing space to really focus on Intel.
And it might be a curveball at Nvidia by being a product that Nvidia might not be expecting and could potentially make some of Nvidias research into low-midrange GPUs wasted hurting their bottom line reducing their capability to out research AMD in the GPU space.

No I’m convinced that this is AMDs answer to Intel and Nvidia.

Oh, and it would help them when talking with OEMs too. ;-)

Of course Intel will continue to be unrivalled as the snappiest processors in the office space and for light weight home use so they might not feel threatened by AMDs move since it’s really just an attack on Nvidia more then Intel differentiating them from each other.
Nvidia on the other hand…
How on earth are they to respond to that?
Both AMD and Nvidia earn most of their graphics money on the lower midrange chips in the ca 200 USD and below space. And if I’m right then AMD intends to seriously eat into that market with an APU instead of a GPU.

Daniel Anderson

Low power consumption, not low power. The only reason “no one needs” a 16 core chip is because applications out there are not designed to use that much. Just like at one point no one needed a quad core. If higher core counts are released to the customers then developers can develop applications that take advantage of the extra hardware. Eg: AI in video games can be tweaked to become much smarter.

Domaldel

That’s *kind* of true.
However the issue is that there are limits to how parallelized it’s possible to make a task.
And even when you *do* have a task that can technically be parallelized there’s huge issues with doing so.
There’s a whole lot of problems with concurrency that regular programming simply don’t face, problems that you can avoid in high level languages like Erlang of course but that’s harder to avoid in the low level languages that some programs like say games tend to be programmed in (like say C/C++), just look up data races for one example.
And making a code that tries to be concurrent and also secure you often end up with overhead involved in checking the state of the data and stuff so in some cases having more threads actually doesn’t speed up the execution of a multi-threaded piece of code at all despite it being multi-threaded compared with nearly the same code in a single threaded implementation that doesn’t have to deal with that additional overhead..

That said, I do have high hopes towards things like the Rust programming language that might help make concurrency more common also for low level code.
And there’s a whole host of high level languages that’s well suited for concurrent programming, Erlang being the king of course, but also new languages like Go (even though a lot of the people I’ve read about using it don’t seem to be all that impressed by it..)

Oh, and another thing.
I suspect that by the time we get any significant amount of code for everyday users that would actually benefit for 16 threads (let alone 16 cores and 32 threads) the single threaded performance of these processors will probably already be so obsolete that we’d be getting new processors for that reason alone anyway.
4 cores and 8 threads are currently more then the great majority of regular users need, we might reach 8 cores/threads being useful soon, but 16 threads…
That’s some years into the future.
I’d still like that number of threads though simply because I have some *really* bad habits (like opening 4-500 browser tabs at the same time divided on several browser windows as well as running games, virtual machines and other things…)
But even I probably won’t find any use for a full 32 threads for the next five or six years at least I think.
At least not for anything other then say BOINC…
So I’d rather go with the 8 core APUs.
But at least I’d have the option of upgrading to a 16 core with a dGPU whenever such a processor would actually start to make sense if I really wanted to use that, and who knows, perhaps they’ll keep using the same socket for a while and one of the latest processors to use the socket might be a worthwhile 16 core processor at a time when they’re starting to become usefull?

Daniel Anderson

With all the technicals aside, as the inevitable progression of computing will adapt once hardware is available, I find it fiscally responsible to get what will last the longest as opposed to getting the bare minimum. So even though a physical 16 cores is overkill currently, I’d rather purchase that outright for a longer term solution compared to going with a 4-8 core system and having to upgrade sooner. My current system is going to be 6 year old when Zen/arctic islands is released and I’m looking forward to get all the flashy goodness that I can hold onto till the next big release happens.

Domaldel

Well, personally I think it’s more likely that a 8 core APU with HSA will have a longer shelf life then a 16 core CPU from AMD.
After all, the APU will have the ability to run compute accelerated programs with a performance higher then a non-GPU accelerated CPU with more cores and higher single threaded speed in the future so it’s performance has no way to go but up.
The CPU however only has it’s single and multithreaded performance to rely on, and the single threaded performance will get outdated way before that amount of multithreading becomes a thing I think.
So 16 cores is in my view simply a waste, simply because by the time 16 or more threads are actually used in mainstream programs you’d be looking to upgrade anyway, and I don’t think the 8 core will be updated any earlier then the 16 core simply because the single core performance will be an issue way before the multi core performance will anyway. (Overall potential throughput of a processor isn’t as important as single threaded performance for the user experience for the average user even if extra cores and throughput can be neat.

We’re already seeing that with the FX processors.
They’re so far behind now that even though you’re getting twice the number of threads that you get on an i5 a ton of people still consider it more on pare with the i3 or lower due to the single core performance…
They simply don’t feel that it offers a good enough experience to be viable to them.

I still decided to pick up an 8350 due to my particular use case and being an AMD fan, but just throwing in more cores isn’t a real solution.
At least AMDs APUs have some real innovation where they’re actually ahead of Intel, so I for one believe they’re the more future proof option of the two alternatives.

Of course I could be wrong I guess…
Anyway, that’s just me…
If you really do think that the 16 core 32 thread processor will be of use for you, then go for it. =)

Daniel Anderson

The thing with Bulldozer is they’ll see a new breath of life once DX12 is released. Doesn’t mean much for production, but then again if production is a worry wouldn’t have grabbed Bulldozer anyhow. 16 core proccy wouldn’t hurt in the least, with a reasonable price of course. I wouldn’t consider it a waste because I wouldn’t upgrade until it started slowing down, and if I did, it would be in order to repurpose it for whatever I might have a need for (mumble servers, game servers, tinkering applications or fiddling with a new software of the month I find, guest PC, whatever it might be).

Domaldel

Well, yes DX 12 helps, but they’re still dwarfed by the newer Intel chips in performance even in dx12 games at the moment.
Of course settings and stuff matters a ton.
The chip behaves very differently from one bios setting to another, practically like a different chip sometimes.
That said, it’s total throughput is roughly on par with a modern i5 in a best case scenario but with worse single threaded performance (despite actually having the potential to match a i7 quad core back in the day when it was new in some use cases)
While I of course understand why AMD choose to do so I still kind of wish I could have had a high end AMD chip with all the single thread performance improvements that Kevari and Carizzo has been given but in a 8 core like the 8350.

As for those uses, well yes you might find a use for a few extra cores that way, but 16 cores with 32 threads is still overkill for you by the sound of it.

Btw, processors never slow down with age pr say, it’s just the software that keeps getting more and more demanding.
And usually that increase in resource demand is single threaded even these days…

If you choose to go with things like game servers or mumble servers and guest PCs you might want to look into what code the software in question is running, because it better be pretty multithreaded to really make use of such a processor.
Still, at least you can run all those things on such a processor without other cores being bogged down. =)

Daniel Anderson

I think it goes without saying Intel has the performance crown, AMD hasn’t been pulling through since the athlon 64 x2. Phenom II was decent but still wasn’t anywhere near taking the crown.

Also I can simultaneously run several of each server with such beefy specs without issue. So regardless of its “overkill”, which for the consumer pricepoint – lets be honest – is nothing of the sort, I can put it to use properly. In the proper config I can have those services running and have it as a guest machine. Worrying about not putting those idle clock cycles while its a non-server oriented chip isn’t that big of a deal.

Domaldel

Fair enough then perhaps you’ll be one of the few that can make use of all of those cores.
Most people won’t be.
But fair enough. =)
Myself I’ll be looking at the benchmarksame for it and the 8 core APU before I’ll make up my mind.
But I’m leaning heavily towards the APU for my own use case.

Daniel Anderson

Most definitely. Much like what you stated previously HSA will be a pretty important performance earner. I’m thinking by the time I upgrade from Zen/Intel Equiv HSA will have much more adoption. Seeing what the laptop APU’s can do with HSA its definitely going to be an exciting period.

Domaldel

Well…
I’m actually thinking that HSA might be more interesting in the midrange desktop computers and the supercomputers and HPCs.
But that’s just me…

Joel Hruska

“. Zen is moving back to SMT architecture which was last used in their Phenom II series.”

Phenom II never used SMT. SMT refers to running multiple threads on a single chip. The acronym you are looking for is Symmetric Multiprocessing (SMP).

Zen has multiple cores (SMP) and can execute multiple threads on those cores (SMT). It is the first AMD to perform both of these steps in a single CPU.

“dropping some of AMD’s own bulldozer instruction set.”

True, but very little code used these instructions and the performance benefits of doing so were very small.

“The biggest piece isn’t the fact that they’ll be throwing more cores, but they’ll be stacking the cores on top of the solution”

I’m not sure what you mean by “stacking cores on top of.” AMD has not announced plans to build 3D microprocessors and 3D chips are extremely difficult to design due to heat dissipation problems. You can’t stack multiple cores vertically, and you can’t stack memory on top of 95W TDP processors without melting it.

“IPC is said to be 40% better than excavator to boot, which isn’t tough to do seeing they are drastically different.”

Hitting 40% better IPC than Excavator is easy, though there is concern over how they measured this. While I haven’t had the opportunity to measure Excavator’s IPC, previous chips had IPC so terrible, it’s hard to imagine AMD couldn’t improve it.

Daniel Anderson

Good call on the first two points. Stacking the cores on top of a solution, not like a 3D interposer, but more metaphorical. They’ll have plenty of cores, but each core would be far more efficient compared to the bulldozer approach. Didn’t want to throw too much in there just gotta throw out general info.

Yeah bulldozers arch in general just wasn’t able to pull through. They cut too much on R&D way too early. Hear some of the stories around those that got AMD into their current position (hurting) and you’ll cringe at some of the decisions that were made. I just hope they pull through this time around, else I think it’ll be the last time I give them my money.

Joel Hruska

“Yeah bulldozers arch in general just wasn’t able to pull through. They cut too much on R&D way too early.”

I’ve spent a lot of time analyzing BD, and I still don’t know what the hell went wrong with that chip. I think part of the problem was that the CPU was both late and underperformed. If, for example, you go to Anandtech and use its “Bench” tool to compare the FX-8350 against the old Core i7-940, you’ll see that the FX-8350 spanks Nehalem in a number of tests. It’s even reasonably comparable against the Core i7-970, which launched in Q2 2010: http://anandtech.com/bench/product/697?vs=157

If Piledriver had launched at the end of 2010 instead of 2012 (replacing BD), it would’ve been in a better overall position. If AMD had even managed to hold IPC even with Thuban as opposed to losing a huge chunk of it, PD would’ve been a match for Sandy Bridge.

You can see echoes of what they tried to do. But in three years of work, they never managed to patch the problems. Kaveri did a great job of fixing the 20% multi-threaded penalty that you took for running two threads per module, and cut it back to 10%, but even doubling up on dispatch only improved single-threaded IPC by about 7%. L1 cache contention has gotten better, but never really got *fixed.*

It’s still not clear exactly what went wrong. I suspect there were multiple interlocking problems but cannot prove it.

FSRed94

Joel,

Long time reader and I want to thank you for the work that you do. You sure do know your stuff.

How much of the BD problem do you believe was cache related? I think the amount of work that needed to be done with the exclusive cache may have increased latency. The fact that Zen uses smaller private inclusive caches with a large LLC is different from what AMD had been doing forever with L1/L2 points to this. Coherency must have been a nightmare.

Also, I think BD could have benefited from some form of micro-op (or whatever AMD call’s their decoded x86 instructions) cache. I believe I read something like that was added with Kaveri or maybe Excavator but perhaps it was too little too late.

Anyway BD is still a bit of a mystery to me. I’m just wondering what you think hurts it most. Cache and decoding are what I have thought for some time.

Joel Hruska

“How much of the BD problem do you believe was cache related?”

Anandtech did a major profile of BD and concluded that cache latencies were not the overwhelming problem that some people thought they were… in server workloads. I don’t think anyone ever did a full analysis at the code level of consumer software.

What I know boils down to this: Bulldozer had terrible L1 cache contention, in which two different threads would evict data from each others’ L1 code. I think the fact that AMD expanded both the L1 instruction and data cache over the years indicates that this core had some significant problems in that area. Typically, when Intel or AMD builds an L1, they stick with that structure for a long time.

I once ran Quake 3 Arena on an FX-9590 against the Core i7-4960X. I set it to like, 320×200 or 512×384 with bare minimum details, because that’s the resolution we used to test in to measure things like cache performance in the old P4 / K7 days. IIRC, the FX-9590 @ 5GHz hit something like 767 FPS. The Core i7-4960X at 3.9GHz was like, 949 FPS. It’s a silly little test, but it points to a profound performance difference between the two.

PD’s L2 latency was more like Intel’s L3 latency. Its L3 latency was more like Intel’s main memory latency. I do not know if these issues accounted for 50% of its performance problems or 10%, but I’m certain that they whacked it.

I believe Kaveri had a very small cache for 1-4 instructions. I do not know about Excavator.

The decode thing is weird. Very weird. Kaveri could issue instructions to all four cores every clock cycle, and microbenchmarks I performed with Agner Fog confirmed it did so. Was the problem in the instruction fetch? Pipeline stalls? The lack of a significant uop cache?

I’m hoping once Zen is out, AMD will quietly tell us.

FSRed94

Just by looking at the design you would think the extra decode hardware would have helped more. Indeed it did seem to help only slightly, and that gain was offset by reduced clock speed for the most part it seemed.

Back to cache that brings up another interesting bit. Despite lacking the L3 cache the APU’s have done very well. I’m sure there are some examples where that L3 helps a lot, but the extra I wonder if it was worth the extra die size in the end. Of course if we never saw BD derived chips with L3 we would be speculating that was the reason for the lackluster performance.
I have a feeling you may be right and we will have a better idea once Zen is out. We do seem to get some quality information during such transitions about what went right and what went wrong. I am eagerly awaiting Zen. On paper it looks good. Hopefully it gets built on a good process and power is kept under control. If it is a dud the x86 space will get even more boring than it already is.

Joel Hruska

It’s true that cutting the L3 off didn’t really seem to hurt the APUs vs. BD. Hard to say how much L3 is really necessary these days, most motherboards no longer offer the option to turn it off.

Another reason why L3 might not have helped BD much is because it was clocked so much lower than the rest of the chip. Only Intel’s first-gen Nehalem cores used a downclocked uncore; SNB and following designs all used an equal-clock.

The FX-9590, in contrast, pairs a 5GHz CPU with a 2.2GHz Uncore.

Another oddity I forget to mention: Because BD’s cache is write-through rather than write-back, L1 performance is constrained by L2 performance — and L2 performance was terrible. Some interesting further info here:

I never understood the decision to go with a write through design. I just know that the engineers who designed it must have had a good reason for it.

Good point about the L3 running noticeable slower. That must limit it’s usefulness along with it’s already abysmal latency. Thanks for the link, I will have to take a look at it.

Daniel Anderson

Bulldozer was meant for multithreaded processing primarily. So in fully multithreaded tests it will definitely shine. However AMD’s management decided to cut costs by going to modular route, and did not allow their engineers to go fully custom on the components which mattered most for much better performance and lower latency, I’ll defer that part to those that know a hell of a lot more than I do regarding those. Basically this is where they backpeddle with Zen. From my understanding they have basically given keller and team full control over how Zen operates in terms of customizing the chip as they see fit, and from the sounds…or rumors…it will definitely make intel at least raise an eyebrow if all goes in AMD’s favor.

CMT wasn’t a bad design, it was just arrogance assuming an industry would change the way their code would work to use AMD’s proprietary instruction sets, they didn’t have the power to make such a move. If intel did such a thing (which I believe they have a coprocessor now) they would see the adoption considering they have much more of the market. CMT is more specific on its ability to process multiple threads (inefficiently compared to if it had the proper setup, albeit much more expensive), SMT and SMP are more general from what I gather. Basically allowing for good performance for single and multithreaded tasks.

Joel Hruska

Again, I wouldn’t focus on the AMD instruction sets. I had applications provided *by* AMD that used them. Performance uplift was tiny. The most important instructions, like FMA, were eventually added by Intel.

I’m not saying that people couldn’t tune for BD and get better performance, but this wasn’t like SSE2, where the P4 was terrible without it and awesome when using it. Bulldozer was awful without them and very slightly less awful when using them.

Sirò

AMD never said anything about stacking cores that’s only theoretical right now and has a ton of problems that make it not worth it. they do want to use HBM2 on APUs that’s a stacking tech and should make a great high to mid range laptop

Daniel Anderson

Lol I replied to Joel at the same confusion. Was just trying to throw condensed info. I didn’t mean physical stacking. The solution of course AMD actually using an efficient design. I meant there will be plenty of cores but unlike previous generations they’ll be more efficient cores.

They are going to HBM ZEN in the server environment, but thats 115-130w arena I believe (my own uneducated estimate). If I’ve been following correctly desktop APU’s will have excavator next year and then zen/newer arch 2017.

Joel Hruska

Yes, just noticed this. A word got omitted.

Daniel Anderson

” Bringing all of these elements under one roof, along with developer relations and driver development,”

A word got omitted there. Thanks for pointing it out. I have rewritten the sentence to make it absolutely clear that K12 and Zen are two different things.

pelov lov

Sounds to me as if Raja Koduri is preparing to lead an independent graphics company in the near future.

Joel Hruska

Inside sources at AMD say that they have no plans to do this. I believe them. AMD has tied GPU tech to its CPU tech at every level. The complexity of the cross-licensing and revenue sharing would be astronomical and the bulk of AMD’s earnings comes from sources that aren’t dGPU.

God knows I can’t say “never,” but I think spinning Radeon off would be hilariously unhelpful.

pelov lov

Whether it’s helpful or not is a different matter. Likewise, so are their HSA/APU endeavors.

From a financial perspective, they’ve got to do something drastic and preferably as soon as possible. The AnandTech article notes that Koduri will have the flexibility and freedom that you only see in independent companies. He’s in charge of product development, direction, even down to the marketing/PR. That doesn’t indicate a unified AMD to me. In fact, the tone of AnandTech’s article seems to point towards HSA/Fusion being a failure (they stop short of failure, but I don’t think outright failure is hyperbole when you look at how their products have fared). Given the timing of this announcement and how it coincides with rumors of a Silver Lake investment, I think it’s justified to question whether or not AMD’s plans told to the public matter at all. It’s a company with lots of plans and a reputation of never actually abiding by them. They wouldn’t be in this position had they just a single good plan in the first place. They have a new one now, though. Be sure to add it to the list of all the other old ones that clearly weren’t the right plans.

The timing isn’t confidence-inspiring either. Koduri is definitely the right guy to spearhead something like this, but placing the right guy at the helm right after a really poor product(s) launch, plummeting market share, and FinFETs just around the corner? Are they not happy with where they are right now? or is this a case of poor current performance along their next generation being underwhelming as well?

The answer to the first one is undoubtedly yes, but the 2nd question is where things can get incredibly ugly. Moves like these take years to develop into something beneficial, and if it’s one that revolves around GPU microarchitecture — next year’s FinFETs cycle — then I fear it’s too little, too late.

Joel Hruska

“In fact, the tone of AnandTech’s article seems to point towards HSA/Fusion being a failure (they stop short of failure, but I don’t think outright failure is hyperbole when you look at how their products have fared).”

I think we should separate whether the HSA / Fusion *concept* failed from whether or not AMD’s implementation of that concept failed. It may be academic from a financial perspective but not, I think, from a planning and implementation angle — and planning and implementation are key parts of what Koduri is charged with being able to execute.

There’s broad agreement across the entire industry that heterogeneous compute and many-core architectures are the future of computing. Qualcomm implemented it with Snapdragon 820. ARM talks about it. Intel has its own projects. NV uses certain types of offload for image processing.

Furthermore, we know that the PS4 and Xbox have some level of HSA capability (Mark Czerny has referred to HSA functionality explicitly in some of his discussions, though he didn’t call it by that name). These capabilities are part of why AMD won the contracts for next-gen consoles in the first place, which puts them squarely in the “Vitally important” camp.

Has HSA helped AMD in the core PC space? No. But given that the technology was already being developed for other projects, deploying it in Kaveri was probably a minimal expense. HSA doesn’t eat much power, it doesn’t chew up die size.

I think AMD was right to champion heterogeneous compute. I think it was right to integrate the capability, since it was rolling out in other places along the product stack. And while it may not have boosted AMD’s sales revenue directly, the truth is, AMD has a long history of technical innovation and standard adoption. Hyper-Transport (point-to-point bus topology), x86-64, NUMA, GDDR5, HBM, HSA — AMD has often had a finger on the sorts of approaches the computer industry would adopt in the future.

I wish that history *also* made them money. But HSA-like computing is where the industry is moving.

pelov lov

Heterogenous compute will take off with or without AMD. They’re a non-factor in the success/failure of heterogenous compute. Btw, I believe it’ll undoubtedly succeed. The entire industry is or will be moving in that direction, but whether AMD will survive until then is a separate matter.

It’s a bad case of the Bulldozers yet again. They’ve got this product that, under very particular circumstances and the absolutely perfect software is awesome. Now we just have to wait a decade to actually get there, and by the time we do the hardware is laughably outdated.

What they required before any of these efforts even got off the ground are increasing revenue, tighter developer relations, a strong foundation for the software with a focus on ease of development, and a market share high enough not to ignore. All of these factors got worse, and HSA never got off the ground.

I hope their new plans take into account the current state of the company and aren’t based on some delusional perception of where things stand in an alternate universe.

Btw, DigiTimes is stating GloFo is running into problems at FinFETs getting yields up to snuff. Apparently Q4 2016 at the earliest. Might as well stick a fork in it, if that’s true.

Joel Hruska

“Btw, DigiTimes is stating GloFo is running into problems at FinFETs getting yields up to snuff. Apparently Q4 2016 at the earliest. Might as well stick a fork in it, if that’s true.”

That… would be unfortunate. It’s not hard to believe that GloFo would suffer that kind of delay. Then again, “Truth” and “DigiTimes” are only passing acquaintances.

pelov lov

Very true.

Then again, a delay on a modified SoC/low-power process being licensed from a 3rd party, producing high-power, high-frequency server silicon is very much in the realm of ‘Duh’.

If true, it certainly makes one view their recent announcements in quite a different manner.

Joel Hruska

I’ve put out some feelers on this. I’ve wondered before if AMD would move it’s x86 business to TSMC to use their tech, since GF killed it’s own 14nm and agreed to use Samsung’s. I didn’t think it particularly likely, but foundry agreements between AMD and GF have proven malleable. The original terms of the agreement between the two never materialized and AMD was forced to dump its 20nm SOC designs for consoles.

I’m not saying AMD would or could dump GF, but we know TSMC can fab low-power x86. It wouldn’t shock me if AMD investigated such options.

AMD has already restructured their wafer commitment with GloFo for full-year 2015 after their PC&G group revenue and sales plummeted. Iirc, the floor of their commitment is at $1bn a year, and it looks like it’s going to be under that this year (it was only a tad over that last year).

They’re perfect for each other, imo. GloFo can’t produce a competitive process within a reasonable time frame while AMD can’t make products people want to buy.

If AMD is forced to look elsewhere, we’re talking about hundreds of millions more in charges and upwards of a year for port+production, and even probably longer. Currently, AMD doesn’t have the money nor the time to make that transition even if TSMC was offering the process.

At this point it’s GloFo or bust for AMD. Even if we were to assume they’d incur no charges for hopping foundries, the timing issues alone would make such a move completely pointless. Does it really matter if Zen is on TSMC in 2017 or GloFo in 2017? I really don’t think so.

Again, this is all hypothetical of a hypothetical. We can discuss the advantages/disadvantages of each with respect to foundry/process, but that discussion can only take place if one is to assume that AMD can survive any sort of delay whatsoever. They truly can’t, thus the subsequent matter really makes no difference.

“AMD denies rumor that it’s mulling breakup or spinoff
By Joel Hruska on June 20, 2015
AMD’s own spokesperson Sarah Youngbauer told ExtremeTech the following: “While we normally would not comment on such a matter, we can confirm that we have no such project in the works at this time. We remain committed to the long-term strategy we laid out for the company in May at our Financial Analyst Day.”

Use of this site is governed by our Terms of Use and Privacy Policy. Copyright 1996-2016 Ziff Davis, LLC.PCMag Digital Group All Rights Reserved. ExtremeTech is a registered trademark of Ziff Davis, LLC. Reproduction in whole or in part in any form or medium without express written permission of Ziff Davis, LLC. is prohibited.