If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

While some argue about the semantics, I hope others are optimizing software for the architecture

Now that's a problem.

You see the point is that only big ass problems require the entire CPU. The rest is wasting thread start, lock, sync and stop. That's cool and all when you make GUI apps (multithread programming is a bliss there (and also in general)), but you're wasting resources for the enduser, not for yourself.

I see my computer as a computer, not a kitchen apparatus. Therefore I want enormous power. Not because I'm a lazy programmer, but I tend to find resource restrictions problems fascinating. Floating point is kind of a requirement for visual and audio problems.

This CPU just made floating point operations at least four times more expensive than integer operations. Vincent does not like.

Floating point is freaking part of the CPU. Given that there are two, it's dual core. Unless you have four cores and two different ones at that.

The term dual-core was invented for essentialy two CPU's being molten together on one die. In this case it's not any different. Unless it's a six core having four integer and two float cores. But they are not exactly entire seperate, so I'll simply calll this dual core.

in fact a fx4000 is a 8core cpu.... but only 4 cores can be used directly by the thread scheduler if the workload is Integer and not SIMD and not floating point.
the corei3 is a 8core cpu to but only 2 cores can be used directly by the thread scheduler
an intel 2600 is a 16core cpu emulating a 4 core.
the bulldozer fx8000 is also a 16 core cpu.

but the people count the windows system monitor.... and in the system monitor there is only 4 cores for the fx4000 or 8cores for the fx8000

Faster single theating speed by out-of-order FAKE singlecore emulation is just bulldshit bingo for the poor mindless people.

Intel for example lose up to 60% theoretical speed only because they prefer to fake singlecore speed.
if you compare a Intel 2600 to an REAL in-order-16-core-cpu the intel 2600 LOSE! in speed per WATT usage.
Thats why the smartphone company prever ARM cpus because ARM cpus are REAL cores means no "fake" and REAL in-Order-cpu architecture and not out-of-order

you can only burn power on Desktop-PCs and Servers... on notebook and smartphones its just stupid.

Q, I belive you confuse hyper threading (which seriously must die at once, unless you're using a server, maybe) with assembly order problems, x86 to RISC translation and got it all messed up and backwards.

Q, I belive you confuse hyper threading (which seriously must die at once, unless you're using a server, maybe) with assembly order problems, x86 to RISC translation and got it all messed up and backwards.

i do not confuse anything. modern out-of-order cpus do have tiny cores in a big core and they emulate a singlecore.

may be the bulldozer should have been created with 2 cpu + 1 fpu + 1 cl_gpu
with cl_gpu i mean only the elements of gpu that do the opencl and absolutely not a full gpu in the cpu .
may be like that this cpu would have been better than that . by the way the gap is not that far between number 1 in results and this one that is the lowest Mhz model

Floating point is freaking part of the CPU. Given that there are two, it's dual core. Unless you have four cores and two different ones at that.

The term dual-core was invented for essentialy two CPU's being molten together on one die. In this case it's not any different. Unless it's a six core having four integer and two float cores. But they are not exactly entire seperate, so I'll simply calll this dual core.

The term dual core was invented because there was actually 2 cores on one die. In BD case a module does not have 2 cores. It has 2 integer processors and one independent floating point processor. In BD the FP processor was decoupled from the pipeline to allow it to be shared. BD architecture is NOT the same as older generations, it cannot be compared as such. ... If you wanted to continue using your method of reasoning, it still wouldnt be a dual core, it would be an asymmetric tri-core.... But then what is the front end? Whats the cache heirarchy? What do you call them? Are they also cores?

If we take as an example of what a core is in older generations, then a module -is- a core.

may be the bulldozer should have been created with 2 cpu + 1 fpu + 1 cl_gpu
with cl_gpu i mean only the elements of gpu that do the opencl and absolutely not a full gpu in the cpu .
may be like that this cpu would have been better than that . by the way the gap is not that far between number 1 in results and this one that is the lowest Mhz model

That would be stage3 Fusion. AMD is currently working on it. Stage2 Fusion is what AMD has on the market now with the APU's based on Stars cores, and also the APU's that will be released based on BD cores. Stage1 Fusion was never released as a commercial product.

EDIT: Take a look at how the BD architecture decoupled the FP pipeline from the Integer pipelines, and made the frontend unified across them... Integrating a GPU's processing elements into a module is possible, which will likely be done as a replacement for the existing FP pipeline. This means that existing instruction sets will be supported, but to take full advantage of instruction parallelism a new instruction set will need to be written to take advantage of it in general purpose software..

EDIT2: Worth noting, stage3 fusion is still probably 4-5 more years away.

EDIT2: Worth noting, stage3 fusion is still probably 4-5 more years away.

For Servers maybe given in 4-5 years is when AMD announced they'll get APUs, but for desktop purposes there's Trinity which will be using Piledriver + VLIW4, and then I'm guessing based off of the architecture announcement a while back that the 2013 APU will be using GCN and should be the stage3 fusion if I'm not failing to understand what you're getting at.

For Servers maybe given in 4-5 years is when AMD announced they'll get APUs, but for desktop purposes there's Trinity which will be using Piledriver + VLIW4, and then I'm guessing based off of the architecture announcement a while back that the 2013 APU will be using GCN and should be the stage3 fusion if I'm not failing to understand what you're getting at.

Trinity's APU will be a separate core on the same die. That is stage2 fusion. Stage3 fusion is when they integrate the APU's processing elements directly into the CPU module and develop a native instruction set to utilize them in general purpose code. In the same way that the current FP pipeline is decoupled from the integer pipeline but still shares the frontend, the APU elements for stage3 fusion will be decoupled and share frontend and will most likely be used as a replacement for the existing FP pipeline.

I said 4-5 years because that is generally about how long it takes to get from design to silicon. It's nothing more than a rough guess though.