All of the above would involve Intel or AMD building them into the chip, which they have no plan on doing... So I'm guessing no where. This can be made by currently producing heatsink manufacturers... which means I'd give it a year or two and you'll start seeing them.

I'm not sure about the interface film, but that may be made into a TIM pad or you might start seeing graphene thermal compounds. Either way this is quite a bit different then the other examples you listed.

thanks for the update, i'd not heard about that. i've been tempted to go to the local tech shop and try to build a rather large one and give it a run on a small AC unit and an see how well it works.. if their numbers work out it should prove to be a very good tech, and should be cheap but we all know that woln't happen.

Industrial diamond is not terribly expensive. It can be had for about 2 grand per kilo. Admittedly this is a good deal more than copper, but this graphene composite may be cheaper to manufacture than diamond.

The biggest obstacle to higher clock speeds has been getting rid of the heat (which is why supercooled processors can be overclocked to 7 GHz). This could potentially lead to adding another GHz to clock speeds of domestic computers, perhaps 2 per node for top-end supercomputers. That's valuable, for although multicores are good, there just aren't that many decent parallel programmers out there. I (and a few others) find parallel programming easy but the vast majority of coders in the world got into the field as a way to get rich quick and aren't adept at anything beyond Visual Basic or the most trivial aspects of Java.

Badly-coded programs won't run better on multi-way chips, but can be forced to run faster on faster chips, so the only way to compensate for the lack of skill is to crank up the clock, which is only possible if you can avoid the chip cooking itself.

That's a terminal barrier for synchronous chips, but it's not an obstacle at the puny 3GHz speeds we're currently operating with (especially as overclockers have already established the same chips are capable of 7GHz without issues).

By the time we get to 30 GHz, we may well be working with 3D chips. Furthermore, you don't need a standardized clock for asynchronous chips and async CPUs already exist. (There's even a program to help you design them listed on Freshmeat.)

1. Speed of light in a wire is at best 0.7c, typically.5c-.65c depending upon the material. So, cut your propagation distances accordingly.

2. As another poster suggested, that's only a limit for fully synchronous designs. Async (clockless) and semi-synchronous (partially asych with some clocking) designs are limited by switching times and feature density (which is related to both the speed of light and the gate size, but it's less rigidly limited than in synchronous designs).

The speed of light in an optical fiber is also about.7c. So, you can only improve this if you have optical electronics that use air/gas/partial vacuum tubes/channels as the transmission medium. Speed of light in air is >.999c and will be similar for other gases at/below ATM.

That is perfectly true, but many algorithms can be (and aren't). Even when algorithms have to be serialized, those algorithms generally form a small part of the overall program. (If you were to draw out a timing diagram for a program after the fashion of critical path analysis, you'd see lots of bits of work that don't need to be done sequentially. There isn't a serial list of dependencies, in the general case, for a complete program from start to end.)

Personally I think that single-threaded asynchronous software gives the highest performance. However, there are cases where you want to do things that don't have async APIs. In these cases you need some way of blocking in a synchronous API while letting the server do other work--and epoll/kqueue isn't always an option. In these cases threading can give slightly higher performance than separate processes.

In most cases though I prefer to use separate processes with explicit message-passing. It's easier to

Ultimately, if you are switching between tasks, you have the CPU and memory overhead of a task-switching mechanism plus the latency overhead. It makes no difference whether the mechanism is in the OS, the program or a tea cosy. If you have such a mechanism and it is already running and you are already paying the price for it, then provided the mechanism is implemented efficiently it will be cheaper to use what you're already paying for than to re-implement it in yet another layer.

no. Barring the upper limit and excluding theoretical quantum device, the other bar keeping speeds stagnant is the fabs.Fabricating the level of density without leaking is very hard.Fabs are very ex[pensive to built.Add to the the consumer need for faster clocks has tapered off, it's not worth the expense of massive retooling.

When they can get the metal well below 1 part per billion in the fabs, and create a process to minimize wafer breakage for wafer being cut so precisely, then we may see a doubling of c

For doubling, you're correct. But I'm talking a 25-33% increase in clock speeds, not a 100-200% increase. And there's a far worse increase in leakage dropping from 35nm to 22nm than going from 3GHz to 4GHz (the proof of which is that you CAN run a Core2Duo at 7GHz reliably - which would be absolutely impossible if leakage was causing significant errors at that speed).

Fabrication AS IT STANDS is capable of making a 7GHz chip - we know this because the chips they produce can be run at that speed. The problem

It's impossible. There is always leakage. Yes, as you scale, the leakage does grow, both empirically and as a signal/noise problem. But there are ways to minimize this. This has been foreseen for some time, and a lot of research goes into ways to mitigate it. Despite all the improvements made sub-surface - that is, how the semiconductor itself is altered - to allow scaling and improve efficiency, and how the tools and methods to make the devices have improved... the industry really hasn't had any radical changes in many years. It has been all planar designs that date back to the 70's. Sure, the materials have improved, and it's not entirely silicon any more... but still planar, and subject to some fundamental limits of the planar design and the substrate choice. That's why Intel is pushing into 3d designs. Do some reading on FINFETs, and the benefits of them, especially with respect to leakage and control. And that can still be silicon based, and doesn't push at all into heterogeneous semiconductor systems.

Fabs are very ex[pensive to built.Add to the the consumer need for faster clocks has tapered off, it's not worth the expense of massive retooling.

Oh [intel.com], really [xbitlabs.com]? Why are the industry giants doing it, then? Smaller die - this improves speed and potential clock, can improve power efficiency, means more die/wafer or more advanced designs. Ability to do different etches, deposit different films, etc., to improve device characteristics.

Clockspeed isn't everything anyway, or we'd still be using the Pentium 4 chips that were pushing 4 GHz from the manufacturer, and not the 2 GHz-range Core 2/iX chips. Smart design can trump clockspeed. (I use Intel as an example here because they had the more recent significant architecture change which illustrates this point very well.) We could, y'know, go back to making Pentiums... with current manufacturing technology, we might make them, what, 1/8 the size? Could probably clock them at several GHz.

When they can get the metal well below 1 part per billion in the fabs, and create a process to minimize wafer breakage for wafer being cut so precisely, then we may see a doubling of clock speed 2 more times. Then that will be it.

What makes you think metal contaminants and wafer breakage are the limiting factors to clockspeed scaling? And from where do you get a "doubling of clock speed 2 more times" from? What are you considering the base clockspeed that you are multiplying? Seems like you're pulling it out of your ass. Think about it. We're doing 3 GHz+ already. Doubling that puts us in the 6-8 GHz range. Doubling again puts us in the 12-16 GHz range. That's what people above are claiming as the fundamental limit in a synchronous chip by the limit of the propagation of a signal in a metal. The speed of light in metal is in no way the limiting factor in clockspeed. That would be the case for a single wire in isolation. There are other effects, namely capacitive coupling, in a chip where you are wiring up billions of transistors, which are much more limiting. And we're tallking wires of non-negligible resistance here - if you want to put a bunch of small transistors close together, you need to be able to make really thin metal wires to connect to make the right connections. Assuming metal is the only interconnect, of course, and completely ignoring all the research into optical interconnects...

Of course you do, which is why you know that not all tasks lend themselves to parallelization, and that even tasks which can be parallelized generally have a point after which the overhead adds more work than time saved. Parallel processing is a great tool to have, but that doesn't make it the right tool for every job.

I think parallel programming is easy and I don't understand all the trouble people have with it. The biggest issues is debugging, but as long as you have clear and concise entry and exits in your code, it is just a matter of time to track down the issue. Unit testing + modular code = win

Probably because their graphene is polycrystaline (if one can call them crystals) and this composite is actualy better at conducting heat from one crystal to the other than pure graphene. But I'm not sure, as I didn't read the scientific article.

Anyway, comparing the heat conductivity of this polycrystaline material with monocrystaline graphene is useless.