Posted
by
timothy
on Tuesday August 31, 2010 @10:29PM
from the promising-stuff dept.

Lucas123 writes "Researchers at Rice University said today they have been able to create a new non-volatile memory using nanocrystal wires as small as 5 nanometers wide that can make chips five times more dense than the 27 nanometer NAND flash memory being manufactured today. And, the memory is cheap because it uses silicon and not more expensive graphite as been used in previous iterations of the nanowire technology. The nanowires also allow stacking of layers to create 3-D memory, even more dense. 'The fact that they can do this in 3D makes makes it highly scalable. We've got memory that's made out of dirt-cheap material and it works,' a university spokesman said."

We don't just go vertical without solving the heat dissipation problem. We already have a hard time dissipating the heat off the surface area of one layer. Now imagine trying to dissipate the heat off of the layer that is trapped between two more layers also generating the same amount of problematic heat. Then try to figure out how to dissipate the heat off a thousand layers to buy you just 10 more years of Moore's law.

Well, at least you have a theoretical possibility to avoid that problem in ssd-disks.Since you are only going to access one part of the memory at a time the rest could be unpowered. This gives a constant heat do get rid of regardless of the number of layers.

This is of course not possible for CPU's and other circuits where all parts are supposed to be active.

Along with wear leveling I'm sure the algorithm would also account for physical location of data too. If thermal sensors report back that all location are heat soaked, then an algorithm could throttle back read/writes. But the latter would have to be an extreme rarity for that to be required above and beyond engineering limitations.

Such a thing would not be possible with current computer architectures, even if we had the materials. There is a fundamental theorem in physics/computing that the destruction of information causes an increase in entropy, i.e. generates heat. Thus, an information-destroying gate such as AND can never be completely free of inefficiency simply because it destroys information (if the output of AND is 0, you cannot tell if the input was 00, 01, or 10, therefore information was destroyed). Regardless of whether t

The first solution that comes to my mind (to your hypothetical limiting condition, at least;D) would be to put leads on destructive logic gates to conserve the unused information electronically. Imagine an AND gate with a rarely used "remainder" bit, for example. Designers could glom on that if they wanted it, or if not lead most of the unutilized results off into a seeding algorithm for/dev/urandom, and the rest (those prone to entropy feedback) into controlling blinky LEDs.

Yes, it's non trivial. Such a gap would have to be more than a few air molecules wide to allow free flow (avoid turbulence against the edges). This would make the size of your third dimension grow much faster, negating a lot of the proposed benefit in terms of Moore's law scaling. Also, existing air-flow dissipation strategies just wind up heating the nearby air, and trying to dispose of that heat, which means we'd still have a growing problem to deal with... so even if we've gotten the heat an inch awa

Where does your heat from the ducts GO? You reach the surface of the device, and now you still have 1000 times the heat to dissipate that you had trouble dissipating with fans/heatsinks/liquid cooling already. And that assumes you can do a PERFECT job of reaching the surface of the device with your strategy.

One thing you could run in to are heat issues. Remember that high performance chips tend to give off a lot of heat. Memory isn't as bad, but it still warms up. Start stacking layers on top of each other and it could be a problem.

Who knows? We may be in for a slowing down of transistor count growth rate. That may not mean a slow down in performance, perhaps other materials or processes will allow for speed increases. While lightspeed is a limit, that doesn't mean parts of a CPU couldn't run very fast.

They're all over that. As the transistors shrink they give off less heat. New transistor technologies also use less energy each per square nanometer, and there's new ones in the pipe. Not all of the parts of a CPU, SSD cell or RAM chip are working at the same time so intelligent distribution of the loads give more thermal savings. Then there are new technologies for conducting the heat out of the hotspots, including using artificial diamond as a substrate rather than silicon, or as an intermediary elect

If they were making the same spec part that would be fine, but as transistors shrink they cram more into the same space, so total heat flux tends to go up. Also the leakage gets worse too - but that gets offset by lower voltages.

This might be a dumb question, but why not have some sort of capillary-esque network with a high heat-capacity fluid being pumped through it? Maybe even just deionized water if you have a way of keeping the resistivity high enough.

The problem with such kind of proposals is that there is no means to actualy build the channels. It is a great idea in theory, and quite obvious (so there is a huge amount of research already on it), but nobody could actualy build it.

L1 CPU caches are shamefully stuck with the laughable 20-year old 640K meme in rarely noticed ways. Everyone's first thought is about RAM memory, but remember that CPU's are less change friendly and benefit more from tech like 128K * 5 size at the new density improvement.

Cache is not a case where more is necessary. What you discover is it is something of a logarithmic function in terms of amount of cache vs performance. On that scale, 100% would be the speed you would achieve if all RAM were cache speed, 0% is RAM only speed. With current designs, you get in the 95%+ range. Adding more gains you little.

Now not everything works quite the same. Servers often need more cache for ideal performance so you'll find some server chips have more. In systems with a lot of physical CPUs, more cache can be important too so you see more on some of the heavy hitting CPUs like Power and Itanium.

At any rate you discover that the chip makers are reasonably good with the tradeoff in terms of cache and other die uses and this is demonstrable because with normal workloads, CPUs are not memory starved. If the CPU was continually waiting on data it would have to work below peak capacity.

In fact you can see this well with the Core i7s. There are two different kinds, the 800s and the 900s and they run on different boards, with different memory setups. The 900s feature faster memory by a good bit. However, for most consumer workloads, you see no performance difference with equal clocks. What that means is that the cache is being kept full by the RAM, despite the slower speed, and the CPU isn't waiting. On some pro stuff you do find that the increased memory bandwidth helps, the 800s are getting bandwidth starved. More cache could also possibly fix that problem, but perhaps not as well.

Bigger caches are fine, but only if there's a performance improvement. No matter how small transistors get, space on a CPU will always be precious. You can always do something else with them other than memory, if it isn't useful.

It's not as obvious as it sounds. Some things get easier if you're basically still building a 2D chip but with one extra z layer for shorter routing. It quickly gets difficult if you decide you want your 6-core chip to now be a 6-layer one-core-per-layer chip. Three or four issues come to mind.

First is heat. Volume (a cubic function) grows faster than surface area (a square function). It's hard enough as it is to manage the hotspots on a 2D chip with a heatsink and fan on its largest side. With a small number of z layers, you would at the very least need to make sure the hotspots don't stack. For a more powerful chip, you'll have more gates, and therefore more heat. You may need to dedicate large regions of the chip for some kind of heat transfer, but this comes at the price of more complicated routing around it. You may need to redesign the entire structure of motherboards and cases to accommodate heatsinks and fans on both large sides of the CPU. Unfortunately, the shortest path between any two points is going to be through the center, but the hottest spot is also going to be the center, and the place that most needs some kind of chunk of metal to dissipate that heat is going to have to go through the center. In other words, nothing is going to scale as nicely as we like.

Second is delivering power and clock pulses everywhere. This is already a problem in 2D, despite the fact that radius (a linear function) scales slower than area and volume. There's so MUCH hardware on the chip that it's actually easier to have different parts run at different clock speeds and just translate where the parts meet, even though that means we get less speed than we could in an ideal machine. IIRC some of the benefit of the multiple clocking scheme is also to reduce heat generated, too. The more gates you add, the harder it gets to deliver a steady clock to each one, and the whole point of adding layers is so that we can add gates to make more powerful chips. Again, this means nothing will scale as nicely as we like (it already isn't going as nicely as we'd like in 2D). And you need to solve this at the same time as the heat problems.

Third is an insurmountable law of physics: the speed of light in our CPU and RAM wiring will never exceed the speed of light in vacuum. Since we're already slicing every second into 1-4 billion pieces, the amazing high speed of light ends up meaning that signals only travel a single-digit number of centimeters of wire per clock cycle. Adding z layers in order to add more gates means adding more wire, which is more distance, which means losing cycles just waiting for stuff to propagate through the chip. Oh, and with the added complexity of more layers and more gates, there's a higher number of possible paths through the chip, and they're going to be different lengths, and chip designers will need to juggle it all. Again, this means things won't scale nicely. And it's not the sort of problem that you can solve with longer pipelines - that actually adds more gates and more wiring. And trying to stuff more of the system into the same package as the CPU antagonizes the heat and power issues (while reducing our choices in buying stuff and in upgrading. Also, if the GPU and main memory performance *depend* on being inside the CPU package, replacement parts plugged into sockets on the motherboard are going to have inherent insurmountable disadvantages).

First is heat. Volume (a cubic function) grows faster than surface area (a square function). It's hard enough as it is to manage the hotspots on a 2D chip with a heatsink and fan on its largest side. With a small number of z layers, you would at the very least need to make sure the hotspots don't stack.

I'm not saying your point is entirely invalid, however, heat isn't necessarily a problem if you can parallelize the computation. Rather the opposite, in fact. If you decrease clock frequency and voltage, you get a non-linear decrease of power for a linear decrease of processing power. This means two slower cores can produce the same total number of FLOPS as one fast core, while using less power (meaning less heat to dissipate). As an extreme example of where this can get you, consider the human brain -- a m

Indeed, although if the need for more processing power arises from increasing data sets, Amdahl's law isn't as relevant. (Amdahl's law applies when you try to solve an existing problem faster, not when you increase the size of a problem and try to solve it equally fast as before, which is often the case.)

I wasn't trying to say that we can parallelize every problem -- I was commenting that 3D processing structures might very well have merit, since it is useful in cases where you can parallelize. And those ca

You can't compare a brain to a computer; they are nothing alike. Brains are chemical, computers are electric. Brains are analog, computers are digital.

If heat dissipation in the brain was a problem, we wouldn't have evolved to have so much hair on our heads and so little elsewhere; lack of heat to the brain must have been an evolutionary stumbling block.

If heat dissipation in the brain was a problem, we wouldn't have evolved to have so much hair on our heads and so little elsewhere

Hair is insulation against the sun. The reason why Africans have curly hair is to provide insulation while letting cooling air circulate. In colder climates, straight hair still provides enough protection from the sun while letting some air circulate.

I'm not sure I would call the neurons either analog or digital -- they are more complicated than that. But regardless, both the brain and a computer do computations, which is the important aspect in this case.

Not that brain heat dissipation matters for the discussion (as we already know roughly how much energy the brain consumes), but as far as I can recall, some theories in evolutionary biology assumes that heat dissipation from the head actually has been a "problem".

But regardless, both the brain and a computer do computations, which is the important aspect in this case

Both an abacus and a slide rule will do computations, too, but they're nothing alike, either. A computer is more like a toaster than a brain or slide rule; you have electrical resistance converting current to heat. The brain has nothing like that (nor does a slide rule or abacus, even though the friction of the beads agaiinst wires and the rules sliding must generate some heat).

The existence of the human brain shows that physics allows for three dimensional computing structures with very high processing power and low energy use and consequently low heat dissipation to exist. In other words, efficient computation (by todays standards) is not fundamentally limited to flat objects, if you can exploit parallelism.

But it doesn't address the heat problem in electric circuts -- again, it's more like a toaster than a brain. And note that it takes a computer far less time to compute PI to the nth digit than it does the human brain, despite the brain's 3D model and the computer's 2D model.

That would be because afaik most of the brains power goes into conceptualizing, and all kinds of other tasks. Pure maths is very small part of activities.But then again, those people who's visual cortex (or whatever the area is called which handles visual data and eye movements) are simply amazing in maths. Autistic persons can do amazing things as well.

It's matter of how the horsepower is used, not availability in the case of brains. For brains it's an very easy task to detect objects, and attach all kinds

... consider the human brain -- a massively parallel 3D processing structure. The brain has an estimated processing power of 38*10^15 operations per second (according to this reference [insidehpc.com]), while consuming about 20 W of power (reference [hypertextbook.com])...

Good point. I believe I have solved Moore's Law in computing for some years. I need shovels, accomplices, and every Roger Corman movie.

Yes, but I'd like to see a human brain run the Sieve of Eratosthenes, or accurately simulate a 3-body orbit, or run a given large-scale cellular automata for more than a couple thousand steps.

There's times when parallel computing is useful, but there's also times when pure "how fast can you add 1 + 1" type calculations are incredibly useful. You can't just abandon linear computation completely.

There's a great shortcut if you're just adding 1 + 1 over and over. 1 + (previous sum), even...:P. Although I do understand what you are saying.

As another poster said, more efficient software / methods / (development process) is probably more important. The problem is that it's hard to balance development with slow, unreliable humans with good design.

Human brains is FAR from unreliable and slow. We don't just consciously know about the insane multitude of seemingly very trivial and simple tasks yet requiring immense processing power to make happen.

Can you stand on 1-leg? can you run and while running jump and keep running without falling down? Are you able to very delicately almost touch your girlfriend, but not really touch and she feels that, with all fingers in your hand and run your hand through her back, just almost touching?

Yes, certainly we cannot abandon linear computation. The point I was trying to make is that the merit of 3D computational structures isn't nullified by problems with heat dissipation. They would be limited in the sense that they would need to be very parallel -- but that is still useful for a vast number of problems.

You have a point in that the body also consumes power, and is a required support system for the brain (much like my computer needs a PSU, motherboard, memory circuits, etc). However, 100 W is still extremely low from an energy per computation perspective.

For low power but high transistor (or transistor substitute) count stuff like memory i'm inclined to agree with you.

For processors afaict the limiting factor is more how much power can we get rid of than how many transistors can we pack in.

Also (unless there is a radical change in how we make chips) going 3D is going to be expensive since each layer of tranistors (or transistor substitutes) will require seperate deposition, masking and etching steps.

2D : anything that only has connections in 2 directions. The fact that it's stacked does not change it's 2Dness, if the layers don't interact in a significant way (a book would not be considered 3d, nor even 2.5D, nor would a chip structured like a book).2.5D : anything that has connections in 3 directions, but one of the directions is severely limited in what it can connect, and which way the wires can run (e.g. you can only have wires straight up with no further structure)3D : true 3D means you can etch a

Their technology requires polycrystalline silicon & the demand is increasing much faster than the supply.China might build more polysilicon factories, but they'll undoubtedly reserve the output for their own uses.This isn't a new problem, since mfgs have been complaining about shortages since 2006-ish (IIRC).

Were we to run out of silicon, it'd be time to find a new rock because something very serious has happened to this one. That said, the fact that silicon is among the most common of atoms tells us nothing about the short to medium term supply of sufficiently pure and correctly structured polycrystaline silicon.

If it takes 18 months to bring a plant online, that is pretty much the limit of the market's ability to cope with surprise demand(minus any slack in existing capacity that can be wrung out). For highly predictable stuff, no big deal, the plant will be built by the time we need it; but surprises can and do happen, even for common materials(especially given the degree to which "just in time" has come to dominate the supply chain. This isn't your merchant-princes of old, sitting on warehouses piled high. Inventory that isn't flowing like shit through a goose is considered a failure, with the rare exception of "national security" justified stockpiles or the rare hedge or futures position that is actually stored in kind, rather than in electronic accounts somewhere...)

This isn't your merchant-princes of old, sitting on warehouses piled high. Inventory that isn't flowing like shit through a goose is considered a failure...

Sir! Sir! Yes, you! I have a package for you here; it's a plaque from from the "most awesome remarks ever" voting board. Yes, that's right, sign here, initial there. Yes, you too sir. Have a good day, sir.

I don't think I do. More money will (typically) speed a process; but there are hard limits. You can pay overtime, run lights all night, bribe permitting guys; but concrete still takes time to set. Steel still takes time to assemble. Fine-tuning touchy chemical processes isn't instant.

You can certainly hire better people faster by throwing more money at them; but that isn't instant either.

The exact shape of the tradeoff curve between time and money varies by enterprise; but it never passes through T=0.

If a single dimension changes, assuming the NAND cell structure is similar, there would be a 5x reduction in size in each of the X and Y dimensions. Therefore, you would get up to 25x more density than a current NAND. This is why process technologies roughly target the smallest drawn dimension to progressively double gate density every generation (i.e. 45nm has 2x more cells than 32nm).

The big question I have for all of these technologies is whether or not is is mass production worthy and reliable over a normal usage life.

Just trying to follow what you're saying. You're saying that "X is 200% more than Y" means X=2*Y+Y, but that "X is 2x more than Y" means X=2*Y. I thought that "200%" was synonymous with "2 times".

Ignoring percentages and simply focusing on "X is two times more than Y" meaning X=2*Y, I'm assuming that "X is 1.1 times more than Y" means X=1.1*Y. Does "X is 0.5 times more than Y" mean that X=0.5*Y, that X is actually less than Y? Would this mean that "X is 0 times more than Y" means that X=0?

Best Buy and Amazon are both selling Intel's 40 GB flash drive for just under $100 this week... I'm building a server based around it and will likely later post on how that goes. Intel recently announced that they're upping the sizes so you're likely going to see the 40 GB model in the clearance bin soon.

It's here, it's ready... and when you don't have a TB of data to store they're a great choice, especially when you read much more often that you write.

And if you do need a big SSD Kingston has had a laptop 512GB SSD out since May with huge performance, and this month Toshiba and Samsung will both step up to compete and bring the price down. We're getting close to retiring mechanical media in the first tier. Intel's research shows failure rates of SSD at 10% that of mechanical media. Google will probably have a whitepaper out in the next six months on this issue too.

This is essential because for server consolidation and VDI the storage bottleneck has become an impassable gate with spinning media. These SSDs are being used in shared storage devices (SANs) to deliver the IOPs required to solve this problem. Because incumbent vendors make millions from each of their racks-of-disks SANs, they're not about to migrate to inexpensive SSD, so you'll see SAN products from startups take the field here. The surest way to get your startup bought by an old-school SAN vendor for $Billions is to put a custom derivative of OpenFiler on a dense rack of these SSDs and dish it up as block storage over the user's choice of FC, iSCSI or Infiniband as well as NFS and SAMBA file based storage. To get the best bang for the buck, adapt the BackBlaze box [backblaze.com] for SFF SSD drives. Remember to architect for differences in drive bandwidths or you'll build in bottlenecks that will be hard to overcome later and drive business to your competitors with more forethought. Hint: When you're striping in a Commit-on-Write log-based storage architecture it's OK to oversubscribe individual drive bandwiths in your fanout to a certain multiple because the blocking issue is latency, not bandwidth. For extra credit, implement deduplication and back the SSD storage with supercapacitors and/or an immense battery powered write cache RAM for nearly instantaneous reliable write commits.

I should probably file for a patent on that, but I won't. If you want to then let me suggest "aggregation of common architectures to create synergistic fusion catalysts for progress" as a working title.

That leaves the network bandwidth problem to solve, but I guess I can leave that for another post.

TMS already made such a product with their RAMSAN, it's a niche player for those who need more IOPS than god and don't need to store very much (ie a niche market). Their top end product claims to do 5M IOPS, 60GB/s sustained but only stores 100TB which is the size of a midend array like a loaded HP EVA 6400 but costing about 20x more (granted the EVA tops out in the tens of thousands of IOPS, but that's enough for the vast majority of workloads).

Really? Then my load must be very a-typical because I have about a rackfull of database servers and another rack of 144GB x5570 VM servers hitting a half full 6400 and it's not complaining let alone being ground to dust =) Now I'm not doing VDI because I haven't seen a TCO calculation that passes the smell test but from what I've seen in benchmarks and white papers putting a few of the HP SSD's into the 6400 should settle the boot storm problem. Is it possible to completely overwhelm such a SAN with just a

So it's more dense than NAND flash (and 3D, wow!), but how does it compare on speed, reliability, and endurance?

Taking a wild-guess here.TFA states that the 1/0 is stored as a nanowire that is continuous/interrupted (thus not require any electric charge).

Yao applied a charge to the electrodes, which created a conductive pathway by stripping oxygen atoms from the silicon oxide, forming a chain of nanometer-sized silicon crystals. Once formed, the chain can be repeatedly broken and reconnected by applying a pulse of varying voltage, the University said.

Toshiba has started mass production of 24nm NAND cells. Just saying...Intel and Micron are already at 25nm in their most recent production lines, Hynix at 26nm.Only Samsung, albeit the world's first NAND manufacturer, seems to be at 27nm.

Okay, so they claim that the memory is denser than NAND, and cheap to boot. That's great. But TFA makes no mention of its performance. How does the read/write speed compare to that of NAND, or magnetic drives? Could the 3D architecture potentially slow read/write times? I'm not trying to make any claims here, but it's a little disconcerting that there is no mention of it at all within the article.

I'm still waiting for some cheap, stable, high density ROM or preferably WORM/PROM. Even flash has only about 20 years retention with the power off. Which sounds like a lot, but it's not all that difficult to find a working synthesizer or drum machine from the mid-80s in working condition. But if you put flash in everything your favorite devices may be dead in 20 years. for most devices this is OK. But what if some of us want to build something a little more permanent? Like an art piece, a space probe, a DSP based guitar effects pedal, or a car?

Some kind of device with some nano wires that I can fuse to a plate or something with voltage would be nice if it could be made in a density of at least 256Mbit (just an arbitrary number I picked). EPROMs (with the little UV window) also only last for about 10-20 years (and a PROM is just an EPROM without a window). So we should expect to already have this digital decay problem in older electronics. Luckily for high volumes it was cheaper to use a mask ROM than a PROM or EPROM. But these days NAND flash(MLC) is so cheap and high density that mask ROMs seem like a thing of the past, to the point that it is difficult to find a place that can do mask ROMs that can also do high density wafers.

My parents were told that they were lucky the clutch and clutch plate in their car could be replaced, because the car is a whopping 16 years old. A different part for a different model had to be fitted by a tech who happened to be able to figure out it would work, then the adjustments needed to be twiddled. If Ford Motor Company has problems with the rate at which parts become obsolete, I don't imagine many CE companies are planning for 20-year serviceability either.

It is still quite easy to buy replacement parts for 1970s Fords, Chevys and Chryslers, most of them are third party aftermarket parts. If your parents took their old car to the dealer, that is likely why there were problems acquiring the part because they will just looking in inventories of OEM parts. An independent mechanic can do a broader search and save you quite a bit of money when fixing an old car.

My car is 17 years old and all the parts I can think of are available. I can still get replacement cylinder heads (albeit at uneconomical prices), not to mention all the other parts I can think of. The only stuff that seems to be unobtainable is the alarm key fobs... I've had Land Rovers that are older than I am, and someone built a complete one from parts a couple of years back. As parent says, either your dealer couldn't/wouldn't lookup aftermarket parts or you bought one rare-as-rocking-horse-poo car

I guess the materials alone don't determine the price, but the expertise/work to put them together. I'm also typing on a computer that's made out of cheap materials (lots of plastic, some alumin(i)um, small quantities of other stuff) - but it didn't come that cheap.

Sure it did. When the IBM-PC came out with a 4 mz processor and 64kb of RAM it cost ~$4,000. My netbook has an 1800mz processor, 1,000,000kb of ram plus a 180gb hard drive, and it cost ~$300. I'd say that's pretty damned cheap.

All of the tech we actually purchase comes out of tech published in articles like this one. Processor process technologies, bus evolutions, memory architectures, advancements in lithography are printed here and wind up in the products you buy. Not all of the articles are successful technologies but all of the successful technologies have articles and the time reading about the failures are the price we pay to know about such things in advance. Most of us don't mind, because there are lessons in failures

I think the criticism here is aimed at the university labs, where people invent stuff using outrageous amounts of money that is difficult or impossible to commercialize.

Absolutely. Commerical labs rarely do this kind of whizz-bang pre-announcement, which means that virtually any story like this is about a technology that is a) still in the lab and b) will never get out.

You have to get to the second page of the article to find out that some tiny tech company no one has ever heard of is "testing" a 1 kilo-bit chip these guys have made. That's right, a whole 128 bytes!

Unsurprisingly, the company is impressed. I was always impressed by the stuff my clients were doing too, w

Some supercapacitors have made it to market and refinements on lithium technologies have come a long way in the last decade, tripling the maximum storage density available. The problem is our demand for portable power has outstripped that growth (my blackberry is significantly more powerful than my desktop from 10 years ago and talks 6 different wireless protocols).