I will list a chip in the table above when we have all of the following data:

Hashrate either in a claim from the manufacturer or measurement by a third party

Die size either in an unambiguous claim by the manufacturer or die photo from a third party

Process node in an unambiguous claim by the manufacturer

A plausible date by which independent verification will be possible.

Summary

As more and more announcements about bitcoin-specific chips come out, it would be useful to have a metric that compares the quality of the underlying design. I recommend "hash-meters per second" as a metric. This is calculated by dividing the hashrate (in H/s) by the die area in square meters and then multiplying by the cube of the process's feature size in meters (half of the process node's "name", so a 90nm process has a 45nm feature size). If you use hash-picometers instead of hash-meters you wind up with reasonable-sized numbers.

Current GPUs and FPGAs get 8-24 H*pm/s; the three ASICs we have numbers for have η-factors around 2,400-2,800 H*pm/s -- 100 times more efficient use of silicon than FPGAs and GPUs.

Migrating a design from one process to another by direct scaling -- when possible -- will not change this metric. Therefore it gives you a good idea of how the "rising tide" of semiconductor process technology will lift the various "boats".

Details

Process-invariant metrics factor out the contribution of capital to the end product, since the expenditure of capital can overwhelm the quality of the actual IP and give misleading projections of its future potential. A 28nm mask set costs at least 1000 times as much as a 350nm mask set, but migrating a design from 350nm to 28nm is not going to give you anywhere near 1000 times as much hashpower.

This metric probably does not matter for immediate end-user purchasing decisions -- MH/$ and MH/J matter more for that -- but for investors, designers, and long-range planning purposes it gives a better idea of how much "headroom" a given design has to improve simply by throwing more money at it and using a more-expensive IC process. Alternatively, this can be seen as a measure of how much of its performance is due to money having been thrown at it. That is important for investors -- and the line between presale-customers and investors is a bit blurry these days with all the recent announcements.

As semiconductor processes become more advanced, two important things happen:

1. The transistors get smaller (area).

2. The time required for transistors to turn on gets shorter (speed).

Area

Generally #1 (area) is indicated by the process name. For example, in a 90nm process the smallest transistor gates are 90nm long.

Chip designers refer to half of this length (i.e. 45nm on a 90nm process) as the feature size. The feature size is half of a gate length because you can always place transistors on a grid whose squares are at least half the length of the smallest gate. Usually you get an even finer grid than that, but it's not universally guaranteed.

Therefore, to get an area-independent measure of the size of a circuit, measure the circuit's area (units: square meters) and divide that by the square of the feature size (units: square meters) to get a unitless quantity. Well, almost unitless. Technically the units for a process's feature size are "meters per lambda" rather than meters, meaning the units for the final quantity should be (hash-meters) per (second*lambda-cubed).

Speed

Semiconductor processes are also characterized by a measure called "tau", which is the RC time constant of the process. This is the time it takes a symmetric inverter to drive a wire high or low, assuming the wire has no load.

The raw tau factor ignores the load presented by wires and other gates, so instead some desginers prefer to use This is also called the FO4 or the normalized gate delay. FO4 is the same measurement, but each gate drives four copies of itself.

Unfortunately the tau and FO4 numbers can be hard to come by, and they frequently get mixed up with each other (one is listed where the other ought to be). Also, there is a bit of "wiggle room" in exactly how the RC circuit or loading is done, so it's common to see inconsistent numbers cited by different sources for the same process. Because of this, using tau or FO4 directly in a competitive metric is a bad idea: people will fight over which tau or FO4 numbers to use. A previous proposal used gate delays as part of the metric, but I no longer recommend that metric since if it were to gain popularity it would inevitably lead to people playing games with the tau/FO4 numbers, picking and choosing whichever number cast their favorite product in the best light.

Fortunately, there is a fix. All we need here is a relative comparison of two circuits. It turns out that both tau and FO4 scale more or less linearly with the gate length (and therefore with the feature size). So instead of converting hashes/sec into hashes/tau or hashes/FO4 we can use the feature size as a proxy for the gate delay time and multiply the measure of hashes/sec by the feature size instead of multiplying by the tau/FO4 time. The resulting number will be totally meaningless as an absolute quantity, but the ratio of this metric for two different circuits will still give the ratio of their performance on equivalent processes.

Formula

So the forumla is:

(hashrate / area_in_square_lambda) * gate_switching_time

The units for this number are simply "hashes" (or "hashes per square lambda").

However remember that we're using feature_size (measured in meters per lambda) as a proxy for gate_switching_time since there is less wiggle room in how feature_size is measured and the two values tend to scale proportionally. This substitution gives us:

(hashrate / area_in_square_lambda) * feature_size

Since area_in_square_lambda is (area_in_square_meters / feature_size2) we can substitute to get:

(hashrate / (area_in_square_meters / feature_size2)) * feature_size

which is equivalent to

((hashrate * feature_size2) / area_in_square_meters) * feature_size

collecting the occurrences of feature_size gives us:

(hashrate * feature_size3) / area_in_square_meters

or alternatively:

(hashrate / area_in_square_meters) * feature_size3

Example

The Bitfury hasher gets 300MH/s:

300*106H/s

It runs on a Spartan-6, which a 300mm2 or 300*10-6m2die. Dividing thehashrate by the area in meters gives:

1*1012H/(s*m2)

This is why the Bitfury hasher a convenient example -- out of coincidence its hashrate in H/s just happens to be the same as its die area in square millimeters. This makes the numbers simpler.

Multiplying the number above by the feature_size (22.5*10-9) cubed (11390.625*10-27 meters) gives

11390.625*10-15H*m/s

which is:

11.390625*10-12H*m/s

The SI units for 10-12 are "pico", so the Bitfury hasher gets

11.390 H*pm/s

Summary

To compute the metric, take the overall throughput of the device (hashes/sec), divide by the chip area measured in square meters and multiply by the cube of the process's feature size. Shortcut: take the hashrate in gigahashes per second, divide by the area in mm2, multiply by the feature size (half the minimum gate length) in nanometers three times.

This number can then be used to project the performance of the same design under the huge assumption that the layout won't have to be changed radically. This assumption is almost always false, but assuming the design is ported with the same level of skill and same amount of time as the original layout, it's unlikely to be wrong by a factor of two or more. So I would consider this metric to be useful for projecting the results of porting a design up to roughly a factor of 2x. That might sound bad, but at the moment we don't have anything better. It also gives you an idea of how efficiently you're utilizing the transistors; once I get the numbers I'm looking forward to seeing how huge the divergence is between CPUs/GPUs/FPGAs/ASICs.

I propose to denote this metric by the greek letter η, from which the latin letter "H" arose. "H" is for hashpower, of course. Here is a table of some existing designs and their η-factor (I will update this periodically):

This metric does not take power consumption into account in any way. I believe there ought to be a separate process-independent metric for that.

If anybody can add information to the table, please post below. Getting die sizes can be difficult; I know the Spartan-6 die size above is a conservative estimate (it definitely isn't any bigger or it wouldn't fit in the csg484).[/list][/list]

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

ExampleThe Bitfury hasher gets 300MH/s: 300*106H/sIt runs on a Spartan-6, which is a 45nm device with lambda=22.5nm on a 300mm2 die.Dividing the hashrate by the area gives: 1*106H/(s*mm2)Converting from mm2 to m2 gives 1H/(s*m2)Dividing this by lambda (22.5*10-9 meters) gives 0.0444*109H/(s*m)which is 44.44*106 H/(s*m) or roughly 44MH/s*m.

SummaryTo compute the metric, take the overall throughput of the device(hashes/sec), divide by the chip area measured in square meters anddivide again by the lambda factor for the process used.

If you want to see how efficiently the transistors are utilized, you have to multiply by lambda (in meters) rather than divide by it.In effect the units will be actually H/ms, but the calculated values will be a dozen orders of magnitude smaller.

Yes, I swapped the multiplication and division in the instructions. Thanks for catching this! I've fixed it.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

I've also added figures for the ATI 5870 since it seemed to be the popular card (I've never mined with GPUs so I'm probably wrong here). I was initially surprised to find that it is has an η-factor that is actually on par with most Spartan-6 bitstreams. Three reasons for this:

1. We have an exact die size for the 5870 but only an upper bound for the Spartan-6. The Spartan-6 is definitely smaller than 300mm2, but Xilinx won't say how much smaller and I haven't gotten around to grinding the top off of one of my dead chips yet.

2. If you think about it, FPGAs have an enormous amount of routing, and any given design uses only a tiny fraction of it (probably under 5%). Since η measures only how efficiently the silicon is used and has no bearing on power efficiency, it shouldn't be all that surprising that the unused routing on an FPGA accounts for about the same amount of silicon as the architecture-task mismatch on a GPU. The main difference is that on an FPGA that unused routing sits idle and consumes no power.

3. The $/(MH/s) for 5870's and volume-priced Spartan-6's using a high-end bitstream is nearly identical -- 2 $/(MH/s). Unfortunately the mining-hardware market is a lot smaller than the GPU market, so FPGA mining board vendors' markups have to be a lot higher than ATI's.

Edit: it turns out that my estimate of the die size for Spartan-6 was wildly off -- wrong by 250%. See below for actual measurements from a demolished chip.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

Updated with die size for BFL SC. We now have the first ASIC η-factor figures! Thanks for the transparency, BFL. Hopefully your competitors will follow suit.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

So ... reading through the first post ... I've not spotted where it says what use this is.

Here, I'll put it in red for you:

Quote

Migrating a design from one process to another by optical scaling -- when possible -- will not change this metric. Therefore it gives you a good idea of how the "rising tide" of semiconductor process technology will lift the various boats.

This metric probably does not matter for immediate end-user purchasing decisions -- MH/$ and MH/J matter more for that -- but for investors, designers, and long-range planning purposes it gives a better idea of how much "headroom" a given design has to improve simply by throwing more money at it and using a more-expensive IC process. Alternatively, this can be seen as a measure of how much of its performance is due to money having been thrown at it. That is important for investors -- and the line between presale-customers and investors is a bit blurry these days with all the recent announcements.

When firstly it ignores the majority of the non-GPU devices currently miningIcarus, Lancelot, ModMiner, Cairnsmore …

None of these mine without a bitstream. The bitstream affects the hashrate (and therefore the η-factor) a lot while the particular board has very little impact aside from the number of chips on it. That's why the η-factor is listed by bitstream, per chip -- just like hashrates.

The "M" was a typo (fixed). All the numbers in that column have the same units.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

When firstly it ignores the majority of the non-GPU devices currently miningIcarus, Lancelot, ModMiner, Cairnsmore …

None of these mine without a bitstream. The bitstream affects the hashrate (and therefore the η-factor) a lot while the particular board has very little impact aside from the number of chips on it. That's why the η-factor is listed by bitstream, per chip -- just like hashrates.

So you're saying your metric isn't much use since you can't list the most common mining FPGA's?

There are bitstreams and devices used WAY more than anything you listed (ignoring GPUs)

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

Updated with info for the BFL SC card, thanks to BFL themselves (hint to their competitors: maybe you might want to consider releasing figures like they have?)

This increases the urgency of getting an exact die size for the Spartan-6. We've always known the die size is less than 300mm^2: for one, the package cavity is square but the chip is rectangular: in FPGA editor it's almost twice as tall as it is wide. There's no guarantee that aspect ratio matches the silicon, but it's unlikely to be off by so much that it's square in real life.

I really doubt that BFL is squeezing nearly 2x the eta-factor out of their chips as anybody else, so I now suspect that the Spartan-6 die is substantially smaller than the 300mm^2 package cavity. Unfortunately I seem to have lost the two dead chips I had… argh. I'm almost tempted to sacrifice one of the occasionally-flaky-but-mostly-working ones.

Also keep in mind that there's a substantial amount of per-die overhead for I/O pads and clocking infrastructure, so using two huge chips (like BFL does) instead of five tiny ones (like Bitfury would to get the same hashrate) is inherently a more efficient use of silicon -- but not 2x more efficient. Sadly there aren't any bitstreams for Virtex-class devices that have had as much care put into them as the Bitfury/Tricone/BFL bitstreams for their respective devices.

Edit: the die size estimate for the Spartan-6 was off by 250%; I have actual measurements now (see below).

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

A moment of silence, please, for the XC6LX150 pictured below. He gave his life (and his 224MH/s -- slowest chip in the cluster) in the name of science:

Apologies for the low-tech measuring equipment and misaligned ruler.

It turns out that my estimate of the size of the Spartan-6 was an overestimate by more than a factor of 2! The die itself is 10mm on one side and between 11mm and 12mm on the other side. Let's call it 10x12 = 120mm2. I've updated the η-factors for all the Spartan-6 bitstreams; these are now final numbers using actual measurements (not estimates) for all of the parameters.

PS: this means that Xilinx's "CS" package, which is suppose to stand for "Chip Scale" is not actually a Chip Scale Package. I had assumed it was.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

Avalon chip count and power usage are available. You can now update your comparison table.

Thanks, but I need the actual die measurements, not the number of chips-per-wafer.

Please let me know if/when they are posted by either the Avalon manufacturers (I'll take their word for it) or some third party (must include a photo).

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.

The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.