One thing AMD has taught me is that you can never beat Intel at its own game. Simply trying to do what Intel does will leave you confined to whatever low margin market Intel deems too unattractive to pursue. It’s exactly why AMD’s most successful CPU architectures are those that implement features that Intel doesn’t have today, but perhaps will have in a few years. Competing isn’t enough, you must innovate. Trying to approach the same problem in the same way but somehow do it better doesn’t work well when your competition makes $9B a quarter.

We saw this in the SSD space as well. In the year since Intel’s X25-M arrived, the best we’ve seen is a controller that can sort-of do what Intel’s can just at a cheaper price. Even then, the cost savings aren’t that great because Intel gets great NAND pricing. We need companies like Indilinx to put cost pressure on Intel, but we also need the equivalent of an AMD. A company that can put technological pressure on Intel.

That company, at least today, is SandForce. And its disciple? OCZ. Yep, they’re back.

Why I Hate New SSDs

I’ll admit, I haven’t really been looking forward to this day. Around the time when OCZ and Indilinx finally got their controller and firmware to acceptable levels, OCZ CMO Alex Mei dropped a bombshell on me - OCZ’s Vertex 2 would use a new controller by a company I’d never heard of. Great.

You may remember my back and forth with OCZ CEO Ryan Petersen about the first incarnation of the Vertex drive before it was released. Needless to say, what I wrote in the SSD Anthology was an abridged (and nicer) version of the back and forth that went on in the months prior to that product launch. After the whole JMicron fiasco, I don’t trust these SSD makers or controller manufacturers to deliver products that are actually good.

Aw, sweet. You'd never hurt me would you?

Which means that I’ve got to approach every new drive and every new controller with the assumption that it’s either going to somehow suck, or lose your data. And I need to figure out how. Synonyms for daunting should be popping into your heads now.

Ultimately, the task of putting these drives to the test falls on the heads of you all - the early adopters. It’s only after we collectively put these drives through hundreds and thousands of hours of real world usage that we can determine whether or not they’re sponge-worthy. Even Intel managed to screw up two firmware releases and they do more in-house validation than any company I’ve ever worked with. The bugs of course never appeared in my testing, but only in the field in the hands of paying customers. I hate that it has to be this way, but we live in the wild west of solid state storage. It’ll be a while before you can embrace any new product with confidence.

And it only gets more complicated from here on out. The old JMicron drives were easy to cast aside. They behaved like jerks when you tried to use them. Now the true difference between SSDs rears its head after months or years of use.

I say that because unlike my first experience with OCZ’s Vertex, the Vertex 2 did not disappoint. Or to put it more directly: it’s the first drive I’ve seen that’s actually better than Intel’s X25-M G2.

If you haven't read any of our previous SSD articles, I'd suggest brushing up on The Relapse before moving on. The background will help.

Call me cynical but I'd be very suspicious of benchmark results from this controller. How can you be sure that the write amplification during the benchmark resembles that during real world use? If you write completely random data to the disk then surely its impossible to achieve a write amplification of less than 1.0? I would have thought that home users would be mostly storing compressed images, audio and video which must be pretty close to random. I'd also be interested to know if the deduplication/compression is helping them to increase to the effective reserved space. That would go a long way to mask read-modify-write latency issues but again, what happens if the data on the disk can't be deduplicated/compressed? Reply

It's always matter of used compression algorithm. There are algorithms that are able to press whole avi movie (= already compressed) to few megabytes. Problem with these algoritms is they are so demanding it takes days even for neuron network to compress and decompress. We had one "very simple" compression algorithm in graphs theory classes.. honestly I got ultimately lost after first read paragraph (out of like 30 pages).

So depending on algorithms used you can compress already compressed data. You can take your bitmap, run it through Run Length Encoding, then run it through Huffman encoding and finish with some dictionary based encoding... In most cases you'll compress your data a bit more every time.

There is no chance to tell how this new technology handles it's task in the end. Not until it is ran with Petabytes of data. Reply

Please don't use an authoritative tone when you actually don't know much about the subject. You are likely to confuse readers who believe that what you write is factual.

The compression of movies that you were talking about is a lossy compression and would never, ever be suitable in any way for compressing data internally within an SSD.

Run Length Encoding requires knowledge of the internal structure of the data being stored, and an SSD is an agnostic device that knows nothing about the data itself, so that's out.

Huffman encoding (or derivitives thereof) is universally used in pretty much every compression algorithm, so it's pretty much a given that this is a component of whatever compression SandForce is using. Also, dictionary based encoding is once again only relevent when you are dealing with data of a generally restricted form, not for data which you know nothing about, so it's out; and even if it were used, it would be used before Huffman encoding, not after it as you suggested.

I think your basic point is that many different individual compression technologies can be combined (typically by being applied successively); but that's already very much de riguer in compression, with every modern compression algorithm I am familiar with already combining several techniques to produce whatever combination of speed and effective compression ratios is desired. And every compression algorithm has certain types of data that it works better on, and certain types of data that it works worse on, than other algorithms.

I am skeptical about SandForce's technology; if it relies on compression then it is likely to perform quite poorly in certain circumstances (as others have pointed out); it reminds me of "web accelerator" snake oil technology that advertised ridiculous speeds out of 56K modems, and which only worked for uncompressed data, and even then, not very well.

Furthermore, this tradeoff of on-board DRAM for extra spare flash seems particularly retarded. Why would you design algorithms that do away with cheap DRAM in favor of expensive flash? You want to use as little flash as possible, because that's the expensive part of an SSD; the DRAM cache is a miniscule part of the total cost of the SSD, so who cares about optimizing that away?
Reply

Well I know quite a bit about the subject, but if you feel offended in any way I am sorry.

More I think we got in a bit of misunderstanding.

What I wrote was more or less serie of examples where you could go and compress some already compressed data.

It's quite common knowledge you won't be able to lossless compress well made AVI movie with normally used lossless compression software like ZIP or RAR. But, that is not even a slightest proof there isn't some kind of algorithm that can compress this data to a much smaller volume.

To prove my concept of theory I took the example of bitmap (uncompressed) and then used various lossless compression algorithms. In most cases every time I'd use the algorithm I would get more and more compressed data (well maybe except RLE that could end up with longer result than original file was).

I was not forcing any specific "front end" used algorithms on this controller, because honestly all talks about how (if) it compresses the data is mere speculation. So I went back to origins to keep the idea as simple as possible.

Whole point I was trying to make is there is no way to tell if it saves data traffic on NANDs when you save your file XY on this device simply because there is no knowledge what kind of algorithm is used. We can just guess by trying to compress the file with common algorithms (be it lossless or not) and then try to check if the controller saves NANDs some work or not. OFC, algorithms used on the controller must be lossless and must be stable. But that's about all we can say at this point.

Sorry if I caused some kind of confusion.

What _seems_ to me is that basically there is this difference between X25-M and Vertex 2 Pro logic (taking the 25 vs 11 gigs example used in the article):
System -> 25GB -> X25-M controller -> writes/overwrites -> 25 GB -> NAND flash (but due to overwrites, deletes etc. there is 11GB of data)

[quote]
It's quite common knowledge you won't be able to lossless compress well made AVI movie with normally used lossless compression software like ZIP or RAR. But, that is not even a slightest proof there isn't some kind of algorithm that can compress this data to a much smaller volume.
[/quote]

Well actually there is. The entropy of the original file bounds the minimum possible size of the compressed file. Same reason you compress first before encrypting something: As the goal of encryption is to generate maximum entropy, encrypted data cannot be compressed further. Not even with some advanced but not yet know algorithm. Reply