Friday, May 23. 2008

So the question has been bothering me for quite some time, how many times can I really write to one of these USB memory stick things before it finally dies. With as popular as the USB drive based distributions are becoming, I was wondering, how practical are these distributions. The stick used in this experiment is a 1G Sony Microvault USB Flash Drive.

I'm going to see how many times I have to write to this before it dies. (more after the fold, this is a very long post)

Prepping the Media
I had to write a utility that could do file reading and writing using the O_DIRECT flag. That means that the filesystem access directly interacts with the file. This is needed since the filesystem likes to cache file data in
RAM. This is ideal under most circumstances, but for my testing, we don't want this. In order to use O_DIRECT, I'm going to have to reformat the drives to use the ext3 filesystem. The drives came with vfat filesystems, which doesn't appear to support O_DIRECT properly.

Here is how I created the filesystems:

mkfs.ext3 -m 0 -b 1024 /dev/sdb1

Yes it has a journal, but that's how I would use it, so that's acceptable.

The Test
Fill up the disk, leaving one free block. Write to that block over and over with random data each write, until device failure.

Create test file:

dd if=/dev/urandom of=test-file bs=1024 count=1

sync that file once, then fill up the disk using this command

dd if=/dev/urandom of=big-file

We then start running the application which will overwrite test-file until the drive fails.

This test isn't really possible, as a one block file, really needs more than one block. This is of course due to filesystem magic I don't completely understand, but that's OK. Presuming all needed blocks are written to, this should be OK. It seems the drive had to have 3 free blocks for this test to work.

I'm going to guess that this means I should get somewhere around 30K writes before the device dies. The numbers I keep reading are that a USB flash drive can see a about 10K write per block before it dies.

The Results
It turns out that the wear leveling seems to work better than I expected. It took 90593104 writes for the drive to die. That's 90.5 million, well beyond my expected 30K. From looking at my data, it appears that the flash drive will move used blocks as part of the wear leveling. As I had no idea how the wear levelling would work, this was a pleasant surprise. The below graph shows the number of microseconds needed for each write. As I have too much data for gnuplot to easily handle, I'm only graphing every 1000 points. There appears to be three distinct lines, the top two I presume are when the wear levelling is moving around blocks, while the bottom solid line shows that the vast majority of writes took about 1500 microseconds.

The long write times don't seem to follow any sort of real pattern. I graphed how many writes occurred between each "long" write (a write which was > 10000 microseconds) and the graph is even more puzzling.
It would seem the drive was well aware of its impending death and stopped wear leveling quite as often.

The reads were a bit more constant. Even as the drive was dying, the time it took to read data never really changed.

The Death
The application I was using to read and write the data wasn't what caught the death of the drive, it ended up being ext3 itself. While the application quit working, it failed during the write, not during the readback verification. This was the console message I saw

I can still mount and read from the drive, but I can no longer write to it. It's nice to know that when a drive dies, it's more likely you just won't be able to write new data to it, rather than complete data loss.

I'm happy to say that I have far more confidence in these little drives than I thought I would be. While it obviously can't stand up to the sort of abuse you could give a hard drive, it's easily good enough for my emergency needs.

You really should be doing your test with EXT2, mounted with 'noatime', not EXT3. With EXT3, the filesystem writes to the journal every time you sync metadata to the file. With EXT3 and default options, it is going to update the metadata every time you write to update atime/mtime. This means that it updates several other blocks other than your file data.

The noatime is a good call, but I was well aware of the journal updates and having them there was intentional. My goal wasn't so much to get an exact number of writes, but just a general idea of how a typical filesystem I run would stand up.

I'm certain there are lots of things I could have done to make this test more exact. Perhaps I'll put this knowledge to use in a future run.

Absolutely fascinating for a lurker who knows little but loves to listen to the "big kids" talk. As others have said, this does increase my confidence that the sticks I carry are likely to be functional longer than I feared.

I am using ext3 on Ubuntu with 8 GB A-data USB flash drive.
I use:
noatime in /etc/fstab
tmpfs on /tmp
I tried to raise commit interval, but dont know if it works correctly.
I disabled disk cache in Firefox but it still seems to me that firefox is doing most of the writes do the drive.
My machine writes several hundreds of megabytes to the drive each day, while browsing Internet and doing normal stuff.
You left very small space for wear leveling to kick in. If I leave approx 1GB of 8GB free at all times how long can the flash drive live?
I would make an experiment if you gave your testing software to the public.

hello, can you please put here the script you used or at least some explications how to bypass caching using that flag??
i create a small script in fedora that generates the test-file continously on the usb key with a simple counter to show me how much the file has been written but i dont think it works beacose of the caching !

It's not a script, but rather a small C program I wrote. You can get them here:
http://josh.bressers.name/sync-file.c
http://josh.bressers.name/sync-random.c

Just build them with gcc, no extra libraries are needed.

The sync-file.c will copy a file using the O_DIRECT flag.
The sync-random.c will write random data to the output file. This is most useful as if the flash drive is smart, they will only flip the bits that change, so writing the same data over itself, wouldn't result in an actual write.

you couild try to low level format the USB flash disk. First search the internet for Chip Genius, it's a chinese program that will identify the microcontroller in the stick, then you can search for an MPT (Mass Production Tool) for that specific chip. With this tool you can low level format the stick... it will automatically detect bad blocks also.

I saw your post on bress.net re: Chip Genius, and figure you may know more than I about unlocking a spontaneously write-protected Ativa 8GB flash drive (no switch). Using Diskpart, I got these results:

I tried that on my USB-stick which I can't write to anymore after just one write. Doesn't work. But it still can read. Pretty awkward, since there are still embarassing photos on it. Don't know how to make it un-readable even by forensics. I'm a bit paranoid and afraid of killing it with fire.

Thanks for the interesting reading. I had always thought that these things didn't last forever. And your data and testing proves that. I have two 8 Gig flash drives from Micro Center that died at the same time. One turned out to be just mechanical falure due to the pins wearing out the other just quit. Most likely from the same thing or even from falure of a internal pin broken. Wich prompted me to search for just this kind of Blog. Keep up the Geek work.

Hi, I join those who thank you for this excellent piece of work. Just one question remains in mind. How long will data stored on a flash drive last? Does it deteriorate with time? For example if I store photos on such will they still be there in 10/20 years time, assuming technology still exists then to allow me to retrieve them. (A very real possibility, if you think of floppy disks - the only way you can get information of them now, is to find an old Computer that still has a Floppy drive)
Regards,
Fred CB

It's hard to say how long a flash drive will "last". It's going to be subject to the same electro/magnetic deterioration any other electronic or magnetic media is subject to: over time, the cells will lose their charge state, resulting in the discreet 1s and 0s being muddled. I've heard 10 years used as a liberal outside number for "this is when data will deteriorate".

If you were to take the drive out of storage and rewrite it every year or two, I see no reason why it wouldn't last 10 years. But realistically, wouldn't you want to migrate it to a larger, newer device within a couple years anyway?

Thanks Ben, I had an idea it was roughly 10 years but was seeking confirmation, not being exactly sure how data is stored on a Flash Drive as compared to a floppy disk.
You are right about it being best to move the data to a newer device every few years, but I was more concerned about putting something aside on a Flash Drive and then unintentionally forgetting about it, then remembering at some later stage, especially photos.
FredCB

Just to be clear, data written once to flash storage will not likely last for 10 years. The memory states will likely decay, and at least some of the data will be unrecoverable. Barring other failures and considering repeated writing of the data over a period of years, I suspect it -might- last that long.

Though, we've not had flash drives long enough to make a concrete determination. They're the modern floppy disks, albeit with significantly less volatility and greater density. I would personally consider them about as reliable as tape media, for all intents and purposes. IE, don't use them for single-source archival.

I is a very interesting article and I am currently looking into something similar. I do however not understand why you would fill up the disc with random data before doing the test. I would presume that the wearleveling does not understand the filesystem, thereby not knowing which block are used and which aren't - have you do a test on a disc without filling it?

That said, I can still remember when people were fascinated by CDroms and how long would they last etc. I heard some funny estimates like a CD should be re-written every 3 to 5 years because they'd fade etc. I know have some CD's going back to 1996. They still read just fine. My point is how long this lasts in storage is speculation. It could be 5 to 10 years. Or maybe someone will find a USB stick in 50 years and it will still have data on it. Who knows.....

Hi Zingbot, I suspect that storage life must also take into account, where such items are actually held/stored and how often the data on them is accessed. I do not believe that you would still be able to access the content of your 1996 CD's if they had of been subject to considerable temperature variations or you had been using them regularly. Time itself, although relevant, is not the only factor for which one has to remain mindful. Cheers Fred CB

We just don't know yet how long data can be stored on any medium as we simply haven't had any REAL time tests done.

I have CDs that date back 25 years and they still play back just as perfectly today as they did when I first brought them.
The same question is asked of how long do printed photos last, well my mum still has an old photo album with photos, sure they may not be as colourful as modern photos but they are still there perfectly viewable.

Of course no manufacturer can guarantee 10, 20, 100, or 1000 year storage,.
In 50 years time perhaps Flash drive manufacturers will get a better idea of just how long data can safely be stored on their products, but right now we are in the experimental phase.

So comeback and visit this blog in 50 years and then you might just get a little more accurate information......of course by then we might not even care that much how long data can last on a flash drive!

CDs and CDRs are drastically, drastically different things. CDRs are, for all intents and purposes, a crude work-around for RW optical media - so that they'd work with the existing CD drives.

CDs are pressed - like records used to be, or like someone might do to a piece of play-doh with the palm of their hand.

CDRs, on the other hand, are burned - literally. The laser is focused to a point on the disk at high enough intensity to "burn" the impregnated inks, discoloring them.

I've got CDs from the early 1990s. I have a couple CDRs from 2002, maybe. (The quality of these disks varies greatly, I've found - much more than pressed disks.)

I also have a flash drive from 1998 or 1999. It's (IIRC) 64MB, and it's seen periods of frequent writes as well as sitting for years. It's still 100% valid (I've kept a record of how many available bytes there are on the device since 2002 with all of them remaining both readable and writable, at least.)

Granted, I've also got a number of hard disk drives from that era kicking around, still in use. And those aren't exactly known for their longevity. A 40GB Deskstar drive, even (one of two originally purchased - no idea where the other is now)!

Long term, we'll see. I'd say the prospects for flash memory are at least as good as hard drives and likely better than optical. Archive quality? Probably not, at least for a while. Still, electron decay is a relatively slow process...

This is great, thank you. I had recently read about the finite lifetime of flash drives and was starting to worry about using it for my school programming projects - I do all the compilation etc on the flash drive. This gives me more confidence in using it.

Hi, interesting post, guess over the years I've had all these problems with sticks, but they are still useful so i guess i will have to persist - although i think the approach is not to use them for really important storage and never say no to a freeby.

I am using one (HP4GB) as a harddisk in a unix weather station built from a cisco nslu2. a guy in germany sells the software http://www.meteohub.de (and is probably wildly knowlegable regards stick death) and it has been a lot of fun (I dont do sudoko or crosswords) it seems to have lasted 6 months (clearly running 24x7) before the corruption crashed the system. Linux does a far better job than microsod of recovering what is left (3.73GB) which I have used to rebuild the station as a temporary fix while i get a new stick.

If I only use/format say 2GB of a 4GB stick do you think the self healing sofware on the stick will be more successful?

Im sure i read somewhere you can get "better quality" sd cards, i came across your post looking to see if anything had been posted regards sticks.

You said: If I only use/format say 2GB of a 4GB stick do you think the self healing sofware on the stick will be more successful?

No. As per my understanding, the controllers on these things are design designed to export a specific amount of usable memory, with a little in reserve. They're dumb as to what the OS has allocated, internally managing bad blocks. When the number of bad blocks surpass the available 'repair' blocks, you'll start to get errors - regardless of how much unformatted space there is.

A better way to mitigate (as you call it) stick death is to limit the number of writes. One possible way to do this would be to store data on /tmp and daily copy it to primary storage; this will last quite some time and reduce writes.

Using CF (expensive) instead of SD (cheap) might also help; I don't know if the device you're using supports such things, though.

For what it is worth, back in the late 80's, I was able to (VERY NEW at the time) write some data to a CD r/w disc... again, VERY new technology THEN; not available to the 'public' as yet. The data written was just basic programming code, which was written using my 'Commie' (C-64) system. I used Cobalt primarily back then, but 'Basic' was better for my needs with the little system I will ALWAYS hold near & dear in my heart.
I take that disk out once a year, and guess what?_ it can STILL be read, and even WRITTEN TO with the old, old hardware I have taken good care of. My dad was a retired Electronics Tech. from the military, and through him, I got to 'play' with all sorts of WOO-HOO hardware, most now in peoples' homes... LOL!
Keep-up your good work, buddy!

Hey, I´m the meaning that the USB flash drive is only a data-tranporter. not a storage. And when I see here lots of writing- and reading cycles then is this the result of a scientific study. I have supils who use the flashdrive in school, at home, in office. so they have different enginges, different operating system and the controller tilds very often. so the scientific is not the reality. I recommend to hold a duplicate flash drive of duplicate date on the home PC or Mac. Then You can loose the flash drive without regret.

Here is another excellent article about flash memory that my dad showed me about a year ago.
http://www.anandtech.com/show/2738/8
This portion of the article explains why the write speeds dropped so suddenly, I think. It has to do with that mysterious "filesystem magic" mentioned earlier. If that article is tl;dr....
SSD's can "delete" small sections but can only write big sections so if you "delete" (read: make unreadable but data is still there) a small section and later want to use that space, it copies all the other info in the big section to somewhere else (like a cache), deletes the entire big section and then re-writes the big section with the new stuff you want to save plus the old stuff you wanted to keep that was temporarily saved to a different location. Read the whole article. It's fascinating, even for a layman like me.

Hi, nice post. But for me the real hassle is with flash drives that suddenly stop reading completely when you insert them into the USB port. It would be great to understand why this phenomenon happens and how it could be avoided.

I checked your source code, and have two questions: (a) Since you are using O_DIRECT, why did you need to invoke sync() in each iteration of the loop? and (b) instead of using strncmp, shouldn't you be using memcmp? strncmp would stop at the first 0...

You people do realize that a test on a single thumb drive is pretty much *statistically irrelevant*, right?

A nice anecdotal case and a good base for research but sentences like "It's nice to know that when a drive dies, it's more likely you just won't be able to write new data to it, rather than complete data loss." doesn't make any sense after just one test.

Interesting, I would give it a try with my stick.
Obviously the c-files mentioned are no longer available at the given location
http://www.bress.net/~bress/sync-file.c
http://www.bress.net/~bress/sync-random.c
Is there a new location?

This portion of the article explains why the write speeds dropped so suddenly, I think. It has to do with that mysterious "filesystem magic" mentioned earlier.
SSD's can "delete" small sections but can only write big sections so if you "delete" (read: make unreadable but data is still there) a small section and later want to use that space, it copies all the other info in the big section to somewhere else (like a cache), deletes the entire big section and then re-writes the big section with the new stuff you want to save plus the old stuff you wanted to keep that was temporarily saved to a different location. Read the whole article. It's fascinating, even for a layman like me.

Thanks for doing this. Most tests I've seen deal with read/write to the raw device. This is the first one I've seen that takes the overhead of a real filesystem into account.

These results are very interesting, but they should be understood in their context.

If the stick is inserted and pulled out frequently, the contacts will probably fail mechanically before the SSD fails electronically.

If the stick is left in place and small files are read/written to it randomly over time (e.g. used as a root/usr/home/... filesystem for a dedicated computer) the failure modes will be similar to those described here, but the random nature and small size of the writes may change things -- this test seems to involve repeated large writes.

It would be interesting to try it with a script that works the filesystem code more intensively.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.Enter the string from the spam-prevention image above: