My opinion of S.M.A.R.T. values and their importance.

Recommended Posts

Guest Keatah

Guest Keatah

I put a lot of faith in S.M.A.R.T. It is like engineering data, and it is up to you how you interpret it and use it. SMART reports need to be taken in context with a number of big picture aspects. Not all errors mean impending failure as we will soon find out.

TLDNR version:

The end-user software aspect of S.M.A.R.T. is HUGE! It's the software telling us what's happening. Software that indicates pass/fail for S.M.A.R.T. values is software that has formed an "opinion". An opinion drawn from observing rates of change of some counters, absolutes in others. An opinion that the programmer thinks is correct. The key word here is thinks. You will soon discover that his/her version of fail can be different than yours or mine or reality itself. And it doesn't always take into account the environment wherein the disk operates.

Make no mistake about it, the S.M.A.R.T. counters are great indicators. If disk hardware generates some internal fault, you can bet the SMART chart will show it. What you **DO** with this information is up to you and the software reporting it to you. Let me show you this chart right here. It has two counters out of spec and displaying an ominous warning! OMG!! HIDE YOUR DAUGHTERS! MY DISK IS GOING TO EXPLODE! NEVER TRUST IT AGAIN!

NOT! This chart is for a disk in a laptop. Note that the Spin Retry Count is at 52. And the Reallocated Sector Count is at 2. Them ******** #@$^@#*! (insert HDD mfg name here) really sucks and they are the worst ever. Except that's not really the case. I said this is a laptop disk. And being portable it is going to be subject to improper power on/off cycles. Furthermore the 2 sector errors can also be attributed to improper shutdowns or perhaps the disk got jostled, the head jumped and made an error. These are errors which I can explain away with outside factors influencing the disk. On this very disk these counters haven't increased since like last year and can be considered isolated incidents. The firmware in the disk's CPU has already compensated and dealt with these problems long ago.

The important takeaway point is to watch these counts with an eagle eye once they show a change. Or let quality software do it for you. You need to determine if it's an isolated incident or a new trend forming. If another error happens, and another and another, we're in trouble. But as long as they stay stable and don't increase we're alright. The above disk is alright and fully trustworthy. It's when the counters start increasing quickly and rate of change of counts go up - now you need to consider if you have a failing disk on hand.

Alright, we've established the disk is perfectly healthy and ready to roll. Everything that should be zero is zero. Value (05) and (0A) have been accounted for. So we normalize them to zero and carry on. Also observe these next two disks. They are also perfectly healthy and I trust the above disk (which has warnings) as much as I do these "error-free" units.

Everything that is supposed to be zero is zero. And it better damn well be! ..Since, for these two, I could not reconcile and explain away any errors like I did in the first example. An error here would be coming from something failing within.

It is of my strongly held opinion that SMART data is a good thing. But it needs to be interpreted in context. You need to consider the operating environment, power status, quality of cables, physical mounting, motherboard stability (ram, cpu, southbridge), and user behavior. All these things are going to affect how report values should be interpreted. You need to consider all that in conjunction with a SMART report in order determine if you have a failing disk now, just a glitch, impending failure later on, or 100% functionality.

I have yet to see a quality piece of commercial or freeware software always be in agreement with its peers. Some packages will say a disk is bad, others say it's good. This includes Speccy and anything from Norton or Crystal Disk or any of those PC optimizer suites and backup suites from the likes of AusLogics or Acronis or Macrium. While they will read and report the raw values, those values are open to interpretation. Some software does a better job than others.

To make matters more difficult, no manufacturer has really standardized their ranges and the whole SMART thing continues to evolve. Different disk models report different variables too. And every utility that gives you a green check or a red X is really giving you a guess formed from what someone else thinks is good or bad. I would currently place the predictive nature of SMART at around 65% - 85% accuracy.

No utility out there can be expected to be aware of the thousands of models of disks and infinitely variable conditions under which they operate. For more in-depth reading of SMART check out the wiki as a starting point -- http://en.wikipedia....wiki/S.M.A.R.T.

Share this post

Link to post

Share on other sites

I've never put much thought into what programs tell me the S.M.A.R.T. value is for a hard disk, probably solely because I've luckily never had a hard disk failure issue. Optical disc drives are another bag of worms altogether for both PC and game systems.

Share this post

Link to post

Share on other sites

Guest Keatah

Guest Keatah

Although I can see them just fine. I have no clue why you cannot. This is a problem for the moderators to figure out.

I uploaded them, this time, as attachments instead of media files.

I've had good luck with all optical media except for CD R/W. CD R/W is known to be universally unreliable. Optical disk mechanisms do have a shorter life than magnetic media drives. Dust and laser burnout (10,000 hours) are the common modes of failure.

And when I see a set of "0" like in the above screenshots.. In the data column.. That's when I know a disk is good. Any non-zero result needs some explaining as to why it is non-zero. Like I said earlier, error rates in disks are less than 1 bit in 115. Enterprise disks score even better at less than 1 in 116. So if a disk is spitting out errors, you have a problem!

So if a disk is spitting out errors, frequently enough that you can make note of them occurring, you have a problem! A serious problem.

Share this post

Link to post

Share on other sites

Guest Keatah

Guest Keatah

I tend to like how the chart looks visually. It is easy to read and has a colorful layout. HD Tune Pro seems to have just enough of the basics to benchmark your disk, read SMART, and verify the surface. It's easy enough for anyone to use. And won't let you destructively write anything unless you go through the trouble of removing partition. So it's safe for noobs.

If I were to trust the automatic green check - red Xinterpretation of SMART data to software, I'd go with a drive monitor that's part of a backup suite. Or HD Tune Pro, or Crystal Disk info, or PassMark's DiskCheckup.