Bad sector or no bad sector?

9 posts in this topic

I recently bought a Lenovo ThinkPad T61P (model 6460-d8g, to be exact). It came with a Hitachi 7K200 160GB hard drive and had Windows Vista Ultimate 32bit preinstalled.

Having installed all my necessary software (e.g. Office etc.) on it, I just happened to run chkdsk /r (or was it chkdsk /f, I can't remember). It found one bad sector. Well, I phoned Lenovo support and the hard disk was replaced. So far so good.

Now, I have installed Vista again (and please note, using Vista restore dvd's provided by Lenovo, NOT the actual Vista installation dvd as I don't have one - should this make any difference), and just out of curiosity, I decided to run the chkdsk again. Quess what, it reported ONE sector as a bad again!

After that I ...

... tried to check the drive with the latest version of Speedfan, but it didn't recognize the drive in either AHCI mode or Compatibility mode (BIOS setting).

Share this post

Link to post

Share on other sites

Post the SMART values. A handful of bad sectors won't make Hitachi DFT report the HDD as failing but even a few bad should be visible in SMART values, if one is able to interpret them. Keep your eye on raw data of following attributes: Reallocation Event Count, Reallocation Sector Count, Current Pending Sector Count and the attribute that is either named "Off-line Uncorrectable", "Off-line Correctable" or "Uncorrectable Sector Count". I don't know why the last attribute is named so drastically different ways in different SMART monitoring tools... especially as correctable and uncorrectable are the opposite of each other so it shouldn't be just a trivial typo. Anyway, these four attributes have something to do with bad sectors. All raw data on perfectly operating disk would be zero on all these attributes... but a handful of bad sectors isn't necessarily alarming if they don't keep growing in number. HDDs that have few bads may stop growing them and live for twenty years ...or they may grow 100000+ more bad sectors within a single year... or get it's service areas damaged before that making the disk unaccessible.

Try HDDScan. It can do the short & extended SMART selftests in Windows (these are exactly the same tests done by Hitachi DFT but it's more convenient as you don't have to boot to DOS and you can continue to use the computer normally while scanning), offline data collection routine (this selftest is completely autonomously handled by the HDD itself once HDD receives the command to start it), verify scan, read scan and erase scan. Last three aren't SMART tests but you'll see a transfer rate graph and it'll create a list of slow and bad sectors. HDD scan can also read SMART attribute values.

Share this post

Link to post

Share on other sites

Ok, thank you very much, "whiic" and "extrabigmehdi", for your time on this, and sorry it took me while to get back here .

So, I installed HDDScan, but it didn't recognize the drive in either AHCI or Compatibility mode. I don't know whether this is already an indication of an actual problem with the drive?

Anyway, I tried and installed HD Tune Pro Version 3.10. It seemed to recognice the drive correctly in Compatibility mode. Even HD Tune did not recognize it in AHCI mode, though. And here are the results shown on the "Health" tab of the HD Tune:

(Whoops, something wrong with the tabs, but I hope you can still read the list )

Are these normal values? Does it seem to you that everything is ok? At least the program seem to think so. The number in the data column on the line "(C2) Temperature" does seem a bit strange, though - does this mean the rest of the results cannot be trusted either? The programs main window showed the temperature to be on a bit more acceptable level of 33 degrees, though .

"Extrabigmehdi", you said that if the drive is not formatted with Vista, this could cause all kinds of errors. How could I correct this situation then, as I don't have the Vista installation disk...?

Share this post

Link to post

Share on other sites

your hard drive is fine. You can also submit datas for your hdd online with speedfan, so that they are compared with other datas for similar drive.

According to the "reallocated count" stats you have zero bad sectors.

Are you sure the error reported by Vista where bad sectors ?

you said that if the drive is not formatted with Vista, this could cause all kinds of errors. How could I correct this situation then, as I don't have the Vista installation disk...?

well, you installed vista with a restore dvd, so I thought you formatted before doing so, but obviously there's no need to do that. Maybe the "bad sector" information, is stored in your restore DVD itself, and hence it's the restore DVD that is defective Unless , that's a "secret trick" done by lenovo to check validity of your system

Or worse, lenovo just sent you back the same hard drive

Maybe you could download a full retail version of vista ("lost somewhere in the dark space"), install it,

and see if you have still a "bad sector" error. At least you'll avoid all the pre-installed crapware in your laptop.

Share this post

Link to post

Share on other sites

"The number in the data column on the line "(C2) Temperature" does seem a bit strange, though - does this mean the rest of the results cannot be trusted either? The programs main window showed the temperature to be on a bit more acceptable level of 33 degrees, though."

where xx is highest temperature ever recorded, yy is lowest temperature ever recorded and zz is current temperature reading. Or maybe xx and yy were the other way around... anyway it's zz that's the most relevant.

In hexadecimal, one byte can present values from 00 to FF - that is two digits. So, to get correct temperature reading you'll convert the LAST BYTE of raw data to decimal number and you can use that number to represent current drive temperature in degrees Celsius.

Most other HDD manufacturers only report current temperature (that is temperature raw data is in format 0000000000zz) and because of that some SMART monitoring tools convert the whole raw data to decimal form (since the leading zeroes don't affect the outcome of hexadecimal to decimal conversion). Doing the same for Hitachi HDDs, the result will be very funny as there are non-zero digits prior to zz. For example a Hitachi drive that has 50 deg C maximum temperature recorded would have 003200yy00zz as temperature raw data. Doing conversion of all raw to decimal form will give "temperature" higher than 200 BILLION degrees Celsius.

Share this post

Link to post

Share on other sites

extrabigmegdi: "AFAIK, only the manufacturer know how to interpret the data column, so this one should be discarded."

Only manufacturer knows for certain but I do trust that my interpretations based on my empirical research are correct at least for those production batches of 7K250 and 7K400 I own samples of. I don't agree with you that they should be discarded. As the raw data isn't standardized even the way to use the last byte isn't written in any standard. It just happens to be a de facto way of doing it... to use the last byte to report temperature in degrees celsius.

If you say any attempt at interpretation of raw data should be left undone, then you basically say that we shouldn't monitor HDD temperatures though software. I think software monitoring is good, even though it relies on hardware manufacturer to comply with non-written de facto standard of raw data. I don't suggest blind belief in SMART temperature monitoring. Where as there's only rare exceptions to ways it is used, the temperature sensor itself may be inaccurate or it may be placed next to a component that warms up rapidly disk I/O or placed next to hot component, etc. Also, there may be non-linearities and there may be some weird inconsistensies. Most of the odd readings I've had have come from Samsung HDDs. PL40 reported temperature that seemed half of real temperature. Maybe the sensor was inaccurate and only reported 2 degree increments, and they decided to drop the unused bit from the end, reducing all reading by a factor of 2. My P80 reports kinda odd temps... if I power it up cold it'll warm up in typical fashion... 20 -> 21 -> 22 -> 23 -> 35 -> 25... wait. What was that one reading? Yes. Some readings are just way off.

Don't trust temperature readings blindly. Start the computer cold and after 30 minutes check the graph on how temperature has developed. The initial temperature should be within 5 deg of room temperature (the HDD does warm up a bit until you get booted to Windows) and the end temperature should be close to temperature measured by some external sensor (hand isn't an accurate sensor but it's better than nothing). The graph should be smooth and rate of temperature rise should decrease steadily in as temperature increases. Put some high I/O. This should increase HDD temperature but only slowly, and it should stabilize gradually. Sudden increase would mean temperature sensor is "badly" placed near some hot chip.

Share this post

Link to post

Share on other sites

About P80. It doesn't appear to be just random noise in measurements... if I stabilize it at 24, it'd remain "35" forever. This example was just dramatization as I don't remember if 24 was one of those magical temperatures where readings went ballistic. Also, the temperature reading seemed to skip a few values here and there, where as elsewhere it reported 1 degree increments.

My F1 does seem to give more logical readings than my other Samsungs, though they are a bit on the low side compared as more power economical GreenPower appear to run warmer. F1 is one of those other exceptions on how to use temperature raw data. The last byte still means current temperature... at least with a big likelyhood.

Thus, temperature is 28 and "airflow" is 27. Probably two different sensors in different places. But that raw data format is odd. 1D can't be highest recorded as it's too low. Neither can 1A be lowest recorded. Maybe it's highest or lowest within an hour? Probably not three different sensors since that'd mean a total of 6 different sensor for the two attributes.

Temperature has values, value:72 worst:61 threshold:0. 72 happens to be same as 100-28. If the same applies to worst, 100-61=39 which is probably highest temperature ever recorded.

Nothing is certain but if we were to ditch any theory that might fail, we'd still live in a stone age. SMART monitoring offers lots of information and they are hard to interpret due to lack of standardization. SMART can predict ~50% of deaths. If one monitors SMART manually for smaller changes (not yet reaching thresholds) the prediction percentage is unlikely to be better and there'd be more false alarms (which is why thresholds are set to allow small amount of bad sectors, etc.) but manual analysis of SMART values does give extra time to react to possible HDD death.