Like anyone apart from me cares…
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 40% 1594 6323152
# 2 Extended offline Completed: read failure 40% 1594 6323152
# 3 Extended offline Completed: read failure 40% 1594 –
# 4 Short offline Completed: read failure 60% 1594 6323152
Ulp.
After advising someone about this the other day, it took a recommendation from Dave Goodwin to point out that I should look at smartmontools. They essentially ask the disk itself to report it’s S.M.A.R.T. status and also run some checks. Most of the point of this post is to remind myself in future what I did and what to read.
With the help of this and this and the surrounding pages.
A quick self test failed as you can see above
# smartctl -t short /dev/hda
So I ran a few long test, which also failed as can be seen above.
# smartctl -t long /dev/hda
I ran several commands which told me:
# smartctl -H /dev/hda
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
# smartctl -Hc /dev/hda
Self-test execution status: ( 116) The previous self-test completed having
the read element of the test failed.
# smartctl -A /dev/hda
told me that 7 drive attributes were in the pre-fail stage but haven’t failed yet.
# smartctl -a
informed me that it encountered 459 errors before the test failed and
Error 459 occurred at disk power-on lifetime: 1583 hours (65 days + 23 hours)
When the command that caused the error occurred, the device was in an unknown state.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
— — — — — — —
40 51 01 d0 7b 60 e0 Error: UNC 1 sectors at LBA = 0x00607bd0 = 6323152
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
— — — — — — — — —————- ——————–
25 00 01 d0 7b 60 e0 08 00:16:05.184 READ DMA EXT
25 00 02 cf 7b 60 e0 08 00:16:04.000 READ DMA EXT
25 00 08 cf 8e 60 e0 08 00:16:04.000 READ DMA EXT
25 00 08 37 0a 60 e0 08 00:16:04.000 READ DMA EXT
25 00 08 07 1a 60 e0 08 00:16:03.984 READ DMA EXT
Interesting is that the output from the top thinks I have approximately 1594 hours disk lifetime left. Thats about 66 days! Yeah I know it’s not exact, but it knows better than I do.
Either way, if the drive fails the test on a read failure then it’s starting to die. The Linux Journal article recommends moving all data off the disk, which I did a few days ago and also looking for a vendor tool that will either remap the bad sectors or give me an error code to use when requesting a replacement from the manufacturer. I think I’ll do this and also try it on my other drive that died (both Maxtor), then see how long the warranty is for. If it’s a year, then I should get a replacement drive. I might try the supplier for a replacement first though.
I’m glad I’m using Linux. I check my logs reasonably often just to see if things are ok, under Windows I’ve had disks die in the past and knew nothing of it until the machine started to fail to boot.
Remaining refers to the remainder of the test, not \”Remaining LifeTime\”.