[nmglug] SMART error message

Chris Brotherton chris at protonlab.net
Mon Apr 18 17:08:11 PDT 2011


> What's the output of;
>
> smartctl -H /dev/sdd

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

>
> in particular the "SMART overall-health self-assessment test result"?  I would further recommend running another offline test;
>
> smartctl -t offline /dev/sdd
>

I ran this test, but no errors were found.

> and see what the result is.  The badblockhowto.html that you've cited should allow you to identify which files are populated by bad sectors and as described in the backblockhowto.html you can use 'dd' to force the disk to reallocate the bad block by writing zeros to to it (them). You can restore the affected files from backup if desired or take other measures after identifying the file location of bad sectors.
>
> A friend of mine and I recently had a similar situation with an Enterprise class drive that started sprouting bad sectors.  The procedure he followed was pretty much *exactly* as you've mentioned and as described in backblockhowto.html.  We eventually replaced the failing 3.5" Samsung disk in a 1U chassis with two commodity WD 2.5" drives (SUPERMICRO MCP-220-00044-0N Hard Drive Bracket) in a software RAID-1 configuration.
>
> I'd say if the SMART test fails or if you see additional bad sectors after writing zeros to the existing bad sectors, replace the drive.  You could (if it's not a server) take the drive offline boot from a manufacturer's diagnostic CD (Seatools or Western Digital Data LifeGuard bootable CD) and do a full surface scan.  If it fails with bad sectors (likely) you could RMA the drive (if in warranty) based upon a result code other than 0x00.  If it was me and it was not a production system and I had backups, I'd probably skip straight to pulling the drive model and serial number;
>
> hdparm -i /dev/sdd
>
> and if the drive was in warranty (by checking the manufacturer's website which will usually tell you) I'd do the full surface scan (which I'm wagering will fail) and RMA the drive.
>
Thanks for all of the advice.  SMART doesn't seem to think that there
is a problem.  I'll try running the WD diagnostic tools this weekend.

Chris.


More information about the nmglug mailing list