One disk with bad sectors: what steps to take?

Status
Not open for further replies.

jfr2006

Contributor
Joined
May 27, 2011
Messages
174
Yesterday i found out than one the Samsung disks has bad sectors (very common to this drive. It's the third one i catch with bad sectors :mad:):

Code:
freenas# zpool status
  pool: volume1
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        volume1                                         ONLINE       0     0     0
          raidz1                                        ONLINE       0     0     0
            gptid/be4f7e18-91d9-11e0-869c-f46d04923303  ONLINE   56.8K 14.3M     0
            gptid/bea6a51b-91d9-11e0-869c-f46d04923303  ONLINE       0     0     0
            gptid/bf1702f1-91d9-11e0-869c-f46d04923303  ONLINE       0     0     0
            gptid/bf6feb92-91d9-11e0-869c-f46d04923303  ONLINE       0     0     0
            gptid/bfd28ef2-91d9-11e0-869c-f46d04923303  ONLINE       0     0     0

errors: No known data errors



So, my question now is: what are the best steps to take now to replace this disk ?

What i want to know, is the sequence of commands to give on the command line or steps to take in the GUI.

Regards.
 

freeflow

Dabbler
Joined
May 29, 2011
Messages
38
Use Samsung ESTOOLS to do a full media scan.

If ESTOOLS reports errors then replace the disk.

If ESTOOLS reports an OK disk then replace the sata cable, do a zpool clear, then do a zpool scrub

After the scrub ckeck status of RaidZ1 and smart data of affected drive.

If there are no zpool errors then possibly just a dodgy sata cable/connection. Check smart data and decide if disk needs replacing. Copy data to disk and then review as above.

If there are zpool errors then replace disk. etc etc etc
 

jfr2006

Contributor
Joined
May 27, 2011
Messages
174
Yes, i intend to use ESTOOLS to scan the disk. It's not the first time i have a samsung with bad sectors :(

But i don't think the problem is on the SATA cable..They are very good quality and have clips on the end, so they stay well attached :)

Regards.
 

jfr2006

Contributor
Joined
May 27, 2011
Messages
174
Ok..done some testing with estools, no errors found (complete surface scan). Found out that the power cable was having bad contacts, making the disc turn off and on. Put the disc in place, replace the power cable, and after about an hour, i get this:

freenas# zpool status
pool: volume1
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver completed after 0h0m with 0 errors on Mon Jun 13 12:39:54 2011
config:

NAME STATE READ WRITE CKSUM
volume1 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
gptid/be4f7e18-91d9-11e0-869c-f46d04923303 ONLINE 0 0 251 1.47M resilvered
gptid/bea6a51b-91d9-11e0-869c-f46d04923303 ONLINE 0 0 0
gptid/bf1702f1-91d9-11e0-869c-f46d04923303 ONLINE 0 0 0
gptid/bf6feb92-91d9-11e0-869c-f46d04923303 ONLINE 0 0 0
gptid/bfd28ef2-91d9-11e0-869c-f46d04923303 ONLINE 0 0 0

errors: No known data errors
freenas#


should i get worried? The data cable is ok, since it's the same for all discs. I can try to exchange it, however...
 

jafin

Explorer
Joined
May 30, 2011
Messages
51
Is the disk still turning on/off? Perhaps your disk platter is ok but the disk drive motor is flakey? May be identified if you run some sort of exhaustive test that goes for a few hours on the drive. (not sure if samsung es tools has this)
 

jfr2006

Contributor
Joined
May 27, 2011
Messages
174
Hi:

Problem solved.. The CKSUM errors were inherited from the previous behavior of the disk. And the problem was indeed the power cable. SATA Power connectors is the worst invention ever made :mad:

Regards.
 
Status
Not open for further replies.
Top