ZFS RaidZ2 Issues with 2 drives

Status
Not open for further replies.
Joined
Oct 10, 2013
Messages
3
I have a ZFS RaidZ2 array with 7 drives. I was migrating the drives to different hardware. As I was migrating, I finally figure out how to setup email alerts properly, after this system had been running for a year. Now with the email alerts setup I learn that 2 drives have uncorrectable sectors.

Device: /dev/ada1, 4513 Offline uncorrectable sectors Seagate ST2000DM001

Device: /dev/ada3, 160 Offline uncorrectable sectors Seagate ST2000DL003

is it just me or is that 4513 sectors seems high enough that the array should be giving array health alerts (it's normally green)?

I am trying to decide on next steps. Fortunately the DL003 drive is still under warranty for another week or so I was able to RMA it after running a long test with Seatools.

The other drive, ST2000DM001 is out of warranty. My question is what should I with this drive?

I pulled the drive to run sea tools in another computer, fails. Crystaldisk shows about 1155 pending sectors, and I am running the Fix All Long with Sea tools. If this is repaired and all of the sectors are successfully reallocated, should I still not use this drive in the array?

Also, if it does repair, would I run into any issue reinserting it into the array, I didn't do anything (detach or replace) in FreeNAS, I just shut it down and pulled the drives. Any help greatly appreciated.

The good thing is any meaningful data on this array is already backed up onto a separate external hard drive.
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
The GUI only gives alerts in case of ZFS problems, i.e. checksum errors, degraded pools.
If you did regular scrubs and haven't gotten a warning yet, your data should be ok.
To detect disk problems early on, a good way is to setup SMART email alerts (like you did now).

I personally would replace the ST2000DM001, as 4500 offline sectors is quite a lot. Luckily you have backups and a Raid-Z2, so if no third disk starts dsplaying errors, that should work without a problem.

If you just reinsert the exact same disk, FreeNAS will pick it up on boot and you don't have to do anything. To replace the other disk, just follow the steps outlined in the manual.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
It's as warri says. I'd replace the one with 4500 errors asap. That's the one you RMAed I believe, but I'd get a spare disk and replace that disk right now. You run the risk of losing data since you already have 2 disks that are failing.

Personally, from what I've seen with disks that have your errors, they only get worse. The "fix" is very temporary as the issues just get worse and worse over time. If I were you I'd replace both of those disks right away and just toss the one that is out of warranty in the trash(or keep it in case you ever want to use a failed disk to see how a system responds). I keep a bad disk handy just for the occasional test. I wrote in big letters with a sharpie "BAD" so I don't use it for data.
 
Status
Not open for further replies.
Top