Second hard drive failed during resilvering

Status
Not open for further replies.

TonyToews

Dabbler
Joined
Mar 20, 2012
Messages
33
I replaced a failed hard drive, added the new hard drive in the FreeNAS configuration and about five hours later the SMART daemen emailed me to tell me that it was "unable to open device."

So now when I run zpool status that hard drive is showing 49 read and 1.02M writes. The other hard drives show 0 so presumably that means errors. And the write number is increasing. So I'm thinking that it could be greatly slowing down the resilvering. Is this the case? If so then I should just deactivate or physically unplug that hard drive

Is there a way of seeing the resilver progress to see if it is indeed making reasonable progress?

If the scrub took about 30 or so hours how long would you expect a resilver to run?
 

TonyToews

Dabbler
Joined
Mar 20, 2012
Messages
33
Ahh, I had to use the | more to see the first part of the zpool status. It indicates 70% done so I'll just recheck back in an hour.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
From what I've seen the resilver takes roughly the same amount of time as a scrub(give or take a few hours).

I wouldn't unplug the drive unless the resilvering is going so slow that finishing the resilver is not possible. If the disk is no longer available(most likely based on the SMART daemon response) then the disk is already 'disconnected' from the machine and isn't capable of being used. If it is still available it might save your data if another drive has issues and you need some parity data.

In short, if its disconnected then its no big deal for the resilver. If its not disconnected I wouldn't disconnect it unless your resilver time changes to be weeks long or something. I'd definitely get a new drive on order right now as that other drive is almost certainly failed.
 

TonyToews

Dabbler
Joined
Mar 20, 2012
Messages
33
Now that I know about | more resilver is continuing at the expected speed. Ah yes, the View Disks not longer shows that hard drive. But interesting to me is zpool status shows that the write errors are now up to 1.21M for that hard drive.

I'm also not doing any writes to the NAS and very few reads.

And yes I do have another hard drive on order. Even if I was to unplug the current failed and plug it back in and it worked just fine I'd no longer trust it.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
That's because ZFS checks the drives while its resilvering the disk you replaced. The "failed" disk isn't responding with the correct data(duh.. there is no data) so its attempting to fix the issue by writing the correct data to the drive. It doesn't know that the drive is missing, so you're racking up lots of write errors. Nothing to worry about though.
 

TonyToews

Dabbler
Joined
Mar 20, 2012
Messages
33
Ahh, thanks for the explanation. That makes sense. That hard drive is obviously somewhat still working then. At least the electronics. Not like my other one that failed due to a partially melted/burnt power connector.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Ahh, thanks for the explanation. That makes sense. That hard drive is obviously somewhat still working then. At least the electronics. Not like my other one that failed due to a partially melted/burnt power connector.

Not necessarily. The drive is most likely detached from the system already based on the SMART message you got. But the system still considers the gptid to be part of the zpool and tries to output the correct data to the gptid. However, since the gptid doesn't actually map to a physical drive that is still attached to the system and responding you're getting the write errors.

If I had to guess, I'd say you have a 99% chance that the disk is completely failed. It may come back online if you power cycle the drive, but considering the errors it has racked up already I wouldn't do anything until the resilver completes. Then replace the drive with a spare. As I said before, you should be ordering a replacement drive for that disk if you don't already have a spare.
 

TonyToews

Dabbler
Joined
Mar 20, 2012
Messages
33
Thanks. Resilver successfully completed. Second replacement hard drive on order. No spares on site.
 
Status
Not open for further replies.
Top