ZFS error recovery?

Status
Not open for further replies.

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
I have a question about ZFS' error recovery process in FreeNAS.

Here is a senerio I actually experienced on a Solaris 10 SPARC server recently. A run of "zpool status"
showed a double failure, in a mirrored setup. It even told me what file was lost, (some language file that
was pretty worthless). It seems someone else had renamed that file, restored a copy but left the rest of the
mess for me to clean up.

I was able to run a scrub, which still showed the bad file, as it's blocks were not yet "fixed". I removed it,
re-ran the scrub which repaired 29KB, but no losses. (Or may that 29KB was found in the first scrub...)
Next, I cleared the error counts and scheduled the disks for replacement, (one at a time). At no time did
the server crash or need to be rebooted.

Now how does FreeNAS' ZFS handle a similar situation?

If I experience an un-recoverable fault, can I manually recover?
Like from backups or re-generate the file, (like from my music CDs)?

I know that FreeNAS, (and BSD in general), use a forked version of ZFS. So that what I have on the
Solaris 10 SPARC server would likely be different than in FreeNAS. It appears if Solaris 10 SPARC
has direct access to the drives, it will perform a SCSI "repair" on the bad blocks.

So I am just asking how FreeNAS handles something like this today.

P.S. Regardless, I don't want to administer a Solaris 10, (or later), x86 box for home. Looks like a nice
FreeNAS Mini is in my future :smile:.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
when you say double failure, you mean 2 disks were bad in the same 2 disk mirror vdev?
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Yes. It was a simple 2 disk mirror, (2 disk VDEV), with the same blocks bad in the file.
Thus, no possible way for ZFS to recovery.

To be clear, it was the OS disks, (Solaris 10 SPARC "rpool"), and only a few blocks
were bad.

I don't know if ZFS stores the file blocks in the exact same place on mirrored disks,
like some other software mirroring programs, (Solaris DiskSuite, Veritas Volume
Manager, Linux MD-RAID or Linux LVM). If it was the same blocks bad, on BOTH
disks, pretty strange chance event. But, given the hundreds of trillions of reads, (for
the entire world), it was bound to happen to someone. I feel SO lucky :-(.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
For the conditions I'd have expected FreeNAS to perform the same way. Not because FreeNAS is awesome or anything, but because that's how ZFS is coded to work. It *should* be working that way.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Thanks for the answer. I hope I never find out...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I have seen the worst of the worst being here. If you follow the recommendations in the manual, stickies, and my guide you have a very low chance of losing data. On the other hand, if you think you are so awesome you are smarter than me and know better, prepare to be a statistic for data loss. We don't waste time on ignorant people that do stupid things and lose their data. ;)

Most of the recommendations are painted with blood and tears.
 
Status
Not open for further replies.
Top