I think my boot pool is toast

Status
Not open for further replies.

qwertymodo

Contributor
Joined
Apr 7, 2014
Messages
144
Ok, so last night my scheduled zpool scrub fired off critical errors on my boot pool. After checking the pool with zpool status -v freenas-boot, I get the following bad news:

pool: freenas-boot
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub repaired 10.9M in 0h50m with 26 errors on Sat Oct 17 12:55:24 2015
config:

NAME STATE READ WRITE CKSUM
freenas-boot DEGRADED 0 0 26
mirror-0 DEGRADED 0 0 52
gptid/9f30190b-0dc3-11e5-a332-08002717050c DEGRADED 0 0 179 too many errors
gptid/9f5017bd-0dc3-11e5-a332-08002717050c DEGRADED 0 0 123 too many errors

errors: Permanent errors have been detected in the following files:

freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-06-08-10:01:51:/sbin/fsck_ffs
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-06-08-10:01:51:/usr/local/sbin/fsck_ext2fs
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-06-08-10:01:51:/sbin/fsck_msdosfs
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-06-08-10:01:51:/sbin/hastd
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-06-08-10:01:51:/rescue/glabel
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-09-20-13:22:46:/usr/bin/make
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-09-20-13:22:46:/usr/bin/nm
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-09-20-13:22:46:/usr/bin/nslookup
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-09-20-13:22:46:/usr/bin/nsupdate
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-09-20-13:22:46:/usr/bin/kgdb
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-09-20-13:22:46:/usr/bin/ld
freenas-boot/ROOT/FreeNAS-9.3-STABLE-201509282017@2015-08-25-09:02:35:/boot/kernel/kernel

So that leads me to a few questions. First of all, just how screwed am I? The fact that /boot/kernel/kernel is in the list is especially troubling. Are my boot devices toast? If so, that makes the 5th and 6th flash drives I've killed this year. It's getting ridiculous. However, all of the file errors are in the latest snapshot. Could I just roll back to the previous snapshot, delete the latest one, and then update again?

Beyond that, as I mentioned, this is getting ridiculous how quickly this thing eats flash drives and spits them back out. Does anybody know of a good flash drive brand/model that is known for long-term reliability? I know of the $150 industrial-grade drives, but I don't want to go that crazy. It just seems insane that I haven't yet had a single flash drive last longer than 6 months in this machine (and frankly, it terrifies me for how much I've trusted flash drives to store data without the error-detection capabilities of zfs...)

Thoughts? Suggestions? Condolences?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I bought a 16 GB SATA DOM late last year for about $30 off eBay (actually two of them; I run them mirrored). Scrubs every couple of weeks. No errors so far. A small SSD would be another option--32 GB units are under $20 on eBay right now. Both would require a free SATA port, though.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Condolences are all I have. There's a reason why I've advocated SSDs for the last few months. It's becoming a bigger and bigger problem.
 

qwertymodo

Contributor
Joined
Apr 7, 2014
Messages
144
I think I have 2 spare sata ports, but no room for additional drives... I'll have to look into those DOM's. Pretty sure those SATA ports are Marvell controllers, but it's still probably better than USB.

Sent from my m8wl using Tapatalk
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479

qwertymodo

Contributor
Joined
Apr 7, 2014
Messages
144
Condolences are all I have. There's a reason why I've advocated SSDs for the last few months. It's becoming a bigger and bigger problem.
So what's changed to make it a bigger problem? I have some FN8.x boxes still running off the original flash drives. Sure, without zfs on the boot drives I'm probably missing errors that are there, but at the rate I'm losing drives these days, I'd expect those old machines to be completely non-functional by now.

Sent from my m8wl using Tapatalk
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
The boot devices are now ZFS and there is more read/write/scrub activity than the older versions of FreeNAS.
 

qwertymodo

Contributor
Joined
Apr 7, 2014
Messages
144
The system database exported successfully, so the corruption didn't hit that. I got a pair of Kingspec DOM's in yesterday and reinstalled everything today, and it worked almost without a hitch. Word of warning, if anybody gets one of these: http://www.ebay.com/itm/161299452084 there's basically nothing holding the black plastic part of the SATA port to the PCB, so it's very easy to accidentally bend the board at a 90 degree angle and pull the pins right out of it. I managed to get it back together, and it's all working, but be VERY careful when inserting them into the motherboard.
 
Status
Not open for further replies.
Top