Checksum errors on boot

Status
Not open for further replies.

kjp4756

Contributor
Joined
Feb 11, 2014
Messages
102
This morning I woke up to an email saying my boot drive was in critical condition. There were no read/write errors; only 25 checksum errors. USB drive was a single kingston micro 8GB.

I immediately backed up my config and went to the store to get a new drive. I picked up a 16GB Sony. Installed freenas 9.3 using the ISO on to the new drive. I then updated through the update tab. After the updates applied I restored my config. Everything works good. I then did a scrub on the new boot drive and another critical alert email. GUI shows 2 checksum errors.

Is it likely the new usb drive is bad as well? My storage pool shows no errors so I doubt it's a system problem; unless of course the USB is flaking out on me.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710

kjp4756

Contributor
Joined
Feb 11, 2014
Messages
102
I cleared the errors with a 'zpool clear freenas-boot' then re-ran a scrub. The checksum errors didn't come back. I'll leave it as is until I order something better online.

Thanks.
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I cleared the errors with a 'zpool clear freenas-boot' then re-ran a scrub. The checksum errors didn't come back. I'll leave it as is until I order something better online.

Thanks.

That sounds like the USB stick is storing the data correctly, but bits are being randomly corrupted when being transferred from the memory on the USB stick to the system.
 

kjp4756

Contributor
Joined
Feb 11, 2014
Messages
102
I'm not too sure what is going on. I took my 8GB micro kingston usb drive (the one I thought was bad) and created a zfs pool on it from the GUI called "test". I then copied a bunch of movies to the new pool (3GB worth). I then ran a scrub on it and no checksum errors. I did this 10 times and no errors. I then took the same 8GB drives and installed freenas on to it. I updated to the latest version and restored my config. After the reboot I ran a scrub and now there are checksum errors.

Why am I getting checksum errors when the drive is used as a boot drive but not when it's a data drive?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I'm not too sure what is going on. I took my 8GB micro kingston usb drive (the one I thought was bad) and created a zfs pool on it from the GUI called "test". I then copied a bunch of movies to the new pool (3GB worth). I then ran a scrub on it and no checksum errors. I did this 10 times and no errors. I then took the same 8GB drives and installed freenas on to it. I updated to the latest version and restored my config. After the reboot I ran a scrub and now there are checksum errors.

Why am I getting checksum errors when the drive is used as a boot drive but not when it's a data drive?

An oddly specific controller firmware bug on the flash drive? That's the only logical thing I can come up with.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'm not too sure what is going on. I took my 8GB micro kingston usb drive (the one I thought was bad) and created a zfs pool on it from the GUI called "test". I then copied a bunch of movies to the new pool (3GB worth). I then ran a scrub on it and no checksum errors. I did this 10 times and no errors. I then took the same 8GB drives and installed freenas on to it. I updated to the latest version and restored my config. After the reboot I ran a scrub and now there are checksum errors.

Why am I getting checksum errors when the drive is used as a boot drive but not when it's a data drive?

You copied movies, which are no doubt large files and easy for ZFS to keep somewhat contiguous. As a boot device, the files are scattered all over the disk, with lots of small random reads and writes. If you read around I keep telling people that USB sticks suck at random reads and writes. Not surprisingly, you are having problems with that workload. Kingston is on my bad list because their quality seems to decline (long story, read the forums if you want to know more).

In short, your workload didn't really represent the workload as a boot device, so your test really doesn't do much except prove that the drive can read/write large blocks without problems.
 

jlpellet

Patron
Joined
Mar 21, 2012
Messages
287
I've had similar occurences accross my 5 FN 9.3 systems - reports uncorrectible errors on 1 or 2 mirrored boot devices (various name brands, matched 8GB pair for each system). When it is reported, I run scrub which detects no issues & all seems well (updates & boots do not generate a new error). So, for now, I'm living with it. Glad for mirrored boot devices.
 
Status
Not open for further replies.
Top