Boot USB drive failing, process to replace?

Status
Not open for further replies.

my95z34

Explorer
Joined
Oct 25, 2014
Messages
51
So, my mom's server just emailed me this morning to tell me that it scrubbed the boot drive and it's got 15 checksum errors. (Boot Volume Condition: ONLINE One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.) So, I told her that she needs to pick up 1-2 new USB drives. Now, just trying to figure out what the best process would be to replace it. Currently she's got just one USB stick, but I want her to move to 2. Should I add the new sticks, set them to mirror, then remove the original stick? Or would that potentially be copying bad data to the new sticks? If that's the case, backup the config and do a clean install on one new stick, restore the config, then mirror to the second new stick?

Thanks in advance!
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The best option is to back up the config (which should be done anyways, as a matter of good practice) and reinstall FreeNAS from ISO.

The installer will also allow you to set up the mirror directly.

Out of curiosity, what brand is the failing drive?
 

my95z34

Explorer
Joined
Oct 25, 2014
Messages
51
The best option is to back up the config (which should be done anyways, as a matter of good practice) and reinstall FreeNAS from ISO.

The installer will also allow you to set up the mirror directly.

Out of curiosity, what brand is the failing drive?
That's kinda what I figured. Thanks!

And, it's un branded, lol. Definitely el cheapo. I'm not really surprised it's failing, to be honest. It's probably 5 years old too.

Sent from my Nexus 5 using Tapatalk
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Since we now have a mechanism to detect failing drives, we're starting to get a better idea of which brands are reliable.

I gave up on el cheapos quite a few years ago, after a not-so-cheap el cheapo (it was 2GB - at the time, it felt like a crazy amount of storage to carry around) bit the dust after a rather trivial fall (pun not intended). It started showing up as a few MB drive and Windows never managed to format it.
 

albiurs

Cadet
Joined
Feb 6, 2015
Messages
4
I have set up the server about 6 weeks ago and then, the boot devices were fine. I got the same message today. In my case, I have two mirrored boot devices (Kingston 8GB) and both of them have a checksum higher then 0. one has a checksum of 2, the other a checksum of 25. I guess I have to replace both of them (fresh install from image) as well, as both seem to be corrupted. Am I right? Is it just bad luck that both devices are corrupted at the same time? I picked the Kingstons, as it was suggested. Would you suggest to take Sandisk drives for the next try?
Thanks
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I have set up the server about 6 weeks ago and then, the boot devices were fine. I got the same message today. In my case, I have two mirrored boot devices (Kingston 8GB) and both of them have a checksum higher then 0. one has a checksum of 2, the other a checksum of 25. I guess I have to replace both of them (fresh install from image) as well, as both seem to be corrupted. Am I right? Is it just bad luck that both devices are corrupted at the same time? I picked the Kingstons, as it was suggested. Would you suggest to take Sandisk drives for the next try?
Thanks

If the boot pool shows no errors, you can start by replacing the one with 25 errors. If the pool does show errors, you're best off reinstalling from scratch.
 

albiurs

Cadet
Joined
Feb 6, 2015
Messages
4
If the boot pool shows no errors, you can start by replacing the one with 25 errors. If the pool does show errors, you're best off reinstalling from scratch.
Thanks for your reply. I'm new on FreeNAS, so I'm not sure about that. I guess, the first row in the table, in my case named "freenas-boot", is the boot pool you mean. It has no errors (checksum=0). But how can the pool have no errors when both of the devices have failures? I expected that if both drives in the mirror have a hardware failure, the pool must have an error as well, as it actually consists of the two drives.

Today I got two more security run output mails from the server. The first was:
freenas.local kernel log messages:
(da6:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 31 d1 c6 00 00 80 00
(da6:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da6:umass-sim0:0:0:0): Retrying command
(da6:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 5c 09 74 00 00 80 00
(da6:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da6:umass-sim0:0:0:0): Retrying command
(da6:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 38 49 41 00 00 80 00
(da6:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da6:umass-sim0:0:0:0): Retrying command
(da6:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 27 50 09 00 00 80 00
(da6:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da6:umass-sim0:0:0:0): Retrying command

-- End of security output --



...and the second email was this one:
Checking status of zfs pools:
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
freenas-boot 7.25G 2.43G 4.82G - - 33% 1.00x ONLINE -
volume0 10.9T 1.78T 9.10T - 9% 16% 1.00x ONLINE /mnt

pool: freenas-boot
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub repaired 1.67M in 0h9m with 0 errors on Sat Feb 7 07:20:23 2015
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
da6p2 ONLINE 0 0 25
da7p2 ONLINE 0 0 2

errors: No known data errors

-- End of daily output --


I tried to replace the second usb drive by hitting the "Replace" button but that was not possible, as no usb device was available (empty "Member disk" field). Any suggestions?

Thanks!
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Thanks for your reply. I'm new on FreeNAS, so I'm not sure about that. I guess, the first row in the table, in my case named "freenas-boot", is the boot pool you mean. It has no errors (checksum=0). But how can the pool have no errors when both of the devices have failures? I expected that if both drives in the mirror have a hardware failure, the pool must have an error as well, as it actually consists of the two drives.

Today I got two more security run output mails from the server. The first was:
freenas.local kernel log messages:
(da6:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 31 d1 c6 00 00 80 00
(da6:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da6:umass-sim0:0:0:0): Retrying command
(da6:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 5c 09 74 00 00 80 00
(da6:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da6:umass-sim0:0:0:0): Retrying command
(da6:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 38 49 41 00 00 80 00
(da6:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da6:umass-sim0:0:0:0): Retrying command
(da6:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 27 50 09 00 00 80 00
(da6:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da6:umass-sim0:0:0:0): Retrying command

-- End of security output --



...and the second email was this one:
Checking status of zfs pools:
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
freenas-boot 7.25G 2.43G 4.82G - - 33% 1.00x ONLINE -
volume0 10.9T 1.78T 9.10T - 9% 16% 1.00x ONLINE /mnt

pool: freenas-boot
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub repaired 1.67M in 0h9m with 0 errors on Sat Feb 7 07:20:23 2015
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
da6p2 ONLINE 0 0 25
da7p2 ONLINE 0 0 2

errors: No known data errors

-- End of daily output --


I tried to replace the second usb drive by hitting the "Replace" button but that was not possible, as no usb device was available (empty "Member disk" field). Any suggestions?

Thanks!

The pool will only have errors if both devices have errors in the same block(s).
 

travanx

Explorer
Joined
Jul 1, 2014
Messages
62
When 1 of the 2 USB flash drives fail, is there an easy way to figure out which one it physically is? 1 of 2 Newegg Team branded 16GB drive just failed on me (2 months of use).
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
When 1 of the 2 USB flash drives fail, is there an easy way to figure out which one it physically is? 1 of 2 Newegg Team branded 16GB drive just failed on me (2 months of use).

Remove one. If you guessed correctly, hooray. If not, no big deal. Put the other one back plus the replacement.
 

travanx

Explorer
Joined
Jul 1, 2014
Messages
62
I guessed wrong I think. Or both USB drives are going bye bye.

I re-attached the new boot drive to the pool. Then the please wait popup went away and it showed the boot pool with 2 drives and the bad one, I detached the bad drive. And then the resilver started.

And now I am getting errors in the resilver. These Team branded Newegg USB drives are pure garbage. Bought some backup Sandisk drives just in case this happened.
 

avalon60

Guru
Joined
Jan 15, 2014
Messages
597
Just for information, I have a Kingston DataTraveler SE9 8GB as the main boot drive, and a Sandisk Cruzer Blade 16GB as the mirrored usb drive.
I have not had any problems with either in the 6 months I have been in use.
 

travanx

Explorer
Joined
Jul 1, 2014
Messages
62
Looks like all of my 16GB Team C141 drives (3) bit the dust in the process. Easy as installing the 2 new 16GB sandisk drives and importing the config.
 

dschoorisse

Cadet
Joined
Nov 5, 2014
Messages
4
Just experienced the same problem, a failing USB drive. The failing drive was a 'temporary' solution, a micro SD card in a dodgy memorycard reader.... but hey, it ran for more than 1.5 years ;)

I've replaced it with two Kingston DataTraveler Micro USB sticks 8GB, in mirror. Backed up the config file, reinstalled FreeNAS (from a third USB stick) and reuploaded the config file afterwards. Took me less than 20 minutes to get it running again. Thanks for the info!

Edit: I've plugged the two small USB drives directly to the mainboard using this little adapter.
 
Last edited:

Ismael Duarte

Contributor
Joined
Jun 13, 2011
Messages
154
This is happening very often, most of times after an upgrade but not only.
Already tried 3 usb drives.
Seems that something else is wrong.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Updates naturally stress drives more than regular usage, so drives will tend to fail then.

What drives are you using?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
1st was a white brand
2nd Sony
3rt Kingston

but, if I reformat the drive a re-install it works OK!

Pretty much matches general experience since 9.3 arrived. Kingston drives have been crapping out left and right, same with no-brand stuff. I wouldn't put selling bad flash past Sony, either, given their abysmal track record with business practices.

Toshiba, SanDisk, Lexar and Corsair seem to be the least problematic ones.
 

Ismael Duarte

Contributor
Joined
Jun 13, 2011
Messages
154
Just experienced the same problem, a failing USB drive. The failing drive was a 'temporary' solution, a micro SD card in a dodgy memorycard reader.... but hey, it ran for more than 1.5 years ;)

I've replaced it with two Kingston DataTraveler Micro USB sticks 8GB, in mirror. Backed up the config file, reinstalled FreeNAS (from a third USB stick) and reuploaded the config file afterwards. Took me less than 20 minutes to get it running again. Thanks for the info!

Edit: I've plugged the two small USB drives directly to the mainboard using this little adapter.

How did you do to get the boot drive in mirror with another one?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Status
Not open for further replies.
Top