How to identify the correct failed USB boot drive

Status
Not open for further replies.

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I received an alert that one of my 2 boot USB drives failed so I'm going to replace it. The problem is I can't figure out which one it is. I have 2 Sandisk Cruzer mini's. Any tips?

With a hard disk, I'd try to dd if=/dev/adaXX and look for the blinky light, but I can't see anything on these things.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
The only 100 % reliable way is to use the serial number, however on USB drives it's not written anywhere so unless you wrote it on a sticker placed on the drive you'll not be able do to that.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I can turn it off and pull them out to see if the S/n is legible.

Of course, I can't even find the corresponding S/n for the failed da1 device in FreeNAS. Grrrr
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Yep, you can do that, but just be careful to not write a single bit to the stick during the process.

Use my script (link is in my signature) or directly the commands inside it ;)
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Thanks for the reminder, but your scripts don't seem to work for these USB drives. glabel and smartctl come up empty.

Code:
[root@freenas2] ~# zpool status -v
  pool: freenas-boot
state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
        repaired.
  scan: scrub repaired 0 in 0h7m with 0 errors on Tue Oct 13 03:52:41 2015
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  DEGRADED     0     0     0
          mirror-0  DEGRADED     0     0     0
            da1p2   FAULTED      1     1     0  too many errors
            da2p2   ONLINE       0     0     0

errors: No known data errors

  pool: tank2
state: ONLINE
  scan: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        tank2                                           ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/e59fb1bc-8443-11e5-8981-001f33eaf869  ONLINE       0     0     0
            gptid/e6689760-8443-11e5-8981-001f33eaf869  ONLINE       0     0     0
            gptid/e73d4371-8443-11e5-8981-001f33eaf869  ONLINE       0     0     0
            gptid/e819921f-8443-11e5-8981-001f33eaf869  ONLINE       0     0     0
            gptid/e8f87625-8443-11e5-8981-001f33eaf869  ONLINE       0     0     0
            gptid/e9cdb4bc-8443-11e5-8981-001f33eaf869  ONLINE       0     0     0

errors: No known data errors
[root@freenas2] ~# glabel status -s da1p2
glabel: No such geom: da1p2.
[root@freenas2] ~# glabel status -s da2p2
glabel: No such geom: da2p2.
[root@freenas2] ~# smartctl -i /dev/da1
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p28 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

/dev/da1: Unable to detect device type
Please specify device type with the -d option.

Use smartctl -h to get a usage summary
[root@freenas2] ~#
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Ah, yes, of course, it's a USB drive, silly me...

So the next question is "how do we get the serial number for a USB drive?", I'll let you search this one :P
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Finally figured it out using a combination of "camcontrol devlist" and "camcontrol inquiry da2" (since da1 is toast, it wasn't responding to any inquiries,duh! smack forehead). dmesg | grep da1 would have worked, but /var/log/messages already rolled over (surprisingly, nothing showed up when running "dmesg".

Code:
[root@freenas2] /var/log# dmesg
[root@freenas2] /var/log# camcontrol devlist
<ST6000DX000-1H217Z CC48>          at scbus0 target 0 lun 0 (ada0,pass0)
<ST6000DX000-1H217Z CC48>          at scbus1 target 0 lun 0 (ada1,pass1)
<ST6000DX000-1H217Z CC48>          at scbus2 target 0 lun 0 (ada2,pass2)
<ST6000DX000-1H217Z CC48>          at scbus3 target 0 lun 0 (ada3,pass3)
<ST6000DX000-1H217Z CC48>          at scbus4 target 0 lun 0 (ada4,pass4)
<ST6000DX000-1H217Z CC48>          at scbus5 target 0 lun 0 (ada5,pass5)
<SMI USB DISK 1100>                at scbus7 target 0 lun 0 (pass6,da0)
<SanDisk Cruzer Fit 1.27>          at scbus8 target 0 lun 0 (pass7)
<KVM vmDisk-CD 0.01>               at scbus9 target 0 lun 0 (pass8,cd0)
<SanDisk Cruzer Fit 1.27>          at scbus10 target 0 lun 0 (pass9,da2)
[root@freenas2] /var/log# camcontrol inquiry 8:0:0
[root@freenas2] /var/log# camcontrol inquiry 10:0:0
pass9: <SanDisk Cruzer Fit 1.27> Removable Direct Access SCSI-6 device
pass9: Serial Number 4C53000123819101112
pass9: 40.000MB/s transfers
[root@freenas2] /var/log# camcontrol inquiry da2
pass9: <SanDisk Cruzer Fit 1.27> Removable Direct Access SCSI-6 device
pass9: Serial Number 4C5302234567819101112
pass9: 40.000MB/s transfers
[root@freenas2] /var/log# 
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Upon further review, even this method doesn't work, since the serial number reported by camcontrol doesn't match the serial number printed on either USB drive (not even close to the right format). I'm just going to roll the dice and pull one to see what message pops up in the console (the new USB detected alert).
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Yeah, but you can plug the USB stick in another PC and read its S/N to see which is which.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I tried that. The numbers are completely different.

For instance, here are the serial numbers from camcontrol:
4C530102841224103441
4C530009340819108345

And here are the serial numbers printed on the USB sticks:
BM140824736V
BM141224736V

Anyone see a pattern?
 

Attachments

  • IMG_20151109_134526.jpg
    IMG_20151109_134526.jpg
    164.4 KB · Views: 348
Last edited:

Robert Smith

Patron
Joined
May 4, 2014
Messages
270
Perhaps, larger number on the inside corresponds to the larger number on the outside, and vice versa…

But yea, wow. I think we need a recommendation to mark USB drives before usage, or to use different brands/models together for easy identification which is which.

If you can import the USBs one at a time on another FreeNAS, you should be able to figure out which one shows as healthy with the second drive missing. I am just not completely sure that the second FreeeNAS won’t try writing anything to the stick.
 
Last edited:

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Anyone see a pattern?

Yes, but we are (as humans) pretty good at finding pattern who doesn't mean anything, so...

For instance, here are the serial numbers from camcontrol:
4C530102841224103441
4C530009340819108345

And here are the serial numbers printed on the USB sticks:
BM140824736V
BM141224736V
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
Take one out and see if the machine boots? If not, its the other one you want.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Take one out and see if the machine boots? If not, its the other one you want.
That is exactly what I'd do. Also, just because FreeNAS said it failed, I'd take the failing device and verify it just didn't become corrupt and verify there is a hardware failure. If you find out that the device did fail, next you should ask yourself why did it fail, too much writing? If the device was only corrupt (not hardware failures) then I'd submit a bug report about the drive corruption and watch to see if the new boot device fails.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
That's actually what I ended up doing (shutting down, pulling one out, starting up, shut down, start with the other). Of course they were both ok at that point. I ran a scrub and 132KB was corrected on one drive. I'll keep an eye on it and if it gets worse/happens again I'll replace it.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
My concern is possible data corruption caused by FreeNAS and is the only reason I think you should check the USB flash drive to ensure it was a hardware failure. You could test it with some dd command or maybe on a different computer using some throughput test and loop it for several hours. I think it would be good to rule out FreeNAS software as the culprit.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I should mention that I have mirrored USB drives running ZFS. Wouldn't the scrub correct the corruption?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
You would think it should. If you run into more corruption though you should find out if its software causing it or hardware. It could be difficult to track down without quality testing equipment to measure signal response times. Eh, might be more trouble than it's worth unless it's an obvious failure.
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
Hey, @joeschmuck, did any of the above changed? In other words, any way to identify the failed USB to be replaced?
 
  • Like
Reactions: cmh

cmh

Explorer
Joined
Jan 7, 2013
Messages
75
Dealing with the same, asked on the FB group and was directed here. (by melloa, it looks like!) I think I may have lucked out - my Cruzer Fits are 16GB and 8GB. The 8 failed. I think I'll just power down and look for the 8G.

Also think I'm going to mix brands from now on.

Now, when I swap out the failed one and resilver, do I have to do anything extra in terms of making the new one bootable? I know (or assume) the current one is bootable, so if I resilver and then the current good one fails, will I be able to boot the system?
 
Status
Not open for further replies.
Top