Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

How to deturmine what HDD or SSD has failed?

Joined
Aug 25, 2014
Messages
86
Thanks
3
#1
Good afternoon FreeNAS Gurus. I have a 20HDD FreeNAS SAN (FreeNAS-9.3-STABLE-201506292130)
After 4+ years of faithful service about 90 minutes ago I got the following message:
CRITICAL: The volume Athene (ZFS) state is DEGRADED: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state.

When logged into my SAN and I click on Storage and Volumes I see my primary storage vault says "DEGRADED"

I went to our Data Center to inspect the FreeNAS SAN and all 20 HDDs have blinking blue lights on the front of my SuperMicro Chassis.

My SAN has 20HDD and four 240GB SSDs for cache. The SSDs are SATA and don't blink as only SAS devices blink on my SuperMicro servers.

Again while being logged in and I click Storage on the left and open the Volumes TAB so I can see View Disks. From View Disks I can see all of the components of the SAN but I don't see an error message saying which drive should I pull so I can add a new HDD?

Here is what I see when I inspect View Disks:

To save time I only copied the serial number of the drive if it was #1 or 10 in HDD
Or number 1 & 4 in SSDs
And finally for numbers 11 & 20 in HDDs.

drive chart.png
 
Last edited:
Joined
Aug 25, 2014
Messages
86
Thanks
3
#3
At the very end of my HDD list and below da27 is da28 and it says 16.8GB. I think I might know what it is. My freeNAS has two mirrored USB thumb drives and one is on the outside of the chassis. I am heading to the Data Center now to see if it is just a loose USB drive?
 

nephri

FreeNAS Aware
Joined
Sep 20, 2015
Messages
37
Thanks
18
#4
you can start by checking the status of your pool

Code:
zpool status


That will give you disk that failed by their gptid labels for the zfs-freebsd partition.

After that, you can find by example which device is under this gptid.
You can do that using disklist.pl downloadable here : https://github.com/nephri/FreeNas-DiskList

You can print all informations like this:

Code:
./disklist.pl -all -i:gptid gptid/xxxxxxxxxxxxxx


That will give you the device, the serial of the disk and the sas enclosure position if it's a SAS device.

PS: The disklist.pl may be unable to get disk information if the disk did'nt respond at all.
But if it's the case, you will have some difficulties to identify the disk.

So, i hope it will help.
 
Joined
Aug 25, 2014
Messages
86
Thanks
3
#5
Do you use the commands from the FreeNAS Shell?

It looks like a shutdown after 934 days to replace one of the two USB flash drives I use to boot.
 
Joined
Aug 25, 2014
Messages
86
Thanks
3
#7
Things getting worse fast. Today I am logged into the FreeNAS console and when I click on Storage and then on View Disks the page is blank?
 

Redcoat

FreeNAS Expert
Joined
Feb 18, 2014
Messages
1,224
Thanks
314
#8
But the pool data is still available on the network?
 

nephri

FreeNAS Aware
Joined
Sep 20, 2015
Messages
37
Thanks
18
#11
as said in previous answer here, you can download and use diskilist.pl

But if you prefer to do it with native tools, you can in shell

Code:
> glabel status
                                      Name  Status  Components
gptid/8b325d19-482c-11e5-99e5-0cc47a18b26c     N/A  da0p2
gptid/e2e06a8c-21d4-11e6-a7ff-0cc47a320ec8     N/A  da1p2
...


The da0p2 say you that is the partition p2 of the device da0

Now, you can ask informations about the device da0

Code:
> geom disk list da0
Geom name: da0
Providers:
1. Name: da0
   Mediasize: 100030242816 (93G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e3
   descr: INTEL SSDSC2BA100G3
   lunid: 55cd2e404b645783
   ident: BTTVXXXXXXVD100FGN
   rotationrate: 0
   fwsectors: 63
   fwheads: 16


The ident is the serial number of the underlying disk.

You can also use smartctl for having serial number, but if the disk failed, i think (without to be sure) that the smartctl has more chance to fail compared to the geom disk list...

Code:
> smartctl -A /dev/da5
 
Last edited:
Joined
Aug 25, 2014
Messages
86
Thanks
3
#13
Good morning all. After work last night I shut down all of my VMs and my FreeNAS SAN and pulled all 30 FreeNAS drives, one at a time and took a pic showing each drive's serial number and I noted the drive bay of each drive. Then of course I replaced the drive with the serial number identified with help from all of you.

It appears that I did not solve my problem. Today I have a different error message: CRITICAL: The volume Athene (ZFS) state is DEGRADED: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state.

I presume there must be something I need to do to bring the new HDD into the 20 HDD fold? The 4TB SAS HDD I added when I removed the bad drive was 1 of 21 when I built my SAN. The serial numbers are almost sequential so compatibility will not be a problem.

What is the next step to get this drive recognized and brought into the SAN as usable storage space?

Thank you to everyone!
 

nephri

FreeNAS Aware
Joined
Sep 20, 2015
Messages
37
Thanks
18
#15
When you removed the bad drive and added the new one, FreeNas didn't consider automatically that is a "replacement"
So, now, you have to say to FreeNas that the new disk will replace the removed one.

Once it done, ZFS will start the resilvering process on the new disk.

I haven't a freenas box available where i am, so i cannot give more precise details on the UI but from memory you will find it under the Storage menu.
 
Joined
Sep 20, 2015
Messages
37
Thanks
18
#17
From memory, you would click "Replace" on the partition that show "REMOVED"
Under mirror-0 you should have at least 2 partitions
- One with ONLINE state that give a present disk that is OK
- One with REMOVED state that give the removed one that you want to put on the new disk
 
Joined
Sep 20, 2015
Messages
37
Thanks
18
#19
Yes it is !!! select this line, click Replace and select your new drive. (it will show only unused disks, so probably it will shown only one)
 
Top