Dell R720XD / R310 in IT Mode / Identify Drive

dmburgess

Cadet
Joined
Jul 11, 2019
Messages
6
I have a 20 Disk Pool in a R720XD using a R310 in IT mode. Everything has been fine so far. This morning I got a e-mail telling me its degrateded. I have replacement drives standing by but I have NO CLUE what drive is in what bay :( Guess I should have goten them.. I use this with my production iSCSI system so its not really possible to shutdown.

I understand that the da19 is the device that is offline. I have the gptid and ran a script that shows what serial they are.

Code:
 pool: pool1
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: resilvered 25.5G in 0 days 00:19:04 with 0 errors on Mon Jul 29 10:00:33 2019
config:

        NAME                                              STATE     READ WRITE CKSUM
        pool1                                             DEGRADED     0     0     0
          raidz3-0                                        DEGRADED     0     0     0
            gptid/f6bbd828-ad6b-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/f7c2452d-ad6b-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/f8c8f66b-ad6b-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/f9d3e76e-ad6b-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/fae07f00-ad6b-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/fbed5ba0-ad6b-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/fd145f9c-ad6b-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/fe27b8e7-ad6b-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/ff3df68a-ad6b-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/00582c7b-ad6c-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/017430b7-ad6c-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/02933c5c-ad6c-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/03b38d92-ad6c-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/04d424b5-ad6c-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/05f6ef71-ad6c-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/073351c5-ad6c-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            gptid/085a1e58-ad6c-11e9-b966-bc305bf2c008    ONLINE       0     0     0
            spare-17                                      DEGRADED     0     0     0
              15896693473823712310                        OFFLINE      6     0     0  was /dev/gptid/09851bce-ad6c-11e9-b966-bc305bf2c008
              gptid/0c373339-ad6c-11e9-b966-bc305bf2c008  ONLINE       0     0     0
            gptid/0aaf7b5f-ad6c-11e9-b966-bc305bf2c008    ONLINE       0     0     0
        spares
          10396465602319241838                            INUSE     was /dev/gptid/0c373339-ad6c-11e9-b966-bc305bf2c008

errors: No known data errors


I run the drives script but it don't come back with serials:

Code:
+--------+--------------------------------------------+-----------------+
| da3    | gptid/f7c2452d-ad6b-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da4    | gptid/f8c8f66b-ad6b-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da5    | gptid/f9d3e76e-ad6b-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da6    | gptid/fae07f00-ad6b-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da7    | gptid/fbed5ba0-ad6b-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da8    | gptid/fd145f9c-ad6b-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da9    | gptid/fe27b8e7-ad6b-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da10   | gptid/ff3df68a-ad6b-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da11   | gptid/00582c7b-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da12   | gptid/017430b7-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da13   | gptid/02933c5c-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da14   | gptid/03b38d92-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da15   | gptid/04d424b5-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da16   | gptid/05f6ef71-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da17   | gptid/073351c5-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da18   | gptid/085a1e58-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da19   | gptid/09851bce-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da20   | gptid/0aaf7b5f-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+
| da21   | gptid/0c373339-ad6c-11e9-b966-bc305bf2c008 |                 |
+--------+--------------------------------------------+-----------------+


When I open my web browser, I hit zpool status, then click edit on da19, and i get 19042910240381 as the serial number. Don't know if that is the actual serial number. Just making sure. If I replace this serial number, that would work. Since i don't konw what serial is in what bay, my only option is to do a 3am shutdown remove each drive, record what serial number goes into what bay, then find that serial number and replace it? Any other options? Note this is a RAIDZ3 system, i thought i had at least one on-line spare, but my assumption is that i can not pull the drive i think it is, and then expect the zraid to keep working? If its not that drive, how long do i have to wait till i pull the next one? Just wondering.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Did you mark the outside of the drive tray before you put them in the system? I am not sure how the serial number helps you.
You could try to turn on the indicator light, but I don't know if that is supported on the Dell disk backplane. I use this to find drives in my Chenbro and Supermicro chassis systems and even in the QNAP SAS enclosures, it will not work if the drive is no-longer detected.
At the command line, type sesutil locate da19 on and the red LED associated to the drive should start flashing. Just repeat it with off to turn the LED off when you are done. You can turn all the other LEDs on if the one you are looking for isn't detected and look for the one that isn't flashing. I have had to do that when a drive failed in such a way that it was not detected by the system.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I run the drives script but it don't come back with serials:
Which script are you running?
I find that this one works well:

Utility: disklist.pl, for listing partition, gptid, slot, devices, disktype, serial num, & multipath
https://www.ixsystems.com/community...ktype-serial-num-multipath.59319/#post-421424

Since i don't konw what serial is in what bay, my only option is to do a 3am shutdown remove each drive, record what serial number goes into what bay, then find that serial number and replace it?
No, that is not the only option.

When I open my web browser, I hit zpool status, then click edit on da19, and i get 19042910240381 as the serial number.
What version of FreeNAS are you using?
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Note this is a RAIDZ3 system, i thought i had at least one on-line spare, but my assumption is that i can not pull the drive i think it is, and then expect the RAIDZ to keep working?
Your spare is in use and the failed drive is offline. You still have three drives of redundancy, which is way, way overkill to have three drives of redundancy AND a hot spare AND cold spares. If it was me, I would have went with RAIDz2, but you also should have went with two vdevs instead of putting all those drives into a single vdev. You sacrifices a lot of performance and used the same number of drives that you would have with two RAIDz2 vdevs.
I use this with my production iSCSI system so its not really possible to shutdown.
This configuration must really perform like a dog with only one vdev of RAIDz3 being used for iSCSI. This is totally not the way it should be done if you need any decent performance.

You can pull a drive out, look at the serial number, and put it back in the chassis and ZFS will automatically resilver it back into the pool in a matter of a few minutes. With RAIDz3, you should have no problem with that.
 

dmburgess

Cadet
Joined
Jul 11, 2019
Messages
6
Guess I don't understand the difference between two vdevs and one? In my case I am using 99% of this unit for iSCSI, i currently have one iSCSI partition. Maybe you can educate me, or point me in the correct direction of what and why I should be doing it differently? Currently i am getting plenty of speed out of this, compared to my old NAS, but my ignorance could be limiting me. Using ESXI . Currently running v11.2 FreeNAS

Your spare is in use and the failed drive is offline. You still have three drives of redundancy, which is way, way overkill to have three drives of redundancy AND a hot spare AND cold spares. If it was me, I would have went with RAIDz2, but you also should have went with two vdevs instead of putting all those drives into a single vdev. You sacrifices a lot of performance and used the same number of drives that you would have with two RAIDz2 vdevs.

This configuration must really perform like a dog with only one vdev of RAIDz3 being used for iSCSI. This is totally not the way it should be done if you need any decent performance.

You can pull a drive out, look at the serial number, and put it back in the chassis and ZFS will automatically resilver it back into the pool in a matter of a few minutes. With RAIDz3, you should have no problem with that.
 

dmburgess

Cadet
Joined
Jul 11, 2019
Messages
6
So I offline the offending drive, da19. I then unplugged it and reinserted it, then online the drive, and since then the state has been healthy. However, I have been getting SMART errors telling me that ALL of the drives (for the most part) are NOT cabled of SMART self-check. They are all 1TB SSDs.

Did you mark the outside of the drive tray before you put them in the system? I am not sure how the serial number helps you.
You could try to turn on the indicator light, but I don't know if that is supported on the Dell disk backplane. I use this to find drives in my Chenbro and Supermicro chassis systems and even in the QNAP SAS enclosures, it will not work if the drive is no-longer detected.
At the command line, type sesutil locate da19 on and the red LED associated to the drive should start flashing. Just repeat it with off to turn the LED off when you are done. You can turn all the other LEDs on if the one you are looking for isn't detected and look for the one that isn't flashing. I have had to do that when a drive failed in such a way that it was not detected by the system.
 

dmburgess

Cadet
Joined
Jul 11, 2019
Messages
6
Ended up the H310 was not even seated in the on-board slot. Just sitting on top of it. :( Got it slotted in and powered it back on, no errors, healthy pool. etc :)
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Ended up the H310 was not even seated in the on-board slot. Just sitting on top of it. :( Got it slotted in and powered it back on, no errors, healthy pool. etc :)
Wow. I can see how a loose connection would give you intermittent behavior.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Guess I don't understand the difference between two vdevs and one?
You might want to look at this presentation, even thought it is old, it has some good foundation information that might help you:

Slideshow explaining VDev, zpool, ZIL and L2ARC
https://www.ixsystems.com/community...ning-vdev-zpool-zil-and-l2arc-for-noobs.7775/
In my case I am using 99% of this unit for iSCSI, i currently have one iSCSI partition. Maybe you can educate me, or point me in the correct direction of what and why I should be doing it differently?
The limitation of ZFS is generally (very roughly speaking) that IOPS are equal to the capacity of one drive in each vdev multiplied by the number of vdevs. That means your pool probably gives you about the performance of a single SSD of the type that the pool is created from. Usually, a single SSD is not as fast as 10Gb networking. SSDs are fast though, so you might be fine with just two vdevs, but I would probably go for a minimum of three. Since SSDs are also pretty reliable, I might even go with four RAIDz1 vdevs of five drives each. Having four vdevs would give you four times as many IOPS as you have now while still having a fairly significant amount of capacity. Out of 20 drives, 16 would be data and four would be redundancy, which is effectively what you are rolling with now, just distributed differently to give you four vdevs instead of one.

General guidance for iSCSI and I am sure that @jgreco will support me on this, is to use mirrored vdevs, but that would give you significantly less usable space in your pool AND the limitation of iSCSI is that you should not fill the pool beyond 50% because it significantly impacts performance in a negative way. I have not had the opportunity to use SSDs for iSCSI, so I can't give you my experience but I would think that SSDs would give you a bit more latitude in that area because of their ability to randomly access any block without the seeking that mechanical disks need to do.

Here are links to the best guidance I have available for you:

10 Gig Networking Primer
https://www.ixsystems.com/community/resources/10-gig-networking-primer.42/

Why iSCSI often requires more resources for the same result (block storage)
https://www.ixsystems.com/community...res-more-resources-for-the-same-result.28178/

Some differences between RAIDZ and mirrors, and why we use mirrors for block storage (iSCSI)
https://www.ixsystems.com/community...and-why-we-use-mirrors-for-block-storage.112/

The ZFS ZIL and SLOG Demystified
https://www.ixsystems.com/blog/zfs-zil-and-slog-demystified/

Some insights into SLOG/ZIL with ZFS on FreeNAS
https://www.ixsystems.com/community/threads/some-insights-into-slog-zil-with-zfs-on-freenas.13633/

Testing the benefits of SLOG using a RAM disk!
https://www.ixsystems.com/community/threads/testing-the-benefits-of-slog-using-a-ram-disk.56561/

SLOG benchmarking and finding the best SLOG
https://www.ixsystems.com/community/threads/slog-benchmarking-and-finding-the-best-slog.63521/
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Currently i am getting plenty of speed out of this, compared to my old NAS, but my ignorance could be limiting me. Using ESXI . Currently running v11.2 FreeNAS
How much speed is plenty?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
PS. iXsystems sells this system: https://www.ixsystems.com/freenas-centurion/

and these are some of the configurations they suggest:
1564531330105.png
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
General guidance for iSCSI and I am sure that @jgreco will support me on this, is to use mirrored vdevs, but that would give you significantly less usable space in your pool

It might or might not give you significantly less usable space, depending on the specifics of the pool design and block sizes. This is one of the reasons not to use RAIDZ - you can totally screw up and get it to use bad amounts of overhead space if you do it wrong. Mirrors consume a fixed amount.
 
Top