calidancer
Dabbler
- Joined
- Apr 26, 2017
- Messages
- 11
Hi Team,
Apologies if this post has already an answer in this community: I searched before posting. Found several posts but nothing really helpful.
ENVIRONMENT:
ISSUE:
A few days ago I have received the following error message:
TROUBLESHOOTING:
QUESTIONS:
Thanks in advance.
Apologies if this post has already an answer in this community: I searched before posting. Found several posts but nothing really helpful.
ENVIRONMENT:
- Data Store Server: Supermicro X8DTU-6+
- Number of drives: 8 x 1TB (932G)
- OS: TrueNAS-12.0-U5.1 (Core edition)
- OS running on on-board stadom disk configured in RAID1
- Storage Pools: 2
- First Storage Pool:
- Name: fs-pool
- RAID Type: Mirror
- Number of disks: 2
- Members: mfid0p2, mfid1p2
- Second Storage Pool:
- Name: vms-pool
- RAID Type: RAIDZ1
- Number of disks: 6
- Members: mfid7p2, mfid6p2, mfid2p2, mfid4p2, mfid3p2, mfid5p2
- Disks are managed by a RAID card hence not much information seems to be visible from the TrueNAS GUI nor from the FreeBSD OS CLI
ISSUE:
A few days ago I have received the following error message:
I'm currently unable to locate the physical location of the affected disk for the replacementPool vms-pool state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected
TROUBLESHOOTING:
- From the first check on the GUI I can see one error only for the mfid5p2 disk under the checksum column only (I tried to post a screenshot here but it does not allow me to do so)
- I can definitely click on the hamburger menu (3 vertical dots) next to the affected disk, click on Offline and follow the procedure indicated on this page, however I can't identify the exact physical location of this affected disk
- From the OS CLI I obtained the following information:
Code:
root@ds[~]# zpool status -v pool: boot-pool state: ONLINE scan: scrub repaired 0B in 00:00:12 with 0 errors on Wed Jan 4 03:45:12 2023 config: NAME STATE READ WRITE CKSUM boot-pool ONLINE 0 0 0 ada0p2 ONLINE 0 0 0 errors: No known data errors pool: fs-pool state: ONLINE scan: scrub repaired 0B in 00:25:48 with 0 errors on Sun Jan 1 00:25:50 2023 config: NAME STATE READ WRITE CKSUM fs-pool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gptid/6681fcf6-7454-11ec-9adc-00259072914e ONLINE 0 0 0 gptid/6684ef93-7454-11ec-9adc-00259072914e ONLINE 0 0 0 errors: No known data errors pool: vms-pool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P scan: resilvered 92.5M in 00:00:11 with 0 errors on Tue Jan 10 12:34:53 2023 config: NAME STATE READ WRITE CKSUM vms-pool ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 gptid/eabb0d80-6eff-11ec-a73e-00259072914e ONLINE 0 0 0 gptid/ead9c3db-6eff-11ec-a73e-00259072914e ONLINE 0 0 0 gptid/eb020189-6eff-11ec-a73e-00259072914e ONLINE 0 0 0 gptid/eb140120-6eff-11ec-a73e-00259072914e ONLINE 0 0 0 gptid/eb13b166-6eff-11ec-a73e-00259072914e ONLINE 0 0 0 gptid/eb248fcd-6eff-11ec-a73e-00259072914e ONLINE 0 0 1
Code:
+========+==========================+==================+============================================+ | Device | DISK DESCRIPTION | SERIAL NUMBER | GPTID | +========+==========================+==================+============================================+ | mfid0 | | | gptid/6681fcf6-7454-11ec-9adc-00259072914e | +--------+--------------------------+------------------+--------------------------------------------+ | mfid1 | | | gptid/6684ef93-7454-11ec-9adc-00259072914e | +--------+--------------------------+------------------+--------------------------------------------+ | mfid2 | | | gptid/eb020189-6eff-11ec-a73e-00259072914e | +--------+--------------------------+------------------+--------------------------------------------+ | mfid3 | | | gptid/eb13b166-6eff-11ec-a73e-00259072914e | +--------+--------------------------+------------------+--------------------------------------------+ | mfid4 | | | gptid/eb140120-6eff-11ec-a73e-00259072914e | +--------+--------------------------+------------------+--------------------------------------------+ | mfid5 | | | gptid/eb248fcd-6eff-11ec-a73e-00259072914e | +--------+--------------------------+------------------+--------------------------------------------+ | mfid6 | | | gptid/ead9c3db-6eff-11ec-a73e-00259072914e | +--------+--------------------------+------------------+--------------------------------------------+ | mfid7 | | | gptid/eabb0d80-6eff-11ec-a73e-00259072914e | +--------+--------------------------+------------------+--------------------------------------------+ | ada0 | SATADOM D150QV-L | 20111213AA0000000035 | gptid/f3d28c0a-01dc-11ec-8d24-00259072914e | +--------+--------------------------+------------------+--------------------------------------------+
Code:
root@ds[~]# glabel status Name Status Components gptid/6681fcf6-7454-11ec-9adc-00259072914e N/A mfid0p2 gptid/6684ef93-7454-11ec-9adc-00259072914e N/A mfid1p2 gptid/eb020189-6eff-11ec-a73e-00259072914e N/A mfid2p2 gptid/eb13b166-6eff-11ec-a73e-00259072914e N/A mfid3p2 gptid/eb140120-6eff-11ec-a73e-00259072914e N/A mfid4p2 gptid/eb248fcd-6eff-11ec-a73e-00259072914e N/A mfid5p2 gptid/ead9c3db-6eff-11ec-a73e-00259072914e N/A mfid6p2 gptid/eabb0d80-6eff-11ec-a73e-00259072914e N/A mfid7p2 gptid/f3d28c0a-01dc-11ec-8d24-00259072914e N/A ada0p1
Code:
root@ds[~]# smartctl --scan-open /dev/pass0 -d scsi # /dev/pass0, SCSI device /dev/pass1 -d scsi # /dev/pass1, SCSI device /dev/pass2 -d scsi # /dev/pass2, SCSI device /dev/pass3 -d scsi # /dev/pass3, SCSI device /dev/pass4 -d scsi # /dev/pass4, SCSI device /dev/pass5 -d scsi # /dev/pass5, SCSI device /dev/pass6 -d scsi # /dev/pass6, SCSI device /dev/pass7 -d scsi # /dev/pass7, SCSI device /dev/ada0 -d atacam # /dev/ada0, ATA device /dev/ses0 -d atacam # /dev/ses0, ATA device
Code:
Filesystem Type Size Used Avail Capacity Mounted on boot-pool/ROOT/default zfs 14G 1.2G 13G 8% / devfs devfs 1.0K 1.0K 0B 100% /dev tmpfs tmpfs 32M 10M 22M 33% /etc tmpfs tmpfs 4.0M 8.0K 4.0M 0% /mnt tmpfs tmpfs 48G 27M 48G 0% /var fdescfs fdescfs 1.0K 1.0K 0B 100% /dev/fd vms-pool zfs 2.8T 154K 2.8T 0% /mnt/vms-pool fs-pool zfs 763G 112K 763G 0% /mnt/fs-pool fs-pool/iocage zfs 763G 11M 763G 0% /mnt/fs-pool/iocage fs-pool/usersdropbox zfs 763G 446M 763G 0% /mnt/fs-pool/usersdropbox fs-pool/fileserver zfs 899G 136G 763G 15% /mnt/fs-pool/fileserver fs-pool/iocage/download zfs 763G 96K 763G 0% /mnt/fs-pool/iocage/download fs-pool/iocage/releases zfs 763G 96K 763G 0% /mnt/fs-pool/iocage/releases fs-pool/iocage/jails zfs 763G 96K 763G 0% /mnt/fs-pool/iocage/jails fs-pool/iocage/templates zfs 763G 96K 763G 0% /mnt/fs-pool/iocage/templates fs-pool/iocage/images zfs 763G 96K 763G 0% /mnt/fs-pool/iocage/images fs-pool/iocage/log zfs 763G 96K 763G 0% /mnt/fs-pool/iocage/log fs-pool/usersdropbox/ivano_buffa zfs 763G 96K 763G 0% /mnt/fs-pool/usersdropbox/user1_name_not_shown_here_on_purpos fs-pool/usersdropbox/alex_knight zfs 763G 96K 763G 0% /mnt/fs-pool/usersdropbox/user2_name_not_shown_here_on_purpose vms-pool/.system zfs 2.8T 205K 2.8T 0% /var/db/system vms-pool/.system/cores zfs 1.0G 154K 1.0G 0% /var/db/system/cores vms-pool/.system/samba4 zfs 2.8T 735K 2.8T 0% /var/db/system/samba4 vms-pool/.system/syslog-676fc17d05c04653ab8528faa22c6340 zfs 2.8T 1.7M 2.8T 0% /var/db/system/syslog-676fc17d05c04653ab8528faa22c6340 vms-pool/.system/rrd-676fc17d05c04653ab8528faa22c6340 zfs 2.8T 62M 2.8T 0% /var/db/system/rrd-676fc17d05c04653ab8528faa22c6340 vms-pool/.system/configs-676fc17d05c04653ab8528faa22c6340 zfs 2.8T 52M 2.8T 0% /var/db/system/configs-676fc17d05c04653ab8528faa22c6340 vms-pool/.system/webui zfs 2.8T 154K 2.8T 0% /var/db/system/webui vms-pool/.system/services zfs 2.8T 154K 2.8T 0% /var/db/system/services
Code:
root@ds[~]# mfiutil show drives mfi0 Physical Drives: 16 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19VN0> SCSI-6 E1:S4 17 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19VP4> SCSI-6 E1:S5 20 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05AW1> SAS E1:S0 21 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19TW2> SCSI-6 E1:S1 22 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05GJZ> SAS E1:S2 23 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05HH9> SAS E1:S3 24 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19V89> SCSI-6 E1:S6 25 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05A3F> SAS E1:S7
Code:
root@ds[~]# mfiutil show drives mfi0 Physical Drives: 16 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19VN0> SCSI-6 E1:S4 17 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19VP4> SCSI-6 E1:S5 20 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05AW1> SAS E1:S0 21 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19TW2> SCSI-6 E1:S1 22 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05GJZ> SAS E1:S2 23 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05HH9> SAS E1:S3 24 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19V89> SCSI-6 E1:S6 25 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05A3F> SAS E1:S7 root@ds[~]# mfiutil show config mfi0 Configuration: 8 arrays, 8 volumes, 0 spares array 0 of 1 drives: drive 20 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05AW1> SAS array 1 of 1 drives: drive 21 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19TW2> SCSI-6 array 2 of 1 drives: drive 22 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05GJZ> SAS array 3 of 1 drives: drive 23 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05HH9> SAS array 4 of 1 drives: drive 16 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19VN0> SCSI-6 array 5 of 1 drives: drive 17 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19VP4> SCSI-6 array 6 of 1 drives: drive 24 ( 932G) ONLINE <SEAGATE ST91000640SS 0002 serial=9XG19V89> SCSI-6 array 7 of 1 drives: drive 25 ( 932G) ONLINE <SEAGATE ST91000640SS 0001 serial=9XG05A3F> SAS volume mfid0 (930G) RAID-0 64K OPTIMAL spans: array 0 volume mfid1 (930G) RAID-0 64K OPTIMAL spans: array 1 volume mfid2 (930G) RAID-0 64K OPTIMAL spans: array 2 volume mfid3 (930G) RAID-0 64K OPTIMAL spans: array 3 volume mfid4 (930G) RAID-0 64K OPTIMAL spans: array 4 volume mfid5 (930G) RAID-0 64K OPTIMAL spans: array 5 volume mfid6 (930G) RAID-0 64K OPTIMAL spans: array 6 volume mfid7 (930G) RAID-0 64K OPTIMAL spans: array 7
- I can even execute commands like this one below to find full information about any device including the serial number (for instance /dev/pass5):
Code:
root@ds[~]# smartctl -d scsi /dev/pass5 -i smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: SEAGATE Product: ST91000640SS Revision: 0001 Compliance: SPC-3 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Logical block size: 512 bytes Rotation Rate: 7200 rpm Form Factor: 2.5 inches Logical Unit id: 0x5000c50033e5f2bb Serial number: 9XG05HH900009133SQE3 Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Tue Jan 10 11:29:59 2023 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled
- And eventually I can even turn the led on any disk (for instance with the following command):
Code:
mfiutil locate "E1:S4" on
- Based on the above I can conclude that affected disk is the one below, however I can't find its location:
- Component: mfid5p2 (which means disk mfid5 partition 2)
- Name: gptid/eb248fcd-6eff-11ec-a73e-00259072914e
- Serial Number: COULD NOT FIND IT
- Physical location: COULD NOT FIND IT
QUESTIONS:
- Because of the nature of the error (checksum), considering that every disk appears online and optimal, is this something I should be concerned about? Do I better replace it? Please note that despite every disk appears online and optimal, from the GUI the vms-pool pool itself appears Unhealthy
- I would highly appreciate if someone can provide a way to show how to correlate the above information with the physical location position of each disk (or at least the affected one)
Thanks in advance.