razvanc.mobile
Dabbler
- Joined
- Oct 19, 2015
- Messages
- 16
Hey guys,
I'm having a small issue with my FreeNAS server, FreeNAS-9.3-STABLE-201509282017
My hardware is
SR2612UR Chassis, with 12 hotswap 3.5 drive cages
single L5630 CPU
12 GB ram ECC
Intel RS2WC040 raid controller (LSI2008)
12 x 1 TB HDD (will post make and models of each )
The controller is not a true HBA (i mean it has the original intel firmware, not reflashed to IT mode), but it has the option of exporting the drives as JBOD.
This is my RaidZ1 pool. this server is used only for backups (it has an 8 TB iscsi target exported to a windows machine running Veeam B&R, and a smaller dataset exported as NFS to a proxmox 4.0 server).
So i just saw a message in the UI today, about this error.
As i see it's a checksum error on one of the drives.
I'm trying to identify the drive like this
Apparently the issue is with the mfisyspd8
As i can see, the disk at target 23 is this
23 ( 931G) JBOD <ST31000528AS CC38 serial=5VP3W9HN> SATA E1:S5, which i assume is the same as
<ATA ST31000528AS CC38> at scbus0 target 23 lun 0 (pass9)
So the device with the checksum error is pass9.
Indeed, running smartctl -a -d sat /dev/pass9 i get plenty of smart errors.
Which brings me to my two questions:
1. How can i identify, using mfisyspdXX numerotation which mfisyspd device corresponds to /dev/pass9.
2. I'm unable to use smart monitoring in the UI, no smart tests can be initiated, probably because of the not-so-true-HBA controller i'm using.
The drives show up as mfisyspd 0 to 11 in the freenas UI, and smartctl gives error on /dev/mfisyspdX (No such file or directory) , but works just fine on /dev/pass0->11. So how can i make freenas see my drives with /dev/passX naming, or is there another way of making smartctl check my drives automaticaly, without me having to run it manually?
I'm having a small issue with my FreeNAS server, FreeNAS-9.3-STABLE-201509282017
My hardware is
SR2612UR Chassis, with 12 hotswap 3.5 drive cages
single L5630 CPU
12 GB ram ECC
Intel RS2WC040 raid controller (LSI2008)
12 x 1 TB HDD (will post make and models of each )
The controller is not a true HBA (i mean it has the original intel firmware, not reflashed to IT mode), but it has the option of exporting the drives as JBOD.
This is my RaidZ1 pool. this server is used only for backups (it has an 8 TB iscsi target exported to a windows machine running Veeam B&R, and a smaller dataset exported as NFS to a proxmox 4.0 server).
Code:
pool: storage state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://illumos.org/msg/ZFS-8000-9P scan: none requested config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 gptid/a6a38928-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0 gptid/a78b17c9-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0 gptid/a85f63ae-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0 gptid/a8f0e77a-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0 gptid/a984d1a4-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0 gptid/aa34a38b-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0 gptid/aaf61800-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0 gptid/abb6de05-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0 gptid/ac9cb3dc-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0 gptid/ad833005-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0 gptid/ae56d58c-75a5-11e5-9681-001e6728a258 ONLINE 0 0 1 gptid/af1aab04-75a5-11e5-9681-001e6728a258 ONLINE 0 0 0
So i just saw a message in the UI today, about this error.
As i see it's a checksum error on one of the drives.
I'm trying to identify the drive like this
Code:
camcontrol devlist <ATA ST31000524NS SN12> at scbus0 target 14 lun 0 (pass0) <ATA ST1000NC001-1DY1 CN01> at scbus0 target 15 lun 0 (pass1) <ATA ST1000NC001-1DY1 CN01> at scbus0 target 16 lun 0 (pass2) <ATA ST31000524NS SN12> at scbus0 target 17 lun 0 (pass3) <INTEL SR2612UR I106> at scbus0 target 18 lun 0 (ses0,pass4) <ATA WDC WD1002FAEX-0 1D05> at scbus0 target 19 lun 0 (pass5) <ATA Hitachi HUA72201 A3EA> at scbus0 target 20 lun 0 (pass6) <ATA Hitachi HUA72201 A3EA> at scbus0 target 21 lun 0 (pass7) <ATA Hitachi HUA72201 A3EA> at scbus0 target 22 lun 0 (pass8) <ATA ST31000528AS CC38> at scbus0 target 23 lun 0 (pass9) <ATA Hitachi HUA72201 A3EA> at scbus0 target 24 lun 0 (pass10) <ATA Hitachi HUA72201 A3EA> at scbus0 target 25 lun 0 (pass11) <ATA Hitachi HUA72201 A3EA> at scbus0 target 26 lun 0 (pass12) <Kingston DataTraveler 2.0 PMAP> at scbus2 target 0 lun 0 (pass13,da0)
Code:
glabel status Name Status Components gptid/a85f63ae-75a5-11e5-9681-001e6728a258 N/A mfisyspd0p2 gptid/a8f0e77a-75a5-11e5-9681-001e6728a258 N/A mfisyspd1p2 gptid/a984d1a4-75a5-11e5-9681-001e6728a258 N/A mfisyspd2p2 gptid/aa34a38b-75a5-11e5-9681-001e6728a258 N/A mfisyspd3p2 gptid/aaf61800-75a5-11e5-9681-001e6728a258 N/A mfisyspd4p2 gptid/abb6de05-75a5-11e5-9681-001e6728a258 N/A mfisyspd5p2 gptid/ac9cb3dc-75a5-11e5-9681-001e6728a258 N/A mfisyspd6p2 gptid/ad833005-75a5-11e5-9681-001e6728a258 N/A mfisyspd7p2 gptid/ae56d58c-75a5-11e5-9681-001e6728a258 N/A mfisyspd8p2 gptid/af1aab04-75a5-11e5-9681-001e6728a258 N/A mfisyspd9p2 gptid/a6a38928-75a5-11e5-9681-001e6728a258 N/A mfisyspd10p2 gptid/a78b17c9-75a5-11e5-9681-001e6728a258 N/A mfisyspd11p2 gptid/83846e35-7415-11e5-86c0-001e6728a258 N/A da0p1
Apparently the issue is with the mfisyspd8
Code:
mfiutil show drives mfi0 Physical Drives: 14 ( 931G) JBOD <ST31000524NS SN12 serial=9WK39MH1> SATA E1:S1 15 ( 931G) JBOD <ST1000NC001-1DY1 CN01 serial=Z1D2V8C6> SATA E1:S2 16 ( 931G) JBOD <ST1000NC001-1DY1 CN01 serial=Z1D2V9GP> SATA E1:S3 17 ( 931G) JBOD <ST31000524NS SN12 serial=9WK3A4YP> SATA E1:S4 19 ( 931G) JBOD <WDC WD1002FAEX-0 1D05 serial=WD-WCATR5822785> SATA E1:S9 20 ( 931G) JBOD <Hitachi HUA72201 A3EA serial=JPW9H0N01EMMKV> SATA E1:S10 21 ( 931G) JBOD <Hitachi HUA72201 A3EA serial=JPW9H0N01E8GBV> SATA E1:S11 22 ( 931G) JBOD <Hitachi HUA72201 A3EA serial=JPW9H0N01EML1V> SATA E1:S12 23 ( 931G) JBOD <ST31000528AS CC38 serial=5VP3W9HN> SATA E1:S5 24 ( 931G) JBOD <Hitachi HUA72201 A3EA serial=JPW9L0N10JLJBV> SATA E1:S6 25 ( 931G) JBOD <Hitachi HUA72201 A3EA serial=JPW9H0N01ELD1V> SATA E1:S7 26 ( 931G) JBOD <Hitachi HUA72201 A3EA serial=JPW9H0N01EM34V> SATA E1:S8
As i can see, the disk at target 23 is this
23 ( 931G) JBOD <ST31000528AS CC38 serial=5VP3W9HN> SATA E1:S5, which i assume is the same as
<ATA ST31000528AS CC38> at scbus0 target 23 lun 0 (pass9)
So the device with the checksum error is pass9.
Indeed, running smartctl -a -d sat /dev/pass9 i get plenty of smart errors.
Which brings me to my two questions:
1. How can i identify, using mfisyspdXX numerotation which mfisyspd device corresponds to /dev/pass9.
2. I'm unable to use smart monitoring in the UI, no smart tests can be initiated, probably because of the not-so-true-HBA controller i'm using.
The drives show up as mfisyspd 0 to 11 in the freenas UI, and smartctl gives error on /dev/mfisyspdX (No such file or directory) , but works just fine on /dev/pass0->11. So how can i make freenas see my drives with /dev/passX naming, or is there another way of making smartctl check my drives automaticaly, without me having to run it manually?