Scharbag
Guru
- Joined
- Feb 1, 2012
- Messages
- 620
Just recently, I have starter having issues with one of my pools detaching devices for no real reason. I have 3 pools in addition to the mirrored boot pool. All issues are confined to only my backup pool. Here is what is happening:
I am currently running FreeNAS-9.3-STABLE-201509220011 and I started to run into issues when updating using the GUI to the Sept 28th release. Not all drives disconnect consistently from the backup pool. On one occasion all drives were disconnected, on other occasions only a few would disconnect. Once the drive has disconnected, it will not reconnect even when I pull the drive tray and re-insert it when online. The system requires a reset to see the drives again.
The really strange thing is, when I reboot, the backup pool is fine and online for a few minutes and then the devices start to be detached. If I instead pull the drive trays, reboot the computer and then insert the drives and import the pool, everything is fine. I am lucky that this is my backup pool but is is a bit frustrating.
I have changed which of my 2 V20 LSI cards is used for my pools and the backup pool devices get detached regardless of the controller. Both controllers are the same V20 IT firmware and everything is connected trough a single Intel RES2SV240. Only 1 LSI card is connected to the Intel expander at any time. The Intel expander has the latest firmware available.
The difference between my production pool and my backup pool is I had the HDDs configured to turn off to save power. I have since removed that from my config now that I have things running again. Would this be a possible issue given the latest update seems to address something to due with LSI cards and power savings?
My server has, up to this point, been pretty much bullet proof. Not sure if there is a bad drive this is causing the system to panic or if there is a hardware issue. I checked all of my cables JIC but the server does not move. System temperatures are normal, all fans are functioning, system has been stable. My UPS reports ~220W with all drives in the system. I have a 750W power supply.
If anyone has a suggestion, please let me know.
Cheers,
Code:
Oct 3 08:44:08 freenas (pass16:mps0:0:24:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 00 00 00 00 01 00 00 00 00 00 00 40 e3 00 length 0 SMID 509 command timeout cm 0xfffffe0000b24c28 ccb 0xfffff800477e6800 Oct 3 08:44:08 freenas (noperiph:mps0:0:4294967295:0): SMID 1 Aborting command 0xfffffe0000b24c28 Oct 3 08:44:08 freenas mps0: Sending reset from mpssas_send_abort for target ID 24 Oct 3 08:44:08 freenas (pass14:mps0:0:22:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 00 00 00 00 01 00 00 00 00 00 00 40 e3 00 length 0 SMID 524 command timeout cm 0xfffffe0000b25f60 ccb 0xfffff8000cc93800 Oct 3 08:44:08 freenas (noperiph:mps0:0:4294967295:0): SMID 2 Aborting command 0xfffffe0000b25f60 Oct 3 08:44:08 freenas mps0: Sending reset from mpssas_send_abort for target ID 22 Oct 3 08:44:08 freenas (pass17:mps0:0:25:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 00 00 00 00 01 00 00 00 00 00 00 40 e3 00 length 0 SMID 553 command timeout cm 0xfffffe0000b28488 ccb 0xfffff8000cc84000 Oct 3 08:44:08 freenas (noperiph:mps0:0:4294967295:0): SMID 3 Aborting command 0xfffffe0000b28488 Oct 3 08:44:08 freenas mps0: Sending reset from mpssas_send_abort for target ID 25 Oct 3 08:44:08 freenas (pass13:mps0:0:21:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 00 00 00 00 01 00 00 00 00 00 00 40 e3 00 length 0 SMID 533 command timeout cm 0xfffffe0000b26ae8 ccb 0xfffff8000cccd000 Oct 3 08:44:08 freenas (noperiph:mps0:0:4294967295:0): SMID 4 Aborting command 0xfffffe0000b26ae8 Oct 3 08:44:08 freenas mps0: Sending reset from mpssas_send_abort for target ID 21 Oct 3 08:44:09 freenas (da17:mps0:0:25:0): READ(16). CDB: 88 00 00 00 00 01 43 ca 52 60 00 00 00 08 00 00 length 4096 SMID 370 terminated ioc 804b scsi 0 state c xfer 0 Oct 3 08:44:09 freenas mps0: Unfreezing devq for target ID 25 Oct 3 08:44:10 freenas mps0: mpssas_prepare_remove: Sending reset for target ID 21 Oct 3 08:44:10 freenas mps0: mpssas_prepare_remove: Sending reset for target ID 22 Oct 3 08:44:10 freenas mps0: mpssas_prepare_remove: Sending reset for target ID 24 Oct 3 08:44:10 freenas da13 at mps0 bus 0 scbus0 target 21 lun 0 Oct 3 08:44:10 freenas da13: <ATA ST4000DM000-1F21 CC54> s/n S300YC75 detached Oct 3 08:44:10 freenas da14 at mps0 bus 0 scbus0 target 22 lun 0 Oct 3 08:44:10 freenas da14: <ATA ST4000DM000-1F21 CC54> s/n S300YBTS detached Oct 3 08:44:10 freenas da16 at mps0 bus 0 scbus0 target 24 lun 0 Oct 3 08:44:10 freenas da16: <ATA ST4000DM000-1F21 CC54> s/n S300YBLN detached Oct 3 08:44:10 freenas GEOM_ELI: Device da13p1.eli destroyed. Oct 3 08:44:10 freenas GEOM_ELI: Detached da13p1.eli on last close. Oct 3 08:44:10 freenas GEOM_ELI: Device da14p1.eli destroyed. Oct 3 08:44:10 freenas GEOM_ELI: Detached da14p1.eli on last close. Oct 3 08:44:10 freenas GEOM_ELI: Device da16p1.eli destroyed. Oct 3 08:44:10 freenas GEOM_ELI: Detached da16p1.eli on last close. Oct 3 08:44:10 freenas zfsd: Replace vdev(backuptank/17892785827659776471) by physical path: Unable to allocate spare target data. Oct 3 08:44:10 freenas zfsd: Replace vdev(backuptank/12561127196141559507) by physical path: Unable to allocate spare target data. Oct 3 08:44:10 freenas zfsd: Replace vdev(backuptank/15713028563994597996) by physical path: Unable to allocate spare target data. Oct 3 08:44:10 freenas (da17:mps0:0:25:0): READ(16). CDB: 88 00 00 00 00 01 43 ca 52 60 00 00 00 08 00 00 Oct 3 08:44:10 freenas (da17:mps0:0:25:0): CAM status: SCSI Status Error Oct 3 08:44:10 freenas (da17:mps0:0:25:0): SCSI status: Check Condition Oct 3 08:44:10 freenas (da17:mps0:0:25:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) Oct 3 08:44:10 freenas (da17:mps0:0:25:0): Retrying command (per sense data) Oct 3 08:44:10 freenas mps0: IOCStatus = 0x4b while resetting device 0x1a Oct 3 08:44:10 freenas mps0: Unfreezing devq for target ID 24 Oct 3 08:44:10 freenas mps0: Unfreezing devq for target ID 24 Oct 3 08:44:11 freenas mps0: IOCStatus = 0x4b while resetting device 0x18 Oct 3 08:44:11 freenas mps0: Unfreezing devq for target ID 22 Oct 3 08:44:11 freenas mps0: Unfreezing devq for target ID 22 Oct 3 08:44:11 freenas mps0: IOCStatus = 0x4b while resetting device 0x17 Oct 3 08:44:11 freenas mps0: Unfreezing devq for target ID 21 Oct 3 08:44:11 freenas mps0: Unfreezing devq for target ID 21 Oct 3 08:44:11 freenas (da16:mps0:0:24:0): Periph destroyed Oct 3 08:44:11 freenas (da14:mps0:0:22:0): Periph destroyed Oct 3 08:44:11 freenas (da13:mps0:0:21:0): Periph destroyed
I am currently running FreeNAS-9.3-STABLE-201509220011 and I started to run into issues when updating using the GUI to the Sept 28th release. Not all drives disconnect consistently from the backup pool. On one occasion all drives were disconnected, on other occasions only a few would disconnect. Once the drive has disconnected, it will not reconnect even when I pull the drive tray and re-insert it when online. The system requires a reset to see the drives again.
The really strange thing is, when I reboot, the backup pool is fine and online for a few minutes and then the devices start to be detached. If I instead pull the drive trays, reboot the computer and then insert the drives and import the pool, everything is fine. I am lucky that this is my backup pool but is is a bit frustrating.
I have changed which of my 2 V20 LSI cards is used for my pools and the backup pool devices get detached regardless of the controller. Both controllers are the same V20 IT firmware and everything is connected trough a single Intel RES2SV240. Only 1 LSI card is connected to the Intel expander at any time. The Intel expander has the latest firmware available.
The difference between my production pool and my backup pool is I had the HDDs configured to turn off to save power. I have since removed that from my config now that I have things running again. Would this be a possible issue given the latest update seems to address something to due with LSI cards and power savings?
My server has, up to this point, been pretty much bullet proof. Not sure if there is a bad drive this is causing the system to panic or if there is a hardware issue. I checked all of my cables JIC but the server does not move. System temperatures are normal, all fans are functioning, system has been stable. My UPS reports ~220W with all drives in the system. I have a 750W power supply.
If anyone has a suggestion, please let me know.
Cheers,