Kirill_v_b
Cadet
- Joined
- Apr 22, 2017
- Messages
- 4
Hello,
After update to 11.2-u2.1 from 11.2-u1 pool (RaidZ3, 8 hdd > 7 + 1 spare, on controller Supermicro LSI2308-IT) going to resilver. Zpool status start shows a lot of errors (read, write, checksum) at ALL pool devices. Smartctl for ALL devices records error like this :
after, in around 15 minutes, system hangs completely and only power reset can restart server.
Rebooting with previous behavior 11.2-u1 or 11.2-Release will start resilvering again. During resilvering no errors at all, no new errors in smartctl. After resilvering is finished the pool will return to normal state and working without any errors.
Trying to update again from 11.2-u1 to 11.2-u2.1 will start nightmare again - Resilvering, zpool status start shows a lot of errors (read, write, checksum) at ALL pool devices, etc.
Please, help me find / solve / understand the problem.
Thanks and sorry for the broken English!
After update to 11.2-u2.1 from 11.2-u1 pool (RaidZ3, 8 hdd > 7 + 1 spare, on controller Supermicro LSI2308-IT) going to resilver. Zpool status start shows a lot of errors (read, write, checksum) at ALL pool devices. Smartctl for ALL devices records error like this :
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 08 98 98 cb 40 40 08 00:52:23.536 WRITE FPDMA QUEUED
61 10 90 08 61 89 40 08 00:52:23.534 WRITE FPDMA QUEUED
61 10 88 f8 85 d2 40 08 00:52:23.534 WRITE FPDMA QUEUED
61 10 80 10 ee a0 40 08 00:52:22.898 WRITE FPDMA QUEUED
61 10 78 10 ec a0 40 08 00:52:22.896 WRITE FPDMA QUEUED
Error 94 occurred at disk power-on lifetime: 17584 hours (732 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 08 a8 10 6e 89 40 08 00:52:22.322 WRITE FPDMA QUEUED
ef 02 00 00 00 00 40 08 00:52:22.321 SET FEATURES [Enable write cache]
ef aa 00 00 00 00 40 08 00:52:22.319 SET FEATURES [Enable read look-ahead]
c6 00 10 00 00 00 40 08 00:52:22.319 SET MULTIPLE MODE
ef 10 02 00 00 00 40 08 00:52:22.318 SET FEATURES [Enable SATA feature]
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 08 98 98 cb 40 40 08 00:52:23.536 WRITE FPDMA QUEUED
61 10 90 08 61 89 40 08 00:52:23.534 WRITE FPDMA QUEUED
61 10 88 f8 85 d2 40 08 00:52:23.534 WRITE FPDMA QUEUED
61 10 80 10 ee a0 40 08 00:52:22.898 WRITE FPDMA QUEUED
61 10 78 10 ec a0 40 08 00:52:22.896 WRITE FPDMA QUEUED
Error 94 occurred at disk power-on lifetime: 17584 hours (732 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
61 08 a8 10 6e 89 40 08 00:52:22.322 WRITE FPDMA QUEUED
ef 02 00 00 00 00 40 08 00:52:22.321 SET FEATURES [Enable write cache]
ef aa 00 00 00 00 40 08 00:52:22.319 SET FEATURES [Enable read look-ahead]
c6 00 10 00 00 00 40 08 00:52:22.319 SET MULTIPLE MODE
ef 10 02 00 00 00 40 08 00:52:22.318 SET FEATURES [Enable SATA feature]
after, in around 15 minutes, system hangs completely and only power reset can restart server.
Rebooting with previous behavior 11.2-u1 or 11.2-Release will start resilvering again. During resilvering no errors at all, no new errors in smartctl. After resilvering is finished the pool will return to normal state and working without any errors.
Trying to update again from 11.2-u1 to 11.2-u2.1 will start nightmare again - Resilvering, zpool status start shows a lot of errors (read, write, checksum) at ALL pool devices, etc.
Please, help me find / solve / understand the problem.
Thanks and sorry for the broken English!