Hi All
Need some advice please. One of my Volumes are coming up with the below error. On investigation I found one of the drives has 3 CKSUM errors.
Did some reading and found other threads with the same issue and trouble shooting they did. Trouble shooting I did below
1. Did mem test with Memtest86 no errors
2. Replaced Sata cable, did scrub, problem came back 3 errors 24k
3. Moved drive to another SATA port and Power cable, cleared error, did scrub again, problem still there 3 errors 24k
The problem problem moves with the drive and always 3 errors 24k. The smart report is not showing anything wrong with the drive.
I am also getting this error when its was ada1 I moved it to ada3 to test if the port was the problem?
(ada1:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 c0 c4 c7 40 8d 00 00 01 00 00
> (ada1:ahcich3:0:0:0): CAM status: Uncorrectable parity/CRC error
> (ada1:ahcich3:0:0:0): Retrying command
pool: Pool 1
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub repaired 24K in 3h17m with 0 errors on Tue Sep 5 19:28:32 2017
config:
NAME STATE READ WRITE CKSUM
Movies_and_Music ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
gptid/82380884-6c3e-11e7-b922-d02788c30dde ONLINE 0 0 0
gptid/834a3ffa-6c3e-11e7-b922-d02788c30dde ONLINE 0 0 0
gptid/845ab596-6c3e-11e7-b922-d02788c30dde ONLINE 0 0 3
errors: No known data errors
Local system status:
3:01AM up 10:59, 0 users, load averages: 0.08, 0.16, 0.15
-- End of daily output --
########## SMART status report for ada3 drive (Western Digital Red: WD-WCC4N5RCSER1) ##########
10.5pt">smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build)
SMART overall-health self-assessment test result: PASSED
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 176 174 021 Pre-fail Always - 6175
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 130
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 079 079 000 Old_age Always - 15966
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 130
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 65
193 Load_Cycle_Count 0x0032 193 193 000 Old_age Always - 21721
194 Temperature_Celsius 0x0022 121 107 000 Old_age Always - 29
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
No Errors Logged
Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
Extended offline Completed without error 00% 15320 -
Short offline Completed without error 00% 15946 -
So what do I do now? is the drive faulty? Do I need to worry. I see the drive is still in Limited Warranty with WD. I suspect they won't replace the drive since I
can't really prove there is something wrong with the drives other than the 3 crc errors.
So what do I do now? is the drive faulty? Do I need to worry.
Need some advice please. One of my Volumes are coming up with the below error. On investigation I found one of the drives has 3 CKSUM errors.
Did some reading and found other threads with the same issue and trouble shooting they did. Trouble shooting I did below
1. Did mem test with Memtest86 no errors
2. Replaced Sata cable, did scrub, problem came back 3 errors 24k
3. Moved drive to another SATA port and Power cable, cleared error, did scrub again, problem still there 3 errors 24k
The problem problem moves with the drive and always 3 errors 24k. The smart report is not showing anything wrong with the drive.
I am also getting this error when its was ada1 I moved it to ada3 to test if the port was the problem?
(ada1:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 c0 c4 c7 40 8d 00 00 01 00 00
> (ada1:ahcich3:0:0:0): CAM status: Uncorrectable parity/CRC error
> (ada1:ahcich3:0:0:0): Retrying command
pool: Pool 1
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub repaired 24K in 3h17m with 0 errors on Tue Sep 5 19:28:32 2017
config:
NAME STATE READ WRITE CKSUM
Movies_and_Music ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
gptid/82380884-6c3e-11e7-b922-d02788c30dde ONLINE 0 0 0
gptid/834a3ffa-6c3e-11e7-b922-d02788c30dde ONLINE 0 0 0
gptid/845ab596-6c3e-11e7-b922-d02788c30dde ONLINE 0 0 3
errors: No known data errors
Local system status:
3:01AM up 10:59, 0 users, load averages: 0.08, 0.16, 0.15
-- End of daily output --
########## SMART status report for ada3 drive (Western Digital Red: WD-WCC4N5RCSER1) ##########
10.5pt">smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build)
SMART overall-health self-assessment test result: PASSED
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 176 174 021 Pre-fail Always - 6175
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 130
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 079 079 000 Old_age Always - 15966
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 130
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 65
193 Load_Cycle_Count 0x0032 193 193 000 Old_age Always - 21721
194 Temperature_Celsius 0x0022 121 107 000 Old_age Always - 29
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
No Errors Logged
Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
Extended offline Completed without error 00% 15320 -
Short offline Completed without error 00% 15946 -
So what do I do now? is the drive faulty? Do I need to worry. I see the drive is still in Limited Warranty with WD. I suspect they won't replace the drive since I
can't really prove there is something wrong with the drives other than the 3 crc errors.
So what do I do now? is the drive faulty? Do I need to worry.