Uncorrectable parity/CRC error

Status
Not open for further replies.

Wenborn

Cadet
Joined
Feb 13, 2014
Messages
9
Hello

I've just decided to scrub my drives and I get this error in the console;

Jun 16 13:18:41 freenas kernel: (ada1:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 e0 a0 00 40 40 00 00 00 00 00 00
Jun 16 13:18:41 freenas kernel: (ada1:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error
Jun 16 13:18:41 freenas kernel: (ada1:ahcich1:0:0:0): Retrying command

Is this something that i should be concerned about?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Yes it is. Disk ada1 is crapping out.

Please post the disk's S.M.A.R.T. data.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
You should have automatic scrubs set up anyway (say, every 2 weeks), and it should email you the instant there's a problem.

As Erieloewe says, please post the result of
Code:
 smartctl -q noserial -a /dev/ada1
 

Wenborn

Cadet
Joined
Feb 13, 2014
Messages
9
Here is my SMART result, I would post all of it but for some reason I cant seem to work out how to scroll up like in the console?
Code:
recommended polling time: ( 2) minutes.
 
Extended self-test routine
 
recommended polling time: ( 413) minutes.
 
Conveyance self-test routine
 
recommended polling time: ( 5) minutes.
 
SCT capabilities: (0x703d) SCT Status supported
 
SCT Error Recovery Control supported.
 
SCT Feature Control supported.
 
SCT Data Table supported.
 
SMART Attributes Data Structure revision number: 16
 
Vendor Specific SMART Attributes with Thresholds:
 
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
 
1 Raw_Read_Error_Rate 0x002f 100 253 051 Pre-fail Always - 0
 
3 Spin_Up_Time 0x0027 177 176 021 Pre-fail Always - 6108
 
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 33
 
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
 
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
 
9 Power_On_Hours 0x0032 100 099 000 Old_age Always - 232
 
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
 
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
 
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 33
 
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 16
 
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 27
 
194 Temperature_Celsius 0x0022 122 116 000 Old_age Always - 28
 
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
 
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
 
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
 
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 1
 
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0
 
SMART Error Log Version: 1
 
No Errors Logged
 
SMART Self-test log structure revision number 1
 
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
 
# 1 Extended offline Interrupted (host reset) 20% 189 -
 
SMART Selective self-test log data structure revision number 1
 
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
 
1 0 0 Not_testing
 
2 0 0 Not_testing
 
3 0 0 Not_testing
 
4 0 0 Not_testing
 
5 0 0 Not_testing
 
Selective self-test flags (0x0):
 
After scanning selected spans, do NOT read-scan remainder of disk.
 
 
If Selective self-test is pending on power-up, resume after 0 minute delay.

I have scrubs set once a month would it be better to set it to 2 weeks? but the reason I did a scrub on my volume was because I got the following yellow warning 2 days ago;
WARNING: The Volume Volume1 (ZFS) status is ONLINE: one or more devices has experienced an unrecoverable error. an attempt was made to correct the error. Applications are unaffected. Determine if the device needs to be replaced, and clear the errors using "zpool clear" or replace the device with "zpool replace".
It disappeared after I did a server restart so I decided to do a scrub to see if it came back which it didn't but I just got the Uncorrectable parity/CRC error.

I also checked my zpool status but it came with no errors this could be because I haven’t been using the server, I built the server in April and I haven't been using it to store data as I wanted to familiarize myself on the setup and maintenance before I trusted it with everything as this is my first FreeNas build.

Also I thought I setup my email but for some reason I have never received any status reports from my NAS? I also tried to use the script from this website https://gist.github.com/fkleon/6147471 to get temp reports of my drives but I can’t seem to receive emails.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Hmm... I'm not quite sure, but it looks like it might've been something on the interface, since everything checks out except for the UDMA_CRC_Error_Count, which is at 1. Not absolutely sure that it represents a transmission error, though.
 

Wenborn

Cadet
Joined
Feb 13, 2014
Messages
9
Hm ada1 would most lightly be the drive on sata 1 right as I will try reseating the sata cable to see if that helps? Out of interest what does an UDMA_CRC_Error mean as I cant find much info about it on the internet
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Hm ada1 would most lightly be the drive on sata 1 right as I will try reseating the sata cable to see if that helps? Out of interest what does an UDMA_CRC_Error mean as I cant find much info about it on the internet

UDMA is a PATA interface. SATA drives must keep using the old designation to avoid breaking some tools, especially because the command set is the same (except for a few additions that were never implemented in PATA devices because the interface was already obsolete) and this allows for a sort of minimum change upgrade that tries to abstract away the new interface.

CRC is an error-detection code.

Basically, I take it to mean an error transmitting (or receiving) data that was caught by the interface's error detection.
 

Wenborn

Cadet
Joined
Feb 13, 2014
Messages
9
Ok thanks for the description is good for me to know, I have reseated the sata cable for ada1 and also checked the other drives and that has seemed well I think solve the problem as no longer get the error messages in the console during a scrub. Is there anything else I could do to ensure the problem is rectified?

Also ive set my scrub inivital to every two weeks as Drkk mentioned, do you have any ideas on why I cant seem to receive emails from my NAS?

Thanks I appreciate your help!
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Ok thanks for the description is good for me to know, I have reseated the sata cable for ada1 and also checked the other drives and that has seemed well I think solve the problem as no longer get the error messages in the console during a scrub. Is there anything else I could do to ensure the problem is rectified?

Also ive set my scrub inivital to every two weeks as Drkk mentioned, do you have any ideas on why I cant seem to receive emails from my NAS?

Thanks I appreciate your help!
Keep an eye on the drive, including regular S.M.A.R.T. long tests and short tests and set up the e-mails.

Check the manual, I believe it has a section on setting this stuff up. If not, there's a sticky somewhere, I think.
 
Status
Not open for further replies.
Top