SOLVED Real world implications of "Uncorrectable parity/CRC error"

Status
Not open for further replies.

entilza72

Dabbler
Joined
Oct 8, 2017
Messages
21
Hi team,

I've searched for and found a great thread here on how to track down the source of a "Uncorrectable parity/CRC error''.

But I notice there is no discussion on the real world impact of getting one of those errors.

What does it actually mean for your data that is being written at that point in time? Is the written data safe?

I'm currently doing an emergency ingest from a dying system onto a new freenas box. I'm more than happy/capable of following the linked thread to isolate the issue (I already have a suspect: I'm using the crazy thin Silverstone Thin SATA cables, and one of them got pinched on install). However, I'd like to understand the risk to the data I am ingesting , especially if there could be random corruption during the write.

Cheers,
Ent.
 

entilza72

Dabbler
Joined
Oct 8, 2017
Messages
21
Clarification around my situation:

I have 3 data loads I am doing:

1. 1 TB load (completed)
2. 5 TB load (underway, ~12 hours and 50% remaining)
3. 1.5 TB load (next)

I will stop and investigate the hardware after load 2 completes tomorrow.

What I am wanting to know is: Is there any risk to the 6TB already loaded? Would I benefit from wiping the entire transfer to date and starting again?

Kind regards,
Ent.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
If it is ZFS, the data is either intact (corrected via redundancy if needed) or you get an I/O Error when you attempt to read if it can't be corrected. In that case, no need to start your transfer again if there were no I/O Errors.

If the concern is that the error happened to the disk you are writing to, then the answer is to run a scrub after transfer is complete. At the end, you'll know that everything read correctly, or that there were certain errors to certain data.
 
Last edited:

entilza72

Dabbler
Joined
Oct 8, 2017
Messages
21
Thanks rs225! I completely overlooked talking about the RAID level! Much embarrassment.

ZFS Raidz1 over three disks. So I'm guessing it should have been corrected via redundancy, but I cannot check files as I do not know which ones.

root@prd-nas-01:~ # zpool status
pool: Tank01
state: ONLINE
scan: resilvered 3.57M in 0h0m with 0 errors on Thu Oct 12 22:40:26 2017
config:

NAME STATE READ WRITE CKSUM
Tank01 ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
gptid/22b3bc95-af2e-11e7-b171-1c1b0d9c50a4 ONLINE 0 0 0
gptid/232474c3-af2e-11e7-b171-1c1b0d9c50a4 ONLINE 0 0 0
gptid/23adc986-af2e-11e7-b171-1c1b0d9c50a4 ONLINE 0 0 0

errors: No known data errors

pool: freenas-boot
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
da0p2 ONLINE 0 0 0
da1p2 ONLINE 0 0 0

errors: No known data errors


I will run a scrub as suggested. FYI, I changed the SATA cable on the disk involved after the second data load. All errors disappeared after the change.

Thanks!
Ent.
 

styno

Patron
Joined
Apr 11, 2016
Messages
466
The scrub will tell you the files that are impacted, if any. I wouldn't worry too much upfront as the scrub from Oct 12 completed without issue.
 

entilza72

Dabbler
Joined
Oct 8, 2017
Messages
21
Thanks rs225 and styno for your input - scrub has now completed and it is reporting no errors (BTW the 12 Oct Scrub was prior to data ingest).

So it seems that, when running RAIDZ1 at least, a series of Uncorrectable parity/CRC errors from the SATA channel can result in no issues due to the nature of RAIDZ1. The Scrub has confirmed this.

Of course, you should seek out and eliminate the source of the errors as I did. Follow my link in the opening post for a great thread on this.

All is well that ends well.

pool: Tank01
state: ONLINE
scan: scrub repaired 0 in 8h45m with 0 errors on Sun Oct 15 17:27:43 2017


Kind regards,
Ent.
 
Status
Not open for further replies.
Top