SOLVED Real world implications of "Uncorrectable parity/CRC error"

entilza72 · Oct 13, 2017

Hi team,

I've searched for and found a great thread here on how to track down the source of a "Uncorrectable parity/CRC error''.

But I notice there is no discussion on the real world impact of getting one of those errors.

What does it actually mean for your data that is being written at that point in time? Is the written data safe?

I'm currently doing an emergency ingest from a dying system onto a new freenas box. I'm more than happy/capable of following the linked thread to isolate the issue (I already have a suspect: I'm using the crazy thin Silverstone Thin SATA cables, and one of them got pinched on install). However, I'd like to understand the risk to the data I am ingesting , especially if there could be random corruption during the write.

Cheers,
Ent.

entilza72 · Oct 13, 2017

Clarification around my situation:

I have 3 data loads I am doing:

1. 1 TB load (completed)
2. 5 TB load (underway, ~12 hours and 50% remaining)
3. 1.5 TB load (next)

I will stop and investigate the hardware after load 2 completes tomorrow.

What I am wanting to know is: Is there any risk to the 6TB already loaded? Would I benefit from wiping the entire transfer to date and starting again?

Kind regards,
Ent.

rs225 · Oct 14, 2017

If it is ZFS, the data is either intact (corrected via redundancy if needed) or you get an I/O Error when you attempt to read if it can't be corrected. In that case, no need to start your transfer again if there were no I/O Errors.

If the concern is that the error happened to the disk you are writing to, then the answer is to run a scrub after transfer is complete. At the end, you'll know that everything read correctly, or that there were certain errors to certain data.

entilza72 · Oct 14, 2017

Thanks rs225! I completely overlooked talking about the RAID level! Much embarrassment.

ZFS Raidz1 over three disks. So I'm guessing it should have been corrected via redundancy, but I cannot check files as I do not know which ones.

 root@prd-nas-01:~ # zpool status

  pool: Tank01

state: ONLINE

  scan: resilvered 3.57M in 0h0m with 0 errors on Thu Oct 12 22:40:26 2017

config:



		NAME											STATE	 READ WRITE CKSUM

		Tank01										  ONLINE	   0	 0	 0

		  raidz1-0									  ONLINE	   0	 0	 0

			gptid/22b3bc95-af2e-11e7-b171-1c1b0d9c50a4  ONLINE	   0	 0	 0

			gptid/232474c3-af2e-11e7-b171-1c1b0d9c50a4  ONLINE	   0	 0	 0

			gptid/23adc986-af2e-11e7-b171-1c1b0d9c50a4  ONLINE	   0	 0	 0



errors: No known data errors



  pool: freenas-boot

state: ONLINE

  scan: none requested

config:



		NAME		STATE	 READ WRITE CKSUM

		freenas-boot  ONLINE	   0	 0	 0

		  mirror-0  ONLINE	   0	 0	 0

			da0p2   ONLINE	   0	 0	 0

			da1p2   ONLINE	   0	 0	 0



errors: No known data errors

I will run a scrub as suggested. FYI, I changed the SATA cable on the disk involved after the second data load. All errors disappeared after the change.

Thanks!
Ent.

styno · Oct 15, 2017

The scrub will tell you the files that are impacted, if any. I wouldn't worry too much upfront as the scrub from Oct 12 completed without issue.

entilza72 · Oct 15, 2017

Thanks rs225 and styno for your input - scrub has now completed and it is reporting no errors (BTW the 12 Oct Scrub was prior to data ingest).

So it seems that, when running RAIDZ1 at least, a series of Uncorrectable parity/CRC errors from the SATA channel can result in no issues due to the nature of RAIDZ1. The Scrub has confirmed this.

Of course, you should seek out and eliminate the source of the errors as I did. Follow my link in the opening post for a great thread on this.

All is well that ends well.

   pool: Tank01

 state: ONLINE

  scan: scrub repaired 0 in 8h45m with 0 errors on Sun Oct 15 17:27:43 2017

Kind regards,
Ent.

Important Announcement for the TrueNAS Community.

SOLVED Real world implications of "Uncorrectable parity/CRC error"

entilza72

Dabbler

entilza72

Dabbler

rs225

Guru

entilza72

Dabbler

styno

Patron

entilza72

Dabbler

Similar threads