New NAS box - how many CRC errors is "acceptable"?

demon

Contributor
Joined
Dec 6, 2014
Messages
117
I've nearly finished a build of a new NAS box (decided to switch from a Coffee Lake Xeon E-2136 to a Coffee Lake-R E-2288G; running 8 Seagate Exos X16 drives), and was testing the drives with the sol.net test script. I had been testing over the course of about 10 days, and received 14 CRC errors (in both dmesg and SMART output, so I'm sure that they were real). I ordered new MiniSAS cables (the drives are connected to an LSI SAS 9211-8i HBA, and yes, running P20 IT firmware) and replaced the old ones, and have been testing for 3 days so far. In that period, it's received one more CRC error on a new run.

Would that be within acceptable margins? My existing NAS build has one CRC error in total over its current drives' entire operational lifetime, so hopefully this won't be a regular occurrence. The commands are reissued immediately and it seems to carry on fine, and I've verified no medium errors so far in testing.
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
UDMA_CRC_Error_Count is that? I've only ever got those from dodgy cables/interfaces. I'd suggest pulling the cables off and putting them on again a few times, break up any surface gunge on the connectors. Current build (new cables and SAS card) has zero.

If you're getting Raw_Read_Error_Rate as well, that would warrant further investigation. If not, keep an eye on them but no panic.
 

demon

Contributor
Joined
Dec 6, 2014
Messages
117
That's correct, yes. Based on another post found elsewhere, it looks like for Seagates you have to do a little adjusting of the value. After doing that, it seems the value is 0 for all drives:

Code:
root@europa[~]# for dev in /dev/da[0-7] ; do smartctl -a -v 1,raw48:54 "${dev}" | awk '{ if ($2 == "Raw_Read_Error_Rate") print $0 }' ; done
  1 Raw_Read_Error_Rate     0x000f   081   064   044    Pre-fail  Always       -       0
  1 Raw_Read_Error_Rate     0x000f   080   064   044    Pre-fail  Always       -       0
  1 Raw_Read_Error_Rate     0x000f   077   064   044    Pre-fail  Always       -       0
  1 Raw_Read_Error_Rate     0x000f   084   064   044    Pre-fail  Always       -       0
  1 Raw_Read_Error_Rate     0x000f   082   064   044    Pre-fail  Always       -       0
  1 Raw_Read_Error_Rate     0x000f   081   064   044    Pre-fail  Always       -       0
  1 Raw_Read_Error_Rate     0x000f   082   064   044    Pre-fail  Always       -       0
  1 Raw_Read_Error_Rate     0x000f   083   064   044    Pre-fail  Always       -       0


Considering the amount of drive activity the script is generating, I'm not super concerned (going on 4 days now and just that one CRC error in this round of testing), just want to be careful.
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
Bloody Seagate and their daft SMART numbers :rolleyes:
 
Joined
May 10, 2017
Messages
838
One or two CRC errors are OK from time to time, any more than that I would consider not normal, though the number you mention it's not that bad.

and yes, running P20 IT firmware

Make sure it's p20.00.07.00, earlier p20 releases have kownn issues, one of them is abnormal CRC errors.
 

demon

Contributor
Joined
Dec 6, 2014
Messages
117
I am:
Code:
mps0: Firmware: 20.00.07.00, Driver: 21.02.00.00-fbsd
 

demon

Contributor
Joined
Dec 6, 2014
Messages
117
Day 5 and no new CRC errors registered, so I think I’m safe to call this good to go. Thanks!
 

powderMonkey

Cadet
Joined
Feb 8, 2020
Messages
2
Hello - first post here - I am looking for this sol.net test script. Every link I find to it is dead, most were posted in 2016. This seems to be the newest conversation about it. Can you point me in the right direction please? : ) Thanks!
 

demon

Contributor
Joined
Dec 6, 2014
Messages
117
Hello - first post here - I am looking for this sol.net test script. Every link I find to it is dead, most were posted in 2016. This seems to be the newest conversation about it. Can you point me in the right direction please? : ) Thanks!

I downloaded it from:

ftp://ftp.sol.net/incoming/solnet-array-test-v2.sh

I just pulled it down a couple weeks ago, so I know that link works.

That link came from the following post:

 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
That link's dead this morning. Author @jgreco will likely advise status.
 

demon

Contributor
Joined
Dec 6, 2014
Messages
117
I just downloaded it on my Linux machine with no issues.

Edit: If someone really can't download it from that FTP link, try this:

 
Last edited:

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Both links working now... Thx @demon
 

powderMonkey

Cadet
Joined
Feb 8, 2020
Messages
2
Thanks for the replies guys! Yes that sol.net based link was dead for me earlier as well, but it is working now here also. Thanks for the Nextcloud link too! I grabbed both. ; )
 

demon

Contributor
Joined
Dec 6, 2014
Messages
117
Just to cap this off, after the one ECC error early on, I observed no further ECC errors in the logs. The new box is up and running in the old one's stead, and everything is humming along nicely.
 
Top