Resilvering + Uncorrectable parity/CRC errors

Status
Not open for further replies.

antsrealm

Explorer
Joined
Jan 28, 2013
Messages
82
Hi,

I've just started the replacement process on a WD green drive to a red that had 2 to offlined sectors from a SMART test. The relsilvering is in progress but now I am getting parity CRC errors again. I didn't have these on the old drive, now all of a sudden as soon as I swap out the drive I'm getting heaps of them. 1st question, is this a problem? At the moment I'm going to let the re-silvering process complete first. 2nd question where to from here? I think it's time to get a sata card rather then using the onboard ports. I've had these errors in the past and solved it by changing ports and cables.

Here is the zpool status of the resilvering process:-

Code:
[root@freenas ~]# zpool status                                                                                                     
  pool: NAS                                                                                                                       
state: DEGRADED                                                                                                                   
status: One or more devices is currently being resilvered.  The pool will                                                         
        continue to function, possibly in a degraded state.                                                                       
action: Wait for the resilver to complete.                                                                                         
  scan: resilver in progress since Mon May 12 11:41:51 2014                                                                       
        163G scanned out of 9.81T at 158M/s, 17h46m to go                                                                         
        32.6G resilvered, 1.62% done                                                                                               
config:                                                                                                                           
                                                                                                                                   
        NAME                                              STATE    READ WRITE CKSUM                                               
        NAS                                              DEGRADED    0    0    0                                               
          raidz1-0                                        DEGRADED    0    0    0                                               
            gptid/71cbb176-bcb1-11e1-aa48-c8600014c8a1    ONLINE      0    0    0                                               
            gptid/7266b6e7-bcb1-11e1-aa48-c8600014c8a1    ONLINE      0    0    0                                               
            gptid/72f9c4f5-bcb1-11e1-aa48-c8600014c8a1    ONLINE      0    0    0                                               
            gptid/738c5264-bcb1-11e1-aa48-c8600014c8a1    ONLINE      0    0    0                                               
            replacing-4                                  OFFLINE      0    0    0                                               
              6054011502170042877                        OFFLINE      0    0    0  was /dev/gptid/74228c86-bcb1-11e1-aa48-c860001
4c8a1                                                                                                                             
              gptid/96cee5b7-d976-11e3-86e0-3085a9a218e2  ONLINE      0    0    0  (resilvering)                               
                                                                                                                                   
errors: No known data errors                                                                                                       
[root@freenas ~]#                  


And here is the errors i'm seeing in the GUI shell in the footer...

Code:
May 12 12:05:26 freenas kernel: (ada4:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error
May 12 12:05:26 freenas kernel: (ada4:ahcich5:0:0:0): Retrying command
May 12 12:05:26 freenas kernel: (ada4:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 28 14 87 40 5b 00 00 01 00 00
May 12 12:05:26 freenas kernel: (ada4:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error
May 12 12:05:26 freenas kernel: (ada4:ahcich5:0:0:0): Retrying command
May 12 12:05:27 freenas kernel: (ada4:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 a8 2f 87 40 5b 00 00 01 00 00
May 12 12:05:27 freenas kernel: (ada4:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error
May 12 12:05:27 freenas kernel: (ada4:ahcich5:0:0:0): Retrying command
May 12 12:05:27 freenas kernel: (ada4:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 68 50 87 40 5b 00 00 01 00 00
May 12 12:05:27 freenas kernel: (ada4:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error
May 12 12:05:27 freenas kernel: (ada4:ahcich5:0:0:0): Retrying command
May 12 12:05:27 freenas kernel: (ada4:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 68 72 87 40 5b 00 00 01 00 00
May 12 12:05:27 freenas kernel: (ada4:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error
May 12 12:05:27 freenas kernel: (ada4:ahcich5:0:0:0): Retrying command
May 12 12:05:28 freenas kernel: (ada4:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 68 7f 87 40 5b 00 00 01 00 00
May 12 12:05:28 freenas kernel: (ada4:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error
May 12 12:05:28 freenas kernel: (ada4:ahcich5:0:0:0): Retrying command
May 12 12:05:28 freenas kernel: (ada4:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 68 94 87 40 5b 00 00 01 00 00
May 12 12:05:28 freenas kernel: (ada4:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error
May 12 12:05:28 freenas kernel: (ada4:ahcich5:0:0:0): Retrying command
May 12 12:05:29 freenas kernel: (ada4:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 a8 29 98 40 5b 00 00 01 00 00
May 12 12:05:29 freenas kernel: (ada4:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error
May 12 12:05:29 freenas kernel: (ada4:ahcich5:0:0:0): Retrying command
May 12 12:05:29 freenas kernel: (ada4:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 a8 5d 98 40 5b 00 00 01 00 00
May 12 12:05:29 freenas kernel: (ada4:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error
May 12 12:05:29 freenas kernel: (ada4:ahcich5:0:0:0): Retrying command
May 12 12:05:30 freenas kernel: (ada4:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 a8 13 9a 40 5b 00 00 01 00 00
May 12 12:05:30 freenas kernel: (ada4:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error
May 12 12:05:30 freenas kernel: (ada4:ahcich5:0:0:0): Retrying command
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Those CRC errors are often a sign of a bad SATA cable. I'd ignore the errors until your resilvering completes since you have zero redundancy. Once the resilvering is done replace the data cable for ada4.
 

antsrealm

Explorer
Joined
Jan 28, 2013
Messages
82
Those CRC errors are often a sign of a bad SATA cable. I'd ignore the errors until your resilvering completes since you have zero redundancy. Once the resilvering is done replace the data cable for ada4.


Hey Cyberjock has anyone had any of these CRC / Parity issues using a M1015 ServerRaid card? If not I think I might bite the bullet and just get one.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Hey Cyberjock has anyone had any of these CRC / Parity issues using a M1015 ServerRaid card? If not I think I might bite the bullet and just get one.

I'm sure people have.. all hardware is susceptible to various problems. Cable problems(as yours appears to be) won't be something the M1015 can fix. The controller itself can only be identified as the problem if you've ruled out other potential problems.
 

antsrealm

Explorer
Joined
Jan 28, 2013
Messages
82
I'm sure people have.. all hardware is susceptible to various problems. Cable problems(as yours appears to be) won't be something the M1015 can fix. The controller itself can only be identified as the problem if you've ruled out other potential problems.


Ok thanks Cyberjock. The resilvering is complete now and back to healthy. I'll try a new cable again today. I also upgraded my freenas version to the latest and notices the zpool status says I should run a zpool upgrade but when I do it doesnt seem to do anything. What should I do here ...

Code:
[root@freenas ~]# zpool status                                                                                                     
  pool: NAS                                                                                                                       
state: ONLINE                                                                                                                     
status: Some supported features are not enabled on the pool. The pool can                                                         
        still be used, but some features are unavailable.                                                                         
action: Enable all features using 'zpool upgrade'. Once this is done,                                                             
        the pool may no longer be accessible by software that does not support                                                     
        the features. See zpool-features(7) for details.                                                                           
  scan: resilvered 1.96T in 18h55m with 0 errors on Tue May 13 06:37:13 2014                                                       
config:                                                                                                                           
                                                                                                                                   
        NAME                                            STATE    READ WRITE CKSUM                                                 
        NAS                                            ONLINE      0    0    0                                                 
          raidz1-0                                      ONLINE      0    0    0                                                 
            gptid/71cbb176-bcb1-11e1-aa48-c8600014c8a1  ONLINE      0    0    0                                                 
            gptid/7266b6e7-bcb1-11e1-aa48-c8600014c8a1  ONLINE      0    0    0                                                 
            gptid/72f9c4f5-bcb1-11e1-aa48-c8600014c8a1  ONLINE      0    0    0                                                 
            gptid/738c5264-bcb1-11e1-aa48-c8600014c8a1  ONLINE      0    0    0                                                 
            gptid/96cee5b7-d976-11e3-86e0-3085a9a218e2  ONLINE      0    0    0                                                 
                                                                                                                                   
errors: No known data errors                                                                                                       
[root@freenas ~]# zpool upgrade                                                                                                   
This system supports ZFS pool feature flags.                                                                                       
                                                                                                                                   
All pools are formatted using feature flags.                                                                                       
                                                                                                                                   
                                                                                                                                   
Some supported features are not enabled on the following pools. Once a                                                             
feature is enabled the pool may become incompatible with software                                                                 
that does not support the feature. See zpool-features(7) for details.                                                             
                                                                                                                                   
POOL  FEATURE                                                                                                                     
---------------                                                                                                                   
NAS                                                                                                                               
      multi_vdev_crash_dump                                                                                                       
      spacemap_histogram                                                                                                           
      enabled_txg                                                                                                                 
      hole_birth                                                                                                                   
      extensible_dataset                                                                                                           
      bookmarks                                                                                                                   
                                                                                                                                   
[root@freenas ~]#                                                                                                                 
                                                
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Typically if you have not had the errors prior to replacing the drive then you can be fairly certain you disturbed the ada4 cable. First just reseat both ends. If that doesn't solve the issue then replace the SATA cable.
 

antsrealm

Explorer
Joined
Jan 28, 2013
Messages
82
Typically if you have not had the errors prior to replacing the drive then you can be fairly certain you disturbed the ada4 cable. First just reseat both ends. If that doesn't solve the issue then replace the SATA cable.


Yeah I have tried reseating the sata cable then replacing the cable then swapped to the other spare sata port (Although I've had this issue in the past on this port). To be honest I'm fed up with it. I'm seriously considering buying the M1015 and the sas - sata cable flashing to IT mode trying that. Just don't really want to add to the power bill when I don't need to.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Yeah.. if you've swapped cables and ports I'd just get the M1015 and be done with it. I've got an M1015 and it's been completely flawless for me.
 

antsrealm

Explorer
Joined
Jan 28, 2013
Messages
82
Now I am getting the system warning in the GUI and data is being lost on ada4 causing the resilvering to occur. Here is the current Zpool Status ...

Code:
[root@freenas ~]# zpool status                                                                                                     
  pool: NAS                                                                                                                       
state: ONLINE                                                                                                                     
status: One or more devices has experienced an unrecoverable error.  An                                                           
        attempt was made to correct the error.  Applications are unaffected.                                                       
action: Determine if the device needs to be replaced, and clear the errors                                                         
        using 'zpool clear' or replace the device with 'zpool replace'.                                                           
  see: http://illumos.org/msg/ZFS-8000-9P                                                                                         
  scan: scrub in progress since Thu May 15 09:44:28 2014                                                                           
        38.7G scanned out of 9.93T at 149M/s, 19h19m to go                                                                         
        384K repaired, 0.38% done                                                                                                 
config:                                                                                                                           
                                                                                                                                   
        NAME                                            STATE    READ WRITE CKSUM                                                 
        NAS                                            ONLINE      0    0    0                                                 
          raidz1-0                                      ONLINE      0    0    0                                                 
            gptid/71cbb176-bcb1-11e1-aa48-c8600014c8a1  ONLINE      0    0    0                                                 
            gptid/7266b6e7-bcb1-11e1-aa48-c8600014c8a1  ONLINE      0    0    0                                                 
            gptid/72f9c4f5-bcb1-11e1-aa48-c8600014c8a1  ONLINE      0    0    0                                                 
            gptid/738c5264-bcb1-11e1-aa48-c8600014c8a1  ONLINE      0    0    0                                                 
            gptid/96cee5b7-d976-11e3-86e0-3085a9a218e2  ONLINE    146 2.75K    0  (repairing)                                   
                                                                                                                                   
errors: No known data errors                                                                                                       
[root@freenas ~]#                            


I have tried again swapping to the other SATA port and again a new cable. I have now ordered the M1015 off ebay and 2 SAS-SATA fwd breakout cables. Am I safe to keep running the system in the meant time and do you have any advice for me in the wake of swapping to the M1015.

Thanks,
Tony.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
My advice is one resilvering is done, turn off your system and wait for the new card to arrive.
 

antsrealm

Explorer
Joined
Jan 28, 2013
Messages
82
My advice is one resilvering is done, turn off your system and wait for the new card to arrive.


Ok.

With changing to the M1015. After I do the flash to IT mode etc and install the card and connect the drives will Freenas automatically see that the 5 drives I have are now connected to the m1015 and identify the serial numbers and reconnect my pool without any need to do anything further or is there a process I should be following for that changeover ?

Thanks,
Tony.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Correct.
 

antsrealm

Explorer
Joined
Jan 28, 2013
Messages
82
Just a follow up.

I received the M1015 and SAS - SATA cables today and after a bit of mucking around getting the flashing to work I have successfully got the NAS back up and running without any parity / CRC errors on the new card. System is back to healthy and all is good again.

I'll need to do a full scrub tonight.

Thanks for the advice.

Tony.
 
Status
Not open for further replies.
Top