Resilvering in progress although zpool status shows every disk online

Scampicfx

Contributor
Joined
Jul 4, 2016
Messages
125
Hey guys,

it looks like I just got to know my first resilver!

I just received following email from my NAS:

Code:
FreeNAS @ xxx

New alerts:
* Pool storage state is ONLINE: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state..

Current alerts:
* Pool storage state is ONLINE: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state..


I logged into my FreeNAS shell and entered
Code:
zpool status
to identify my failed disk. Following result is shown:

Code:
root@xxx:~ # zpool status
  pool: storage
state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Feb  9 14:56:31 2021
        0 scanned at 0/s, 0 issued at 0/s, 62.9T total
        0 resilvered, 0.00% done, no estimated completion time
config:

        NAME                                            STATE     READ WRITE CKSUM
        storage                                         ONLINE       0     0 0
          raidz2-0                                      ONLINE       0     0 0
            gptid/710e2999-3f16-11e7-96ba-0007433aed30  ONLINE       0     1 0
            gptid/721ab656-3f16-11e7-96ba-0007433aed30  ONLINE       0     0 0
            gptid/730fb2a4-3f16-11e7-96ba-0007433aed30  ONLINE       0     0 0
            gptid/741430d1-3f16-11e7-96ba-0007433aed30  ONLINE       0     0 0
            gptid/7515b9cd-3f16-11e7-96ba-0007433aed30  ONLINE       0     0 0
            gptid/760f7faf-3f16-11e7-96ba-0007433aed30  ONLINE       0    49 0
          raidz2-1                                      ONLINE       0     0 0
            gptid/362b5134-9764-11e8-9c79-0007433aed30  ONLINE       0     0 0
            gptid/374cc247-9764-11e8-9c79-0007433aed30  ONLINE       0     0 0
            gptid/386def54-9764-11e8-9c79-0007433aed30  ONLINE       0     0 0
            gptid/3911679c-9764-11e8-9c79-0007433aed30  ONLINE       0     0 0
            gptid/399c038a-9764-11e8-9c79-0007433aed30  ONLINE       0     0 0
            gptid/3a2f6cb7-9764-11e8-9c79-0007433aed30  ONLINE       0     0 0

errors: No known data errors


How exactly can I identify the disk which is being resilvered? I know I can see a disk with 49 write errors, nevertheless it is displayed as "ONLINE".

Why remains the process at 0.00%?

EDIT: I can see this in the shell which is displayed in the footer of the FreeNAS GUI


Code:
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): WRITE(10). CDB: 2a 00 1c 50 bf f0 00 00 20 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): CAM status: SCSI Status Error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI status: Check Condition
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI sense: NOT READY asc:4,9 (Logical unit not ready, self-test in progress)
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Command Specific Info: 0
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Progress: 0% (0/65536) complete
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x80: f5 05
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x81: 00 00 00 00 00 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Error 16, Unretryable error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): WRITE(10). CDB: 2a 00 1c 50 c0 10 00 00 04 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): CAM status: SCSI Status Error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI status: Check Condition
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI sense: NOT READY asc:4,9 (Logical unit not ready, self-test in progress)
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Command Specific Info: 0
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Progress: 0% (0/65536) complete
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x80: f5 05
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x81: 00 00 00 00 00 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Error 16, Unretryable error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): WRITE(10). CDB: 2a 00 4b 62 54 4e 00 00 20 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): CAM status: SCSI Status Error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI status: Check Condition
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI sense: NOT READY asc:4,9 (Logical unit not ready, self-test in progress)
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Command Specific Info: 0
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Progress: 0% (0/65536) complete
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x80: f5 05
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x81: 00 00 00 00 00 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Error 16, Unretryable error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): WRITE(10). CDB: 2a 00 4b 62 54 6e 00 00 20 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): CAM status: SCSI Status Error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI status: Check Condition
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI sense: NOT READY asc:4,9 (Logical unit not ready, self-test in progress)
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Command Specific Info: 0
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Progress: 0% (0/65536) complete
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x80: f5 05
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x81: 00 00 00 00 00 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Error 16, Unretryable error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): WRITE(10). CDB: 2a 00 4b 62 54 8e 00 00 20 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): CAM status: SCSI Status Error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI status: Check Condition
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI sense: NOT READY asc:4,9 (Logical unit not ready, self-test in progress)
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Command Specific Info: 0
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Progress: 0% (0/65536) complete
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x80: f5 05
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x81: 00 00 00 00 00 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Error 16, Unretryable error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): WRITE(10). CDB: 2a 00 4b 62 54 ae 00 00 01 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): CAM status: SCSI Status Error
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI status: Check Condition
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): SCSI sense: NOT READY asc:4,9 (Logical unit not ready, self-test in progress)
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Command Specific Info: 0
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Progress: 0% (0/65536) complete
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x80: f5 05
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Descriptor 0x81: 00 00 00 00 00 00
Feb  9 14:56:21 xxx (da14:mpr0:0:26:0): Error 16, Unretryable error
Feb  9 14:56:21 xxx ZFS: vdev state changed, pool_guid=6655185457139183021 vdev_guid=867113203268394805


Is the reason for the resilver a stuck S.M.A.R.T selftest?

EDIT: I received another email after a few minutes:
Code:
FreeNAS @ xxx

New alert:
* Pool storage state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected..

The following alert has been cleared:
* Pool storage state is ONLINE: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state..

Current alerts:
* Pool storage state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected..


So I guess the resilver has finished, all disks are okay and there is nothing to be worried about?


Thanks for any help!
 
Last edited:

Alecmascot

Guru
Joined
Mar 18, 2014
Messages
1,177
What is the exact model number of these drives ?
 

Scampicfx

Contributor
Joined
Jul 4, 2016
Messages
125
It's a bit mixed, but most of them are HGST HUH728080AL4200 and HGST HUS726060AL4210
 

Alecmascot

Guru
Joined
Mar 18, 2014
Messages
1,177
is da14 the drive with the write errors ?

glabel status shows gptid to device id
 
Last edited:

Scampicfx

Contributor
Joined
Jul 4, 2016
Messages
125
Thanks, executing
Code:
glabel status shows gptid to device id 
throws
Code:
glabel: No such geom: shows
 

Alecmascot

Guru
Joined
Mar 18, 2014
Messages
1,177
Code:
glabel status
 

Scampicfx

Contributor
Joined
Jul 4, 2016
Messages
125
Thank you!


root@xxx :~ # glabel status
Name Status Components
gptid/f9e89844-cf18-11e7-82e2-0007433aed30 N/A nvd0p1
gptid/baf1a6d7-cd29-11e7-b88e-0007433aed30 N/A ada0p1
gptid/bafee7bd-cd29-11e7-b88e-0007433aed30 N/A ada1p1
gptid/710e2999-3f16-11e7-96ba-0007433aed30 N/A da0p2
gptid/721ab656-3f16-11e7-96ba-0007433aed30 N/A da1p2
gptid/362b5134-9764-11e8-9c79-0007433aed30 N/A da2p2
gptid/f71cda28-cf18-11e7-82e2-0007433aed30 N/A da3p2
gptid/f7a2bb9c-cf18-11e7-82e2-0007433aed30 N/A da4p2
gptid/730fb2a4-3f16-11e7-96ba-0007433aed30 N/A da5p2
gptid/741430d1-3f16-11e7-96ba-0007433aed30 N/A da6p2
gptid/374cc247-9764-11e8-9c79-0007433aed30 N/A da7p2
gptid/f827b158-cf18-11e7-82e2-0007433aed30 N/A da8p2
gptid/f8b04921-cf18-11e7-82e2-0007433aed30 N/A da9p2
gptid/7515b9cd-3f16-11e7-96ba-0007433aed30 N/A da10p2
gptid/399c038a-9764-11e8-9c79-0007433aed30 N/A da11p2
gptid/386def54-9764-11e8-9c79-0007433aed30 N/A da12p2
gptid/f934e56f-cf18-11e7-82e2-0007433aed30 N/A da13p2
gptid/760f7faf-3f16-11e7-96ba-0007433aed30 N/A da14p2
gptid/3a2f6cb7-9764-11e8-9c79-0007433aed30 N/A da15p2
gptid/3911679c-9764-11e8-9c79-0007433aed30 N/A da16p2
gptid/f9c009dd-cf18-11e7-82e2-0007433aed30 N/A da17p2
gptid/373bbbc6-9764-11e8-9c79-0007433aed30 N/A da7p1
gptid/740b5518-3f16-11e7-96ba-0007433aed30 N/A da6p1
gptid/73068028-3f16-11e7-96ba-0007433aed30 N/A da5p1
gptid/f79115a7-cf18-11e7-82e2-0007433aed30 N/A da4p1
gptid/f7101b16-cf18-11e7-82e2-0007433aed30 N/A da3p1
gptid/361647b7-9764-11e8-9c79-0007433aed30 N/A da2p1
gptid/72126965-3f16-11e7-96ba-0007433aed30 N/A da1p1
gptid/710664ba-3f16-11e7-96ba-0007433aed30 N/A da0p1

The drive with many checksum errors is: gptid/741430d1-3f16-11e7-96ba-0007433aed30 -> so this would be da6p2
 
Top