Mike Bruns
Dabbler
- Joined
- Dec 9, 2015
- Messages
- 21
Hi All, quick question here. Wanted to ask before running any potentially destructive commands Not to mention my Unix/FreeNas skills are rusty.
I have a 9.10 RaidZ2 system with 6x 6TB Disks. Setup a year ago on a Dell Poweredge Server. 32GB ECC memory. No problems up to now.
================
Two days ago, received a critical alert saying:
The volume fullvolume (ZFS) state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
Device: /dev/ada3, unable to open device
================
I shutdown, reseated the SATA and Power cables, and rebooted. Ran an extended smartctl on the disk in question. From what I can tell, the disk appears fine:
http://pastebin.com/gYHjUg78
http://pastebin.com/PFiSMLmi
================
Zpool Status Shows:
[root@agnas1] ~# zpool status
pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0h5m with 0 errors on Tue Dec 6 03:50:57 2016
config:
NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/c97814ea-a44e-11e5-b1d0-f8db88ffc155 ONLINE 0 0 0
gptid/c9a28995-a44e-11e5-b1d0-f8db88ffc155 ONLINE 0 0 0
errors: No known data errors
pool: fullvolume
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub repaired 0 in 21h29m with 0 errors on Sun Dec 18 21:29:10 2016
config:
NAME STATE READ WRITE CKSUM
fullvolume DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/83a93d5d-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/846ee8d6-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/8546ac0c-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/860548b7-c1a1-11e5-9c88-782bcb45965d DEGRADED 0 0 147 too many errors
gptid/86c56366-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/878eb8ce-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
errors: No known data errors
[root@agnas1] ~#
=========================================
Questions:
1) What additional testing (if any) should I do on the disk or system? I.E. Is there any reason to RMA the disk.
2) If I do want to re-add the disk, what is the procedure? It shows Degraded rather than Offline so a replace option doesn't seem to be available.
http://i.imgur.com/kSIzIMA.png
From the docs, I think the command would be "zpool replace fullvolume gptid/860548b7-c1a1-11e5-9c88-782bcb45965d" But I always hesitate to run a command-line "change" command. My rule of thumb is that if the gui doesn't allow it, there is a reason.
===========================================
Thanks!
I have a 9.10 RaidZ2 system with 6x 6TB Disks. Setup a year ago on a Dell Poweredge Server. 32GB ECC memory. No problems up to now.
================
Two days ago, received a critical alert saying:
The volume fullvolume (ZFS) state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
Device: /dev/ada3, unable to open device
================
I shutdown, reseated the SATA and Power cables, and rebooted. Ran an extended smartctl on the disk in question. From what I can tell, the disk appears fine:
http://pastebin.com/gYHjUg78
http://pastebin.com/PFiSMLmi
================
Zpool Status Shows:
[root@agnas1] ~# zpool status
pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0h5m with 0 errors on Tue Dec 6 03:50:57 2016
config:
NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/c97814ea-a44e-11e5-b1d0-f8db88ffc155 ONLINE 0 0 0
gptid/c9a28995-a44e-11e5-b1d0-f8db88ffc155 ONLINE 0 0 0
errors: No known data errors
pool: fullvolume
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub repaired 0 in 21h29m with 0 errors on Sun Dec 18 21:29:10 2016
config:
NAME STATE READ WRITE CKSUM
fullvolume DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/83a93d5d-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/846ee8d6-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/8546ac0c-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/860548b7-c1a1-11e5-9c88-782bcb45965d DEGRADED 0 0 147 too many errors
gptid/86c56366-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/878eb8ce-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
errors: No known data errors
[root@agnas1] ~#
=========================================
Questions:
1) What additional testing (if any) should I do on the disk or system? I.E. Is there any reason to RMA the disk.
2) If I do want to re-add the disk, what is the procedure? It shows Degraded rather than Offline so a replace option doesn't seem to be available.
http://i.imgur.com/kSIzIMA.png
From the docs, I think the command would be "zpool replace fullvolume gptid/860548b7-c1a1-11e5-9c88-782bcb45965d" But I always hesitate to run a command-line "change" command. My rule of thumb is that if the gui doesn't allow it, there is a reason.
===========================================
Thanks!
Last edited: