Disk Error - Pool Degraded

Status
Not open for further replies.

Mike Bruns

Dabbler
Joined
Dec 9, 2015
Messages
21
Hi All, quick question here. Wanted to ask before running any potentially destructive commands Not to mention my Unix/FreeNas skills are rusty.

I have a 9.10 RaidZ2 system with 6x 6TB Disks. Setup a year ago on a Dell Poweredge Server. 32GB ECC memory. No problems up to now.

================
Two days ago, received a critical alert saying:

The volume fullvolume (ZFS) state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
Device: /dev/ada3, unable to open device

================

I shutdown, reseated the SATA and Power cables, and rebooted. Ran an extended smartctl on the disk in question. From what I can tell, the disk appears fine:

http://pastebin.com/gYHjUg78
http://pastebin.com/PFiSMLmi

================

Zpool Status Shows:

[root@agnas1] ~# zpool status
pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0h5m with 0 errors on Tue Dec 6 03:50:57 2016
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/c97814ea-a44e-11e5-b1d0-f8db88ffc155 ONLINE 0 0 0
gptid/c9a28995-a44e-11e5-b1d0-f8db88ffc155 ONLINE 0 0 0

errors: No known data errors

pool: fullvolume
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub repaired 0 in 21h29m with 0 errors on Sun Dec 18 21:29:10 2016
config:

NAME STATE READ WRITE CKSUM
fullvolume DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/83a93d5d-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/846ee8d6-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/8546ac0c-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/860548b7-c1a1-11e5-9c88-782bcb45965d DEGRADED 0 0 147 too many errors
gptid/86c56366-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0
gptid/878eb8ce-c1a1-11e5-9c88-782bcb45965d ONLINE 0 0 0

errors: No known data errors
[root@agnas1] ~#

=========================================

Questions:

1) What additional testing (if any) should I do on the disk or system? I.E. Is there any reason to RMA the disk.
2) If I do want to re-add the disk, what is the procedure? It shows Degraded rather than Offline so a replace option doesn't seem to be available.
http://i.imgur.com/kSIzIMA.png

From the docs, I think the command would be "zpool replace fullvolume gptid/860548b7-c1a1-11e5-9c88-782bcb45965d" But I always hesitate to run a command-line "change" command. My rule of thumb is that if the gui doesn't allow it, there is a reason.

===========================================

Thanks!
 
Last edited:

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
There is zero CLI steps when using freenas. Never run them or you will probably mess up everything. You should check all your disks for smart failures and make sure you have the right disk. You can try to use the GUI and replace the drive with itself but I would probably replace it. Disks don't just drop out of your system randomly. There is a hardware failure somewhere.

Sent from my Nexus 5X using Tapatalk
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
ada3 logged one uncorrectable read error about 4-5 months ago. Consider pulling it and burning it in again with badblocks destructive. Perhaps also run benchmarks on it. If it passes, replace it with itself, otherwise return it. But you could just keep an eye on it.
 
Status
Not open for further replies.
Top