My first time with degraded volume. Where to start.

Status
Not open for further replies.

willrun4fun

Dabbler
Joined
Jan 19, 2016
Messages
29
I am not onsite now, but will head in to get a look at it later this morning. I just want to get some pointers on what should be done. I don't have a spare drive right away. If the drive has gone out is it ok to run over the weekend? This is a backup repository for a group of workstations.

The drives are all WD Red 4tb and its only a few months old. Motherboard is a super micro X10SLL-F-O

The volume vol1 (ZFS) state is DEGRADED: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state.

Here is the Daily Run Output email:

Code:
Checking status of zfs pools:
NAME  SIZE  ALLOC  FREE  EXPANDSZ  FRAG  CAP  DEDUP  HEALTH  ALTROOT
freenas-boot  14.5G  1.56G  12.9G  -  -  10%  1.00x  ONLINE  -
vol1  29T  5.78T  23.2T  -  8%  19%  1.00x  DEGRADED  /mnt

  pool: vol1
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
  the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
  see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub repaired 0 in 4h50m with 0 errors on Sun Oct 16 04:50:58 2016
config:

  NAME  STATE  READ WRITE CKSUM
  vol1  DEGRADED  0  0  0
  raidz2-0  DEGRADED  0  0  0
  gptid/a184af89-5424-11e6-ad63-0cc47aacb640  ONLINE  0  0  0
  gptid/a235d65f-5424-11e6-ad63-0cc47aacb640  ONLINE  0  0  0
  gptid/a2e3f441-5424-11e6-ad63-0cc47aacb640  ONLINE  0  0  0
  gptid/a38f99bc-5424-11e6-ad63-0cc47aacb640  ONLINE  0  0  0
  gptid/a4436390-5424-11e6-ad63-0cc47aacb640  ONLINE  0  0  0
  297144162559208937  UNAVAIL  0  75  0  was /dev/gptid/a501faad-5424-11e6-ad63-0cc47aacb640
  gptid/a5b42fe3-5424-11e6-ad63-0cc47aacb640  ONLINE  0  0  0
  gptid/a664a78f-5424-11e6-ad63-0cc47aacb640  ONLINE  0  0  0

errors: No known data errors

-- End of daily output --


Here is a section of the log that keeps repeating:

Code:
>  (da1:mps0:0:1:0): WRITE(10). CDB: 2a 00 a4 51 69 a0 00 00 08 00 length 4096 SMID 237 terminated ioc 804b scsi 0 state c xfer 0
> (da1:mps0:0:1:0): WRITE(10). CDB: 2a 00 a4 51 69 a0 00 00 08 00 
> (da1:mps0:0:1:0): CAM status: CCB request completed with an error
> (da1:mps0:0:1:0): Retrying command
> (da1:mps0:0:1:0): WRITE(10). CDB: 2a 00 a4 51 69 a0 00 00 08 00 
> (da1:mps0:0:1:0): CAM status: SCSI Status Error
> (da1:mps0:0:1:0): SCSI status: Check Condition
> (da1:mps0:0:1:0): SCSI sense: NOT READY asc:4,0 (Logical unit not ready, cause not reportable)
> (da1:mps0:0:1:0): Retrying command (per sense data)
> (da1:mps0:0:1:0): WRITE(10). CDB: 2a 00 a4 51 69 a0 00 00 08 00 
> (da1:mps0:0:1:0): CAM status: SCSI Status Error
> (da1:mps0:0:1:0): SCSI status: Check Condition
> (da1:mps0:0:1:0): SCSI sense: NOT READY asc:4,0 (Logical unit not ready, cause not reportable)
> (da1:mps0:0:1:0): Retrying command (per sense data)
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
It does sound like the disk and/or cable has died. Yes, you can continue to run the system in this state, though your redundancy is obviously compromised. When you get a replacement, burn it in properly and follow the manual's instructions for replacing a failed disk.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
If it were a primary I'd be telling you to refresh the backup.

As it's a backup... you just need to get a replacement disk burned in and replace the failed disk asap.
 

willrun4fun

Dabbler
Joined
Jan 19, 2016
Messages
29
This is the only freenas system. That means it is the primary I guess. What I was referring to is that a bunch of windows systems are running their backups to it.

Ok. Got a better look at it. Is it that the drive is online but for some reason dropped out of the volume? Here are a couple of screen caps. Looking for advice here.
fuSzeMy.png

qAPiLRo.png
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
Understand the server keeps the back-up for several windows PCs, so a risk if you loose it, in case one of them needs to be restored. Not sure you can find where to back up what's in your pool, but do so if you can as:

- Burnig a new HD will take time;
- You are running on one spare only;
- Resilvering stress the other disks on your pool and can cause another one to fail;

In the process check all your cables :)

Have I mentioned back-up if you can?
 
Status
Not open for further replies.
Top