Degraded Pool

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Here is the zpool status -v code
Something didn't work there... That's the help text.

Did you type:

zpool status -v

At the command prompt?
 

jspfunk

Dabbler
Joined
Oct 26, 2017
Messages
48
Ok, here is the code. I put in zpool stats -v before. It's still at 1% going on 48 hours. It's the Plex2 pool that's being resilvered.

Code:
# zpool status -v
  pool: Plex
state: ONLINE
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub repaired 0 in 0 days 07:48:29 with 0 errors on Sun Sep  1 07:48:39 2019
config:

        NAME                                          STATE     READ WRITE CKSUM
        Plex                                          ONLINE       0     0     0
          gptid/04bc2da1-b37e-11e7-8981-bcaec5bc5b7b  ONLINE       0     0     0
          gptid/68654533-b420-11e7-9d4b-bcaec5bc5b7b  ONLINE       0     0     0
        cache
          5852771565860687755                         UNAVAIL      0     0     0  was /dev/gptid/051fb789-b37e-11e7-8981-bcaec5bc5b7b

errors: No known data errors

  pool: Plex2
state: UNAVAIL
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Sep  6 18:56:09 2019
        297G scanned at 1.83M/s, 59.6G issued at 377K/s, 7.31T total
        5.24G resilvered, 0.80% done, no estimated completion time
config:

        NAME                                              STATE     READ WRITE CKSUM
        Plex2                                             UNAVAIL    141     0   366
          raidz1-0                                        UNAVAIL    299     0   986
            replacing-0                                   UNAVAIL      0     0     0
              6419097555458380165                         UNAVAIL      0     0     0  was /dev/gptid/eacf0ae1-b798-11e9-a14c-bcaec5bc5b7b
              gptid/a809e59e-d112-11e9-90f6-bcaec5bc5b7b  ONLINE       0     0     0
            gptid/ec1e9c1b-b798-11e9-a14c-bcaec5bc5b7b    DEGRADED     0     0     0  too many errors
            15000897179839332203                          REMOVED      0     0     0  was /dev/ada1p2

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x0>
        <metadata>:<0x1>
        <metadata>:<0x29>
        <metadata>:<0x2b>
        <metadata>:<0x2e>
        <metadata>:<0x2f>
        <metadata>:<0x39>
        <metadata>:<0x3c>
        <metadata>:<0x3d>
        <metadata>:<0x3f>
        <metadata>:<0x42>
        <metadata>:<0x43>
        <metadata>:<0x4c>
        <metadata>:<0x53>
        <metadata>:<0x54>
        <metadata>:<0x57>
        <metadata>:<0x58>
        <metadata>:<0x5a>
        <metadata>:<0x5e>
        <metadata>:<0x5f>
        <metadata>:<0x60>
        <metadata>:<0x62>
        <metadata>:<0x63>
        <metadata>:<0x66>
        <metadata>:<0x67>
        <metadata>:<0x68>
        <metadata>:<0x6a>
        <metadata>:<0x6b>
        <metadata>:<0x6d>
        <metadata>:<0x70>
        <metadata>:<0x71>
        <metadata>:<0x73>
        <metadata>:<0x75>
        <metadata>:<0x77>
        <metadata>:<0x7b>
        <metadata>:<0x7d>
        <metadata>:<0x7f>
        <metadata>:<0x87>
        <metadata>:<0x89>
        <metadata>:<0x8b>
        <metadata>:<0x95>
        <metadata>:<0x99>
        <metadata>:<0x9a>
        <metadata>:<0x9b>
        <metadata>:<0x9c>
        <metadata>:<0x9d>
        <metadata>:<0xa1>
        <metadata>:<0xb3>
        <metadata>:<0xb8>
        <metadata>:<0xb9>
        <metadata>:<0xbc>
        <metadata>:<0xbe>
        <metadata>:<0xc0>
        <metadata>:<0xc2>
        <metadata>:<0xc4>
        <metadata>:<0xc8>
        <metadata>:<0xca>
        <metadata>:<0xcd>
        <metadata>:<0xd1>
        <metadata>:<0xd4>
        <metadata>:<0xd5>
        <metadata>:<0xdc>
        <metadata>:<0xe6>
        Plex2:<0x1>
        Plex2/Media:<0x0>
        Plex2/Media:<0x10202>
        Plex2/Media:<0x10106>
        Plex2/Media:<0x10406>
        Plex2/Media:<0x1010f>
        Plex2/Media:<0x10211>
        Plex2/Media:<0x10412>
        Plex2/Media:<0x10016>
        Plex2/Media:<0x10419>
        Plex2/Media:<0x1021d>
        Plex2/Media:<0x1001f>
        Plex2/Media:<0x10420>
        Plex2/Media:<0x10520>
        Plex2/Media:<0x10127>
        Plex2/Media:<0x10427>
        Plex2/Media:<0x10433>
        Plex2/Media:<0x10634>
        Plex2/Media:<0x10040>
        Plex2/Media:<0x1014b>
        Plex2/Media:<0x1004f>
        Plex2/Media:<0x10268>
        Plex2/Media:<0x10471>
        Plex2/Media:<0x10483>
        Plex2/Media:<0x1018a>
        Plex2/Media:<0x10094>
        Plex2/Media:<0x10595>
        Plex2/Media:<0x1049b>
        Plex2/Media:<0x1019c>
        Plex2/Media:<0x1019f>
        Plex2/Media:<0x101a5>
        Plex2/Media:<0x100ac>
        Plex2/Media:<0x104b1>
        Plex2/Media:<0x105b9>
        Plex2/Media:<0x103c4>
        Plex2/Media:<0x101c6>
        Plex2/Media:<0x105c8>
        Plex2/Media:<0x100ca>
        Plex2/Media:<0x103ca>
        Plex2/Media:<0x104cf>
        Plex2/Media:<0x103d6>
        Plex2/Media:<0x102e3>
        Plex2/Media:<0x104e3>
        Plex2/Media:<0x106eb>
        Plex2/Media:<0x104ec>
        Plex2/Media:<0x103ee>
        Plex2/Media:<0x102ef>
        Plex2/Media:<0x101f0>
        Plex2/Media:<0x104f2>

  pool: freenas-boot
state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 0 days 00:26:27 with 2 errors on Sun Sep  8 04:11:27 2019
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     3
          da0p2     ONLINE       0     0     6

errors: Permanent errors have been detected in the following files:

        freenas-boot/ROOT/11.2-U5@2018-12-22-15:41:01:/usr/local/lib/python3.6/site-packages/django/contrib/admin/static/admin/fonts/Roboto-Regular-webfont.woff
        //usr/local/lib/python3.6/site-packages/middlewared/__pycache__/job.cpython-36.pyc
 
Last edited:
Joined
Oct 18, 2018
Messages
969
Plex2 UNAVAIL 141 0 366 raidz1-0 UNAVAIL 299 0 986 replacing-0 UNAVAIL 0 0 0 6419097555458380165 UNAVAIL 0 0 0 was /dev/gptid/eacf0ae1-b798-11e9-a14c-bcaec5bc5b7b gptid/a809e59e-d112-11e9-90f6-bcaec5bc5b7b ONLINE 0 0 0 gptid/ec1e9c1b-b798-11e9-a14c-bcaec5bc5b7b DEGRADED 0 0 0 too many errors 15000897179839332203 REMOVED 0 0 0 was /dev/ada1p2
If I am reading this correctly you've got a RAIDZ1 vdev with
  • 1 drive, /dev/gptid/eacf0ae1-b798-11e9-a14c-bcaec5bc5b7b, is unavailable and being resilvered with gptid/a809e59e-d112-11e9-90f6-bcaec5bc5b7b
  • 1 is fine, gptid/ec1e9c1b-b798-11e9-a14c-bcaec5bc5b7b
  • 1 was removed, 15000897179839332203
This vdev is not going to be able to finish the resilvering process. It is RAIDZ1 and two drives are missing. The fact that so many drives have dropped off may suggest a cable or controller issue. Would you mind posting your complete system specs including memory, motherboard, cpu, and any hard drive controllers. Exact model numbers are helpful. Also, it may be worth double-checking that all of your drives have their cables properly seated, both power and data cables.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Your pool is in serious trouble and I doubt you will be able to recover all of the data on it.

It looks to me like you might have selected the wrong disk to replace and now you don't have enough disks in there to work with.

Since there are no disks in that pool that show resilvering (and it's not going to be mathematically possible for it to happen), I think the answer is that it's not resilvering at all and will never finish.

Perhaps the resilvering progress shown at the pool level is from a point when you still had enough disks present.

You may have one last chance to recover something if you can bring one or both of the "bad" disks back online (in addition to the ones currently online) and maybe resilvering can have some small chance of resuming.
 

jspfunk

Dabbler
Joined
Oct 26, 2017
Messages
48
@PhiloEpisteme and @sretalla
I have put my system in my signature, so hopefully it will show now. Ok, if I lose the data from the Plex2 pool it's ok. It will be just from the last 30 or so days. I have a backup of my on the Plex pool. I'm not sure what to do about the drives not showing. I reseeded the connections, so I am wondering if my motherboard is having issues with the SATA ports. Once I get that figured out, can I use my config file to recreate the Pools? or will I have to recreate the Plex2 pool from scratch?

It looks like the drives are recognized by Freenas. All 3 (ada1, ada2, ada3) for Plex2 are showing. I know my Plex pool is really having errors. I keep getting the following message;
Code:
 
/dev/ada0, 8 currently unreadable
/dev/ada0, 8 offline uncorrectable


I'm sure that's not a good sign.
 
Last edited:
Joined
Oct 18, 2018
Messages
969
Once I get that figured out, can I use my config file to recreate the Pools? or will I have to recreate the Plex2 pool from scratch?
The config file stores information like which pools the system knows about and what your sharing settings are. The data itself needs to be recreated from backup.

Ok, if I lose the data from the Plex2 pool it's ok. It will be just from the last 30 or so days. I have a backup of my on the Plex pool.
Good that you have a backup. You may consider also having a backup external to your sever. I keep two backups, one on-site (but external to my main server) and one off-site. This protects you from issues with your main machine possibly affecting all pools and disasters like fire in your house. Of course your backup strategy will depend on your budget, data, etc.

I have put my system in my signature, so hopefully it will show now.
Thanks for the information. It looks like you're not using an HBA to connect the drives so any communication issue with the drives would be the board, cables, or drives themselves. You should also feel free to edit your very first post in this thread with that information. This way as your system, and therefore signature, changes the information in this thread will still be up-to-date.

Depending on what happened to your system and what the actual issue is you possibly could recover your pool but I suspect that if the data is replaceable your time is better off spent buying new drives, burning them in, and rebuilding Plex2.

/dev/ada0, 8 currently unreadable
/dev/ada0, 8 offline uncorrectable
Yes, this is certainly an indication that the drive is having some issues.

Also, as a note, it looks like your L2ARC drive has failed in your Plex pool.
Code:
        cache
          5852771565860687755                         UNAVAIL      0     0     0  was /dev/gptid/051fb789-b37e-11e7-8981-bcaec5bc5b7b

 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
It looks like the drives are recognized by Freenas. All 3 (ada1, ada2, ada3) for Plex2 are showing
Don't forget to disregard those number assignments as they can change on reboot or disk replacement. Use serial number references to make sure you're talking about the right disks. (but you don't need to post serial numbers here)
 

jspfunk

Dabbler
Joined
Oct 26, 2017
Messages
48
To try to rebuild the Plex2 pool, do I just need to go to "Disks", in the GUI, and select the disk and use the replace in the settings?

Well, it does not give me the option to replace. I just have option to scrub or disconnect now.

@PhiloEpisteme, the cache for the Plex pool, How do I fix that. From what I read it's the RAM https://www.45drives.com/wiki/index.php?title=FreeNAS_-_What_is_ZIL_&_L2ARC. I also read I could use an SSD, should I get a cheap 120gb drive for cache?
 
Last edited:
Joined
Oct 18, 2018
Messages
969
@PhiloEpisteme, the cache for the Plex pool, How do I fix that. From what I read it's the RAM https://www.45drives.com/wiki/index.php?title=FreeNAS_-_What_is_ZIL_&_L2ARC. I also read I could use an SSD, should I get a cheap 120gb drive for cache?
I'm not 100% sure if I am understanding your question correctly. Based on my understanding I'll add that the primary ARC is your ram. zfs uses this to improve the read performance of your pool by caching often accessed data there. Some folks find that their system doesn't support enough ram for their use case and so they add a "cache" drive which add an L2ARC to their pool, growing the size of the cache. It looks like your system is using an L2ARC for that pool and it is listed as unavailable. If you still want the L2ARC you can replace it and go through the replacement process, I believe that is outlined in the User Guide. If your performance for that pool is fine for now you can skip replacing it and just remove the device. You can search these forums or the User Guide for instructions on how to do that.

To try to rebuild the Plex2 pool, do I just need to go to "Disks", in the GUI, and select the disk and use the replace in the settings?
By rebuild I mean detach and remove that pool and start fresh; create an entirely new pool out of new disks. I suspect that you will not be able to replace/resilver any of the drives in Plex2 because you have 1 removed drive and 1 drive which is unavailable in a RAIDZ1 vdev. RAIDZ1 tolerates a single drive being unavailable, you've got two. The User Guide is a great resource for how to destroy and rebuild your pool. Keep in mind that when you destroy your pool you will certainly lose any data on the pool. It is likely lost anyway, but it is worth mentioning. To rebuild it follow the instructions for creating a new pool.

I'm also looking at your parts list and trying to understand how your system is built.

You're system has 3 pools, Plex, Plex2, and freenas-boot which are made of 3, 3, and 1 drives respectively. Therefore, your total drive count should have been, at some point, 7 in order to fill out the information you provided above. Your parts list only shows 3 drives, 1 Seagate and 2 WD drives. It looks like from zpool status that you should have 5 drives in your system at least, plus possibly 3 faulted drives. Could you clarify exactly how many drives you have in your system right now and what they are being used for?

Looking at your motherboard I see that you have the following options for adding storage
Code:
AMD SB850 controller : 
5 x SATA 6Gb/s port(s), blue
1 x eSATA 6Gb/s port(s), red
Support Raid 0, 1, 5, 10, JBOD
VIA VT6330 controller : 
1 x UltraDMA 133/100/66 for up to 2 PATA devices , navy blue

It looks like you have 5 on-board SATA ports and 1 eSATA port. I assume you're not using PATA devices. How exactly do you have all of your drives hooked up? Are any of them hooked up via the eSATA port, or any expander cards etc? And if so, what models etc are you using?

Sorry if I've misunderstood something or missed you explaining any of this before. Feel free to point me in the right direction if so.
 

jspfunk

Dabbler
Joined
Oct 26, 2017
Messages
48
@PhiloEpisteme Thank you for taking time to help me, and everyone else who had helped as well.

I'm really sorry. I was not even thinking about the 2 pools since I was working on the Plex2. I will update my system to show both Pools and the USB boot drive.

For the L2ARC cache, I guess I would worry about it right now. I didn't setup and L2ARC, so I'm not sure how to fix for create one.

I have 2 drives for the Plex Pool:
Plex Pool (backup of Plex2):
Seagate 4TB 4000DM000
Seagate 4TB 4000DM000

Boot USB:
Kingston Digital 16GB Data Traveler 3.0

I'm backing up all my data to an external drive, so I will try to recreate the Plex2 after. I'm not sure why it's showing the Plex2 only has 1 drive now. I replaced the Seagate with another WD 8tb drive. So in the screen shot below ada1, ada2, ada3 should be the drives for Plex2, where ada1 and ada3 are the old WD and ada2 is the new drive.
1568053174516.png
 
Top