Understanding Spare errors

kolya1972

Cadet
Joined
Oct 23, 2023
Messages
1
I have a pool that is degraded. I inherited this pool and did not configure it. What is unclear is why I have a failed spare that has caused my pool to be degraded. As I understand, the spares that were attached to the pool should replace the failed drive. 1) Why has this not happened? 2) and why does the UI not allow me to replace a failed spare with another spare? Please see the following pool status:

config:

NAME STATE READ WRITE CKSUM
NewStorage DEGRADED 0 0 0
raidz3-0 DEGRADED 0 0 0
gptid/9a3e5f8d-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9b1cffb8-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
spare-2 DEGRADED 0 0 0
gptid/9b77df39-91f1-11ed-8a2c-0cc47aded37e FAULTED 71 0 0 too many errors
gptid/9e947455-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9bbe8320-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9c0ef01d-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9b9f7eeb-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9bd58fb0-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9c710efc-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9c61c43a-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9d11c582-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9d769ba9-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9dca128d-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
cache
gptid/9d229645-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9dd23b35-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
gptid/9d9a3b57-91f1-11ed-8a2c-0cc47aded37e ONLINE 0 0 0
spares
gptid/9e947455-91f1-11ed-8a2c-0cc47aded37e INUSE currently in use
gptid/9f286bae-91f1-11ed-8a2c-0cc47aded37e AVAIL
gptid/9fafd1bf-91f1-11ed-8a2c-0cc47aded37e AVAIL
gptid/a07b7bf9-91f1-11ed-8a2c-0cc47aded37e AVAIL

errors: No known data errors

pool: boot-pool
state: ONLINE
scan: scrub repaired 0B in 00:00:35 with 0 errors on Wed Oct 18 03:45:35 2023
config:

NAME STATE READ WRITE CKSUM
boot-pool ONLINE 0 0 0
da57p2 ONLINE 0 0 0

errors: No known data errors
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Your formatting is off, please use CODE tags, instead of quote. This is important because I am 99% sure the output is simply showing you NORMAL ZFS spare usage.

For example, if your output is this;
Code:
NAME                                             STATE READ WRITE CKSUM
NewStorage                                       DEGRADED 0 0 0
  raidz3-0                                       DEGRADED 0 0 0
    gptid/9a3e5f8d-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9b1cffb8-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    spare-2                                      DEGRADED 0 0 0
      gptid/9b77df39-91f1-11ed-8a2c-0cc47aded37e FAULTED 71 0 0 too many errors
      gptid/9e947455-91f1-11ed-8a2c-0cc47aded37e ONLINE   0 0 0
    gptid/9bbe8320-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9c0ef01d-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9b9f7eeb-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9bd58fb0-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9c710efc-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9c61c43a-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9d11c582-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9d769ba9-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9dca128d-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
  cache
    gptid/9d229645-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9dd23b35-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
    gptid/9d9a3b57-91f1-11ed-8a2c-0cc47aded37e   ONLINE   0 0 0
  spares
    gptid/9e947455-91f1-11ed-8a2c-0cc47aded37e   INUSE currently in use
    gptid/9f286bae-91f1-11ed-8a2c-0cc47aded37e   AVAIL
    gptid/9fafd1bf-91f1-11ed-8a2c-0cc47aded37e   AVAIL
    gptid/a07b7bf9-91f1-11ed-8a2c-0cc47aded37e   AVAIL

errors: No known data errors

That shows that you have a SPARE disk taking over the function of one of the RAID-Z3 disks. It sort of shows it as a Mirror, labeled as "spare-2".

All that would be normal. Either replace the FAULTED disk. Or cause the SPARE disk to become a permanent part of the RAID-Z3.


By the way, depending on amount of RAM and the L2ARC Cache device sizes, you may want to re-think having 3 L2ARC Cache devices. General rule of thumb is L2ARC Cache size is about 5 times RAM size. Going up to 10 times if you push it. Adding far too much simply does not work as RAM is needed for the pointers to data in the L2ARC Cache devices. Thus, you may be robbing Peter, (aka RAM), to pay Paul, (aka L2ARC).
 
Top