Noticed my Scrub Still Running. Is this ok and or normal?

LtHaus · Apr 28, 2020

Trying to do a fair bit of reading ahead of posting. I dug in and found that for some reason my Disk 2 was not being checked, so I fixed my Long and Short Smart tests.
While poking around I see my scrub is Running 1.5 days...
So.. one thing to point out I did upgrade to the latest FreeNAS-11.3-U2.1 and I have NOT upgraded the Pool to the new version yet.

That said, I have 2 disks that are reporting :
Apr 28 12:22:05 freenas smartd[9575]: Device: /dev/ada5, 3 Currently unreadable (pending) sectors
Apr 28 12:22:06 freenas smartd[9575]: Device: /dev/ada2, 7 Currently unreadable (pending) sectors

and I'm keeping my eyes on this. They have not grown and I have disks on hand to swap.... But on to the point....

Is 4+ days a proper scrub time for 13.6T??

Thanks!
v

Code:

root@freenas:~ # zpool status
  pool: P1
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
    still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(7) for details.
  scan: scrub in progress since Mon Apr 27 00:00:02 2020
    4.24T scanned at 33.9M/s, 4.15T issued at 33.2M/s, 13.6T total
    0 repaired, 30.53% done, 3 days 10:52:09 to go
config:

    NAME                                            STATE     READ WRITE CKSUM
    P1                                              ONLINE       0     0     0
      raidz2-0                                      ONLINE       0     0     0
        gptid/b561d9c7-abed-11e3-a052-f46d0492b6bc  ONLINE       0     0     0
        gptid/b650eaac-abed-11e3-a052-f46d0492b6bc  ONLINE       0     0     0
        gptid/932896fd-a2b2-11e5-a46a-f46d0492b6bc  ONLINE       0     0     0
        gptid/b833ae2c-abed-11e3-a052-f46d0492b6bc  ONLINE       0     0     0
        gptid/b91ef87b-abed-11e3-a052-f46d0492b6bc  ONLINE       0     0     0
        gptid/ba122598-abed-11e3-a052-f46d0492b6bc  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:03:12 with 0 errors on Fri Apr 24 03:48:13 2020
config:

    NAME        STATE     READ WRITE CKSUM
    freenas-boot  ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        da0p2   ONLINE       0     0     0
        da1p2   ONLINE       0     0     0

errors: No known data errors
root@freenas:~ #

Yorick · Apr 28, 2020

That seems long. My pool holds 14 TiB and finishes in 1 day and 3 hours. 1.5 to 2 days, sure. 4 days, a little long.

Could be because of failing disks, could be because of type of disks.

What's the model number of the disks in the pool, and of their ready-to-go replacements?

LtHaus said:
They have not grown and I have disks on hand to swap

Is this a raidz1? A raidz2? A bunch of mirrors? If you have two disks that are starting to show errors and you are waiting until you resilver replacements in, you are running a real risk of losing the entire vdev, depending on how much parity you have.

LtHaus · Apr 28, 2020

Drives are 6x4TB
Model:
WDC WD40EFRX-68WT0N0

I have one original replacement that should be identical and then 2 new WD40EFAX They show bigger cache memory but still 4TB Red WD.

LtHaus · Apr 28, 2020

Raidz2

LtHaus · Apr 28, 2020

for reference this was the last Scrub.

Code:

check_circle_outline
zfs.pool.scrub
100.00%
Status: SUCCESS
Start Time: Fri Apr 24, 2020, 3:45:00 (America/Chicago)
Finished Time: Fri Apr 24, 2020, 3:48:14

Yorick · Apr 28, 2020

LtHaus said:
2 new WD40EFAX

Oh boy. Return those if you can, they are DM-SMR. Best case, they will work but slow down writes and make resilvers take 4.5 to 5 days. Worst case, they will "bomb out of" resilver.

The EFRX are fine. WD Pro would be fine, WD Red 8TB or greater would be fine, including shucked ones; Seagate IronWolf and Toshiba N300 are fine, but 7200, so louder and hotter.

With raidz2, consider the possibility that the drives that are starting to fail now will fail hard during resilver, which means you are at "no parity". One more drive fails, you are down hard. Because you have two drives starting to fail and few read errors, I'd consider leaving the drive in during resilver so it can provide additional parity data if another drive starts failing. I'd not resilver onto the EFAX, because even in best case the increased resilver time makes pool failure more likely.

LtHaus · Apr 28, 2020

Just processed my return and found the EFRX on Amazon as well so I got 2 more of those. I'd like to keep all the drives identical, but I guess now days its not that important..

K_switch · Apr 28, 2020

Yorick said:
Because you have two drives starting to fail and few read errors, I'd consider leaving the drive in during resilver so it can provide additional parity data if another drive starts failing

@LtHaus

I ran into a situation very similar except i was using mirrored sets... i lost 3 drives each in a different Vdev then the mirrored drives started to fail... I actually had several Vdevs reporting as lost and had to reboot the NAS to "jump start" the pool... morale of the story i didn't leave the disk in when silvering and it would hit 74% and run into corrupted sectors that it could not migrate. i put the old drive back in days later... after i had backed up all the data to a recovery pool. Finally the silvering completed and the pool didn't show as degraded... i then proceeded to replace every drive but ZFS was able to rebuild the Vdev using the failed drive.

Important Announcement for the TrueNAS Community.

Noticed my Scrub Still Running. Is this ok and or normal?

LtHaus

Dabbler

Yorick

Wizard

LtHaus

Dabbler

LtHaus

Dabbler

LtHaus

Dabbler

Yorick

Wizard

LtHaus

Dabbler

K_switch

Dabbler

Similar threads