ZFS disk replace - resilvering taking way too long

Status
Not open for further replies.

duddit2

Cadet
Joined
Jul 24, 2012
Messages
7
Hi all,

So first off the specs:
FreeNAS-8.0.3-RELEASE-p1-x86 (9591)
2 x 1TB sata 3 drives
4GB RAM
Core 2 duo CPU

The system:
2 x 1TB drives setup as a ZFS pool using RAIDZ (mirrored)
ada2p2
ada3p2

The problem:
It started with errors from a windows server which uses a iSCSI file extent located on the zpool as a backup disk, performance seemed to be at a crawl, so I enabled SMART and let it test and what do you know ada2p2 was having serious problems.

So I add a new 1TB drive and boot backup (I didn't not remove anything), then use the GUI to 'replace disk' and select the new drive.

Its now resilvering, but the issue here is the time its saying its going to take......

[ssh@freenas] /# zpool status -v
pool: data
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress for 0h39m, 0.01% done, 4592h56m to go
config:

NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
mirror ONLINE 0 0 0
replacing ONLINE 0 0 1
ada2p2 ONLINE 0 0 2 8K resilvered
ada1p2 ONLINE 0 0 3 127M resilvered
ada3p2 ONLINE 0 0 0

errors: No known data errors


That's 4592 hours!! Good bloody lord, can someone on here tell me if this is normal and what if anything can be done to speed things up?
 

peterh

Patron
Joined
Oct 19, 2011
Messages
315
Be calm !
the first part will take a long time, but i guess that when you read me it's already progressed a lot

Use "systat -vm" to see how much disks are working.

BTW, x86 is no good environ for zfs, it' slow and it might abend due to memory shortage.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Did you replace ada2p2 with ada1p2 or did you attach ada1p2 to the existing mirror? God forbid you put the output inside some [code][/code] tags so I could actually read it properly. Is ada1p2 any good. It appears to be showing 3 checksum errors already?

It's taking so long because ada2p2 is killing the pool performance.

BTW, x86 is no good environ for zfs, it' slow and it might abend due to memory shortage.
peterh is right. Is your Core 2 duo not 64-bit capable?
 

duddit2

Cadet
Joined
Jul 24, 2012
Messages
7
hi again all,

ok so first off, this isn't my freenas box I've inherited it from a customers old IT guy and do plan to upgrade to the latest x64 version but thought it was best to resolve this issue first.

Onto the update, it looks to be stuck (with code tags this time, sorry about that):

Code:
[ssh@freenas] /#  zpool status -v
  pool: data
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 16h28m, 0.01% done, 116218h38m to go
config:

        NAME           STATE     READ WRITE CKSUM
        data           ONLINE       0     0     0
          mirror       ONLINE       0     0     0
            replacing  ONLINE       0     0     1
              ada2p2   ONLINE       0     0     2  8K resilvered
              ada1p2   ONLINE       0     0     3  127M resilvered
            ada3p2     ONLINE       0     0     0

errors: No known data errors


And to clear up, ada1p2 is a brand new drive bought yesterday, ada2p1 is the disk that had the issues.

An idea - since I'm looking at upgrading to latest x64 version anyway - is it possible to save config from this setup and then import to a fresh install thereby giving me an x64 system to work with?
 

duddit2

Cadet
Joined
Jul 24, 2012
Messages
7
Code:
 systat -vm
    1 users    Load  0.00  0.00  0.00                  Sep 21 07:50

Mem:KB    REAL            VIRTUAL                       VN PAGER   SWAP PAGER
        Tot   Share      Tot    Share    Free           in   out     in   out
Act  121492    9628   350692    12608  715328  count
All  216220   13532  4652100    32416          pages
Proc:                                                            Interrupts
  r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Flt        cow    3997 total
             88       333   11  138    1  130   15      3 zfod        atkbd0 1
                                                          ozfod       atapci0+ 1
 0.1%Sys   0.1%Intr  0.0%User  0.0%Nice 99.8%Idle        %ozfod  1998 cpu0: time
|    |    |    |    |    |    |    |    |    |    |       daefr     1 re0 irq256
                                                          prcfr  1998 cpu1: time
                                       260 dtbuf          totfr
Namei     Name-cache   Dir-cache     67549 desvn          react
   Calls    hits   %    hits   %     23054 numvn          pdwak
     227     227 100                 16886 frevn          pdpgs
                                                          intrn
Disks  ada0  ada1  ada2  ada3   md0   md1   md2    119564 wire
KB/t   0.00  0.00  0.00  0.00  0.00  0.00  0.00     98556 act
tps       0     0     0     0     0     0     0     48164 inact
MB/s   0.00  0.00  0.00  0.00  0.00  0.00  0.00        64 cache
%busy     0     0     0     0     0     0     0    715264 free
    1 users    Load  0.02  0.01  0.00                  Sep 21 07:54

Mem:KB    REAL            VIRTUAL                       VN PAGER   SWAP PAGER
        Tot   Share      Tot    Share    Free           in   out     in   out
Act  121824    9628   351716    12608  714984  count
All  216560   13532  4653124    32416          pages
Proc:                                                            Interrupts
  r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Flt        cow    3997 total
             88       321    3   83    1  128             zfod        atkbd0 1
                                                          ozfod       atapci0+ 1
 0.1%Sys   0.1%Intr  0.0%User  0.0%Nice 99.8%Idle        %ozfod  1998 cpu0: time
|    |    |    |    |    |    |    |    |    |    |       daefr     1 re0 irq256
                                                          prcfr  1998 cpu1: time
                                       260 dtbuf          totfr
Namei     Name-cache   Dir-cache     67549 desvn          react
   Calls    hits   %    hits   %     23054 numvn          pdwak
                                     16886 frevn          pdpgs
                                                          intrn
Disks  ada0  ada1  ada2  ada3   md0   md1   md2    119576 wire
KB/t   0.00  0.00  0.00  0.00  0.00  0.00  0.00     98892 act
tps       0     0     0     0     0     0     0     48160 inact
MB/s   0.00  0.00  0.00  0.00  0.00  0.00  0.00        64 cache
%busy     0     0     0     0     0     0     0    714920 free
                                                   109952
 

duddit2

Cadet
Joined
Jul 24, 2012
Messages
7
also to clear up, I added the drive, booted up then within the GUI on the check disks page (from the zpool options), I used the replace disk option on ada2 and chose ada1. I don't understand why ada2 is still having an effect since I though replacing the disk would simply boot ada2 out of the pool and ada1 would just mirror the contents of ada3 and become a functioning mirror again.
 

duddit2

Cadet
Joined
Jul 24, 2012
Messages
7
Can anyone give me any more pointers on this? Really don't want to start from scratch and lose the backups stored on here.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'd wait for one of the more experienced persons to agree with what I'm about to say before you do this since your data is at risk....I'd do this if it were my data though.

Since you are trying to replace ada2 with ada3, and ada2 is basically trashed I'd shutdown and remove ada2 physically from the system. Then bootup and start the scrub over(if it doesn't start on it's own). When you bootup ada3 will now be ada2 since you have only 2 drives. Don't panic.. that's normal.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Onto the update, it looks to be stuck (with code tags this time, sorry about that):
Much better and thanks for clearing up the steps you took.

I don't understand why ada2 is still having an effect since I though replacing the disk would simply boot ada2 out of the pool and ada1 would just mirror the contents of ada3 and become a functioning mirror again.
Because ada2 is still an active member of the zpool. It won't be booted until the resliver finishes. With mirrors I would have created a 3-way mirror and once the new disk was resilvered, detach the old drive turning it back into a 2-way mirror.

An idea - since I'm looking at upgrading to latest x64 version anyway - is it possible to save config from this setup and then import to a fresh install thereby giving me an x64 system to work with?
Yes, that will work.

I would actually do the replacement under FreeNAS-8.3.0-BETA2. The newer ZFS V28 code handles resilvering/replacements better than the ZFS V15 code in 8.2. What you will want to do is zpool offline /dev/ada2p2. This will stop the system from reading from the failing drive. Again, I would do this under FreeNAS-8.3.0-BETA2. You can physically remove the drive or not. The offline command is "politer" and saves you a step if you need ada2p2 because e.g. a bad sector on ada3p2.
 

duddit2

Cadet
Joined
Jul 24, 2012
Messages
7
Hi thanks everyone for your help, I took a config backup then pulled the faulty drive and the hard disk used for freenas then booted from a usb stick with freenas 8.3 Beta 2 imaged onto it, found the zfs volume and imported it and it was fixed fairly quickly, I didn't upgrade the zfs pool (as prompted by the GUI alert) as I had an error in the gui reporting so just used 8.3. beta 2 to fix the pool.

I then installed 8.2 (oh using 64 bit as well now) onto the usb drive and booted, uploaded the config and all is good :)

Thanks again...
 

duddit2

Cadet
Joined
Jul 24, 2012
Messages
7
Oh and in the 8.3 beta 2 install I couldn't elevate the session over SSH for some reason (which is why I just went down the route of pulling the drive as opposed to running the commands you mentioned).

also, the drive that was used for the old freenas install is now added to the pool as a cache drive.
 
Status
Not open for further replies.
Top