Resilver 2nd disk much slower than 1st

Status
Not open for further replies.

Joerg G

Dabbler
Joined
Jan 7, 2015
Messages
16
Hi there,

In order to increase the size of my pool I started replacing and resilvering drives.
My pool is a 4 disk raidz2 with WD30EFRX disks that I am now replacing with WD50EFRX disks.

The replace and resilver of the first disk took about 10 hours. The resilver of the second disk is now running for 12 hours and just finished 50%, so it is considerably slower. It started with comparable speed for the first hour then went down.

Should I be worried? Could this be just normal?

I have stopped all services (AFP, NFS, CIFS) before starting. Also did a scrub before. SMART is not showing errors, messages neither.

The pool has a capacity of 86% which isn't recommended, I know, therefore going to increase size.
Freenas is latest version (FreeNAS-9.3-STABLE-201501090144) with updates applied.
Disks are attached to an IT flashed IBM ServeRaid M1015.
Disks have been hot swapped.
System is running with this configuration now for about 6 months without any issues so far.

Here is the output of zpool status:

Code:
 pool: zroot
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Jan 11 22:23:49 2015
        4.99T scanned out of 9.42T at 123M/s, 10h28m to go
        1.21T resilvered, 53.04% done

config:

NAME                                              STATE     READ WRITE CKSUM
zroot                                             DEGRADED     0     0     0
  raidz2-0                                        DEGRADED     0     0     0
    gptid/a8707053-9a75-11e3-9287-0050569e52b8    ONLINE       0     0     0
    gptid/a8eebc4d-9a75-11e3-9287-0050569e52b8    ONLINE       0     0     0
    replacing-2                                   OFFLINE      0     0     0
      10221799220697409017                        OFFLINE      0     0     0  was /dev/gptid/a96eb79b-9a75-11e3-9287-0050569e52b8
      gptid/21825bf6-99d8-11e4-b972-0050569e3db2  ONLINE       0     0     0  (resilvering)
    gptid/1b882e48-9984-11e4-b972-0050569e3db2    ONLINE       0     0     0

errors: No known data errors

 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Well, I'd look at the SMART info on all of the disks. It could be that one of the disks is starting to have problems now and you're seeing that in resilver speed.

It also could be that your pool is busy doing things that it wasn't doing before. There is a possibility there is a bug with the system dataset doing small log writes and that would interrupt the resilver at nearly constant intervals.

But what you've provided doesn't show anything is wrong. Resilvering is going to be faster (or slower) based on a whole list of possible reasons, many of which you won't have a whole lot of control of or influence over once resilvering has started.

To sum everything up, if SMART looks okay for all of your disks and you can't find actual smoking gun error messages somewhere, I wouldn't worry about it at all.
 

Joerg G

Dabbler
Joined
Jan 7, 2015
Messages
16
Hi cyberjock,

thanks for quick response.
Still wondering about this difference. Looking forward how the other resilverings will behave. Two more to go...

Could you propose other things/places to check for "smoking gun error messages" that I might have missed? Just to make sure...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Umm.. there's too many places that could have errors. This is where your troubleshooting skills are going to be needed. But var/log/messages would be a good place to start.
 

Joerg G

Dabbler
Joined
Jan 7, 2015
Messages
16
Checked a few things but could not find any errors or issues.
Meanwhile resilver has been completed fine.
After reboot now resilvering the third disk, which is running fine as well.

Let's see if the auto expand will work after replacing all four...
 
Status
Not open for further replies.
Top