Replacing disk not working

Status
Not open for further replies.
Joined
Mar 12, 2012
Messages
6
Hi,
We had a disk go bad in our back up server, so I shut the server down and replaced it with a new one. After boot, I clicked "replace" in the freeNas GUI and it started resilvering. It took forever, but now it's done. But now the newly inserted disk won't take the old disks place, and it reports data errors on some files. Here's the output from zpool status:

zpool status
pool: Backupstorage
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scan: resilvered 2.46T in 48h12m with 428 errors on Wed Sep 25 18:59:31 2013
config:

NAME STATE READ WRITE CKSUM
Backupstorage DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
gptid/420dd7b7-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/426b38b5-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
replacing-2 DEGRADED 0 0 0
10563331935651512754 UNAVAIL 0 0 0 was /dev/gptid/42ddf682-3008-11e2-b18c-002590576f25
gptid/b530b4b4-246f-11e3-b8d5-00304856e6c4 ONLINE 0 0 0
gptid/4389f83f-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/445673c3-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/44de6ef5-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/45757cf0-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/4607641d-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
gptid/991f1e15-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/9a133b28-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/9ac8d146-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/9bc7bdce-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/9ca82ec1-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/9d561675-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/9e15dd2f-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/9eb88180-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
raidz1-2 ONLINE 0 0 0
gptid/cc4ed7b8-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/cce83aa1-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/cdbea5d8-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/ce944c0f-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/cfb69630-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/d0acb9f5-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/d18a2631-3008-11e2-b18c-002590576f25 ONLINE 0 0 0
gptid/d25d280b-3008-11e2-b18c-002590576f25 ONLINE 0 0 0

errors: 428 data errors, use '-v' for a list

I need some advice:
1. How can I get it to recognise the new disk?
2. What can I do about the data errors? Let's say I'm fine with the data loss (it's a backup server, no biggie), but can I "reset" the warning in some way?

Thanks!
Micke
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
1. It recognizes the new disks. What you need to do is go to the manual and finish the disk replacement procedure. You didn't do the last step. That will remove the "old" disk from the pool for good.
2. You can reset the warning by deleting the offending files. "zpool clear" will clear any errors(but you have 0 for all disks right now. But the files will list themselves as corrupt for as long as those files exist. And hopefully you don't have metadata corruption as those list hex locations and not file names, and you definitely aren't going to find that hex location without destroying the whole pool.

Doing an 8 disk RAIDZ1 is wicked crazy. See that link in my sig about RAIDZ1/RAID5 being dead? Read up on UREs. That sounds like exactly why you have 428 data errors. You had read errors on the other disks and couldn't fix the issues since you had no redundancy during the resilver. As a backup server that might be acceptable... but its your risk(or reward) to take.
 
Joined
Mar 12, 2012
Messages
6
Hi Cyberjock,
Thanks for your reply, that worked just fine! And thanks for the reading on RAIDZ1, really interesting, never thought of it like that!
So, can I ask you for your advise, I have a production server that it set up like the following:

NAME STATE READ WRITE CKSUM
Fileserver ONLINE 0 0 0
raidz2 ONLINE 0 0 0
da16p2 ONLINE 0 0 0
da17p2 ONLINE 0 0 0
da20p2 ONLINE 0 0 0
da21p2 ONLINE 0 0 0
da22p2 ONLINE 0 0 0
da23p2 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
da1p2 ONLINE 0 0 0
da2p2 ONLINE 0 0 0
da3p2 ONLINE 0 0 0
da4p2 ONLINE 0 0 0
da5p2 ONLINE 0 0 0
da7p2 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
da8p2 ONLINE 0 0 0
da9p2 ONLINE 0 0 0
da10p2 ONLINE 0 0 0
da11p2 ONLINE 0 0 0
da14p2 ONLINE 0 0 0
da15p2 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
da24p2 ONLINE 0 0 0
da25p2 ONLINE 0 0 0
da26p2 ONLINE 0 0 0
da27p2 ONLINE 0 0 0
da28p2 ONLINE 0 0 0
da29p2 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
da0p2 ONLINE 0 0 0
da6p2 ONLINE 0 0 0
da12p2 ONLINE 0 0 0
da13p2 ONLINE 0 0 0
da18p2 ONLINE 0 0 0
da19p2 ONLINE 0 0 0
errors: No known data errors

This is replicated on to a backup server configured like the one you saw in the last post. It's all 3TB drives, so I get roughly the the same amount of storage.
Is there a better/more clever way of setting this up, how would you do it?
I'm a bit of a zfs noob, but have had great success with it for the last year or so (partly lucky i guess... ;-) )

Thanks!
Micke
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Well, there's no formatting for your output, but if all of those disks are RAIDZ2 vdevs of 6 disks that looks fine to me.
 
Status
Not open for further replies.
Top