Hey guys,
I'm using FreeNAS 8.3 with a raidz1 configuration (6x 2tb) and esxi 5. For the last several months a disk has been dropping out of the raid inexplicably but could be brought back with a full power cycle of the esxi host (off and then on, not reboot). Since this is a home server this wasn't that big of a deal. This week I decide to replace the drive (SN ending in 8GG) when half way through the resilvering (56%) another disk fails (SN ending in 0BM)! I'm not able to get 0BM to do anything and I'm worried I've lost the raid.
I was thinking that maybe I could undo the replacement of 8GG and instead replace 0BM; is this possible? My reasoning is that 8GG hadn't fully failed yet as it was only dipping out of the array occasionally and I still have yet to detach it. However, the pool is not recognized if I plug it back in (with or without its replacement disk). Running while 8GG is plugged in gives an I/O error.
Is all lost or is it possible to undo the replacement so I can replace the other disk, resilver and then replace the first disk afterwards? Alternatively, is it possible to resilver using 8GG or clone it somehow so that I could have all replicas and then subsequently replace 0BM?
My procedure for replacing 8GG:
Booting up I get this from 0BM, repeatedly.
zpool status without 8GG plugged, the faulted disk is 0BM, the unavail is 8GG (replaced)
I'm using FreeNAS 8.3 with a raidz1 configuration (6x 2tb) and esxi 5. For the last several months a disk has been dropping out of the raid inexplicably but could be brought back with a full power cycle of the esxi host (off and then on, not reboot). Since this is a home server this wasn't that big of a deal. This week I decide to replace the drive (SN ending in 8GG) when half way through the resilvering (56%) another disk fails (SN ending in 0BM)! I'm not able to get 0BM to do anything and I'm worried I've lost the raid.
I was thinking that maybe I could undo the replacement of 8GG and instead replace 0BM; is this possible? My reasoning is that 8GG hadn't fully failed yet as it was only dipping out of the array occasionally and I still have yet to detach it. However, the pool is not recognized if I plug it back in (with or without its replacement disk). Running
Code:
zpool import files
Is all lost or is it possible to undo the replacement so I can replace the other disk, resilver and then replace the first disk afterwards? Alternatively, is it possible to resilver using 8GG or clone it somehow so that I could have all replicas and then subsequently replace 0BM?
My procedure for replacing 8GG:
- zpool scrub files
- offlined 8GG in GUI
- shut down
- physically replace disk
- boot up
- replace 8GG in GUI
- this is where 0BM failed
Booting up I get this from 0BM, repeatedly.
Code:
(da4:mps0:0:3:0): READ(10). CDB: 28 0 e8 e0 87 80 0 1 0 0 (da4:mps0:0:3:0): CAM status: SCSI Status Error (da4:mps0:0:3:0): SCSI status: Check Condition (da4:mps0:0:3:0): SCSI sense: MEDIUM ERROR info:e8e08860 asc:11,0 (Unrecovered read error)
zpool status without 8GG plugged, the faulted disk is 0BM, the unavail is 8GG (replaced)
Code:
pool: files state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scan: scrub repaired 0 in 8h51m with 203133 errors on Fri Mar 15 01:41:28 2013 config: NAME STATE READ WRITE CKSUM files DEGRADED 0 0 336K raidz1-0 DEGRADED 0 0 959K replacing-0 DEGRADED 0 0 0 7423993124130373765 UNAVAIL 0 0 0 was /dev/dsk/gptid/b79a4f7f-ba3e-11e0-b27e-50e54950d8e6 gptid/20a24ed9-8a8e-11e2-8608-000c29eb7541 ONLINE 0 0 0 gptid/b83b4301-ba3e-11e0-b27e-50e54950d8e6 FAULTED 9 79 2 too many errors gptid/b8b1b0f0-ba3e-11e0-b27e-50e54950d8e6 ONLINE 0 0 0 gptid/b92173a6-ba3e-11e0-b27e-50e54950d8e6 ONLINE 0 0 0 gptid/b98c8805-ba3e-11e0-b27e-50e54950d8e6 ONLINE 0 0 0 gptid/ba01227d-ba3e-11e0-b27e-50e54950d8e6 ONLINE 0 0 0 errors: 203133 data errors, use '-v' for a list