danb35
Hall of Famer
- Joined
- Aug 16, 2011
- Messages
- 15,504
So I was too clever by half, and have now brought my pool to a state that I'm not sure the best way to recover from. I wanted to remove one of my drives to run a couple of tests, and didn't want to bother with the few minutes of system downtime to power down. So I offlined the drive from the GUI, pulled it out of its hot-swap bay, ran the tests, put it back in, and tried to replace it with itself. That's when the fun started.
The web GUI gave me an error of:
...so I figured I had to reboot anyway, and the system should pick up the drive on bootup. No dice. When I rebooted and ran zpool status, this is what came up:
So then I figured I should be able to online the device, as the status message says:
Since this is a RAIDZ2 vdev, I'm not critically worried, but obviously it's something that needs to be fixed quickly. I could, I guess, wipe the disk with DBAN and then replace it, but that seems kind of wasteful. Any other ideas (other than "next time, just shut down the server!")?
The web GUI gave me an error of:
Code:
Nov 13 09:09:35 freenas2 manage.py: [middleware.exceptions:38] [MiddlewareError: Disk replacement failed: "invalid vdev specification, use '-f' to override the following errors:, /dev/gptid/2b1fae37-8a10-11e5-bec2-002590de8695 is part of active pool 'tank'
...so I figured I had to reboot anyway, and the system should pick up the drive on bootup. No dice. When I rebooted and ran zpool status, this is what came up:
Code:
[root@freenas2] ~# zpool status pool: freenas-boot state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Thu Nov 5 03:45:51 2015 config: NAME STATE READ WRITE CKSUM freenas-boot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gptid/1b6fb23e-bec6-11e4-8407-0cc47a01304d ONLINE 0 0 0 gptid/1b7f00c5-bec6-11e4-8407-0cc47a01304d ONLINE 0 0 0 errors: No known data errors pool: tank state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: scrub repaired 0 in 26h21m with 0 errors on Mon Nov 2 01:21:42 2015 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/9a85d15f-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/9afa89ae-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/9b6cc00b-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/9c501d57-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/9cc41939-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/9d39e31d-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 gptid/f5b737a6-8e41-11e4-8732-0cc47a01304d ONLINE 0 0 0 7019498564335405691 OFFLINE 0 0 0 was /dev/gptid/f6284bf9-8e41-11e4-8732-0cc47a01304d gptid/f68f4fa9-8e41-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/f722e509-8e41-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/f7d115c2-8e41-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/f84821c1-8e41-11e4-8732-0cc47a01304d ONLINE 0 0 0 errors: No known data errors
So then I figured I should be able to online the device, as the status message says:
Code:
[root@freenas2] ~# zpool online tank /dev/gptid/f6284bf9-8e41-11e4-8732-0cc47a01304d warning: device '/dev/gptid/f6284bf9-8e41-11e4-8732-0cc47a01304d' onlined, but remains in faulted state use 'zpool replace' to replace devices that are no longer present [root@freenas2] ~# zpool status pool: freenas-boot state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Thu Nov 5 03:45:51 2015 config: NAME STATE READ WRITE CKSUM freenas-boot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gptid/1b6fb23e-bec6-11e4-8407-0cc47a01304d ONLINE 0 0 0 gptid/1b7f00c5-bec6-11e4-8407-0cc47a01304d ONLINE 0 0 0 errors: No known data errors pool: tank state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: scrub repaired 0 in 26h21m with 0 errors on Mon Nov 2 01:21:42 2015 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/9a85d15f-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/9afa89ae-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/9b6cc00b-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/9c501d57-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/9cc41939-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/9d39e31d-8d5c-11e4-8732-0cc47a01304d ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 gptid/f5b737a6-8e41-11e4-8732-0cc47a01304d ONLINE 0 0 0 7019498564335405691 UNAVAIL 0 0 0 was /dev/gptid/f6284bf9-8e41-11e4-8732-0cc47a01304d gptid/f68f4fa9-8e41-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/f722e509-8e41-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/f7d115c2-8e41-11e4-8732-0cc47a01304d ONLINE 0 0 0 gptid/f84821c1-8e41-11e4-8732-0cc47a01304d ONLINE 0 0 0 errors: No known data errors
Since this is a RAIDZ2 vdev, I'm not critically worried, but obviously it's something that needs to be fixed quickly. I could, I guess, wipe the disk with DBAN and then replace it, but that seems kind of wasteful. Any other ideas (other than "next time, just shut down the server!")?