purduephotog
Explorer
- Joined
- Jan 14, 2013
- Messages
- 73
So... how many drives did you pull before it 'came back'? Was it just after the first one? Can you enumerate your steps using the history command ?
root@mfsbsd:/root # history 1 10:03 zpool import 2 10:03 zpool import storage 3 10:04 zpool import -T 735242 storage 4 10:07 zpool status 5 10:07 zpool status -v 6 10:10 zpool status -x 7 10:15 zpool status -v 8 10:15 zpool import storage 9 10:15 zpool status 10 10:16 zpool status -xv 11 16:10 history
Apparently most / everything is now still intact...!
Great... I am currently very happy, however before I do something (stupid), what should I do now...? Can the ZFS be repaired and get it back working again...? So I don't even have to lose my FreeNAS settings etc...?
Of course it would be best to back it up now, however I do not have the spare space or harddisks to transfer everything...? However since I can access it now, it means it can also be fixed right?
Please provide me with some solutions or what I should do next... Thanks...
errors: Permanent errors have been detected in the following files: /rw/storage/Jail/plugins/var/log/messages
root@mfsbsd:/root # zpool status -v pool: storage state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 3h50m with 0 errors on Sun Feb 24 03:13:57 2013 config: NAME STATE READ WRITE CKSUM storage DEGRADED 0 0 2 raidz2-0 DEGRADED 0 0 4 gptid/19177fb9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/3dc2f956-3de6-11e2-8af1-00151736994a ONLINE 0 0 0 gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 5393521929904432319 UNAVAIL 0 0 0 was /dev/gptid/1b8f2b64-25fa-11e2-9ab0-00151736994a gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /rw/storage/Jail/plugins/var/log/messages
root@mfsbsd:/root # camcontrol devlist <ATA WDC WD20EARX-00P AB51> at scbus0 target 0 lun 0 (pass0,da0) <ATA WDC WD20EARX-00P AB51> at scbus0 target 1 lun 0 (pass1,da1) <ATA WDC WD20EARX-00P AB51> at scbus0 target 3 lun 0 (pass2,da2) <ATA WDC WD20EARX-00P AB51> at scbus0 target 4 lun 0 (pass3,da3) <ATA WDC WD20EARX-008 AB51> at scbus0 target 5 lun 0 (pass4,da4) <SanDisk Extreme 0001> at scbus11 target 0 lun 0 (da5,pass5)
root@mfsbsd:/root # zpool import storage cannot import 'storage': I/O error Recovery is possible, but will result in some data loss. Returning the pool to its state as of Sun Mar 24 11:13:56 2013 should correct the problem. Approximately 496 minutes of data must be discarded, irreversibly. Recovery can be attempted by executing 'zpool import -F storage'. A scrub of the pool is strongly recommended after recovery.
Yeah exactly... I have been patient enough so far and I am really happy that at least most stuff seems to be safe (as far as I can tell).
I also did ls -Ral from storage and after displaying several files and directories it rebooted the machine?!
Hopefully it didn't do any damage... :S
*FUCK*.... there's that bloody I/O error AGAIN..... there's some hardware problem SOMEWHERE!
I'd take that new controller and all the disks you've got connected and put them in another system. If a simple "ls" is causing your system to crash, trying to copy files isn't going to go very smoothly either.
Wait for PaleoN just so we have some consensus.
Just noticed the 496 minutes of suggested rollback is 'after' the initial problem accessing the pool.
This implies the pool has changed after it was mounted for recovery.
Would it have been better to have been trying readonly mounts exclusively during attempted recovery? Just to ensure nothing got changed on a potentially damaged pool?
Yeah well I/O error, but now it says it can be repaired and "only" lose 496 minutes of data.
Maybe this data was "already" lost, but couldn't calculate this or display this, because of other problems concerning the ZFS...?
And maybe "ls -Ral"-command triggered a reboot because it wanted to access / read files which were destroyed or unavailable. My guess is that it rebooted at 30% - 40% of the files being displayed. I can see it rebooted by at a movie I downloaded. I don't know how ZFS stores files, but I reckon it's randomly written, right?
In regards to 496 minutes data lost. How do I need to interpret that? I think for ZFS it's normal to show this amount in minutes, but what does that mean in MB or GB? I cannot put a finger on that...
Also it says it wants to and I quote:
"Returning the pool to its state as of Sun Mar 24 11:13:56 2013"
That's the state AFTER I could access the ZFS pool called "storage" finally. But after that I didn't add, change or remove files. So how can stuff be lost, if I didn't change, add or delete files after that time? Weird...
I also think I'd do what the error says and try "zpool import -F storage" and then do a scrub, then you can remount it read-only and try copying stuff when you have some disks to copy to.
root@mfsbsd:/root # zpool import pool: storage id: 17472259698871586545 state: DEGRADED status: One or more devices are missing from the system. action: The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported. see: http://illumos.org/msg/ZFS-8000-2Q config: storage DEGRADED raidz2-0 DEGRADED gptid/19177fb9-25fa-11e2-9ab0-00151736994a ONLINE gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a ONLINE gptid/3dc2f956-3de6-11e2-8af1-00151736994a ONLINE gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a ONLINE 5393521929904432319 UNAVAIL cannot open gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a ONLINE root@mfsbsd:/root # zpool import storage cannot import 'storage': I/O error Recovery is possible, but will result in some data loss. Returning the pool to its state as of Sun Mar 24 11:13:56 2013 should correct the problem. Approximately 496 minutes of data must be discarded, irreversibly. Recovery can be attempted by executing 'zpool import -F storage'. A scrub of the pool is strongly recommended after recovery. root@mfsbsd:/root # zpool import -F storage Pool storage returned to its state as of Sun Mar 24 11:13:56 2013. Discarded approximately 496 minutes of transactions. root@mfsbsd:/root # zpool status pool: storage state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 3h50m with 0 errors on Sun Feb 24 03:13:57 2013 config: NAME STATE READ WRITE CKSUM storage DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 gptid/19177fb9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/3dc2f956-3de6-11e2-8af1-00151736994a ONLINE 0 0 0 gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 5393521929904432319 UNAVAIL 0 0 0 was /dev/gptid/1b8f2b64-25fa-11e2-9ab0-00151736994a gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 errors: 1 data errors, use '-v' for a list root@mfsbsd:/root # zpool status -v pool: storage state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 3h50m with 0 errors on Sun Feb 24 03:13:57 2013 config: NAME STATE READ WRITE CKSUM storage DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 gptid/19177fb9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/3dc2f956-3de6-11e2-8af1-00151736994a ONLINE 0 0 0 gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 5393521929904432319 UNAVAIL 0 0 0 was /dev/gptid/1b8f2b64-25fa-11e2-9ab0-00151736994a gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /rw/storage/Jail/plugins/var/log/messages root@mfsbsd:/root #
If it were me, I'd hold off on doing a scrub.
I thought it was inadvisable to scrub a pool that is having problems?
I agree the 496 minute rollback is probably minor. But I wouldn't be scrubbing it until I had copies of my data, or if a scrub was absolutely required in order to get data.
Honestly, in this situation, if I had good copies of my data, I'd probably recreate the pool from scratch just to be sure. After copying everything off, and before recreating the pool, I'd probably do some surface verification of the drives. As in a dd wipe, dd read, and another smart long test of each drive. Might be a bit overkill, but it's something I would do.
Uhmz... I just read titan_rw's post about holding off the scrub... :s
It's now running... :/
- - - Updated - - -
And a new reboot during scrub. Not good. :S
root@mfsbsd:/root # zpool status -v pool: storage state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub in progress since Sun Mar 24 22:07:34 2013 33.0G scanned out of 3.74T at 352M/s, 3h4m to go 0 repaired, 0.86% done config: NAME STATE READ WRITE CKSUM storage DEGRADED 0 0 8 raidz2-0 DEGRADED 0 0 32 gptid/19177fb9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/3dc2f956-3de6-11e2-8af1-00151736994a ONLINE 0 0 0 gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 5393521929904432319 UNAVAIL 0 0 0 was /dev/gptid/1b8f2b64-25fa-11e2-9ab0-00151736994a gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: storage:<0x0> /rw/storage/Jail/plugins/var/log/messages
Or I occasionally do other things and I was also fighting a PoS new cable modem today. I only looked now.Yeah, that's not good, but I suspect it has to do with that I/O error. I would just stop and wait for some disks to copy stuff to, AND to see if PaleoN has any ideas, though at this point he probably doesn't want to help since you didn't wait.... I don't know.
This is not what I asked you to do. You were only supposed try a normal zpool import with a missing disk. Still I'm ecstatic you appeared to have made some progress.Now I am running / trying the commands without 1 disk (at a time).
Quite possibly required, but as titan_rw suggested I'd attempt a read-only import first. Try to see what we can get first and then scrub. This may be an upgraded pool which means the failmode is likely not set to continue which would be better.I think in this case a scrub is needed.
Essentially, we can always make it worse. Did you happen to record the kernel panic? While I certainly can't do anything directly with it, it very well may provide some useful information. Write down or take pictures of all such occurrences.Now it is getting worse. Whenever I try to import 'storage' it gives a kernel panis and causes reboot.
Assuming the message isn't a red herring which I'm inclined to believe it's not. Perhaps the PSU isn't sending out the correct voltages all the time to all the drives. Usually such problems are more overt, but it'd be nice to rule out everything except the disks themselves.there's that bloody I/O error AGAIN..... there's some hardware problem SOMEWHERE!
And don't do this. At least not blindly. Though some -F imports may be required.and I would redo the mount -F and scrub after each crash.
If the pool is still up after that scrub, leave it alone and leave it on. If not record the error, shutdown and reconnect the disconnected drive before trying any further imports.
If the pool is still up after that scrub, leave it alone and leave it on. If not record the error, shutdown and reconnect the disconnected drive before trying any further imports.
root@mfsbsd:/root # zpool status no pools available root@mfsbsd:/root # zpool import pool: storage id: 17472259698871586545 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: storage ONLINE raidz2-0 ONLINE gptid/19177fb9-25fa-11e2-9ab0-00151736994a ONLINE gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a ONLINE gptid/3dc2f956-3de6-11e2-8af1-00151736994a ONLINE gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a ONLINE gptid/1b8f2b64-25fa-11e2-9ab0-00151736994a ONLINE gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a ONLINE root@mfsbsd:/root # zpool import storage
root@mfsbsd:/root # zpool import -T 735242 storage Pool storage returned to its state as of Fri Mar 15 02:03:31 2013. root@mfsbsd:/root # zpool status pool: storage state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 3h50m with 0 errors on Sun Feb 24 03:13:57 2013 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 2 raidz2-0 ONLINE 0 0 4 gptid/19177fb9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/3dc2f956-3de6-11e2-8af1-00151736994a ONLINE 0 0 0 gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/1b8f2b64-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 errors: 1 data errors, use '-v' for a list
root@mfsbsd:/root # zpool status -v pool: storage state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 3h50m with 0 errors on Sun Feb 24 03:13:57 2013 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 2 raidz2-0 ONLINE 0 0 4 gptid/19177fb9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/19b5ec3a-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/3dc2f956-3de6-11e2-8af1-00151736994a ONLINE 0 0 0 gptid/1aefa3e9-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/1b8f2b64-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 gptid/1c2d6a74-25fa-11e2-9ab0-00151736994a ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /rw/storage/Jail/plugins/var/log/messages