ZFS failed can't import pool

Status
Not open for further replies.

Beeth

Cadet
Joined
Jan 5, 2016
Messages
5
I am having big problem with ZFS/FreeNAS which may cause me to lose confidence, can anyone please help?
Here's the detail:

I have 5x 2TB drives for raidz1 with about 5TB data on it, as my data archive/backup, to where I regularly dump my data from a Synology NAS(less storage). I keep the FreeNAS offline most of the time, but when I powered it on last week I found one of the disks having bad sectors(from smartd). I upgraded FreeNAS from 9.1.0 to 9.3.1 without ZFS upgrade, everything looked fine, until I decided to replace the failing disk "ada3", when it was ONLINE and the pool zfs0 was still HEALTH.

I put ada3 offline(zfs became degraded), powered FreeNAS off, swapped a good disk in, powered it on, I could see the zpool for about two minutes, then suddenly the system hung and I lost connection from GUI and SSH, no response from the console. Then I reset the box but it couldn't boot(stuck with message "spa_load_impl waiting claims to sync", no luck with long waiting). I reinstalled FreeNAS 9.3.1 so that I could boot the system, and ran "zpool import" to see the pool, however, the "zpool import -f zfs0" failed(it ran for long time, even a day, with no single output from the console, all zfs command failed, I believe the zpool import command caused system hang). I tried FreeNAS 9.1.0, just wanted to make sure I used the right zfs version but no luck(even I didn't upgrade ZFS when I upgraded FreeNAS from 9.1.0 to 9.3.1).

I did a lot of google search, I found quite a few people complained about the "zpool import" took long time or caused system hang, but I saw no solution. I found I could zpool import the pool as readonly(WHY???) but at least I can see my data inside the pool(not sure if all the data are safe though).

[root@freenas] ~# zpool import
pool: zfs0
id: 756149107281189722
state: DEGRADED
status: One or more devices are offlined.
action: The pool can be imported despite missing or damaged devices. The
fault tolerance of the pool may be compromised if imported.
config:

zfs0 DEGRADED
raidz1-0 DEGRADED
gptid/31e69f5e-1523-11e3-b638-406186f225d7 ONLINE
gptid/323b141e-1523-11e3-b638-406186f225d7 ONLINE
gptid/32955024-1523-11e3-b638-406186f225d7 ONLINE
7438062148680404053 OFFLINE
gptid/334be691-1523-11e3-b638-406186f225d7 ONLINE

[root@freenas] ~# zpool import -f -o readonly=on zfs0 mnt
[root@freenas] ~#
[root@freenas] ~# zpool status -v
pool: mnt
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: resilvered 10.8M in 0h0m with 0 errors on Sun Jan 3 08:17:02 2016
config:

NAME STATE READ WRITE CKSUM
mnt DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
gptid/31e69f5e-1523-11e3-b638-406186f225d7 ONLINE 0 0 0
gptid/323b141e-1523-11e3-b638-406186f225d7 ONLINE 0 0 0
gptid/32955024-1523-11e3-b638-406186f225d7 ONLINE 0 0 0
7438062148680404053 OFFLINE 0 0 0 was /dev/gptid/32ea34ce-1523-11e3-b638-406186f225d7
gptid/334be691-1523-11e3-b638-406186f225d7 ONLINE 0 0 0

errors: Permanent errors have been detected in the following files:

<metadata>:<0x95>
<metadata>:<0xb7>


I could copy out all the 5TB+ data out of the NAS, rebuild the NAS, then copy the data back. I have difficulty to find another 5TB+ free space to hold all these data and it will definitely take me days to transfer the data back and forth. Although I can go through all these troubles to get the NAS back working, but I wonder if I can just fix the ZFS/Zpool with an easier way... and most important, I want to understand why FreeNAS/ZFS could just fail like that and I really don't want to encounter the same problem in the future. I tried to be very careful when I picked ZFS years ago and when I did every single change to the system, I just didn't know what I could have done wrong. Also, why did I get those metadata error when all other disks are still good and a single failing disk just has a few bad sectors?? Why the pool can be imported as readonly but normal import failed? Without a good explanation for this issue, I wonder if I should go for another filesystem for NAS for the new build, like BTRFS.

PLEASE HELP. THANK YOU IN ADVANCE!
 

Beeth

Cadet
Joined
Jan 5, 2016
Messages
5
That being said, ZFS could just fail for one single drive failure with some bad sectors, and users may risk losing all the data, any idea how to prevent this from happening again? Thanks!
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Your hardware please. Motherboard model, amount and type of RAM, etc.
 

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739
As @DrKK said, we need specs. Sounds like a lack of RAM is preventing the pool from mounting.
 

Bhoot

Patron
Joined
Mar 28, 2015
Messages
241
I'm resilvering as well. A very different experience with a RaidZ2 Vdev.
Code:
bhoot@freenas:~ % zpool status -v
  pool: bhoot
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
  continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Jan  8 13:48:03 2016
  13.3T scanned out of 18.1T at 49.3M/s, 28h13m to go
  1.63T resilvered, 73.61% done
config:

  NAME  STATE  READ WRITE CKS  UM
  bhoot  ONLINE  0  0  0
  raidz2-0  ONLINE  0  0  0
  gptid/cce1f2ca-e4d8-11e4-b39d-f07959376c84  ONLINE  0  0  0
  gptid/cd427285-e4d8-11e4-b39d-f07959376c84  ONLINE  0  0  0
  gptid/cda4f9a1-e4d8-11e4-b39d-f07959376c84  ONLINE  0  0  0
  gptid/ce06b19f-e4d8-11e4-b39d-f07959376c84  ONLINE  0  0  0
  gptid/ce69a75d-e4d8-11e4-b39d-f07959376c84  ONLINE  0  0  0
  gptid/4cfc6072-b5e0-11e5-8742-f07959376c84  ONLINE  0  0  0  (resilvering)
  gptid/cf2dd08e-e4d8-11e4-b39d-f07959376c84  ONLINE  0  0  0
  gptid/cf91d6e8-e4d8-11e4-b39d-f07959376c84  ONLINE  0  0  0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: none requested
config:

  NAME  STATE  READ WRITE CKSUM
  freenas-boot  ONLINE  0  0  0
  da0p2  ONLINE  0  0  0

errors: No known data errors


Could the OP be suffering from the "Evil Ram" problem?
@Beeth please read @cyberjock guide on what to do and what not to do. https://forums.freenas.org/index.ph...ning-vdev-zpool-zil-and-l2arc-for-noobs.7775/
Personally I think a raidZ is as good as a stripe. Almost all disk failures will lose data since the parity is completely lost and chksums can't function. As @jgreco said and I quote "You're only under threat of losing data if ZFS is not able to retrieve your data from *somewhere*. Two disks being totally removed from a RAIDZ2 eliminates redundancy, and ANY read error on the remaining disks could result in an uncorrectable error. That's why for a larger array, we use RAIDZ3 ... because experience says failures like to crop up in small batches."
This was an answer in a question I asked https://forums.freenas.org/index.ph...oss-of-two-hdds-in-raid-z2.40476/#post-254820
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I find it curious that @Beeth has not seen fit to give us the information we need to assess his problem.
 

Bhoot

Patron
Joined
Mar 28, 2015
Messages
241
I find it curious that @Beeth has not seen fit to give us the information we need to assess his problem.
Maybe he's just a little (maybe not a little) sad. I remember the first time I had a problem with the boot drive. HP 16gb USB stick failed within the first month of building my freenas system. Had just finished dumping multiple external disks onto the machine, and suddenly the machine wouldn't start. I was at that time running the system completely headless, so I didn't even know it was a boot drive failure. I went into a semi depressed state trying to figure out why the headless system wasn't working :D. Finally picked a used monitor with a defective power button. Now I literally pull the power cable to switch it off.
The guy probably hasn't read a lot about freenas before building his system. Member since tuesday and only posted on this thread. @Beeth you should be active on the forums once you have declared a problem. THis is one of the most active forums I have seen with the gurus and admins being online most of the times. Their solution times/advice is faster than calling a number and staying online, but they can only help if you are cooperative/active.

No offenses meant to any :)
 

Beeth

Cadet
Joined
Jan 5, 2016
Messages
5
Your hardware please. Motherboard model, amount and type of RAM, etc.

Hi DrKK, Bhoot,

Thank you for your comments and I feel sorry for not having updated the post sooner.

I used 12GB non-ECC RAM for the NAS. I did understand the importance of ECC RAM for filesystem/storage but this is my homebuilt NAS as a backup for my main Synology NAS, so I was thinking to take the risk which I thought it should be low, especially when talking about the risk to lose all the data, well it looked I was wrong.

I did run a day long of memtest86 towards the system for the total 12GB non-ECC RAM, no single error has been found. I understand this may not prove/guarantee the root cause of the problem was not because of the bad DIMM. However, I thought the chance was really low until someone can convince me it was about the bad RAM, from the system log or somewhere I can check. If not, how can you know the same problem won't happen again when I go with the right ECC RAM, or is it just a bug of ZFS?

Also, what confused me most, as I mentioned in my original post, I can successfully import/mount the pool in read-only mode, and I could copy data out without any problem(not the whole 5TB of data, but at least several hundred GBs of data, if some files got corrupted I don't care much but so far I haven't found any). I found other post in the forum about such readonly import issue, without solution. Why?? What corrupted metadata made ZFS to allow me to mount readonly but not readwrite, what exactly happened? Any chance this can be fixed so that I can do a normal zpool import?

I believe not only ZFS, but all other filesystems like NTFS, ext3/ext4, also rely on ECC RAM for reliability. The thing ZFS surprised me was that I was in fact taking the risk to lose all the data instead of just some, as "cyberjock" mentioned in his great article at https://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/, the "all or none", and now it's real! Even if I use raidz2 instead of raidz1, no much help if I still keep my current specs(non-ECC). For my own purpose, a backup NAS for the main one, I really want to reconsider using ZFS again, maybe try SnapRAID, what do you think? I really appreciate any comments and suggestions.

Thanks again!
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I suspected the issue was INSUFFICIENT ram, rather than ECC vs non-ECC, which frankly, shouldn't make a difference in this case.

12GB sounds like plenty of RAM to mount this pool. I am not sure why it's not mountable writeable. Perhaps @cyberjock will have an idea.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
@Beeth ,

If I had to guess, I'd say you lost one disk and another disk had some errors, silent corruption, or something like that. It's like playing russian roulette with your data when using RAIDZ1. All you need is a few errors in the right (or wrong, depending on how you look at it) places and poof, it's all gone. ZFS is very much an "all or none" as you had quoted me as saying before. It's a fairly dangerous game to play when you have no redundancy left. For me, I'd gladly pay for that extra disk and go with RAIDZ2 than stick with RAIDZ1.

In your case, it looks like this is an archive/backup system. For you, the data may be easily replacable and aside from a little inconvenience, you lost nothing that is very important. So you lost 'who knows how many' hours troubleshooting and later recovering. Is the inconvenience, the time spent troubleshooting, and later the recovery worth the cost of an extra disk? Generally, the answer is "yes". In fact, when this kind of thing happens, nobody ever says 'I'll gladly rebuild my RAIDZ1 and do this all over again'. It blew up in their face, they see the path to the light, and the go RAIDZ2. ;)

RAIDZ2 is still useful, even if you stick with non-ECC RAM. The two are totally different issues and resolve different problems. RAIDZ2 protects you with an additional disk of redundancy. ECC RAM protects you from RAM-based corruption. You can go with 25-disk mirrors (have 25 copies of the data on 25 different disks) and your RAM going bad will still potentially corrupt all 25 copies on all 25 disks, redering your zpool just as broken. So go with the RAIDZ2, but remember that bad RAM will still take down your zpool just as quickly and as painfully as what you just experienced.

Unfortunately, you are correct that a RAM test can't really prove that RAM was the fault, but it didn't rule it out. So you still have that unknown factor you can never rule out. :/

And let me say that you have my sympathies for your data loss. It's never fun. :(
 

Beeth

Cadet
Joined
Jan 5, 2016
Messages
5
@cyberjock,

Thanks for the comments and advise! I think I will go with RAIDZ2 if I decide to keep using ZFS for the new build.

In fact, I didn't lose my data, because I could still import/mount the pool and got access to all the data(at least so far I haven't find any issue when I copied data out, although no way I can make sure no data impact).

When I found the pool can be mounted readonly, I thought the pool can recover by itself. I still don't understand why ZFS doesn't try to fix/repair the pool like this. What extra metadata the pool needs for readwrite, other than those metadata for readonly pool?

Thanks again!

@Beeth ,

If I had to guess, I'd say you lost one disk and another disk had some errors, silent corruption, or something like that. It's like playing russian roulette with your data when using RAIDZ1. All you need is a few errors in the right (or wrong, depending on how you look at it) places and poof, it's all gone. ZFS is very much an "all or none" as you had quoted me as saying before. It's a fairly dangerous game to play when you have no redundancy left. For me, I'd gladly pay for that extra disk and go with RAIDZ2 than stick with RAIDZ1.

In your case, it looks like this is an archive/backup system. For you, the data may be easily replacable and aside from a little inconvenience, you lost nothing that is very important. So you lost 'who knows how many' hours troubleshooting and later recovering. Is the inconvenience, the time spent troubleshooting, and later the recovery worth the cost of an extra disk? Generally, the answer is "yes". In fact, when this kind of thing happens, nobody ever says 'I'll gladly rebuild my RAIDZ1 and do this all over again'. It blew up in their face, they see the path to the light, and the go RAIDZ2. ;)

RAIDZ2 is still useful, even if you stick with non-ECC RAM. The two are totally different issues and resolve different problems. RAIDZ2 protects you with an additional disk of redundancy. ECC RAM protects you from RAM-based corruption. You can go with 25-disk mirrors (have 25 copies of the data on 25 different disks) and your RAM going bad will still potentially corrupt all 25 copies on all 25 disks, redering your zpool just as broken. So go with the RAIDZ2, but remember that bad RAM will still take down your zpool just as quickly and as painfully as what you just experienced.

Unfortunately, you are correct that a RAM test can't really prove that RAM was the fault, but it didn't rule it out. So you still have that unknown factor you can never rule out. :/

And let me say that you have my sympathies for your data loss. It's never fun. :(
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
@cyberjock

Can you speak a little more on what circumstances would cause a pool to be mountable read-only versus read-write? I am interested in that.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
Problems with the spacemap (needed for writes) can cause the pool to only work read-only.

Had this pool ever been scrubbed? Even with no redundancy, metadata corruption also seems unlikely to me, given that ZFS supposedly stores it twice.
 
Status
Not open for further replies.
Top