CRITICAL: The volume Main (ZFS) state is UNKNOWN after replacing a drive

Status
Not open for further replies.

Ahmed Esmat

Cadet
Joined
Nov 25, 2014
Messages
5
I replaced a failed drive and now getting the error:
CRITICAL: The volume Main (ZFS) state is UNKNOWN:

Although I did changed the failed drive to OFFLINE before shutting down the system.

Volume "Main" appears only when I try to extend a volume
When trying to add a drive, I can see that 3 drives her... I just canceled, done nothing

I did not attempt writing any data to the volume since it became degraded, and was functioning in a normal way just before switching off the server.
 

Attachments

  • Fullscreen capture 232017 40055 PM.bmp.jpg
    Fullscreen capture 232017 40055 PM.bmp.jpg
    58.5 KB · Views: 442
  • Fullscreen capture 232017 40637 PM.bmp.jpg
    Fullscreen capture 232017 40637 PM.bmp.jpg
    50.5 KB · Views: 280
  • Fullscreen capture 232017 42403 PM.bmp.jpg
    Fullscreen capture 232017 42403 PM.bmp.jpg
    92.3 KB · Views: 288
  • Fullscreen capture 232017 42637 PM.bmp.jpg
    Fullscreen capture 232017 42637 PM.bmp.jpg
    76.5 KB · Views: 361
  • Fullscreen capture 232017 42703 PM.bmp.jpg
    Fullscreen capture 232017 42703 PM.bmp.jpg
    76.3 KB · Views: 314
  • New Disk.jpg
    New Disk.jpg
    63.4 KB · Views: 296
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
It appears you suffered a second disk failure after shutdown. With 2 out of 3 drives
failed, you appear to have lost your pool.

That said, it's possible you removed the wrong drive. My suggestion is this:
  1. Power down gracefully, (using the GUI or CLI)
  2. Put back the bad drive.
  3. Check all the other cables, (did you accidentally remove a data cable from another drive?).
  4. Boot up.
  5. If the pool is now available, (with 1 off-lined drive), write down the serial numbers of the drives, including which one is bad.
  6. Perform a full backup, if possible. If not, perform a backup of critical files.
  7. Power down gracefully.
  8. Try to remove the failing drive again.
  9. Power back up.
This brings forth something I learned a long time ago. Never return, re-use or
destroy a failing or failed disk until you have the replacement working. I also
apply this to field engineer disk replacement. They don't get to leave until the
new disk is syncing to the old one(s).

Also, this reminds me that RAID-Z2, (and >= 3 way Mirror'ed vDevs), does
not just protect from 2 disk failures. It will also protect path failures, like bad
cables, cables accidentally removed, disks accidentally removed, and such.

Good luck.
 

Ahmed Esmat

Cadet
Joined
Nov 25, 2014
Messages
5
I still have the failed disk untouched. But it is not working at all, also when I connects it makes tik tik tik sound, even when I connect it to a USB3 adapter and try to read it on a Windows machine, it makes the same sound and not recognized at all. Anyway, I did as you suggested, but the disk is not recognized at all by freenas.

I'm sure that the other two disks are intact. But I did something I'm not sure if could be the reason. After I witched off the freenas from the GUI after taking Offline the bad disk, I removed it and tested it using the USB3 adapter just to double check. I also removed one of the good disks attached it to the USB3 adapter to check if it will be recognized by Windows. But once it is recognized, I removed it and placed it back in my Freenas server in the same place. Could be the reason, would Windows write something to the disk? if so, can this be fixed?

Please let me know what other things to try, my data is so important. I'm so sure that the 2 other disks are so healthy.
 
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
...
But I did something I'm not sure if could be the reason. After I witched off the freenas from the GUI after taking Offline the bad disk, I removed it and tested it using the USB3 adapter just to double check. I also removed one of the good disks attached it to the USB3 adapter to check if it will be recognized by Windows. But once it is recognized, I removed it and placed it back in my Freenas server in the same place. Could be the reason, would Windows write something to the disk? if so, can this be fixed?
...
I don't know if MS-Windows did something to the disk. My experience with MS-Windows is limited.

However, if the disks are the same model and size, you can check the partition table in FreeNAS and
see if the still good disk and the removed, (but thought to be good), disk have the same partition table.
If so, good. If not, then the good but removed disk may have been corrupted by using it in MS-Windows.
Don't know how to fix that.

I think these commands would help show if they are different. And should not change anything.

diskinfo -v DEVICE
gpart show DEVICE
 
Last edited:

Ahmed Esmat

Cadet
Joined
Nov 25, 2014
Messages
5
Here are the outputs from the commands provided. ada0 is my new disk, ada1 and ada2 are the old ones which are good.
Have some questions here:
- gpart is not showing anything for my ada0. Is that because it is the new one?
- gpart is not showing anything for my ada2, while it should be a good disk, why is that?
What proves that the removed disk is really the bad one is that I was still able to read my files after taking it OFFLINE, for example I tried to play a movie.
Fullscreen capture 242017 120616 PM.bmp.jpg Fullscreen capture 242017 115959 AM.bmp.jpg Fullscreen capture 242017 120430 PM.bmp.jpg
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
@Ahmed Esmat,
For the ada0 disk, if it's the new, replacement disk, then yes, lack of a partition table is normal.

The ada1 disk's partition table seems to be perfectly normal for FreeNAS.

If you are certain disk ada2 is the other good disk, and was the one you put into MS-Windows
machine, then either MS-Windows seems to have done something to it. Or the USB adapter.
What it did, I would not beable to tell remotely.

But, the lack of partition table indicates that either the disk has been over-written, or perhaps just
it's partition table.

My skills with Unix in general is quite good. And very good for Solaris & Linux. But, FreeBSD,
(used by FreeNAS), not so much. So I can't walk you though a potential fix. Sorry. Perhap some
one else can help further with the information you have supplied.
 

Ahmed Esmat

Cadet
Joined
Nov 25, 2014
Messages
5
I checked the drive serial, yes, ada2 is the one I connected to Windows, I wish I have not done that :(
 

Ahmed Esmat

Cadet
Joined
Nov 25, 2014
Messages
5
Looking at FreeNAS booting screen I can see:

GEOM: ada2: the secondary GPT header is not in the last LBA
GEOM_PART: integrity check failed (ada2, GPT)
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Looks like Windows formatted the drive for you. How nice was that of them to do that. You might be able to rebuild the partitions and hope data is still there. That's not something most people would do though.

Sent from my Nexus 5X using Tapatalk
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
...
You might be able to rebuild the partitions and hope data is still there. That's not something most people would do though.
...
Yes, that is what I might try to do. Except I would see if I could perform a raw backup of the disk first.
Like with a dd piped to gzip then saved to another disk, or series of disks. Thus, if I accidentally screw
up my new partition table work, I'd be able to restore the disk back to it's current state. And then try again.

However, trying to walk someone else who is remote, through that procedure is not something I can do.
I just don't know GPT partition tables, GEOM and FreeBSD that well. Now if it was Solaris SPARC, done
something similar dozens of times.
 

nipanike

Cadet
Joined
Sep 9, 2017
Messages
2
Ahmed, have you managed to save your data?
I'm having the same problem as you after connecting the good disk to Windows.
 

pschatz100

Guru
Joined
Mar 30, 2014
Messages
1,184
Windows cannot recognize a ZFS formatted disk. There is no reason to connect a FreeNAS disk to a Windows system. I guess you have figured that out.

I'm going to guess that you tried to run a windows command, which attempted to access the disk. No go.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Theoretically, the ZFS data should still be there, you just need to recreate the partition table....

Easier said than done.
 
Status
Not open for further replies.
Top