Pool unlock fails with geli.key

involut

Dabbler
Joined
Feb 27, 2020
Messages
23
Hi,

I have recently set up a bare metal FreeNAS server, configured everything and put data on the system. The version I have used was 11.2 U2. I have created a passphrase and downloaded the geli.key file.

Since then, I have moved to a 11.3 U1 on a VM. I have loaded the config file upon finishing the installation, reworked the network interface slightly and gained access to the web interface. The pools are displayed properly, the disks show up as ada0-ada3 as they were on the bare metal setup. When trying to unlock the pool with the .key file, I am receiving the following error:

[EFAULT] Pool could not be imported: 4 devices failed to decrypt.

More info + reveals this:
Error: concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.7/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 95, in main_worker
res = loop.run_until_complete(coro)
File "/usr/local/lib/python3.7/asyncio/base_events.py", line 579, in run_until_complete
return future.result()
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 51, in _run
return await self._call(name, serviceobj, methodobj, params=args, job=job)
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 43, in _call
return methodobj(*params)
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 43, in _call
return methodobj(*params)
File "/usr/local/lib/python3.7/site-packages/middlewared/schema.py", line 965, in nf
return f(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/zfs.py", line 382, in import_pool
zfs.import_pool(found, found.name, options, any_host=any_host)
File "libzfs.pyx", line 369, in libzfs.ZFS.__exit__
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/zfs.py", line 380, in import_pool
raise CallError(f'Pool {name_or_guid} not found.', errno.ENOENT)
middlewared.service_exception.CallError: [ENOENT] Pool 8982781163759361897 not found.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/pool.py", line 1660, in unlock
'cachefile': ZPOOL_CACHE_FILE,
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1127, in call
app=app, pipes=pipes, job_on_progress_cb=job_on_progress_cb, io_thread=True,
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1074, in _call
return await self._call_worker(name, *args)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1094, in _call_worker
return await self.run_in_proc(main_worker, name, args, job)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1029, in run_in_proc
return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1003, in run_in_executor
return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
middlewared.service_exception.CallError: [ENOENT] Pool 8982781163759361897 not found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/middlewared/job.py", line 349, in run
await self.future
File "/usr/local/lib/python3.7/site-packages/middlewared/job.py", line 386, in __run_body
rv = await self.method(*([self] + args))
File "/usr/local/lib/python3.7/site-packages/middlewared/schema.py", line 961, in nf
return await f(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/pool.py", line 1672, in unlock
raise CallError(msg)
middlewared.service_exception.CallError: [EFAULT] Pool could not be imported: 4 devices failed to decrypt.

If there is any additional information that you will need to investigate, I will happily provide more details.

Best,
Michael
 
Joined
Oct 18, 2018
Messages
969
I have recently set up a bare metal FreeNAS server, configured everything and put data on the system. The version I have used was 11.2 U2. I have created a passphrase and downloaded the geli.key file.
Did you also happen to add and download the recovery key?

My very first thing would be to put geli.key into the correct place for your system to use it for your pool. Check the db for the file name. sqlite3 /data/freenas-v1.db 'select vol_encryptkey from storage_volume where vol_name = "<pool>";'. Then, check if that file exists in /data/geli. If it does, make a backup of it. cp /data/geli/<key>.key /data/geli/<key>.key.bak. Now, copy the key you downloaded to that same file name cp geli.key /data/geli/<key>.key. Now try unlocking the pool, does it work? It not, continue below.

Since you went from bare metal to vm I'd first check the drives your system expects to have in the pool vs the drives your system sees.
sqlite3 /data/freenas-v1.db 'select sed.encrypted_provider from storage_encrypteddisk as sed join storage_volume sv on sv.id = sed.encrypted_volume_id where sv.vol_name = "<pool>";'. Compare that with what you see in /dev/gptid.

If that succeeds attempt to unlock them manually with your geli kei geli attach -k /path/to/geli.key /dev/gptid/<device_id>. Try that for every disk in the pool, it seems like you have 4 disks. If those all work then try zpool import. If your pool is listed and your drives unlock then you know you have the correct key and the pool is healthy. I would consider filing a bug. If you cannot unlock your disks you may have the wrong key. If the correct device is not listed under /dev/gptid I would look closer at your vm settings to be sure FreeNAS it getting unaltered access to the disks. You can, of course, use the CLI to import your pool but this is not advised as doing so leads to FreeNAS's database being out of date with the pool itself, its state, whether it is encrypted, etc. You could access the data for backups etc.
 

involut

Dabbler
Joined
Feb 27, 2020
Messages
23
Hi Philo,

Thank you very much for reaching out to me.

I have proceeded in the meantime. I managed to unlock my pool using the recovery key. The pool perfectly unlocks and the drives + shares are visible.

However, once I reboot FreeNAS the pool is again locked, and I need to manually unlock it using the recovery key. Is there a way to boot FreeNAS without having to manually log in and unlock the pool? I know this was possible when I was on bare metal.
 
Joined
Oct 18, 2018
Messages
969
Freenas uses two keys to unlock your pool. Only one of the keys is stored on the system and is typically used to unlock the pool at boot time or when you enter the passphrase. If this key|passphrase combo is not working you can provide the recovery key. This will unlock your pool but the system does NOT store the recovery key. Thus, when you reboot the machine it does not have access to a key to unlock the pool; hence why it won't unlock your pool automatically.

To fix this you need to rekey your pool AND regenerate the recovery key. Be sure you do both, do not do only one or the other. If you were using a passphrase, reset the passphrase as well. When you are done, download BOTH keys and store them somewhere safe. Your old keys will no longer be valid and these new keys will be the keys to your pool moving forward. See the instructions for your version of FreeNAS for how to rekey your pool and add a recovery key.
 

involut

Dabbler
Joined
Feb 27, 2020
Messages
23
That's great.

I just tried to rest my keys, following the 'Reset Keys' procedure. I downloaded both the encryption key and the recovery key. After reboot, the pool is again locked. I was however able to unlock the pool with the newly created recovery key.

Am I missing something fundamentally?
 
Joined
Oct 18, 2018
Messages
969
Check out this resource I made, it may help. Are you using a passphrase? Could you show the output here of any commands you run from the resource page?
 

involut

Dabbler
Joined
Feb 27, 2020
Messages
23
Here is my follow up:

Entering the first command I am not getting any return:
Code:
root@saturn[~]# sqlite3 /data/freenas-v1.db 'select sed.encrypted_provider from storage_encrypteddisk as sed join storage_volume sv on sv.id = sed.encrypted_volume_id where sv.vol_name = "{pool_name}";'
root@saturn[~]#


I thought pool_name is the actual name of the pool, that is why I have replace it with core, the pool's name. Still, no return:
Code:
root@saturn[~]# sqlite3 /data/freenas-v1.db 'select sed.encrypted_provider from storage_encrypteddisk as sed join storage_volume sv on sv.id = sed.encrypted_volume_id where sv.vol_name = "{core}";'
root@saturn[~]#



Code:
root@saturn[~]# gpart show
=>    33  294097  cd0  MBR  (574M)
      33  294097       - free -  (574M)

=>        40  7814037088  ada0  GPT  (3.6T)
          40          88        - free -  (44K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)

=>        40  7814037088  ada1  GPT  (3.6T)
          40          88        - free -  (44K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)

=>        40  7814037088  ada2  GPT  (3.6T)
          40          88        - free -  (44K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)

=>        40  7814037088  ada3  GPT  (3.6T)
          40          88        - free -  (44K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)

=>      40  16777136  da0  GPT  (8.0G)
        40    532480    1  efi  (260M)
    532520  16220160    2  freebsd-zfs  (7.7G)
  16752680     24496       - free -  (12M)

=>    33  294097  iso9660/FREENAS  MBR  (574M)
      33  294097                   - free -  (574M)



The same weird behaviour is with these commands:
Code:
root@saturn[~]# sqlite3 /data/freenas-v1.db 'select vol_encryptkey from storage_volume where vol_name = "{pool}";'
root@saturn[~]# sqlite3 /data/freenas-v1.db 'select vol_encryptkey from storage_volume where vol_name = "{core}";'
root@saturn[~]#



Note: while running this command, the pool is locked. So this is entering the above commands directly after reboot without manually unlocking the pool with the recovery key.
 
Joined
Oct 18, 2018
Messages
969
You need to replace all of {pool_name}".

This command
Code:
root@saturn[~]# sqlite3 /data/freenas-v1.db 'select sed.encrypted_provider from storage_encrypteddisk as sed join storage_volume sv on sv.id = sed.encrypted_volume_id where sv.vol_name = "{pool_name}";'


Should become this command
Code:
root@saturn[~]# sqlite3 /data/freenas-v1.db 'select sed.encrypted_provider from storage_encrypteddisk as sed join storage_volume sv on sv.id = sed.encrypted_volume_id where sv.vol_name = "core";'
Code:

I have updated my resource accordingly.
 

involut

Dabbler
Joined
Feb 27, 2020
Messages
23
I am sorry, I feel completely stupid now... This is my follow up:

Code:
root@saturn[~]# sqlite3 /data/freenas-v1.db 'select sed.encrypted_provider from storage_encrypteddisk as sed join storage_volume sv on sv.id = sed.encrypted_volume_id where sv.vol_name = "core";'
gptid/196085e2-1152-11ea-957b-3cecef012038
gptid/1a3b1dad-1152-11ea-957b-3cecef012038
gptid/1b1ee326-1152-11ea-957b-3cecef012038
gptid/1be79663-1152-11ea-957b-3cecef012038
root@saturn[~]#


This looks good. All 4 drives which build the core pool are visble. This aligns with the user interface, that all drives are being detected.

Code:
root@saturn[~]# gpart show
=>    33  294097  cd0  MBR  (574M)
      33  294097       - free -  (574M)

=>        40  7814037088  ada0  GPT  (3.6T)
          40          88        - free -  (44K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)

=>        40  7814037088  ada1  GPT  (3.6T)
          40          88        - free -  (44K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)

=>        40  7814037088  ada2  GPT  (3.6T)
          40          88        - free -  (44K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)

=>        40  7814037088  ada3  GPT  (3.6T)
          40          88        - free -  (44K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)

=>      40  16777136  da0  GPT  (8.0G)
        40    532480    1  efi  (260M)
    532520  16220160    2  freebsd-zfs  (7.7G)
  16752680     24496       - free -  (12M)

=>    33  294097  iso9660/FREENAS  MBR  (574M)
      33  294097                   - free -  (574M)


This looks fine, too.

Code:
root@saturn[~]# ls -al /dev/gptid
total 1
dr-xr-xr-x   2 root  wheel      512 Mar 21 21:42 .
dr-xr-xr-x  12 root  wheel      512 Mar 21 21:42 ..
crw-r-----   1 root  operator  0x93 Mar 21 21:42 196085e2-1152-11ea-957b-3cecef012038
crw-r-----   1 root  operator  0x95 Mar 21 21:42 1a3b1dad-1152-11ea-957b-3cecef012038
crw-r-----   1 root  operator  0x97 Mar 21 21:42 1b1ee326-1152-11ea-957b-3cecef012038
crw-r-----   1 root  operator  0x99 Mar 21 21:42 1be79663-1152-11ea-957b-3cecef012038
crw-r-----   1 root  operator  0x9a Mar 21 21:42 81c87f42-656c-11ea-8728-000c2913c66e


Now this is where it's beginning to get interesting. On the bare metal setup I had a SSD installed and set up as a second pool. Looking at the IDs the SSD is trying to refer to 81c87f42-656c-11ea-8728-000c2913c66e.

Code:
root@saturn[~]# sqlite3 /data/freenas-v1.db 'select vol_encryptkey from storage_volume where vol_name = "core";'
7f0b1fbc-8246-419d-9548-c50f0b3757f5


However, I have already tried to copy a regular key into /dev/geli. I just took the pool_core_encryption.key and renamed the file to the expected ID 7f0b1fbc-8246-419d-9548-c50f0b3757f5.key.
Should I usre the recovery key for that? I do also have set up a passphrase with the re-key procedure.

Code:
root@saturn[~]# ls -la /data/geli
total 5
drwxr-xr-x  2 root  www   4 Mar 19 20:57 .
drwxr-xr-x  8 www   www  14 Mar 21 21:43 ..
-rw-r--r--  1 root  www  64 Mar 13 22:02 085ff1fa-ff96-4060-a470-843bc0857e9f.key.backup
-rw-r--r--  1 root  www  64 Mar 19 20:57 7f0b1fbc-8246-419d-9548-c50f0b3757f5.key


What do you suggest? Thank you very much for your help!
 

involut

Dabbler
Joined
Feb 27, 2020
Messages
23
In addition, I have tried with a fresh VM instance of FreeNAS. I did not import the configuration file.
I only imported the pool through the web interface which worked perfectly fine. I rebooted FreeNAS. After the reboot the pool was locked again.

This is the output:

Code:
root@freenas[~]# sqlite3 /data/freenas-v1.db 'select sed.encrypted_provider from storage_encrypteddisk as sed join storage_volume sv on sv.id = sed.encrypted_volume_id where sv.vol_name = "core";'
gptid/1be79663-1152-11ea-957b-3cecef012038
gptid/1b1ee326-1152-11ea-957b-3cecef012038
gptid/1a3b1dad-1152-11ea-957b-3cecef012038
gptid/196085e2-1152-11ea-957b-3cecef012038


Code:
root@freenas[~]# ls -la /dev/gptid
total 1
dr-xr-xr-x   2 root  wheel      512 Mar 23 14:46 .
dr-xr-xr-x  11 root  wheel      512 Mar 23 14:46 ..
crw-r-----   1 root  operator  0x89 Mar 23 14:46 196085e2-1152-11ea-957b-3cecef012038
crw-r-----   1 root  operator  0xb1 Mar 23 14:51 196085e2-1152-11ea-957b-3cecef012038.eli
crw-r-----   1 root  operator  0x8b Mar 23 14:46 1a3b1dad-1152-11ea-957b-3cecef012038
crw-r-----   1 root  operator  0xb0 Mar 23 14:51 1a3b1dad-1152-11ea-957b-3cecef012038.eli
crw-r-----   1 root  operator  0x97 Mar 23 14:46 1b1ee326-1152-11ea-957b-3cecef012038
crw-r-----   1 root  operator  0xaf Mar 23 14:51 1b1ee326-1152-11ea-957b-3cecef012038.eli
crw-r-----   1 root  operator  0x99 Mar 23 14:46 1be79663-1152-11ea-957b-3cecef012038
crw-r-----   1 root  operator  0xae Mar 23 14:51 1be79663-1152-11ea-957b-3cecef012038.eli
crw-r-----   1 root  operator  0x9a Mar 23 14:46 3c6cf6c5-6d4d-11ea-b9b5-000c2977edf4


Code:
root@freenas[~]# sqlite3 /data/freenas-v1.db 'select vol_encryptkey from storage_volume where vol_name = "core";'
2abe7f52-9d10-42e9-b64a-849d2ea619ac


Code:
root@freenas[~]# ls -la /data/geli
total 3
drwxr-xr-x  2 root  www   3 Mar 23 14:44 .
drwxr-xr-x  7 www   www  12 Mar 23 14:51 ..
-rw-r--r--  1 root  www  64 Mar 23 14:44 2abe7f52-9d10-42e9-b64a-849d2ea619ac.key
 
Joined
Oct 18, 2018
Messages
969
However, I have already tried to copy a regular key into /dev/geli. I just took the pool_core_encryption.key and renamed the file to the expected ID 7f0b1fbc-8246-419d-9548-c50f0b3757f5.key.
Should I usre the recovery key for that? I do also have set up a passphrase with the re-key procedure.
When possible, I recommend you use the main key. This is because even though geli (the underlying encryption software) doesn't differentiate between the two keys, FreeNAS does. If you have to use the recovery key in /data/geli in place of the regular key I would suggest you rekey and re-enter the passphrase.

In addition, I have tried with a fresh VM instance of FreeNAS. I did not import the configuration file.
I only imported the pool through the web interface which worked perfectly fine. I rebooted FreeNAS. After the reboot the pool was locked again.
Does your pool make use of a passphrase? If makes use of a passphrase the pool will not unlock automatically.

Code:
crw-r----- 1 root operator 0x89 Mar 23 14:46 196085e2-1152-11ea-957b-3cecef012038
crw-r----- 1 root operator 0xb1 Mar 23 14:51 196085e2-1152-11ea-957b-3cecef012038.eli
crw-r----- 1 root operator 0x8b Mar 23 14:46 1a3b1dad-1152-11ea-957b-3cecef012038
crw-r----- 1 root operator 0xb0 Mar 23 14:51 1a3b1dad-1152-11ea-957b-3cecef012038.eli
crw-r----- 1 root operator 0x97 Mar 23 14:46 1b1ee326-1152-11ea-957b-3cecef012038
crw-r----- 1 root operator 0xaf Mar 23 14:51 1b1ee326-1152-11ea-957b-3cecef012038.eli
crw-r----- 1 root operator 0x99 Mar 23 14:46 1be79663-1152-11ea-957b-3cecef012038
crw-r----- 1 root operator 0xae Mar 23 14:51 1be79663-1152-11ea-957b-3cecef012038.eli
All of the devices above which have the .eli are already unlocked.

In order to simplify debugging it may be helpful if you picked one installation and stuck with it. If you cannot get the pool to unlock via the GUI I would check that the disks are unlocked by first identifying which disks you want by looking in the storage_encrypteddisk table as you did above. Then, check if the disks are all in /dev/gptid. If they are, but none have the .eli suffix your disks are not unlocked. In that case try my manual unlock instructions from the resource to check your key. If they are unlocked, check if they are visible to freenas with the command zpool status.
 

involut

Dabbler
Joined
Feb 27, 2020
Messages
23
Thank you very much for your thorough feedback and help. I have figured it out. The problem was indeed the passphrase. I did a re-key without setting a passphrase, unlocked again with the newly created key, rebooted and voila.

Thank you very much @PhiloEpisteme
 
Joined
Oct 18, 2018
Messages
969
Glad things are working. Do be sure that since you did a rekey that you regenerate the recovery key and download both the new key and the new recovery key and keep them safe.
 
Joined
Oct 18, 2018
Messages
969
Hi @involut, I wanted to follow up with something I recently learned. There exists a significant bug in 11.3-U1 and 11.3-U2 where if you have a locked, encrypted pool and export a different pool it will wipe your locked, encrypted pool. You can find the bug report here; the fix here; and release notes referring to it in 11.3-U2.1.
 
Top