Encrypted Pool Missing in 11.3

raidoh

Dabbler
Joined
Jul 2, 2015
Messages
12
I have a FreeNAS Mini that I've been using since 9.3. When I upgraded from 9.10 to 11.1, it crashed my encrypted pool so I had to rebuild it from the backup (it wouldn't recognize the same recovery keys I'd been using for years on 9.10). I didn't lose any data, but the encrypted pool did get rebuilt for 11.1. As I recall, when I rebuilt the encrypted pool, I could choose to use a password or recovery key but not both - I chose the key. When I boot into 11.2 U6, the pool opens up fine. It doesn't have a password, so it decrypts automatically from the recovery key. However, if I boot into 11.2 U8 or 11.3 U1 the pool won't decrypt. In both environments, it asks for the recovery key but won't take it. Then, today, rebooting into either showed the pool as completely missing but dropping back to 11.2 U6 brought it up perfectly. This seems like a solution that's already past its expiration date and prevents future upgrades. Any ideas why this might be happening?

Thanks.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Try rekeying and regenerating a new recovery key on 11.2-U6.
 

raidoh

Dabbler
Joined
Jul 2, 2015
Messages
12
Thanks for the response and idea. I rekeyed in 11.2-U6 and saved both the Encrypt Key and added a Recovery Key. I brought it up in 11.2-U6 successfully, then booted into 11.3-U1. It didn't unlock automatically so I tried both Keys but it failed to unlock the Pool. I got the following error output:

Error: concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.7/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 95, in main_worker
res = loop.run_until_complete(coro)
File "/usr/local/lib/python3.7/asyncio/base_events.py", line 579, in run_until_complete
return future.result()
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 51, in _run
return await self._call(name, serviceobj, methodobj, params=args, job=job)
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 43, in _call
return methodobj(*params)
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 43, in _call
return methodobj(*params)
File "/usr/local/lib/python3.7/site-packages/middlewared/schema.py", line 965, in nf
return f(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/zfs.py", line 382, in import_pool
zfs.import_pool(found, found.name, options, any_host=any_host)
File "libzfs.pyx", line 369, in libzfs.ZFS.__exit__
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/zfs.py", line 380, in import_pool
raise CallError(f'Pool {name_or_guid} not found.', errno.ENOENT)
middlewared.service_exception.CallError: [ENOENT] Pool 2820076106864343640 not found.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/pool.py", line 1660, in unlock
'cachefile': ZPOOL_CACHE_FILE,
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1127, in call
app=app, pipes=pipes, job_on_progress_cb=job_on_progress_cb, io_thread=True,
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1074, in _call
return await self._call_worker(name, *args)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1094, in _call_worker
return await self.run_in_proc(main_worker, name, args, job)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1029, in run_in_proc
return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1003, in run_in_executor
return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
middlewared.service_exception.CallError: [ENOENT] Pool 2820076106864343640 not found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/middlewared/job.py", line 349, in run
await self.future
File "/usr/local/lib/python3.7/site-packages/middlewared/job.py", line 386, in __run_body
rv = await self.method(*([self] + args))
File "/usr/local/lib/python3.7/site-packages/middlewared/schema.py", line 961, in nf
return await f(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/pool.py", line 1672, in unlock
raise CallError(msg)
middlewared.service_exception.CallError: [EFAULT] Pool could not be imported: 5 devices failed to decrypt.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
On 11.2-U6, have you upgraded to the latest ZFS feature flags after unlocking?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Also, on 11.2-U6, please run zdb -eC <your pool name> | grep pool_guid. Does that GUID match the ID in the traceback above, 2820076106864343640? I wonder if 11.3 is trying to unlock the pool before your rebuild?
 

raidoh

Dabbler
Joined
Jul 2, 2015
Messages
12
In 11.2-U6, it does appear to be at the latest ZFS feature flag status since the Upgrade Pool alert isn't there. No, the GUID does not match.
zdb -eC FileCabinet | grep pool_guid
pool_guid: 18412830247627684552
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
OK, that explains the mystery. Unfortunately, the only way I see you can upgrade is to:
  1. Pull your existing boot drive, and save it as your known good boot volume.
  2. Clean install 11.3-U3.2 onto a new boot volume, and then import your pool, using the recovery key you recently created.
  3. Try importing your 11.2 config. After this step, you'll need to rekey and save recovery keys again, as the ones in the config will be out of sync with the current state of the pool.

    Edited: On second thought, since the config refers to pool 2820076106864343640, don't import your config. You'll need to take screenshots and recreate your config from scratch to remove references to the missing pool.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Also, note this caveat.

 

raidoh

Dabbler
Joined
Jul 2, 2015
Messages
12
Thanks for the help. It is a GELI-encrypted drive, but there isn't a password so I think that's ok. So it's a corrupted FreeNAS boot volume?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
More like a database entry that never got updated for your rebuilt pool.
 

raidoh

Dabbler
Joined
Jul 2, 2015
Messages
12
While in 11.2-U6, I was able to delete the previous 11.2-U8 and 11.3-U1 boot environments and reperform the updates to 11.2-U8 and 11.3-U3.2 and now FreeNAS is working properly on 11.3-U3.2 and recognizing the pool. Thanks for the help.
 
Top