Thibaut
Dabbler
- Joined
- Jun 21, 2014
- Messages
- 33
Hello, this is somewhat related to this other thread (89487), but I think it deserves a separate discussion since it pinpoints an error occurring in the TrueNAS interface.
I'm using TrueNAS-12.0-U4, but the problem has occurred on earlier v12 versions also, I can't remember whether it happened under v11 as well but I think it did.
The situation:
A drive fails in a raidz pool and the system outputs an alert letting know that it has to be replaced.
The failed hard disk is referred as UNAVAILABLE in the TrueNAS interface:
The failed hard disk is put offline using the menu and confirmed in the dialog that pops up:
The failed hard disk is then taken out of the server, and physically replaced with a new hard disk that then appears in the Storage > Disks list:
Finally, the reference to the failed disk is replaced in the pool's disks list:
The replacement process starts:
The problem:
After a moment, an error is displayed:
The full content of the error report is:
Although at this point it could seem that the whole disk replacement process has failed, the resilvering effectively gets started, as can be seen from the CLI with the
At that point, the TrueNAS web interface does NOT report the silvering being in progress.
After a moment, reloading the TrueNAS web page shows that a resilvering is going on:
Reproducibility:
From what I experimented so far, this error doesn't occur each time a disk is replaced. Out of three disks replaced so far under v12, two have shown this problem but one went without a glitch.
I currently have no clue about what is causing the problem and why it happens, or whether it is some problem in the TrueNAS code?
Any idea regarding this would be welcome.
Thank you.
I'm using TrueNAS-12.0-U4, but the problem has occurred on earlier v12 versions also, I can't remember whether it happened under v11 as well but I think it did.
The situation:
A drive fails in a raidz pool and the system outputs an alert letting know that it has to be replaced.
The failed hard disk is referred as UNAVAILABLE in the TrueNAS interface:
The failed hard disk is put offline using the menu and confirmed in the dialog that pops up:
The failed hard disk is then taken out of the server, and physically replaced with a new hard disk that then appears in the Storage > Disks list:
Finally, the reference to the failed disk is replaced in the pool's disks list:
The replacement process starts:
The problem:
After a moment, an error is displayed:
The full content of the error report is:
Code:
Error: concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 277, in replace target.replace(newvdev) File "libzfs.pyx", line 391, in libzfs.ZFS.__exit__ File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 277, in replace target.replace(newvdev) File "libzfs.pyx", line 2060, in libzfs.ZFSVdev.replace libzfs.ZFSException: no such pool or dataset During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.9/concurrent/futures/process.py", line 243, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 94, in main_worker res = MIDDLEWARE._run(*call_args) File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 45, in _run return self._call(name, serviceobj, methodobj, args, job=job) File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 39, in _call return methodobj(*params) File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 39, in _call return methodobj(*params) File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 977, in nf return f(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 279, in replace raise CallError(str(e), e.code) middlewared.service_exception.CallError: [EZFS_NOENT] no such pool or dataset """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 367, in run await self.future File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 403, in __run_body rv = await self.method(*([self] + args)) File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 973, in nf return await f(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool_/replace_disk.py", line 122, in replace raise e File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool_/replace_disk.py", line 102, in replace await self.middleware.call( File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1241, in call return await self._call( File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1206, in _call return await self._call_worker(name, *prepared_call.args) File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1212, in _call_worker return await self.run_in_proc(main_worker, name, args, job) File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1139, in run_in_proc return await self.run_in_executor(self.__procpool, method, *args, **kwargs) File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1113, in run_in_executor return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs)) middlewared.service_exception.CallError: [EZFS_NOENT] no such pool or dataset
Although at this point it could seem that the whole disk replacement process has failed, the resilvering effectively gets started, as can be seen from the CLI with the
zpool status mypool
command:At that point, the TrueNAS web interface does NOT report the silvering being in progress.
After a moment, reloading the TrueNAS web page shows that a resilvering is going on:
Reproducibility:
From what I experimented so far, this error doesn't occur each time a disk is replaced. Out of three disks replaced so far under v12, two have shown this problem but one went without a glitch.
I currently have no clue about what is causing the problem and why it happens, or whether it is some problem in the TrueNAS code?
Any idea regarding this would be welcome.
Thank you.
Last edited: