This job has been working for several weeks, but broke today. Main pool is DATA, backup is called BACKUP. There are existing replication jobs to replicate from DATA to BACKUP once a week. These succeed. Other snapshot jobs on the DATA pool itself also succeed. The failing job is set up to replicate from the local BACKUP pool to a "remote" TN box called MIRROR.
When the job starts it fails immediately. The job logs in /var/log/jobs are below. The folder called out in the error is mounted from the DATA pool (from the .system folder). I don't see anything mounted from BACKUP.
Any thoughts about why this is failing, and how to fix it?
Running TrueNAS Scale 22.02.3 on both boxes.
When the job starts it fails immediately. The job logs in /var/log/jobs are below. The folder called out in the error is mounted from the DATA pool (from the .system folder). I don't see anything mounted from BACKUP.
Any thoughts about why this is failing, and how to fix it?
Running TrueNAS Scale 22.02.3 on both boxes.
[2023/01/14 16:47:47] INFO [Thread-435] [zettarepl.paramiko.replication_task__task_3] Connected (version 2.0, client OpenSSH_8.4p1)
[2023/01/14 16:47:47] INFO [Thread-435] [zettarepl.paramiko.replication_task__task_3] Authentication (publickey) successful!
[2023/01/14 16:47:49] INFO [replication_task__task_3] [zettarepl.replication.pre_retention] Pre-retention destroying snapshots: []
[2023/01/14 16:47:50] INFO [replication_task__task_3] [zettarepl.replication.run] For replication task 'task_3': doing push from 'BACKUP' to 'MIRROR' of snapshot='auto-week-2023-01-13_02-30' incremental_base=None include_intermediate=False receive_resume_token=None encryption=False
[2023/01/14 16:47:50] ERROR [replication_task__task_3] [zettarepl.replication.run] For task 'task_3' unhandled replication error ExecException(1, "Warning: Permanently added the ECDSA host key for IP address '192.168.xx.yy' to the list of known hosts.\ncannot unmount '/var/db/system/syslog-cd93307f360c4818ad53abf4dac4059c': pool or dataset is busy\n")
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/zettarepl/replication/run.py", line 181, in run_replication_tasks
retry_contains_partially_complete_state(
File "/usr/lib/python3/dist-packages/zettarepl/replication/partially_complete_state.py", line 16, in retry_contains_partially_complete_state
return func()
File "/usr/lib/python3/dist-packages/zettarepl/replication/run.py", line 182, in <lambda>
lambda: run_replication_task_part(replication_task, source_dataset, src_context, dst_context,
File "/usr/lib/python3/dist-packages/zettarepl/replication/run.py", line 278, in run_replication_task_part
run_replication_steps(step_templates, observer)
File "/usr/lib/python3/dist-packages/zettarepl/replication/run.py", line 611, in run_replication_steps
replicate_snapshots(step_template, incremental_base, snapshots, include_intermediate, encryption, observer)
File "/usr/lib/python3/dist-packages/zettarepl/replication/run.py", line 652, in replicate_snapshots
run_replication_step(step, observer)
File "/usr/lib/python3/dist-packages/zettarepl/replication/run.py", line 732, in run_replication_step
ReplicationProcessRunner(process, monitor).run()
File "/usr/lib/python3/dist-packages/zettarepl/replication/process_runner.py", line 33, in run
raise self.process_exception
File "/usr/lib/python3/dist-packages/zettarepl/replication/process_runner.py", line 37, in _wait_process
self.replication_process.wait()
File "/usr/lib/python3/dist-packages/zettarepl/transport/ssh.py", line 154, in wait
stdout = self.async_exec.wait()
File "/usr/lib/python3/dist-packages/zettarepl/transport/async_exec_tee.py", line 104, in wait
raise ExecException(exit_event.returncode, self.output)
zettarepl.transport.interface.ExecException: Warning: Permanently added the ECDSA host key for IP address '192.168.xx.yy' to the list of known hosts.
cannot unmount '/var/db/system/syslog-cd93307f360c4818ad53abf4dac4059c': pool or dataset is busy