Replication error since upgrade to TrueNAS-12.0-U8

Hx Jai

Dabbler
Joined
Dec 4, 2016
Messages
29
Ever since I upgraded, my replication task has started failing with the below error... Oddly, when I run it manually, I get a success with message "No snapshots to send for replication task 'task_1' on dataset 'xyz'".

Any advice on how to debug this?

Error message on task...

>
[2022/04/05 01:00:00] INFO [replication_task__task_1] [zettarepl.replication.run] For replication task 'task_1': doing push from 'xyz' to 'usbsea45t/lib' of snapshot='auto-2022-04-05_01-00' incremental_base='auto-2022-04-04_01-00' receive_resume_token=None encryption=False
[2022/04/05 01:00:02] ERROR [replication_task__task_1] [zettarepl.replication.run] For task 'task_1' unhandled replication error ExecException(1, '')
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py", line 164, in run_replication_tasks
retry_stuck_replication(
File "/usr/local/lib/python3.9/site-packages/zettarepl/replication/stuck.py", line 18, in retry_stuck_replication
return func()
File "/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py", line 165, in <lambda>
lambda: run_replication_task_part(replication_task, source_dataset, src_context, dst_context,
File "/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py", line 258, in run_replication_task_part
run_replication_steps(step_templates, observer)
File "/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py", line 592, in run_replication_steps
replicate_snapshots(step_template, incremental_base, snapshots, encryption, observer)
File "/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py", line 687, in replicate_snapshots
run_replication_step(step, observer)
File "/usr/local/lib/python3.9/site-packages/zettarepl/replication/run.py", line 764, in run_replication_step
ReplicationProcessRunner(process, monitor).run()
File "/usr/local/lib/python3.9/site-packages/zettarepl/replication/process_runner.py", line 33, in run
raise self.process_exception
File "/usr/local/lib/python3.9/site-packages/zettarepl/replication/process_runner.py", line 37, in _wait_process
self.replication_process.wait()
File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/local.py", line 164, in wait
self.async_exec.wait()
File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/async_exec_tee.py", line 103, in wait
raise ExecException(exit_event.returncode, self.output)
zettarepl.transport.interface.ExecException: Command failed with code 1
 

Hx Jai

Dabbler
Joined
Dec 4, 2016
Messages
29
This issue took a few different forms. Updating only the snapshot schedule caused a different error about incremental snapshots not existing, and new full snapshot not being allowed. Then after recreating the jobs, I got the error that the replication failed because the target had its own encryption root.

...in the end, I just deleted the snapshot jobs, the replication jobs, and the target dataset, and then recreated everything, and it worked.

Given that it just started failing randomly after the upgrade, I suspect there's an upgrade error somewhere for encrypted local snapshot replication jobs.
 
Top