Unable to scan for ssh key for replication task.

Joined
Oct 18, 2018
Messages
969
Hi folks,

The relevant systems are as follows

Primary
FreeNAS: 11.2-U6
Pool 1: crypt
Pool 2: vault
Board: X11SSM-F
WebGUI IP: 192.168.77.250 via onboard NIC
WebGUI Port: 1970
NIC: Chelsio T520-CR direct-connected to backup on 192.168.77.250/24
FreeNAS: 11.2-U6

Backup
FreeNAS: 11.2-U6
Pool 1: vault1
Board: X9SCM-F
WebGUI IP: 192.168.77.150 via onboard NIC
WebGUI Port: 4309
NIC: Chelsio T520-CR direct-connected to primary on 192.168.77.150/24
SSH IP: 192.168.14.150
SSH Port: 31892


Very recently I had to replace every drive in the vault pool. I performed the steps as follows. I had a strange work requirement such that resilvering each drive 1 by 1 was a non-option.
  1. created the new pool tmpVault
  2. created a manual recursive snapshot and performed replication between the pools from the command line to copy the data over
  3. saved a copy of the system config
  4. exported both pools through the GUI and destroyed configs for the pools but not the data
  5. imported them manually and changed the names via zpool import tmpVault vault; zpool import vault tmpVault
  6. exported them again via CLI
  7. imported both from the GUI
  8. restored my system config
At this point all of my data is safe but my replication and snapshot tasks for vault were no longer there. No problem; I recreated the automated snapshot for vault and then tried to set up the replication task.

  • Pool: vault
  • Remote ZFS Pool: vault1
  • Recursively Replicate Child Dataset Snapshots: checked
  • Delete Stale Snapshots on Remote System: checked
  • Replication Stream Compression: lz4
  • Limit: 0
  • Begin Time: 00:00
  • End Time: 23:59
  • Enabled: checked
  • Setup-Mode: semi-automatic
  • Remote Hostname: 192.168.14.150
  • Remote HTTP Port: 4309
  • Remote HTTPS: not checked
  • Remote Auth Token: Copied from other machine
  • Encryption cypher: standard
  • Dedicated User Enabled: not checked
  • Dedicated User: empty

When I click "Scan SSH Key" I get the following error

Code:
Error: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 165, in call_method
    result = await self.middleware.call_method(self, message)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 1096, in call_method
    return await self._call(message['method'], serviceobj, methodobj, params, app=app, io_thread=False)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 1044, in _call
    return await methodobj(*args)
  File "/usr/local/lib/python3.6/site-packages/middlewared/schema.py", line 664, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/replication.py", line 307, in ssh_keyscan
    raise CallError(errmsg)
middlewared.service_exception.CallError: [EFAULT] ssh key scan failed for unknown reason


I've tried looking at "/usr/local/lib/python3.6/site-packages/middlewared/plugins/replication.py", line 307 and find this. I then tried to run the command manually and got an error initially. Unfortunately I cannot reproduce this error and cannot post it; but it was something related to unknown host. If I try it now I get no error but no response either. $ /usr/bin/ssh-keyscan -p 4309 -T 2 192.168.14.150.

I have tried to SSH to the other machine and it works whether I use the onboard NIC or the chelsio NIC. ssh -i /data/ssh/replication -p 31892 192.168.77.150 and ssh -i /data/ssh/replication -p 31892 192.168.14.150 both work; though I tried the .14.150 address first and had to accept the new key and then 77.150 failed because the key changed so I deleted the key for 192.168.77.150 in /ect/.ssh/known_hosts and tried again and it worked.

So, I'm a bit stumped; I'm not sure what else to try.

Oddly enough, the replication task for crypt seems to still be working and running without any changes required.
 
Last edited:
D

dlavigne

Guest
Were you able to resolve this? If not, please create a report at bugs.ixsystems.com and post the issue number here.
 

wongdongfu

Cadet
Joined
Mar 13, 2017
Messages
4
Howdy,

I have the exact same error when I try to setup replication between two 11.2-U6 boxes.

Does anyone have a solution on this?

Thanks in advance
 
Top