Can't join Active Directory,Failed to validate bind credentials: [EFAULT] timed out

dstewart51

Dabbler
Joined
Apr 2, 2019
Messages
12
I updated the activedirectory.py to the new version in post #39 by anodos on our problem system. Worked like a charm, I was able to join our AD and no issues after reboots.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Changes that have been made in 11.3-stable:
1) password validation is now carried out via a simple `kinit`. This fixes issues with potentially high latency links (or slow responses by DC).
2) the actual domain join is converted into a middleware job. Once the password is verified, the actual domain join is backgrounded. Progress can be watched by clicking on the 'tasks' button in the GUI. This avoids middleware timeouts.
3) if password auth and kerberos keytabs are selected, we will preferentially use the keytab (and clear out the password). This means no more error messages about both being present.
 

g847

Cadet
Joined
Mar 5, 2020
Messages
5
Related to this, now instead of getting the [EFAULT] timed out error, I get the next error when I Enable and Save: Failed to validate bind credentials: [EFAULT] kinit for domain [****.****] with password failed: kinit: Password incorrect

I can ping the windows server from FreeNAS, and the other way around, and it does pick up the AD from the windows server as it tells me when I put a user that doesn't exist.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Related to this, now instead of getting the [EFAULT] timed out error, I get the next error when I Enable and Save: Failed to validate bind credentials: [EFAULT] kinit for domain [****.****] with password failed: kinit: Password incorrect

I can ping the windows server from FreeNAS, and the other way around, and it does pick up the AD from the windows server as it tells me when I put a user that doesn't exist.
If kinit is failing with "password incorrect" then the password is probably incorrect. You should run the command "midclt call activedirectory.config" and verify that your settings there are correct. There's a button on the top-right of the screen that shows the health of the directory services. If the AD service shows a status of "FAULTED" then we're failing health checks.
 
Last edited:

g847

Cadet
Joined
Mar 5, 2020
Messages
5
If kinit is failing with "password incorrect" then the password is probably incorrect. You should run the command "midclt call activedirectory.config" and verify that your settings there are correct. There's a button on the top-right of the screen that shows the health of the directory services. If the AD service shows a status of "FAULTED" then we're failing health checks.
I had somehow reset the password of the user, so nevermind that. It now gives me the famous [EFAULT] timed out error again when I try enabling.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I had somehow reset the password of the user, so nevermind that. It now gives me the famous [EFAULT] timed out error again when I try enabling.
Is this with the latest version of activedirectory.py from the git repo I posted earlier or is this with the default in 11.3-U1?
 

g847

Cadet
Joined
Mar 5, 2020
Messages
5
Is this with the latest version of activedirectory.py from the git repo I posted earlier or is this with the default in 11.3-U1?
Even with the latest version it gives me the Timed out error.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Even with the latest version it gives me the Timed out error.
Did you restart middlewared? The socket call only logs an error message on failure. If you are seeing an exception, then it sounds like you're not using the new code. If the issue persists after applying the version from the 11.3-stable branch and restarting middlewared, PM me the contents of /var/log/middlewared.log.
 

g847

Cadet
Joined
Mar 5, 2020
Messages
5
Did you restart middlewared? The socket call only logs an error message on failure. If you are seeing an exception, then it sounds like you're not using the new code. If the issue persists after applying the version from the 11.3-stable branch and restarting middlewared, PM me the contents of /var/log/middlewared.log.
Yes, i did. I even restarted the server. I'm on my way to install 11.2 7-U just to see if it works like some have said.
 

g847

Cadet
Joined
Mar 5, 2020
Messages
5
I made it work in 11.2-U7! Something is definitely off in 11.3, because I followed the same steps but made it work in the older version.
 

Julien.guay

Cadet
Joined
Mar 17, 2020
Messages
9
I confirm this issues is also present on freenas 11.3 release (legacy) I've been fighting against my configuration for a good day now.
I could try and get a pcap with wireshark to find out why it doesn't work. We have a large Active directory with thousand of users. I tried creating the computer object first and then without tried multiple users over different domains in our forest. nothing works and access is not restricted anywhere on the network. getting a pcap might take some time as the freenas is fiber hooked and the only way I could get decent capture is by using a network tap on the trunk link (I do not have access to the DC directly only the active directory)
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I confirm this issues is also present on freenas 11.3 release (legacy) I've been fighting against my configuration for a good day now.
I could try and get a pcap with wireshark to find out why it doesn't work. We have a large Active directory with thousand of users. I tried creating the computer object first and then without tried multiple users over different domains in our forest. nothing works and access is not restricted anywhere on the network. getting a pcap might take some time as the freenas is fiber hooked and the only way I could get decent capture is by using a network tap on the trunk link (I do not have access to the DC directly only the active directory)
I've put in quite a few fixes for U2 (which is a couple of weeks out). You can partially test for yourself using instructions earlier in this thread about how to replace the activedirectory plugin with the one from our 11.3-stable repository. There were also some python-ldap library fixes that you will have to wait until U2 for.
 

Julien.guay

Cadet
Joined
Mar 17, 2020
Messages
9
I've put in quite a few fixes for U2 (which is a couple of weeks out). You can partially test for yourself using instructions earlier in this thread about how to replace the activedirectory plugin with the one from our 11.3-stable repository. There were also some python-ldap library fixes that you will have to wait until U2 for.
oh thanks much appreciated, we are trying to include this nas in a production environnement how long until by by couples a week how long are we talking about ? 3-5? more?
 

uri

Dabbler
Joined
Jul 27, 2012
Messages
20
I finde out the problem in my situation!

In fact I have a lagg interface with loadbalance and one of network adapters is coming down and this become a problem for whole lagg interface and connection to AD!

Now I've replace this card and all works like a charm!

Maybe it will be helpfull for someone!
 

jeremyrea

Cadet
Joined
Mar 30, 2020
Messages
1
I was getting an error: "[EFAULT] active directory update: Failed to validate domain configuration: No response received from domain controller."
I had just upgraded to 11.3-U1. I tried to replace activedirectory.py and restart middleware but still no success. I had to revert to the 11.2-U8 Environment. It connects just fine now to active directory.
 

Gremlin

Dabbler
Joined
Jun 30, 2018
Messages
10
Has anyone tried to domain join after updating from U1 to U2? I've updated a fresh build of 11.3-U1 to U2, (failed under U1, fails under U2).
Tried the additional steps here: https://www.reddit.com/r/freenas/comments/bgji05/changes_to_ad_directory_service_in_freenas_113/ (Primarily steps 1-3).

I have another build, that's older, first built in 11.1 or 2 I think, domain joined, has no issues.
Before the error log, I've setup the network global config, setting only name server 1 to the one DC with IP (same subnet), and it can ping it via shell, both name and IP.

Main error is (dc name replaced): [EFAULT] activedirectory_update: Failed to validate domain configuration: No response received from dc.domain.local

Log that comes up:
Error: Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/ntplib.py", line 311, in request
response_packet, src_addr = s.recvfrom(256)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/activedirectory.py", line 839, in do_update
await self.middleware.run_in_thread(self.validate_domain, new)
File "/usr/local/lib/python3.7/site-packages/middlewared/utils/run_in_thread.py", line 10, in run_in_thread
return await self.loop.run_in_executor(self.run_in_thread_executor, functools.partial(method, *args, **kwargs))
File "/usr/local/lib/python3.7/site-packages/middlewared/utils/io_thread_pool_executor.py", line 25, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/activedirectory.py", line 1132, in validate_domain
self.middleware.call_sync('activedirectory.check_clockskew', data)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1147, in call_sync
io_thread=True, job_on_progress_cb=job_on_progress_cb,
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1170, in run_coroutine
return fut.result()
File "/usr/local/lib/python3.7/concurrent/futures/_base.py", line 428, in result
return self.__get_result()
File "/usr/local/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1095, in _call
return await run_method(methodobj, *args)
File "/usr/local/lib/python3.7/site-packages/middlewared/utils/run_in_thread.py", line 10, in run_in_thread
return await self.loop.run_in_executor(self.run_in_thread_executor, functools.partial(method, *args, **kwargs))
File "/usr/local/lib/python3.7/site-packages/middlewared/utils/io_thread_pool_executor.py", line 25, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/activedirectory.py", line 1120, in check_clockskew
response = c.request(pdc[0]['host'])
File "/usr/local/lib/python3.7/site-packages/ntplib.py", line 316, in request
raise NTPException("No response received from %s." % host)
ntplib.NTPException: No response received from dc.domain.local.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 130, in call_method
io_thread=False)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1081, in _call
return await methodobj(*args)
File "/usr/local/lib/python3.7/site-packages/middlewared/service.py", line 303, in update
f'{self._config.namespace}.update', self, self.do_update, [data]
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1081, in _call
return await methodobj(*args)
File "/usr/local/lib/python3.7/site-packages/middlewared/schema.py", line 961, in nf
return await f(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/activedirectory.py", line 843, in do_update
f"Failed to validate domain configuration: {e}"
middlewared.service_exception.ValidationError: [EFAULT] activedirectory_update: Failed to validate domain configuration: No response received from dc.domain.local.
 

Gremlin

Dabbler
Joined
Jun 30, 2018
Messages
10
Gah, can't see a way to edit my previous post. Meant to add, I've increased the AD and DNS timeout in the directory services page from 60 and 10 to 120 and 20 respectively.
 

Gremlin

Dabbler
Joined
Jun 30, 2018
Messages
10
Is this a Windows AD domain or a Samba one?
Windows Server 2016 DC, fresh on that machine, not migrated from older OSs etc. The dc.domain.local above in the log is correct for the name of the server. Other FreeNAS (when on an older release) was joined to it successfully in the past. Neither FreeNAS name existed on any previous machine (so old records etc).
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Windows Server 2016 DC, fresh on that machine, not migrated from older OSs etc. The dc.domain.local above in the log is correct for the name of the server. Other FreeNAS (when on an older release) was joined to it successfully in the past. Neither FreeNAS name existed on any previous machine (so old records etc).
Did you disable NTP on your DC?
 
Top