Can't get NFS v4 + Kerberos to work

Status
Not open for further replies.

Ling Ho

Dabbler
Joined
Nov 13, 2014
Messages
11
I have been trying to mount a shared dataset, with NFS v4 turned on, and sec set to krb5, krb5i or krb5p, without sys.

I have tried mounting from a RHEL6 client, and also from another FreeNAS 9.4 box.

From the FreeNas client box, I can mount using nfsv3, but can't do any directory listing (getting Permission denied). Using nfsv4, getting nfsv4 err=10016. mount_nfs: ... : Input/output error.

However, my main concern is mounting from RHEL6 because majority of my hosts are running RHEL 5 or RHEL6.

Does anything has a working example?

Using tcpdump and wireshark, I see the following:
(My configurations are explained after these 2 packet traces)

Frame 23: 718 bytes on wire (5744 bits), 718 bytes captured (5744 bits)
Ethernet II, Src: Qumranet_15:21:02 (00:1a:4a:15:21:02), Dst: All-HSRP-routers_3d (00:00:0c:07:ac:3d)
Internet Protocol Version 4, Src: 172.21.49.85 (172.21.49.85), Dst: 172.21.32.81 (172.21.32.81)
Transmission Control Protocol, Src Port: 45012 (45012), Dst Port: nfs (2049), Seq: 1, Ack: 1, Len: 652
Remote Procedure Call, Type:Call XID:0xda09dc45
Fragment header: Last fragment, 648 bytes
1... .... .... .... .... .... .... .... = Last Fragment: Yes
.000 0000 0000 0000 0000 0010 1000 1000 = Fragment Length: 648
XID: 0xda09dc45 (3658079301)
Message Type: Call (0)
RPC Version: 2
Program: NFS (100003)
Program Version: 4
Procedure: NULL (0)
[The reply to this request is in frame 25]
Credentials
Flavor: RPCSEC_GSS (6)
Length: 20
GSS Version: 1
GSS Procedure: RPCSEC_GSS_INIT (1)
GSS Sequence Number: 0
GSS Service: rpcsec_gss_svc_none (1)
GSS Context
GSS Context Length: 0
GSS Context: <MISSING>
Verifier
Flavor: AUTH_NULL (0)
Length: 0
Network File System
[Program Version: 4]
[V4 Procedure: NULL (0)]
GSS Token: 000002456082024106092a864886f71201020201006e8202...
GSS Token Length: 581
GSS-API Generic Security Service Application Program Interface
OID: 1.2.840.113554.1.2.2 (KRB5 - Kerberos 5)
krb5_blob: 01006e8202303082022ca003020105a10302010ea2070305...
krb5_tok_id: KRB5_AP_REQ (0x0001)
Kerberos AP-REQ
Pvno: 5
MSG Type: AP-REQ (14)
Padding: 0
APOptions: 20000000 (Mutual required)
0... .... .... .... .... .... .... .... = reserved: RESERVED bit off
.0.. .... .... .... .... .... .... .... = Use Session Key: Do NOT use the session key to encrypt the ticket
..1. .... .... .... .... .... .... .... = Mutual required: MUTUAL authentication is REQUIRED
Ticket
Tkt-vno: 5
Realm: PCDSN
Server Name (Service and Host): nfs/psnfs1.pcdsn
Name-type: Service and Host (3)
Name: nfs
Name: psnfs1.pcdsn
enc-part des-cbc-crc
Encryption type: des-cbc-crc (1)
Kvno: 1
enc-part: 04083acff1a30163a376ad8a3ea190543e13274036ccfc6d...
Authenticator des-cbc-crc
Encryption type: des-cbc-crc (1)
Authenticator data: 3c0adb6fabe80574ed3a5db4601be9309c05dc05e5d0a50e...

Frame 25: 90 bytes on wire (720 bits), 90 bytes captured (720 bits)
Ethernet II, Src: Cisco_9b:27:00 (00:1d:71:9b:27:00), Dst: Qumranet_15:21:02 (00:1a:4a:15:21:02)
Internet Protocol Version 4, Src: 172.21.32.81 (172.21.32.81), Dst: 172.21.49.85 (172.21.49.85)
Transmission Control Protocol, Src Port: nfs (2049), Dst Port: 45012 (45012), Seq: 1, Ack: 653, Len: 24
Remote Procedure Call, Type:Reply XID:0xda09dc45
Fragment header: Last fragment, 20 bytes
1... .... .... .... .... .... .... .... = Last Fragment: Yes
.000 0000 0000 0000 0000 0000 0001 0100 = Fragment Length: 20
XID: 0xda09dc45 (3658079301)
Message Type: Reply (1)
[Program: NFS (100003)]
[Program Version: 4]
[Procedure: NULL (0)]
Reply State: denied (1)
[This is a reply to a request in frame 23]
[Time from request: 0.000287000 seconds]
Reject State: AUTH_ERROR (1)
Auth State: client must begin new session (2)

My /etc/exports file:
[root@psnfs1] /etc/rc.d# cat /etc/exports
V4: /
/mnt/datapool/ling -alldirs -sec=krb5 -network 172.21.0.0/16
/mnt/datapool/ling -alldirs -sec=krb5 127.0.0.1

Server:
[root@psnfs1] /etc/rc.d# host psnfs1
psnfs1.pcdsn has address 172.21.32.81

Client:
[root@psnfs1] /etc/rc.d# host psana107
psana107.pcdsn has address 172.21.49.85

My domain (private) is pcdsn.
Kerberos REALM is PCDSN

I have set up host and ftp kerberos principal in the keytab files for both machines, using des-cbc-crc only. I have tried other encryption type but made no difference.

I have rpc.gssd, rpc.idmapd running on the RHEL6 client.

I can get a lot of logs on my RHEL6 client, but unfortunately I am not sure how to get more logs on the FeeeNAS box. I have set syslog-ng to send everything the /var/log/all.log but there is just nothing much to see. I also tried running gssd on the FreeNAS box with -d -d -d and also nothing was shown. If I try to run truss on nfsd: server process , it dies right away. nfsd: master doesn't show anything.

Thanks,
 

Ling Ho

Dabbler
Joined
Nov 13, 2014
Messages
11
No, I still could not figure out. I can't even mount from another FreeNAS box using NFSv4 with Kerberos. Is this feature suppose to work? If not, I will give up.

THanks,
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
can you kinit from the server and client? You should start by eliminating variables.
 

Ling Ho

Dabbler
Joined
Nov 13, 2014
Messages
11
Yes, I can kinit on both the server and the client. From the client, I am also able to mount a directory from another RHEL 6 server using NFSv4 and krb5.
 
D

dlavigne

Guest
If it's not working, it's worth creating a bug report at bugs.freenas.org. It may be a software bug or a doc bug. If you do, post the issue number here.
 

ebledsoe

Cadet
Joined
Apr 18, 2014
Messages
6
I have the same problem as the OP except using RHEL7 and/or CentOS7 clients. I didn't look at all the detail of his packet captures but mine look very similar if not exactly the same.

One thing I noted was that on the FreeNAS box console I see "nfsd: can't register svc name" whenever the NFS service is started. A little googling led me here: http://lists.freebsd.org/pipermail/freebsd-arm/2013-September/006514.html

From the code snippet "if (principal[0] != '\0')" I thought maybe the nfs service principal had to be first in /etc/krb5.keytab. I jockeyed that around a bit to make sure it was first but no dice, still not able to mount. I can change to "sec=sys" and the mount works fine. So it does seem to be a kerberos issue.

Oh and I can kinit from any client and from the FreeNAS box.
 
Last edited:

ebledsoe

Cadet
Joined
Apr 18, 2014
Messages
6
So I thought I'd see if I could get kerberos to work with NFSv3, initial attempts did not work. The "nfsd: can't register svc name" kept bugging me. The fix for this is to comment out "127.0.0.1 freenas freenas.example.com" (or whatever your host/domain combination is) at the bottom of /etc/hosts and restart gssd and nfsd, assuming you have working forward and reverse DNS. If you see "nfsd: can't register svc name" on the console something is still wrong. After this change NFSv3 and kerberos is working, but still no v4. And I realize the change to /etc/hosts may break something else... It certainly isn't persistent across reboots.

For grins I installed a plain vanilla copy of FreeBSD 9.3-RELEASE-p5 in a VM to use as an NFS client. As above v3 works but not v4. It fails a little differently than CentOS 7 as a client does. The mount seems to work but I get an error when I try to access the mounted directory. I get "nfsv4 client/server protocol prob err=10006"

Bottom line with both CentOS 7 and FreeBSD 9.3 as NFS clients. Looking at packet captures with wireshark - both fail with the "SETCLIENTID" opcode. The server always responds with "SETCLIENTID Status: NFS4ERR_SERVERFAULT". Don't know if there's anything I can do configuration wise to make this happy(?). nfsuserd and rpc.idmapd are running where appropriate.

Is this enough info to put into a bug report? I'll create one if so (it seems the OP lost interest).
 

ebledsoe

Cadet
Joined
Apr 18, 2014
Messages
6
Just updated to FreeNAS-9.3-STABLE-201503270027. It appears there's been a bit of a regression. Part of the fix for Bug #7775 is gone. The line in /etc/hosts "127.0.0.1 freenas freenas.example.com" is back.

When the machine rebooted after the update I saw the familiar "nfsd: can't register svc name" on the console and sure 'nuf NFS4/kerberos is not working. I know "127.0.0.1 ..." has been in the /etc/hosts file of unix'ish systems for ever, is it actually needed for anything in FreeNAS? It sure is NOT allowing "gssd" to work for me at least.

Should I (can I?) re-open Bug #7775? Or is there a better/different way to get this resolved?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
is your fqdn freenas.example.com? That line has to be there and lots of stuff would stop working if it wasn't there. This line probably isn't the reason your kerberos is failing. Things are not working because you have them configured incorrectly. You need to make sure your principals created in your kdc match what freenas expects them to be.
 

ebledsoe

Cadet
Joined
Apr 18, 2014
Messages
6
is your fqdn freenas.example.com?
Yes it is, in the test environment that I have setup. In the "production" environment it's a domain that we own.

That line has to be there and lots of stuff would stop working if it wasn't there.
To be clear the line "127.0.0.1 localhost localhost.my.domain" is still there. I believe "stuff" stops working if it's gone. I'm removing "127.0.0.1 freenas freenas.example.com".

This line probably isn't the reason your kerberos is failing. Things are not working because you have them configured incorrectly. You need to make sure your principals created in your kdc match what freenas expects them to be.
Yes actually the presence of this line /etc/hosts causes NFSV4/Kerberos to fail. I don't have the tcpdump captures with me now, however when gssd starts there is an exchange between gssd and the kdc. With that line in /etc/hosts gssd tells the kdc it's IP address is 127.0.0.1 which causes the "transaction" to fail. With out the line in /etc/hosts gssd does a DNS query to find it's IP address and is then reported correctly to the kdc.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'm gonna say just a few things because finding Kerberos/AD/LDAP/etc problems are not trivial.

1. I've never known anyone to have to mess with the hosts file in a production environment (this is for TrueNAS and FreeNAS). If this is fixing your problem, you are masking the problem by removing the entry and you should instead fix the problem. Ultimately, the hosts file is nothing but some static values for things you want to work, even if the network is down. So the entry you get from a DNS request and the entry in the host should be the same. So you should NOT be seeing any behavioral change.
2. The problems that come about when user authentication is used with FreeNAS are often very difficult to identify. This is not because it's hard, people are just experienced with it.
3. More than 75% of the time the problem is because their domain/LDAP/etc is not configured and managed properly. You cannot have dead domain controllers from 2008 listed. You cannot have 1/2 broken DNS servers listed. You cannot do stupid crap and expect it to work. In fact, the problem is generally people have no idea they've got all this old deprecated/broken stuff in their domain until they try to setup FreeNAS and FreeNAS knows better.
 

ebledsoe

Cadet
Joined
Apr 18, 2014
Messages
6
Thanks for taking the time to reply, hopefully others will find it useful. Unfortunately I didn't find anything applicable to the issue.

In our environment FreeNAS is the "odd man out". NFSv4/kerberos works EVERY where else, including a generic FreeBSD 9.3 NFS server. I've taken the time to study the network traffic with tcpdump/wireshark, I've taken the time to truss nfsd and gssd when they start. I know exactly what the problem is, I just suggested one fix, it may break something else, I don't know.
I'm gonna say just a few things because finding Kerberos/AD/LDAP/etc problems are not trivial.

1. I've never known anyone to have to mess with the hosts file in a production environment (this is for TrueNAS and FreeNAS). If this is fixing your problem, you are masking the problem by removing the entry and you should instead fix the problem.
So how many decades has TrueNAS and/or FreeNAS had Kerberos authentication on NFSv4 ? The point here is NFSv4/Kerberos is new so there's at least an outside chance things that always worked in the past may not work now. Incidentally the other way to fix this problem is change this line in /etc/nsswitch.conf

From - hosts: files dns
To - hosts: dns files

Ultimately, the hosts file is nothing but some static values for things you want to work, even if the network is down. So the entry you get from a DNS request and the entry in the host should be the same. So you should NOT be seeing any behavioral change.
If the hosts file says freenas.example.com is 127.0.0.1 and DNS says it's 192.168.10.227 do you still think your statement above is true?
2. The problems that come about when user authentication is used with FreeNAS are often very difficult to identify. This is not because it's hard, people are just experienced with it.
3. More than 75% of the time the problem is because their domain/LDAP/etc is not configured and managed properly. You cannot have dead domain controllers from 2008 listed. You cannot have 1/2 broken DNS servers listed. You cannot do stupid crap and expect it to work. In fact, the problem is generally people have no idea they've got all this old deprecated/broken stuff in their domain until they try to setup FreeNAS and FreeNAS knows better.
Anyway thanks again. We know how to work around this problem for the time being so I'll fade away for now.
 

lumaforge

Cadet
Joined
Jun 3, 2015
Messages
4
Anyway thanks again. We know how to work around this problem for the time being so I'll fade away for now.

Sorry about the response you got here. Doubly so, because I could have really gotten some help from a more thorough culmination of this thread.

-Eric
 

icemachine79

Cadet
Joined
Jan 22, 2014
Messages
5
I'm gonna say just a few things because finding Kerberos/AD/LDAP/etc problems are not trivial.

1. I've never known anyone to have to mess with the hosts file in a production environment (this is for TrueNAS and FreeNAS). If this is fixing your problem, you are masking the problem by removing the entry and you should instead fix the problem. Ultimately, the hosts file is nothing but some static values for things you want to work, even if the network is down. So the entry you get from a DNS request and the entry in the host should be the same. So you should NOT be seeing any behavioral change..

Really? You don't think associating a nonexistent hostname to your FreeNAS box might cause problems for DC authentication and/or interactions with other members of the domain?

3. More than 75% of the time the problem is because their domain/LDAP/etc is not configured and managed properly. You cannot have dead domain controllers from 2008 listed. You cannot have 1/2 broken DNS servers listed. You cannot do stupid crap and expect it to work. In fact, the problem is generally people have no idea they've got all this old deprecated/broken stuff in their domain until they try to setup FreeNAS and FreeNAS knows better.

Sounds like you've had quite a lot of fun dealing with those sorts of issues on networks you've inherited.
 

xenu

Dabbler
Joined
Nov 12, 2015
Messages
43
Sorry for necroing this old thread:
After it was announced FreeNAS Corral was not going to be continued in it's current form I decided to revert back to 9.10.2-U2 (I was not using any of the new features).
So I switched back to an old boot environment and everything seemed fine except for I could not get kerberized NFS4 mounts to work. The only error message I found was "nfsd: can't register svc name" every time I changed the nfs config and reloaded and/or rebooted.
After a while I decided to wipe my boot drives and do a fresh install. The result was the same.
So I vaguely remebered I was struggling with nfs4 + kerberos when I first set it up and eventually changed the "/etc/hosts" file just as mentioned above and everything started working again. I also added a line "10.0.20.4 freenas01.ipa.mydomain.com freenas01" though I assume this just stops services from making a dns call.

Here is the part I don't understand though: I rebooted afterwards to make sure my changes "stick" which they didn't but /etc/hosts had that line again "127.0.0.1 freenas01.ipa.mydomain.com freenas01"). BUT now everything works nonetheless. Maybe there is some hidden cache - I don't know. I added my custom line above through the GUI now (network->global configuration->host name database) but it is added above the 2 lines which caused trouble to begin with.
 
Status
Not open for further replies.
Top