NFS Huge Issue Writing, No Issue Reading

Status
Not open for further replies.

brando56894

Wizard
Joined
Feb 15, 2014
Messages
1,537
I just noticed this now when I was trying to add torrents to Deluge, it would download for a few seconds and then error out complaining that the mount was unavailable or busy. It did manage to successfully download one file, but nothing else. It's not a permission issue, but I believe an NFS issue.

Here's the output from nfsiostat, as you can see no issues with reads but a ton of issues with writes. I didn't notice this with Usenet, but it seems to be affecting that mount as well, they're just far more resilient at handling errors than deluge is apparently.

Code:
192.168.1.6:/mnt/storage/downloads/torrents mounted on /mnt/downloads/torrents:

		   ops/s	   rpc bklog
		 970.743		   0.000

read:			  ops/s			kB/s		   kB/op		 retrans	avg RTT (ms)	avg exe (ms)  avg queue (ms)
				   3.183		  51.758		  16.262		0 (0.0%)		   1.359		   1.910		   0.529
write:			 ops/s			kB/s		   kB/op		 retrans	avg RTT (ms)	avg exe (ms)  avg queue (ms)
				  21.564		4387.662		 203.469   36624 (82.1%)		 780.735	   11616.492	   10654.165

192.168.1.6:/mnt/storage/downloads/usenet mounted on /mnt/downloads/usenet:

		   ops/s	   rpc bklog
		 970.743		   0.000

read:			  ops/s			kB/s		   kB/op		 retrans	avg RTT (ms)	avg exe (ms)  avg queue (ms)
				 252.616	   32318.499		 127.935	  712 (0.1%)		  19.290		  32.553		  12.839
write:			 ops/s			kB/s		   kB/op		 retrans	avg RTT (ms)	avg exe (ms)  avg queue (ms)
				 220.446	   74212.082		 336.645 930670 (204.1%)		1945.021	   12171.875		9592.675


And further proof that writes are the issue

Code:
 [bran@pirate torrents]$ dd if=/dev/zero of=test.img bs=16k count=163840 status=progress
2516369408 bytes (2.5 GB, 2.3 GiB) copied, 25 s, 101 MB/s
163840+0 records in
163840+0 records out
2684354560 bytes (2.7 GB, 2.5 GiB) copied, 59.1221 s, 45.4 MB/s

 [bran@pirate torrents]$ dd if=test.img of=/dev/null bs=16k status=progress
163840+0 records in
163840+0 records out
2684354560 bytes (2.7 GB, 2.5 GiB) copied, 0.493303 s, 5.4 GB/s


The NFS server is FreeNAS and the NFS Client is Arch Linux, both of which are virtualized inside of ESXi 6.5

Here are my mount options
Code:
 [bran@pirate torrents]$ mount|grep torrents
systemd-1 on /mnt/downloads/torrents type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=15704)
192.168.1.6:/mnt/storage/downloads/torrents on /mnt/downloads/torrents type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=14,retrans=2,sec=sys,mountaddr=192.168.1.6,mountvers=3,mountport=1001,mountproto=udp,local_lock=none,addr=192.168.1.6)

 [bran@pirate torrents]$ mount|grep usenet
systemd-1 on /mnt/downloads/usenet type autofs (rw,relatime,fd=46,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=15708)
192.168.1.6:/mnt/storage/downloads/usenet on /mnt/downloads/usenet type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=14,retrans=2,sec=sys,mountaddr=192.168.1.6,mountvers=3,mountport=1001,mountproto=udp,local_lock=none,addr=192.168.1.6)


I'm not too good a performance debugging on the BSD side of things, especially in an appliance OS like FreeNAS. nfsstat looks fine from FreeNAS.

Code:
root@freenas:~ # nfsstat
Client Info:
Rpc Counts:
  Getattr   Setattr	Lookup  Readlink	  Read	 Write	Create	Remove
		0		 0		 0		 0		 0		 0		 0		 0
   Rename	  Link   Symlink	 Mkdir	 Rmdir   Readdir  RdirPlus	Access
		0		 0		 0		 0		 0		 0		 0		 0
	Mknod	Fsstat	Fsinfo  PathConf	Commit
		0		 0		 0		 0		 0
Rpc Info:
 TimedOut   Invalid X Replies   Retries  Requests
		0		 0		 0		 0		 0
Cache Info:
Attr Hits	Misses Lkup Hits	Misses BioR Hits	Misses BioW Hits	Misses
		0		 0		 0		 0		 0		 0		 0		 0
BioRLHits	Misses BioD Hits	Misses DirE Hits	Misses Accs Hits	Misses
		0		 0		 0		 0		 0		 0		 0		 0

Server Info:
  Getattr   Setattr	Lookup  Readlink	  Read	 Write	Create	Remove
	35995	   826	 11237		 0   4862736   6433639		 0	   821
   Rename	  Link   Symlink	 Mkdir	 Rmdir   Readdir  RdirPlus	Access
	  987		 0		 0		21		13		 0	  4885	 23296
	Mknod	Fsstat	Fsinfo  PathConf	Commit
		0	 89266		22		11	 64141
Server Ret-Failed
				0
Server Faults
			0
Server Cache Stats:
   Inprog	  Idem  Non-idem	Misses
		0		 0		 0  11528706
Server Write Gathering:
 WriteOps  WriteRPC   Opsaved
  6433639   6433639		 0


My FreeNAS exports
Code:
root@freenas:~ # cat /etc/exports
/mnt/storage/downloads/usenet  -maproot="root":"wheel"
/mnt/storage/downloads/torrents  -maproot="root":"wheel"


Edit:

Knocking the rsize and wsize down to 32k on Linux didn't seem to help
Code:
 [bran@pirate torrents]$ !471
dd if=/dev/zero of=test.img bs=16k count=163840 status=progress
2683060224 bytes (2.7 GB, 2.5 GiB) copied, 8 s, 335 MB/s
163840+0 records in
163840+0 records out
2684354560 bytes (2.7 GB, 2.5 GiB) copied, 60.6684 s, 44.2 MB/s


Edit 2:

Write is just as slow directly on FreeNAS wtf?
Code:
root@freenas:/mnt/storage/downloads/torrents # dd if=/dev/zero of=test.img bs=16k count=163840
163840+0 records in
163840+0 records out
2684354560 bytes transferred in 48.635495 secs (55193322 bytes/sec)


That dataset is using a record size of 16k, and the pool has 2 RAIDZ2 VDEVs of 6 drives each.

Using a different blocksize tells a different story though

Code:
root@freenas:/mnt/storage/downloads/torrents # dd if=/dev/zero of=test1.img bs=2M count=5100
5100+0 records in
5100+0 records out
10695475200 bytes transferred in 58.807041 secs (181874059 bytes/sec)


I tested it on my usenet dataset as well, which has the default record size, and I used a block size of 2M and as you can see it's a lot quicker
Code:
root@freenas:/mnt/storage/downloads/usenet # dd if=/dev/zero of=test.img bs=2M count=5100
5100+0 records in
5100+0 records out
10695475200 bytes transferred in 54.203036 secs (197322439 bytes/sec)


I just remembered that I have compression enabled on these datasets so they're probably not accurate :-/
 
Last edited:
Status
Not open for further replies.
Top