Yet Another ZFS Tuning Thread

Status
Not open for further replies.

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Here's another cool screenshot. Never thought I'd see 99% network utilization for longer than 1 second on a windows machine.

Untitled2.jpg
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I'm pretty sure that's ZFS because less than 25MB are sent through the network before the file starts playing, but more than 800MB are read from the server before the file starts. Usually there's a 2-3 second delay between when the VLC window opens and when you see the video and hear audio. It COULD be that VLC needed certain info from certain parts of the file yielding the high read rates and low utilization of data.

No, it's not ZFS. Look, we can EXPERIMENT. It's real simple. Run "zpool iostat 1" in one window. Run commands in another. Make sure you're NOT reading the same data so as to force a read off the disk (see below).

So here's what you do. Reading 25MB of data does not result in ZFS reading and caching 800MB of data. It's straightforward. Do "dd if=file.foo of=/dev/null bs=1048576 count=25". Assuming a fast system, you'll see a little spike

Code:
storage0    7.08T  3.80T      0    113      0   567K
storage0    7.08T  3.80T    481    138  59.5M   666K
storage0    7.08T  3.80T      0    116      0   578K


where that 59.5M coincides exactly with me hitting return. So ZFS is reading the requested 25MB, plus another 29.5MB which is 34GB, a tad above what I said was the maximum buffer, but then again it had to read metadata to open the file, and there is other light activity going on with the system too.

I did the dd you requested and it read 17.4MB. I ran it a second time and got 0, which I assume means the data is in the cache?

Right. Now play around with it a bit. You won't find ZFS reading gobs of data. It isn't efficient to do so. If it sees a file where prefetching is successful, it WILL try to prefetch a larger buffer but only up to about 32MB of data or thereabouts. And that only happens when several megabytes have already been read. If you go and read 1KB, it isn't going to prefetch 32MB.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Is there a way to change that?

Of course. However, this is really something that could cause serious levels of stress on the cache if one was careless with it; I've also not looked carefully at the code to see if there might be other pitfalls. It may be better to crank this sort of thing up by upping the record size. You can set vfs.zfs.zfetch.block_cap under tunables to what you'd like.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I thought that the record size was preset to 128k, and also had a maximum value of 128k?

I can't run my test right now.. I'm moving 15TB from 1 server to another via rsync :( The way that I came up with 800MB was exactly how you described. I did a 'zpool iostat 1', then on my HTPC opened a file with VLC. The HTPC would record about 25MB across the network, but over the next 6 seconds or so it would rack up almost 800MB read before it would settle down. As soon as I can I'll do another test I'll post the results. It could be a few days.. 15TB takes a while to move ;)
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I thought that the record size was preset to 128k, and also had a maximum value of 128k?

I can't run my test right now.. I'm moving 15TB from 1 server to another via rsync :( The way that I came up with 800MB was exactly how you described. I did a 'zpool iostat 1', then on my HTPC opened a file with VLC. The HTPC would record about 25MB across the network, but over the next 6 seconds or so it would rack up almost 800MB read before it would settle down. As soon as I can I'll do another test I'll post the results. It could be a few days.. 15TB takes a while to move ;)

Yeah, but that's not a controlled test, you don't know what's being asked for by VLC. Use dd and request a specific amount of data, or else it's just meaningless handwaving. I can write a program to more effectively cause ZFS to read lots of data based on very small requests, but it proves relatively little about caching beyond showing that I know how to manipulate I/O subsystems. The original idea here was that someone was streaming a movie to a TV and that so much data had been cached that the drives spun down after a half hour. That story may be true in terms of what someone experienced, but the why's probably wrong - ZFS by itself doesn't want to cache that much data.

Now maybe if he watched part of a show, someone else came over, they restarted the show, and as a result got served out of ARC, drives spun down, etc., well yes that could happen.
 

TheSmoker

Patron
Joined
Sep 19, 2012
Messages
225
Good morning. Sorry for lack of updated but i was not able to install the Intel NIC last night.
I am planning of doing it tonight I. After that I will post more updates.
 

TheSmoker

Patron
Joined
Sep 19, 2012
Messages
225
Done the upgrade: i've put the Intel CT PCI Express Network adapter in and disabled the Realtek onboard. Also I have added the following card: http://delock.de/produkte/F_331_intern--SATA-_89302/merkmale.html
On which I have installed a OCZ Nocti 60G SandForce SSD. With the spare port this card provides and the 8 sata ports onboard I have the previously required 9 sata connections for the 9 WD Green 2TB drives.
My configuration is as follows:
- 6 WD Green configured in RAIDZ2 using 6 sata ports on mobo;
- 3 WD Green configured in RADIZ1 using 2 sata ports on mobo and 1 sata port in the add-on card;
All the ports are 6Gbps gen 3 sata ports.

Before starting to do the testing my question was: how do I add the SSD to be the ZIL disk for one of the RAIDZx arrays?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Please tell me this is only for testing. OCZ is world renowned for having very low reliability among SSDs... and you are adding it as a ZIL. Do you enjoy torturing yourself?
 

TheSmoker

Patron
Joined
Sep 19, 2012
Messages
225
Please tell me this is only for testing. OCZ is world renowned for having very low reliability among SSDs... and you are adding it as a ZIL. Do you enjoy torturing yourself?

I do not ... it was the only mSATA SSD available on market in my country/city. :) I will replace it later on with an Intel one.
 

TheSmoker

Patron
Joined
Sep 19, 2012
Messages
225
freenas_8.3_R6_6D_woSSD.jpg

First round of tests. Intel network adapter on FreeNAS and Realtek network adapter on client (Windows 7).
Going to post additional test with and without SSD as ZIL disk using Intel NASPT in batch mode during the night.
 
Status
Not open for further replies.
Top