Freenas 8.2 LACP

Status
Not open for further replies.

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
I've been reading up on this a lot lately. Seems every post has a different story/opinion.

Basically I want to play around with getting more transfer speeds.

CURRENT SETUP
Right now I only have a single NIC from my NAS to an unmanaged jumbo frame gigabit switch. I was getting about 40-90MB/s on FTP transfers but now suddenly I'm getting 30MiB/s transfers. Nothing has changed on the NAS or local machine; all pools are healthy; cables and their connections are good.

Local machine & NAS are all SATA 6Gbps connected with 7200RPM HDDs. RAID5Z2 configuration.

FUTURE SETUP IDEA
Looking to upgrade to 3 NICs total on NAS & local machine both connected to a new managed switch with 9000 jumbo frames and LACP compatibility.

Is this worth the extra trouble or is there some red flag I need to check with my current drop in speed?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Sigh.. this has been hashe dout on the whole jumbo versus non-jumbo and LACP versus LACP.

If you are only getting 30MB/sec you aren't even close to saturating a single Gb NIC, so why would you think LACP would help? LACP will NOT increase single-link speeds. Basically, unless you have lots of users(aka.. not a home environment) LACP will be a waste of your time.

Jumbo packets can help increase speed slightly, but there is alot of potential problems because not all network devices use or support jumbo frames. And even if they do, the definition of "jumbo frames" often changes from device to device.

Basically, you should figure out where your bottleneck is and fix that before you go trying to add unnecessary complexity to your setup.
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
Sigh.. this has been hashe dout on the whole jumbo versus non-jumbo and LACP versus LACP.

If you are only getting 30MB/sec you aren't even close to saturating a single Gb NIC, so why would you think LACP would help? LACP will NOT increase single-link speeds. Basically, unless you have lots of users(aka.. not a home environment) LACP will be a waste of your time.

Jumbo packets can help increase speed slightly, but there is alot of potential problems because not all network devices use or support jumbo frames. And even if they do, the definition of "jumbo frames" often changes from device to device.

Basically, you should figure out where your bottleneck is and fix that before you go trying to add unnecessary complexity to your setup.

Thanks Cyberjock. Always good to hear from you.

I agree about finding out the current cause. Know of any Windows programs with a good reputation that I can start out with to sniff out where the issue is in my network? Wireshark seems to have what I'm looking for but there are some shady things I've been reading online about that as well. Found a few other programs, but it appears to have the same risks.

Not expecting a guaranteed safe program, but rather something that maybe any of you have worked with in the past that I could look into.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I wouldn't go sniffing traffic. It's pretty much a waste of time. You need to start testing from your zpool and go each step of the way and follow the "path" your data takes from zpool to destination system to find the problem. With speeds that low I doubt its a problem with your network hardware, but something configured in a way that is killing performance or hardware that is starting to fail. Sniffing traffic will only show you what your transfer rate is most likely. You already know that. You want to know why its slow which isn't going to be found with network packet sniffing unless you have something like massive packet retransmits or something.

Run long smart tests on your drives, do an iperf test from your server to 2 different machines, then from 2 machines to your typical desktop, then go from there.
 

louisk

Patron
Joined
Aug 10, 2011
Messages
441
Once you sort out your bottleneck(s), please acquire a switch that supports LACP. These are not typically cheap, or for consumers. Once you have a switch that is capable of LACP, its pretty trivial to setup for both the switches and FreeNAS.
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
I wouldn't go sniffing traffic. It's pretty much a waste of time. You need to start testing from your zpool and go each step of the way and follow the "path" your data takes from zpool to destination system to find the problem. With speeds that low I doubt its a problem with your network hardware, but something configured in a way that is killing performance or hardware that is starting to fail. Sniffing traffic will only show you what your transfer rate is most likely. You already know that. You want to know why its slow which isn't going to be found with network packet sniffing unless you have something like massive packet retransmits or something.

Run long smart tests on your drives, do an iperf test from your server to 2 different machines, then from 2 machines to your typical desktop, then go from there.

Will do. Thanks for your time!!
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
Once you sort out your bottleneck(s), please acquire a switch that supports LACP. These are not typically cheap, or for consumers. Once you have a switch that is capable of LACP, its pretty trivial to setup for both the switches and FreeNAS.

Thanks Louisk. Its for a SOHO so getting more performance can definitely be beneficial as time = $ lost. I'll keep looking around to see what most folks are getting as far as performance boosts, and if I don't think the time saved would = how much business time is being wasted then I can manage without.

One a non-professional note, could also be a nice tax write off and fun to set up.
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
Getting 22MBytes/s now with iperf running. Running a Long S.M.A.R.T. test in about an hour. After than I'll scrub the volume.
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
Want to make sure I'm doing this right. Is this merely telling me that the current status of the rest for this drive is 80% done? I never get email reports of SMART results so I was hoping to view them this way:

smartctl -l selftest /dev/ada0
smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA _of_first_error
# 1 Extended offline Self-test routine in progress 80% 2593 -
 

HolyK

Ninja Turtle
Moderator
Joined
May 26, 2011
Messages
654
Post an output from command "smartctl -a /dev/ada0"

There is a known issue (not only on freenas, but globally) with smartctl that it will get stucked at some point on some kind of device (disc). So if you are still on 80% of the progress, probably the long test is stucked and then the performance is degraded.

Try to stop all of the known activity (transfers from/to NAS) and check the I/O for each of the device.
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
Ahh there we go!! Thanks HolyKiller. Ran that on each drive and overall-health self-assessment test result is PASSED.

Two suspects I'd like to look into are: write speeds and EMI.

Forgive my ignorance on this one as this was the tests I found online that I was able to get feedback from.

I'm using Seagate Barracuda 7200 RPM HDDs (Drobo share = 3TB disks; DLNA = 2TB disks I believe).

Any feedback on these write speeds?

dd if=/dev/zero of=/mnt/DLNA/ddfile bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 34.768370 secs (603178120 bytes/sec)


dd if=/dev/zero of=/mnt/DROBO/ddfile bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 33.876779 secs (619052949 bytes/sec)


dd of=/dev/null if=/mnt/DLNA/ddfile bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 32.955801 secs (636352914 bytes/sec)


dd of=/dev/null if=/mnt/DROBO/ddfile bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 34.682836 secs (604665661 bytes/sec)
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
BTW, Autotune is enabled on this system, lzjb compression level, Enable atime = off

"Drobo" zpool is a RaidZ2
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
Any insights?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'll give you some insight. DD tests with /dev/zero are useless if you have any compression on at all. DD with /dev/zero tests always give crazy high write speeds with compression.

Do the same DD test with /dev/random! That'll tell you what kind of speeds you can really expect from DD. I bet you'll get something that is, at best, 1/3 of that 600MB/sec you just claimed to get. I knew compression was on the second I saw 600MB/sec. Nobody gets those speeds without massively powerful hardware(and you wouldn't be complaining about poor performance then).
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
I'll give you some insight. DD tests with /dev/zero are useless if you have any compression on at all. DD with /dev/zero tests always give crazy high write speeds with compression.

Do the same DD test with /dev/random! That'll tell you what kind of speeds you can really expect from DD. I bet you'll get something that is, at best, 1/3 of that 600MB/sec you just claimed to get. I knew compression was on the second I saw 600MB/sec. Nobody gets those speeds without massively powerful hardware(and you wouldn't be complaining about poor performance then).

Thanks brother. Forgive my ignorance, but I'm not exactly sure what bs= & count= are referring to so I'm putting in numbers I've seen other people use. Based on results I'm assuming that count is the # of test files written and bs is the size of the file?

Here's my output for DLNA zpool (mirrored)

dd if=/dev/random of=/mnt/DLNA/ddfile bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 257.212645 secs (81533783 bytes/sec)

77.7567 MB/sec

Now for Drobo zpool (RaidZ2)

dd if=/dev/random of=/mnt/DROBO/ddfile bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 253.687442 secs (82666764 bytes/sec)

78.8372 MB/s
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
So based on your DD speeds you should expect something between 88MB/sec and 600MB/sec depending on the file type. I always recommend nothing less than 200MB/sec if you want to saturate Gb. Almost certainly compression is killing your performance. Try disabling compression and do your DD tests again. If they both are over 200MB/sec then you should keep compression disabled. I've seen 200MB as being the best place to be for maximum file sharing. Anything less and performance won't be optimal.

Something I've noticed are that the only people that use compression are the ones that are trying to save money by not buying bigger hard drives. Since they're trying to save money with hard drives they're also almost always trying to save money by using old CPU/RAM. Having an old CPU is exactly when you don't want to use compression, so compression seems to be completely pointless for just about anyone that actually can use it. Then you add in the fact that alot of people are enabling compression on video streams(why??? you won't save any disk space but you will kill your ability to stream) and you have a recipe for unhappiness.

Also, with an 88MB/sec DD test scrubs can take a hellatiously long time to complete. That would also explain your poor performance.

Edit: So what's my advice after all of that reading? Turn off compression for decent performance or buy a MUCH more powerful computer and leave compression on. Also keep in mind that when you turn off compression you are only turning off compression for any new data written to the zpool. Any data on the zpool is left compressed and will perform poorly.

BS = block size
count = Number of times the block size is written.

So if you have a BS of 1024(aka 1kb) and a count of 10000 you can expect that is approximately 10GB to be written. You need a block size that is not too big or too small(typically 1024k or 2048k is good) and a count that is big enough that the file won't be cached in RAM. Otherwise you'll only tell yourself how fast your cache write speed is, which doesn't really tell you anything.
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
Thanks Cyberjock!! Wow 200MB/s or anything close to that would be amazing!!

Followed your advice on the compression and disabled it (atime disabled as well). Also, you probably noticed the jump in speed from my last post; forgot to mention I disabled jumbo frames on the NAS and hosts. FTP upload (5GB file) went from 21MB/s to about 82MB/s. The first 20secs of that transfer, FileZilla is showing speeds very close to 200MB/s but then it drops to 82MB/s.

TEST 1
dd if=/dev/random of=/mnt/DROBO/ddfile bs=2048k count=10000
10000+0 records in
10000+0 records out
20971520000 bytes transferred in 238.345150 secs (87988029 bytes/sec)

About 84MB/s

TEST 2
dd if=/dev/random of=/mnt/DROBO/ddfile bs=1024k count=10000
10000+0 records in
10000+0 records out
10485760000 bytes transferred in 119.158798 secs (87998202 bytes/sec)

Still about 84MB/s

With that, all my drives are 7200rpm, SATA III mobo w/ SATA III cables, CAT5e cable, Intel NICs, compression & atime disabled.

When I ran these tests I was using about 50% of my CPU and about 15-18GB of RAM (32GB total with ECC).

Here's my system processes (rsync started running because of the dd file):
last pid: 3643; load averages: 1.04, 0.85, 0.61 up 0+00:34:09 04:52:10
45 processes: 2 running, 43 sleeping

Mem: 100M Active, 53M Inact, 11G Wired, 836K Cache, 141M Buf, 20G Free
Swap: 18G Total, 18G Free


PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
3458 root 1 118 0 18108K 3516K CPU2 2 6:55 100.00% rsync
2222 root 6 44 0 141M 76024K uwait 1 0:03 0.00% python
3223 root 1 76 0 87008K 40080K ttyin 1 0:01 0.00% python
2448 www 1 44 0 14372K 5516K kqread 1 0:01 0.00% nginx
2589 root 7 44 0 66388K 7952K ucond 1 0:00 0.00% collectd
2063 uucp 1 44 0 8032K 1740K select 1 0:00 0.00% usbhid-ups
1569 root 1 44 0 11784K 2776K select 0 0:00 0.00% ntpd
2065 uucp 1 44 0 10992K 2604K select 0 0:00 0.00% upsd
2788 root 1 44 0 33308K 5128K select 1 0:00 0.00% sshd
1744 root 1 44 0 46796K 9212K select 0 0:00 0.00% smbd
2078 uucp 1 44 0 12044K 2548K nanslp 0 0:00 0.00% upsmon
1412 root 1 44 0 6908K 1468K select 0 0:00 0.00% syslogd
3240 root 1 44 0 10176K 2848K ttyin 0 0:00 0.00% csh
1808 root 1 44 0 39220K 6360K select 1 0:00 0.00% nmbd
2740 root 1 76 0 7964K 1552K nanslp 1 0:00 0.00% cron
2186 root 1 44 0 16092K 4560K select 0 0:00 0.00% proftpd
2976 root 1 44 0 7840K 1532K select 1 0:00 0.00% rpcbind
2071 uucp 1 44 0 12044K 2500K nanslp 1 0:00 0.00% upslog

Is there anything else I should check on? Not sure if a write limit override tuneable would make any difference with autotune enabled.

I appreciate the detail of your last post. I never really noticed my space savings with compression anyway so it good to see a small write speed pick up from disabling it.
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
Ah, can't believe I left this part out...unfortunately my clients are Windows users so I'm using CIFS shares
 

sonny81

Contributor
Joined
Aug 7, 2012
Messages
105
Looked around in my CIFS settings, tried to enable AIO but I don't think its doing any difference.

First FTP run (AIO off), my speed rapidly jumped around from 115-200MB/s and half way through a 7GB video file transfer is steadied around 95MB/s.

Turned on AIO for the second test and it steadied around 102MB/s.

However when I duplicated this whole process again it was the other way around...so I'm thinking AIO is making a difference. My min AIO read & write is the default 4,096
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
You do realize that FTP is NOT CIFS, right? You keep mentioning FTP speeds, which can be alot different from CIFS.
 
Status
Not open for further replies.
Top