Advise on Tehuti TN4010 10G NICs? Or alternate for use with iSCSI

ee21

Dabbler
Joined
Nov 2, 2020
Messages
33
Let me preface this by saying I am literally brand new to FreeNAS and just installed in on my first box a few days ago. I have some minimal experience with Linux before hand however which has helped.

I built my box on a budget to begin with, and figured I'd put in some upgrades as I go. I picked a motherboard that has a 2.5Gbase-T integrated NIC using the RTL8125B chip, which I managed to get to work on FreeBSD.. Performance which has been less than desirable, only about 1.5X that of any old 1G card. I'm not sure it's to blame, but I think it's causing my Hyper-V VMs installed on the iSCSI disks to hang/crash too..

I'm looking at biting the bullet and upgrading to a 10G card, and want to some advise from the community. While doing research, I found a number of newer cards built on the TN4010 chip, which not only is less expensive than (new) Intel cards it seems, but I read use a LOT less power, to the tune of 1-2W vs 10-11W, which is a substantial amount of heat in a 2U chassis without much cooling.

Has anyone else looked at these cards, and do they look like good hardware? I am concerned about reliability obviously, as well as heat. Here is one such card, which the band name right off the bat causes some concern: https://www.newegg.com/rosewill-rc-nic412v2/p/N82E16833166130

It seems there may be a working driver now for FreeNAS, but I haven't done as much research yet into what it would take to install, my Linux experience in that field is limited to none: http://www.tehutinetworks.net/?t=drivers&L1=8&L2=12&L3=61

Comments, thoughts or suggestions from the community?: Is this card worth trying, or does is a cheap consumer-grade card not even worth it, and should I just go Intel? It seems like the X550 based cards are the newest option from Intel that are certified for use with FreeNAS, is that the best way to go for reliability's sake, or is there a better recommendation? Not looking for used hardware on eBay.
 

ee21

Dabbler
Joined
Nov 2, 2020
Messages
33
Well I gave it a shot, for anyone interested the driver installs perfectly on TrueNAS 12 using:
pkg add tn40xx_freebsd12_amd64-1.1.txz

NIC recognized as a tn400 10Gbase-T interface, iPerf3 tests were less than spectacular as can be expected with an off brand budget card:

[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 497 MBytes 4.17 Gbits/sec
[ 4] 1.00-2.00 sec 509 MBytes 4.27 Gbits/sec
[ 4] 2.00-3.00 sec 490 MBytes 4.11 Gbits/sec
[ 4] 3.00-4.00 sec 502 MBytes 4.21 Gbits/sec
[ 4] 4.00-5.00 sec 505 MBytes 4.23 Gbits/sec
[ 4] 5.00-6.00 sec 514 MBytes 4.31 Gbits/sec
[ 4] 6.00-7.00 sec 517 MBytes 4.34 Gbits/sec
[ 4] 7.00-8.00 sec 502 MBytes 4.21 Gbits/sec
[ 4] 8.00-9.00 sec 505 MBytes 4.24 Gbits/sec
[ 4] 9.00-10.00 sec 505 MBytes 4.23 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 4.93 GBytes 4.23 Gbits/sec sender
[ 4] 0.00-10.00 sec 4.93 GBytes 4.23 Gbits/sec receiver
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Thanks for the report; makes for an interesting budget option for users gunning for "greater than gigabit" speeds. What switch are you using (or is it direct-connect?)

Just be aware that any updates to TrueNAS may require you to reinstall the driver (or potentially break it) so it would be a good idea to be a little gun-shy when they do drop.

Comments, thoughts or suggestions from the community?: Is this card worth trying, or does is a cheap consumer-grade card not even worth it, and should I just go Intel? It seems like the X550 based cards are the newest option from Intel that are certified for use with FreeNAS, is that the best way to go for reliability's sake, or is there a better recommendation? Not looking for used hardware on eBay.

I'm trying to recall but of Intel cards the X520 is actually preferred I believe; the X550 has some driver maturity issues. Chelsio cards are the vendor of choice due to their mature driver and high sustained throughput. I wouldn't shy away from a working-pull used card myself, but if there's corporate dollars behind the purchase then absolutely let them pay for the warranty themselves.
 

ee21

Dabbler
Joined
Nov 2, 2020
Messages
33
No switch, direct attached using a 18 inch S/FTP CAT6 cable, soon to be replaced with a lower/larger AWG CAT 8

Sustained throughput seems to be nonexistent however, and do not recommend:
Screenshot (108).png


I likely will be returning these and biting the bullet on some Chelsio T520s and fiber DACs when I can afford
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Sustained throughput seems to be nonexistent however, and do not recommend:
Hold your horses; what's the rest of your system configuration? A long period of high speeds followed by a drop to a slower rate usually indicates that you've exhausted something else in your setup (likely your vdev/pool throughput) and this wouldn't be solved by the Chelsio cards.
 

ee21

Dabbler
Joined
Nov 2, 2020
Messages
33
Hold your horses; what's the rest of your system configuration? A long period of high speeds followed by a drop to a slower rate usually indicates that you've exhausted something else in your setup (likely your vdev/pool throughput) and this wouldn't be solved by the Chelsio cards.

  • AMD Ryzen 3100
  • AMD B550 chipset motherboard
  • 2x2x2 Pool of 3 vdevs, each composed of 2x mirrored SSDs, those SSDs which consist of a mix of 3x Samsung Evo850, and 3x Evo 860, unfortunately which were purchased at separate times, recycled from my old server, hopefully soon to be upgraded
  • 64GB non-ECC RAM
  • No SLOG yet, but going to be implementing a Sabre Rocket NVMe for that
  • Direct attached 10G NICs, same card on both ends as already discussed with a DAC CAT6 cable
I don't think I'm forgetting anything else that's really relative as far as performance goes, but chime in if I left out any important details.
I did test disk I/O using the "dd" command, I forget exactly what tests I ran, I think several 1GB writes, which were not buffered by the ZFS cache I don't think, and resulted in a bout a 550MB/s write speed.
Also I watched the dashboard during transfers such as what I screenshotted above, plenty of free memory for ZFS caching, low CPU usage, both before and after the transfer speeds dropped..

Oh, and the file copy was done from a Windows host, if not obvious, using an iSCSI volume, stored on a zvol extent on the TrueNAS
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
22.5MB/s is a pretty big dive to hit for an all-flash setup, but this could still be the write throttle. I'll circle back on this when I can get to a full keyboard and provide some scripts and CLI commands you can use to confirm.
 

ee21

Dabbler
Joined
Nov 2, 2020
Messages
33
22.5MB/s is a pretty big dive to hit for an all-flash setup, but this could still be the write throttle. I'll circle back on this when I can get to a full keyboard and provide some scripts and CLI commands you can use to confirm.

Thanks! I'll take all the help I can get. I've doubled the RAM, and ditched the 2.5G NIC already I started with, I'm not sure what else to try next except ditching the Ryzen CPU and getting something with ECC RAM, but I really was hoping to avoid that. I figured a known-good model NIC was the next thing to try before that..
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
So a bit of napkin math says that you were copying a 20GB file (a VHD/X, probably) and looking at the chart your transfer speed choked at the 30-32% mark, or about 6.4GB into the copy job. With 64GB of RAM you're at a max dirty data of 4GB. I'm thinking your SSDs are choking up under sustained writes (consumer SSDs are made for sprints, not marathons) and eventually you're just getting choked out by the write throttle.

iSCSI will also be chopping your files up into smaller records - the default volblocksize of 16K means a lot more I/Os to your pool than the default 128K recordsize.

Here's a dtrace script, save it on your TrueNAS machine as a file (eg: "dirty.d") and then run it with dtrace -s dirty.d YOURPOOLNAME and start a transfer. Watch for the correlation between transfer speed dropping and dirty data climbing.

Code:
txg-syncing
{
        this->dp = (dsl_pool_t *)arg0;
}

txg-syncing
/this->dp->dp_spa->spa_name == $$1/
{
        printf("%4dMB of %4dMB used", this->dp->dp_dirty_total / 1024 / 1024,
            `zfs_dirty_data_max / 1024 / 1024);
}

Also a note - iSCSI by default isn't using sync writes, so your SLOG device would go unused unless you specify sync=always on the ZVOLs. The Sabrent NVMe SSD also may not be a good SLOG candidate depending on model - check the thread in my signature for details on what makes a good SLOG device (low latency and high endurance)
 

ee21

Dabbler
Joined
Nov 2, 2020
Messages
33
So a bit of napkin math says that you were copying a 20GB file (a VHD/X, probably) and looking at the chart your transfer speed choked at the 30-32% mark, or about 6.4GB into the copy job. With 64GB of RAM you're at a max dirty data of 4GB. I'm thinking your SSDs are choking up under sustained writes (consumer SSDs are made for sprints, not marathons) and eventually you're just getting choked out by the write throttle.

iSCSI will also be chopping your files up into smaller records - the default volblocksize of 16K means a lot more I/Os to your pool than the default 128K recordsize.

Here's a dtrace script, save it on your TrueNAS machine as a file (eg: "dirty.d") and then run it with dtrace -s dirty.d YOURPOOLNAME and start a transfer. Watch for the correlation between transfer speed dropping and dirty data climbing.

Code:
txg-syncing
{
        this->dp = (dsl_pool_t *)arg0;
}

txg-syncing
/this->dp->dp_spa->spa_name == $$1/
{
        printf("%4dMB of %4dMB used", this->dp->dp_dirty_total / 1024 / 1024,
            `zfs_dirty_data_max / 1024 / 1024);
}

Also a note - iSCSI by default isn't using sync writes, so your SLOG device would go unused unless you specify sync=always on the ZVOLs. The Sabrent NVMe SSD also may not be a good SLOG candidate depending on model - check the thread in my signature for details on what makes a good SLOG device (low latency and high endurance)

Correct, ~20GB VHDX. I thought the Samsung SSDs were able to handle sustained writes at near max speed without throttling due to cache/write throughput, but thank you for the script, I will test!

I'm not entirely sure what you mean by a max of 4GB dirty data? Where did you get that based on 64GB RAM? I thought the whole point of the ZFS cache was that writes could be dumped in at full speed and then flushed to disk? When running over a 1G connection, writes typically are sustained at 115MB/s without any sudden dives, if that tells you anything.. Oh and just out of curiosity, I felt the little heatsink on the card during an idle workload, and it was burning hot to the touch, I'm going to try some active cooling and see if that makes a difference.

And I did go with the default block size at 16K for the zvols o_O Is it worth wiping them out and rebuilding at 64 or 128k? I did some research before hand, but didn't really get a straight answer besides "the default size usually works fine". I think the Windows side of the disk partition is set at the default as well, which is 4k.. 64k max for NTFS, should I reformat and build the zvols both at 64k maybe then?

And the Sabre I think is probably the best I can afford, for the price. It had one of the lowest latencies of any NVMe drive I have seen, as close as the Intel Optane 900 as an NVMe can get I think, and one of the highest rated MTBF and TBW ratings I have also seen. I'll check out your thread though! My whole pool is strictly for VMs, and has sync=always set.

-Is the scenario we've just talked about, where the SSDs are running out of steam, a good use case for an SLOG even?- EDIT: based on the below, guessing SLOG won't actually help this pool most likely?


EDIT: Ran the test, very interesting results, I'm happy at least it looks like my pool isn't at fault.. Time to ditch the NIC?
Screenshot (24).png
 
Last edited:
Top