10Gb Network - Ping Works But Data Transfer Doesn't

Hatfriend

Cadet
Joined
Nov 27, 2021
Messages
7
Hey everyone,
I have a Linux server and TrueNAS box connected to a 1Gb router and wanted to add a direct 10Gb connection to improve data transfer speeds between the two. I read on other forum posts that it is a fairly straightforward process that involves creating a separate network and mounting drives using the 10Gb IP addresses. So I purchased two Chelsio T520-CR NICs, connected them with a DAC and assigned IP addresses to my network as follows:

Linux Server (1Gb) - IP: 192.168.1.108 - Subnet: 255.255.255.0
Linux Server (10Gb) - IP: 192.168.2.108 - Subnet: 255.255.255.0
TrueNAS Server (1Gb) - IP: 192.168.1.100 - Subnet: 255.255.255.0
TrueNAS Server (10Gb) - IP: 192.168.2.100 - Subnet: 255.255.255.0

Initially, I had issues installing the Chelsio drivers and getting the connection to work on the Linux Server, so I swapped the two cards between machines due to a suggestion in another forum post. After that, everything on the Linux Server worked and I was able to ping the TrueNAS server through the 10Gb connection. Everything seemed fine until I ran iperf to test the speed of the connection. I ran the following command and received the error:
Code:
root@LINUXSERVER:~# iperf -c 192.168.2.108
connect failed: Connection timed out


I also tried mounting a share from TrueNAS on Linux and got the following error:
Code:
root@LINUXSERVER:~# mount -t cifs -o user=USER,password=PASSWORD //192.168.2.108/SHARENAME /mnt/SHARENAME
mount error(115): Operation now in progress
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)


All the commands I executed through the 1Gb network worked. I'm beginning to suspect the card in the TrueNAS server is bad, but I wanted a second opinion before I went through the trouble of replacing it.
 

Hatfriend

Cadet
Joined
Nov 27, 2021
Messages
7
NOTE: I made a small error in the original post. The IP addresses for the 2 servers are as follows:
Linux Server (1Gb) - IP: 192.168.1.100 - Subnet: 255.255.255.0
Linux Server (10Gb) - IP: 192.168.2.100 - Subnet: 255.255.255.0
TrueNAS Server (1Gb) - IP: 192.168.1.108 - Subnet: 255.255.255.0
TrueNAS Server (10Gb) - IP: 192.168.2.108 - Subnet: 255.255.255.0
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Bad cards do happen. These cards run hot, and a common mistake is for them to be run in a situation where they have inadequate airflow. You haven't provided any context about hardware in which to frame your question. If you had a commercially purchased 2U from Dell, Supermicro, or HP, I would expect sufficient airflow to exist in the chassis. If you have an attempt to build a "quiet NAS" with only gamer-grade experience in building PC's, I would expect you might have toasted (past tense) or be toasting (current tense) your cards if you haven't specifically allowed for cooling of the network card. Other options exist too.
 

Hatfriend

Cadet
Joined
Nov 27, 2021
Messages
7
I completely forgot to add the hardware I'm using.

TrueNAS Server
Fractal Node 304
AsRock Rack C2750D4I
32GB ECC RAM
6x3TB WD Red in RAIDZ2

Linux Server
Modified Micro ATX case
Asrock B450 Pro4
AMD Ryzen 3900
64GM RAM

For the Linux Server I have a 120mm intake fan blowing across a small graphics card and the NIC and the heat sink is measuring 50C. For the TrueNAS Server, I set it up so it is quiet but the fan noise is still noticeable. There is an intake next to the NIC but no fan is directly blowing on it and the heat sink temp for that one is 55C. The T520-CR spec sheet said the operating temp is 0C-55C so I didn't think there would be an issue especially in the installation phase. Do you think overheating is the issue?
 

Hatfriend

Cadet
Joined
Nov 27, 2021
Messages
7
Bad cards do happen. These cards run hot, and a common mistake is for them to be run in a situation where they have inadequate airflow. You haven't provided any context about hardware in which to frame your question. If you had a commercially purchased 2U from Dell, Supermicro, or HP, I would expect sufficient airflow to exist in the chassis. If you have an attempt to build a "quiet NAS" with only gamer-grade experience in building PC's, I would expect you might have toasted (past tense) or be toasting (current tense) your cards if you haven't specifically allowed for cooling of the network card. Other options exist too.
So I took your advice and added fans next to both cards. Now they are operating at 45C and 48C. I was hoping that maybe the cards weren't working because they were on the upper edge of the operating temperature range, but that wasn't the case. I'm still seeing the same problem where ping works but data transfer doesn't. Do you have any suggestions on what I could try to either fix the problem or test the cards to see if they are working properly?

Thanks for your help
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I think you'll need to do more to characterize your problem.

You say "ping" works. Does it work in both directions? Does it work on both networks? That's four separate tests.

Does "ping" with 1500 byte size work?

Have you done anything like trying to configure jumbo frames that can often hose up a network? (just disable jumbo if so)

What happens if you enable ssh on the NAS and try to ssh to it from the client? On each network?

Do iperf3 tests work? (this is a complex mess of possible tests)

I'm sorta concerned about the "had problems installing the driver on Linux" thing so have you tried running some other OS, maybe from a thumb drive, where the driver installs cleanly?
 

Hatfriend

Cadet
Joined
Nov 27, 2021
Messages
7
I think you'll need to do more to characterize your problem.

You say "ping" works. Does it work in both directions? Does it work on both networks? That's four separate tests.

Does "ping" with 1500 byte size work?

Have you done anything like trying to configure jumbo frames that can often hose up a network? (just disable jumbo if so)

What happens if you enable ssh on the NAS and try to ssh to it from the client? On each network?

Do iperf3 tests work? (this is a complex mess of possible tests)

I'm sorta concerned about the "had problems installing the driver on Linux" thing so have you tried running some other OS, maybe from a thumb drive, where the driver installs cleanly?

You say "ping" works. Does it work in both directions? Does it work on both networks? That's four separate tests.
192.168.1.100 -> 192.168.1.108 Success
192.168.1.100 <- 192.168.1.108 Success
192.168.2.100 -> 192.168.2.108 Success
192.168.2.100 <- 192.168.2.108 Success

Does "ping" with 1500 byte size work?
192.168.1.100 -> 192.168.1.108 Success
192.168.1.100 <- 192.168.1.108 Success
192.168.2.100 -> 192.168.2.108 Success
192.168.2.100 <- 192.168.2.108 Success
I also tried 5000 and 10000 with success for all four configurations

Have you done anything like trying to configure jumbo frames that can often hose up a network? (just disable jumbo if so)
No. I was planning on just getting the system working reliably before I decided to add jumbo frames. It didn't seem like it would give me much of a speed boost, so I wasn't too interested in adding it. Both 10Gb cards indicate mtu 1500.

What happens if you enable ssh on the NAS and try to ssh to it from the client? On each network?
192.168.1.100 -> 192.168.1.108 Success
192.168.2.100 -> 192.168.2.108 Fail (I could not ssh NAS from Linux through 10Gb link)
Windows PC -> 192.168.1.108 Success
Windows PC -> 192.168.1.100 Success

Do iperf3 tests work? (this is a complex mess of possible tests)
I ran the command "iperf3 -c IP" in the configurations below. I tried a variety of other options but found nothing that worked.
192.168.1.100 -> 192.168.1.108 Success
192.168.1.100 <- 192.168.1.108 Success
192.168.2.100 -> 192.168.2.108 Fail
192.168.2.100 <- 192.168.2.108 Fail

I'm sorta concerned about the "had problems installing the driver on Linux" thing so have you tried running some other OS, maybe from a thumb drive, where the driver installs cleanly?
Initially I tried installing the entire Chelsio Unified Driver package on Linux Server. The iSCSI driver install kept failing so I resorted to just installing the NIC/TOE driver. I'm not familiar with the Unified Driver, so I'm not sure if that was expected or something to be concerned about.
After the driver installation, I wasn't able to see the 10Gb network interface in ifconfig, so I swapped the two cards. I then reinstalled the NIC/TOE driver on the Linux Server and then the interface appeared in ifconfig.

I've contemplated swapping the network cards again to see if my original problems reemerge but won't be able to shut the Linux server down for a week due to an urgent job.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The success of 1500 byte pings but the failure of iperf3, SSH, and SMB connections, all of which are TCP connections, make me wonder if this is some sort of TCP offload damage on the Linux machine.

Can you retry the iperf3 tests on the 10G link, but add the "--udp" flag?
 

Hatfriend

Cadet
Joined
Nov 27, 2021
Messages
7
The success of 1500 byte pings but the failure of iperf3, SSH, and SMB connections, all of which are TCP connections, make me wonder if this is some sort of TCP offload damage on the Linux machine.

Can you retry the iperf3 tests on the 10G link, but add the "--udp" flag?
I ran the following on the 10G link in both directions and received no response from either server. The same command on the 1G link worked fine.

Code:
iperf3 -c 192.168.2.108 --udp
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Then I think what you should do is to toss another install of FreeNAS/TrueNAS on a thumb drive, temporarily boot it on the Linux server, and see if IP connectivity can be established. The driver in FreeNAS/TrueNAS is a known quantity (known to work).

I'm not really sure what else to suggest. Something's obviously broken. My normal technique here would be to simply start replacing bits one at a time until something clears the logjam. If you don't have spare hardware bits to do that with, I understand that this is harder to do. I suppose there's a possibility that it's something like a firewall issue with the Linux host, but I don't know what to suggest for that.

If this was an Intel card, I'd say it was likely a fake card, but I haven't heard of fake Chelsios for reasons I outline elsewhere.

I'm definitely interested in figuring this out, but I'm running out of good/obvious ideas.
 

Hatfriend

Cadet
Joined
Nov 27, 2021
Messages
7
Then I think what you should do is to toss another install of FreeNAS/TrueNAS on a thumb drive, temporarily boot it on the Linux server, and see if IP connectivity can be established. The driver in FreeNAS/TrueNAS is a known quantity (known to work).

I'm not really sure what else to suggest. Something's obviously broken. My normal technique here would be to simply start replacing bits one at a time until something clears the logjam. If you don't have spare hardware bits to do that with, I understand that this is harder to do. I suppose there's a possibility that it's something like a firewall issue with the Linux host, but I don't know what to suggest for that.

If this was an Intel card, I'd say it was likely a fake card, but I haven't heard of fake Chelsios for reasons I outline elsewhere.

I'm definitely interested in figuring this out, but I'm running out of good/obvious ideas.
So we've made some progress. I boot TrueNAS on the Linux server and was able to SSH into the NAS. Both network cards are functioning properly so your theory was correct that it could be an issue with Linux. What that issue might be is still up in the air. I will continue trying to figure this out but I really appreciate your help up to this point. If I'm able to solve the issue, I'll post the solution here. Thanks a lot for your help.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
No problemo. I've been there myself. I spent a little extra time thinking about this one because I understand the value a fresh pair of experienced and cynical eyes can bring to solving this sort of problem.
 
Top