Network drops under load - AsRock C2750D4i

Status
Not open for further replies.

rabiat

Dabbler
Joined
Oct 4, 2011
Messages
19
Hi,

I'm having a really weird issue with my NAS currently running FreeNAS 9.2.1.7 (had the issue in 9.2.1.6 as well) and would really appreciate some assistance as I'm running out of ideas.

The problem is that the NAS becomes unreachable from the LAN when under load (transfers via cifs and/or nfs), the issue can be reproduced at any time by just transferring a larger amount of data to the NAS and it usually only takes a few minutes until network connectivity fails. If I check my switch the ports are shown as UP. If I access the KVM console I can sometimes ping some of the other nodes on my LAN, not all of them but some which is extremely weird. In most cases pinging other nodes on the same subnet doesn't work at all and it outputs "No route to host".

* I have ruled out the switch (Zyxel GS1900-24E) and cables by connecting a workstation directly to the NAS with different cable. The issue remains.

* I have tried the second interface on the motherboard but it also fails. I have tested running link aggregation with LACP and both NIC's fails at the same time.

* I installed nas4free 9.2.0.1.972 on a different USB stick just to try another similar OS, I didn't manage to reproduce this issue in nas4free, It was 100% stable during load testing.

I compared the versions of the igb driver:

nas4free 9.2.0.1.972:
dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.10

FreeNAS 9.2.1.7:
dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.4.0

* FreeNAS logs says nothing when the network fails.

* Running /etc/rc.d/netif restart resolves the issue but the network will fail again after a while under traffic load.

The hardware specs:


AsRock C2750D4i
16GB Kingston ECC
IBM m1015 flashed with IT firmware.
8x4TB WD Red encrypted raidz2 pool. Running single 2TB WD Green during testing though.


ifconfig before failure:
Code:
[root@freenas] /# ifconfig igb0
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO>
    <...>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active


ifconfig after failure:
Code:
[root@freenas] /# ifconfig igb0
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO>
    <...>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active



Anyone else stumbled upon a similar problem?

Thanks in advance!
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
That's for the 10Gb Intel card, not the 1Gb.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Try removing the M1015. I've had some networking issues with the C2750 that don't actually seem to have much to do with the C2750 but rather maybe some interop issues with older switch gear and the new Intel chipset. That was showing up as unexplained poor performance.

I haven't done much with putting a SAS HBA/RAID in the C2750 - definitely want to though to see how it handles abuse as a SLOG device.
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
That's for the 10Gb Intel card, not the 1Gb.

If you read the discussion on the FreeBSD kernel list, the problems they fixed with mbufs affect both Intel drivers, 1g and 10g.

The "under load" and "restart fixes it" smells to me like he's hitting the same problem.
It's easy enough to test for and there's an errata kernel to try, so the worst outcome is that he swaps the kernel and it doesn't fix his issue.
 
Last edited:

rabiat

Dabbler
Joined
Oct 4, 2011
Messages
19
Thanks for the replies guys!
I got some help from "jkh" in the staff and he said "There are known firmware issues with some serial # ranges of the ASRock motherboards that cause this exact symptom." and pointed me to an update package. I tried the update but the issue remains.

I do have another box with similar setup which is currently running ESXi 5.5 without any issues, it's connected to the same Zyxel switch as the NAS using link aggregation.

Specs:

AsRock C2750D4i
32GB Corsair non-ECC
IBM m1015 flashed with IT firmware.

My next step will be to boot FreeNAS 9.2.1.7 on that box and perform the same tests.
 

rabiat

Dabbler
Joined
Oct 4, 2011
Messages
19
It seems the issue is with the motherboard, looks like RMA is the way to go now. The other C2750d4i I have works without any issues.

The first board must be an early batch or something, I can't even get the serial number using dmidecode in the shell.
 

AlainD

Contributor
Joined
Apr 7, 2013
Messages
145
Disturbing, but the C2750d4i is a recent MB/CPU/company. Did you contact Asrock Rack inc. ?
 

rabiat

Dabbler
Joined
Oct 4, 2011
Messages
19
Disturbing, but the C2750d4i is a recent MB/CPU/company. Did you contact Asrock Rack inc. ?

No I didn't. I tried to contact AsRock support a while ago regarding another issue but I didn't receive any response at all so it felt pointless to do it this time.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It seems the issue is with the motherboard, looks like RMA is the way to go now. The other C2750d4i I have works without any issues.

The first board must be an early batch or something, I can't even get the serial number using dmidecode in the shell.

What _do_ you get?

Code:
Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
        Manufacturer: ASRock
        Product Name: C2750D4I
        Version:
        Serial Number: E8R-3B000300289
        Asset Tag:
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis:
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0

 

rabiat

Dabbler
Joined
Oct 4, 2011
Messages
19
What _do_ you get?

Code:
Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
        Manufacturer: ASRock
        Product Name: C2750D4I
        Version:
        Serial Number: E8R-3B000300289
        Asset Tag:
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis:
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0


I get the following:

Code:
Base Board Information
        Manufacturer: ASRock
        Product Name: C2750D4I
        Version:
        Serial Number:
        Asset Tag:
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis:
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0
 

pontusborg

Cadet
Joined
May 28, 2013
Messages
2
I have also just realized that I have exactly the same problem.
Under high NFS load the ethernet controllers hang. About once every 5-10 minutes.
The same FreeNAS version on a supermicro board (A1SAi) with the same CPU has not problems at all.

My magic numbers:
Code:
Base Board Information
        Manufacturer: ASRock
        Product Name: C2750D4I
        Version:
        Serial Number: E8R-41000200048
        Asset Tag:
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis:
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0
 

pontusborg

Cadet
Joined
May 28, 2013
Messages
2
I just got a reply from ASRock support. Very fast reply.
There are some issues with the IPMI sharing the intel ports. The solution is to go into the IPMO (Megarac SP). Select Configure->Network->Network Bond, enable it and choose eth1 (??!!). This will make the IPMI leave the intel NICs alone and only work on the dedicated network interface. The changes to the settings makes no sense at all but it solves the problem.

Just as a note: I am running both ethernet ports bonded in FreeNAS.
 
Last edited:

rabiat

Dabbler
Joined
Oct 4, 2011
Messages
19
I just got a reply from ASRock support. Very fast reply.
There are some issues with the IPMI sharing the intel ports. The solution is to go into the IPMO (Megarac SP). Select Configure->Network->Network Bond, enable it and choose eth1 (??!!). This will make the IPMI leave the intel NICs alone and only work on the dedicated network interface. The changes to the settings makes no sense at all but it solves the problem.

Just as a note: I am running both ethernet ports bonded in FreeNAS.

That was the first thing I did when I built my NAS, i.e to separate the IPMI/KVM thingamajig from the Intel NIC's and run on the dedicated port.
Maybe this solves the problem for some but not for me :(
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I know I found it rather disconcerting that the IPMI showed up shared despite having a dedicated port, probably my biggest complaint about the C2750's IPMI implementation (which is in most other ways quite nice).
 

rabiat

Dabbler
Joined
Oct 4, 2011
Messages
19
I totally agree!
I also have another "issue" not really related to this thread though but maybe someone has any useful tip. I can't get the KVM console feature to work on Windows with Oracle's Java runtime, I've tried adding the site to "trusted sites" in Java configuration and also tried changing the security level and I've tried older versions of Java but nothing works. I have this problem with both of my C2750D4's.

The last thing I tried was installing Centos desktop that comes with "IcedTea web start" and it works perfectly with the KVM console.

Has anyone managed to get this running on Windows 8.1 with the later versions of Java runtime, in that case which version are you running and what settings do you have?
 

panz

Guru
Joined
May 24, 2013
Messages
556
I totally agree!
I also have another "issue" not really related to this thread though but maybe someone has any useful tip. I can't get the KVM console feature to work on Windows with Oracle's Java runtime, I've tried adding the site to "trusted sites" in Java configuration and also tried changing the security level and I've tried older versions of Java but nothing works. I have this problem with both of my C2750D4's.

The last thing I tried was installing Centos desktop that comes with "IcedTea web start" and it works perfectly with the KVM console.

Has anyone managed to get this running on Windows 8.1 with the later versions of Java runtime, in that case which version are you running and what settings do you have?

I have the same problem with a Supermicro X9 and a laptop with Win7 Pro-64bit. For some weird reason it can't run the Java IPMI program by just double-clicking on it. So, I downloaded the IPMI package from Supermicro and used the .bat file provided. Be sure to edit it to point to your Java program. Use, as reference, this video

(the procedure for IPMI is at the end of the video)
View: http://youtu.be/IZJRzIq6GuQ
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I totally agree!
I also have another "issue" not really related to this thread though but maybe someone has any useful tip. I can't get the KVM console feature to work on Windows with Oracle's Java runtime, I've tried adding the site to "trusted sites" in Java configuration and also tried changing the security level and I've tried older versions of Java but nothing works. I have this problem with both of my C2750D4's.

The last thing I tried was installing Centos desktop that comes with "IcedTea web start" and it works perfectly with the KVM console.

Has anyone managed to get this running on Windows 8.1 with the later versions of Java runtime, in that case which version are you running and what settings do you have?

No trouble here but I'm kind of used to the whole idiotic process of getting Java to work with $randomcrap.

Windows 8.1, Firefox 31, Java 7u65. You have to configure Java to accept your server as a secure site, and you have to configure Firefox to allow popups. Then you have to configure your antivirus to allow the stupid Java download. There's probably other stuff. Since I didn't specifically set this up on a fresh box to hook up to that specific device I don't have exact steps for you.
 
Status
Not open for further replies.
Top