Troubleshooting odd network/speed problems - general approach?

paulinventome

Explorer
Joined
May 18, 2015
Messages
62
So been running TrueNAS for a while. Had some odd issues along the way but mostly sorted.

But now I see periods where access suddenly becomes super slow. Or I start seeing failures to read or write. Usually from a windows machine. I will see great performance up until that point. If I shut everything down and reboot the TrueNAS server I will see the performance return.

SMART tests and scrubbing don't bring up any errors.

But my questions really are about how on earth do I begin to troubleshoot something like this. I feel that it might be network based, or permissions or something going on. I am 99% sure it's around the TrueNAS server.

Is there a network or general log file I can see? Or can I enable some logging and diagnostics to hopefully get some feedback from TrueNAS?

Kindest
Paul
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
You can probably
Is there a network or general log file I can see? Or can I enable some logging and diagnostics to hopefully get some feedback from TrueNAS?
You can start looking in /var/log/messages
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So been running TrueNAS for a while. Had some odd issues along the way but mostly sorted.

Is there a network or general log file I can see? Or can I enable some logging and diagnostics to hopefully get some feedback from TrueNAS?

Welcome to the forums.

Sorry to hear you're having trouble. There aren't any "network" or "log" files that are going to do high level debugging work for you; you need to avail yourself of tools such as ping and iperf3 yourself to provide some baseline stuff.

I would also note that you're doing yourself a disservice when you describe your server with such generalities as "Xeon build X99 board" and "10Gbe network"; these are almost entirely unhelpful. Your "10Gbe network" could be anything from an Aquantia RJ45 PoS to a Chelsio high end SFP+ card; these will differ greatly in their performance. Please consider checking out the Forum Rules, conveniently located at the top of every page in red, for some guidance on how to generate a useful problem description and outline of your hardware.

Since you're asking about network, you also need to describe your network, such as what switches you're using and how you've got things hooked up. Also describing your client machine would be beneficial.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Welcome to the forums.
He's not exactly new lol. Post count in the double digits (54) and join date of 2015!!!

That being said, the lack of information in the post would certainly lead one to believe it's a newcomer.
 

paulinventome

Explorer
Joined
May 18, 2015
Messages
62
Welcome to the forums.

Sorry to hear you're having trouble. There aren't any "network" or "log" files that are going to do high level debugging work for you; you need to avail yourself of tools such as ping and iperf3 yourself to provide some baseline stuff.

I would also note that you're doing yourself a disservice when you describe your server with such generalities as "Xeon build X99 board" and "10Gbe network"; these are almost entirely unhelpful. Your "10Gbe network" could be anything from an Aquantia RJ45 PoS to a Chelsio high end SFP+ card; these will differ greatly in their performance. Please consider checking out the Forum Rules, conveniently located at the top of every page in red, for some guidance on how to generate a useful problem description and outline of your hardware.

Since you're asking about network, you also need to describe your network, such as what switches you're using and how you've got things hooked up. Also describing your client machine would be beneficial.
Sorry about the signature, just updated and I'll add some more once I've dug out models.

This has been working fine. Just today it's playing up to the point where I cannot copy anything to any of the pools now. Even after shut downs and restarts. I know there is no easy way to track this down and it's going to be trial and error. But I was wondering if I am missing some logs, and /var/log has some that I didn't know about. Whether they contain anything useful I am not sure yet.

So the server is running one of the 'standard' intel 10gbe cards (X540?) and my workstation has the same card. I have Macs running 5gbe and other machines running various adaptors. The network is going through a TPlink 5 port 10gbe switch and there's a netgear 10gbe switch in there as well.

I have to track down what has changed today and my question is whether there are broad guidelines.

I am pretty sure it's on the server side because I have more than one machine not being able to write to it, different machines and network cards.

Kindest
Paul
 
Last edited:

paulinventome

Explorer
Joined
May 18, 2015
Messages
62
He's not exactly new lol. Post count in the double digits (54) and join date of 2015!!!

That being said, the lack of information in the post would certainly lead one to believe it's a newcomer.
To be fair on me, I may have joined in 2015 but I only build a truenas system last year. I must have registered thinking about it in 2015
Point taken on the info, which I am correcting!

thanks
Paul
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Just today it's playing up to the point where I cannot copy anything to any of the pools now.
Define what "cannot copy anything to any of the pools" means cause it's kinda' vague. You failed to connect to SMB? Error message/alert? I mean, there must be SOMETHING that made you think that you "cannot copy anything" to it. Be more specific, What were you trying to do, what were you expecting, and what actually happened instead? This is how you want to frame your issues when you want meaningful help. Help us HELP YOU.
 

paulinventome

Explorer
Joined
May 18, 2015
Messages
62
Define what "cannot copy anything to any of the pools" means cause it's kinda' vague. You failed to connect to SMB? Error message/alert? I mean, there must be SOMETHING that made you think that you "cannot copy anything" to it. Be more specific, What were you trying to do, what were you expecting, and what actually happened instead? This is how you want to frame your issues when you want meaningful help. Help us HELP YOU.
So I can connect to the server from any machine. I can copy a file and it will start - it will do a small part of it and then just hang. Sometimes coming back with a windows / macOS error. The same behaviour from multiple machines. This applied to any of the shares - whether NVMe or SSD or Spinning Drives.

I can't see any errors on the server itself but this was part of my effort to work out how I can tell about the log file positions.

Kindest
Paul
 

paulinventome

Explorer
Joined
May 18, 2015
Messages
62
So just to add to this.

I think it might actually be heat related. I shut everything off last night. Let it all cool down and then try again this morning and I am back to seeing 800MB/s across all the volumes. I had been running the server with the side off because I am in the process of changing a 4 port HBA to an 8 port. I noticed yesterday how physically hot everything was. So it could be related to that in which case I will need to check my fans, air flow and keep the side on.

I have a bunch of PCI cards, HBA, Network and the NVMe raid - none of which have fans (trying to also keep the server quiet) but I have a lot of 140mm fans pulling the air around. Could be having the side off just destroyed the airflow.

So something to monitor.

Kindest
Paul
 
Top