Reporting on 11.2 broken

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
We have two production servers, both upgraded recently to 11.2-U1. Today I noticed that reporting on the main system is completely broken (no graphs whatsoever). On the hot standby system, the default graphs work but the whole UI hangs when I try to select additional metrics. The additional metric does eventually come up but it's blank. Anyone else seeing this?
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
Also, all the metrics on the dashboard are broken as well. They never load.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
We have two production servers, both upgraded recently to 11.2-U1. Today I noticed that reporting on the main system is completely broken (no graphs whatsoever).
What kind of boot media? More hardware details would help.
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
What kind of boot media? More hardware details would help.

I'm talking about one of the primary production systems BTW. Looks like the graphs just stopped working recently for some reason.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
OS disk: 2 x Kingston DataTraveler 3.0 64GB mirrored
I actually have three or so of those drives myself. I like them for a lot of things. I have a some that I use to install Windows 10 from, when I need to commission new workstations at the office, but that is a temporary boot device. What I have seen with many of the forum members that chose USB 3.0 drives for boot media is that they tend to fail, what might be called, prematurely. The symptoms you describe sound like failed boot media and if these are all the same age, with about the same amount of wear, they may well have all failed at about the same time. Another thing I have noticed is that when the boot media fails, with USB media, you don't get any kind of warning or notification. It is just kind of catastrophic. This is the reason that we don't suggest using USB boot media any more, and have not for over a year, maybe even two.
See:
Hardware Requirements
http://www.freenas.org/hardware-requirements/
Capture.PNG
The good news, it should be as simple as replacing the boot media, reinstalling and reloading the config.db file, but the server will be down for a little time while you do that.
I suggest switching to SSD for your boot media or perhaps a SATA DOM
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
Hrmm - the docs say USB sticks are fine:

The FreeNAS® operating system is installed to at least one device that is separate from the storage disks. The device can be a SSD, USB memory stick...

In fact, the screenshot you posted says the same thing and that SSDs are only recommended. So I guess I'm confused. If USB boot devices aren't recommended any more then why are they listed all over the place in the docs?

I'm also skeptical about not having any indication of failure. These boot devices are regularly scrubbed without issue. I see no error logs of any kind. How could they just silently start failing without the system detecting that?

I think the more likely answer for the problems I've been seeing recently (and posting here about) have to do with the upgrade to 11.2. I've never seen these kinds of issues until I upgraded.

If anyone has any input on how to troubleshoot this I'd appreciate. I'll dig around myself and maybe I'll just have to put in a ticket.
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
Oh and again, we don't have a choice about hardware. This is what we have to work with. I'd have never used USB boot drives if we had a choice or if the docs had said not to.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I'm also skeptical about not having any indication of failure. These boot devices are regularly scrubbed without issue. I see no error logs of any kind. How could they just silently start failing without the system detecting that?
There are no tests for USB memory sticks to tell you when they start to fail. That is one of the reasons for NOT using them. I am not going to discuss it with you. You came asking for help. If you refuse to listen, figure it out yourself.
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
For anyone who might have this issue in the future, I simply had to start collectd which wasn't running for whatever reason on this particular server (service collectd start).
 

Omega

Dabbler
Joined
Dec 12, 2015
Messages
15
Thanks, that was helpful, just had a similar issue, collectd wasn't running, thus all metrics non-existent. Looks like my collectd hit

Code:
collectd[3698]: nanosleep failed: No child processes


around the time it stopped collecting metrics. Anyhow, not a big deal, but is there a way to auto-restart services?
 

r7365326dx

Cadet
Joined
May 7, 2019
Messages
2
Definitely some kind of issue that is causing collectd to die and not restart.

This exact issue has occurred for me multiple times.

collectd appears to stop running after this is logged:

May 2 01:42:06 nas.local collectd[3044]: nanosleep failed: No child processes


It is able to be started back up with "service collectd start"


Currently running: FreeNAS-11.2-U3
 

r7365326dx

Cadet
Joined
May 7, 2019
Messages
2
This happened again in the past week, I noticed that the FreeNAS dashboard was not showing any stats, an the reporting section was also blank.

Confirming that collectd had stopped running:
root@nas[~]# ps faux | grep collectd
root@nas[~]#

Same error message in the system logs:
May 13 15:28:49 nas.local collectd[3467]: nanosleep failed: No child processes


Start collectd back up again:
root@nas[~]# service collectd start
Starting collectd.
option = Hostname; value = nas.local;
option = FQDNLookup; value = true;
option = BaseDir; value = /var/db/collectd;
option = PIDFile; value = /var/run/collectd.pid;
Created new plugin context.
plugin_load: plugin "aggregation" successfully loaded.
plugin_load: plugin "cpu" successfully loaded.
plugin_load: plugin "cputemp" successfully loaded.
plugin_load: plugin "ctl" successfully loaded.
plugin_load: plugin "df" successfully loaded.
plugin_load: plugin "disk" successfully loaded.
plugin_load: plugin "exec" successfully loaded.
plugin_load: plugin "geom_stat" successfully loaded.
plugin_load: plugin "interface" successfully loaded.
plugin_load: plugin "load" successfully loaded.
plugin_load: plugin "memory" successfully loaded.
plugin_load: plugin "network" successfully loaded.
plugin_load: plugin "processes" successfully loaded.
plugin_load: plugin "python" successfully loaded.
plugin_load: plugin "rrdcached" successfully loaded.
plugin_load: plugin "swap" successfully loaded.
plugin_load: plugin "uptime" successfully loaded.


It is running again:
root 18520 0.0 0.1 79508 33988 - Ss 11:01 0:00.12 /usr/local/sbin/collectd
 

z-factor

Cadet
Joined
Jun 12, 2019
Messages
2
I also observed this behavior in FreeNAS-11.2-U4.1.
Restarting collectd fixed it.
Thanks!
 

Huib

Explorer
Joined
Oct 11, 2016
Messages
96
I had the same issue and restarting collecd seems to do the trick.
I noticed this in both 11.2 U3 and 11.2-U4.1 and did not have this in 9.10 and 11.1 versions.

I will see if I can replace my usb sticks for an ssd to see if collectd will become more stable after this.
 

z-factor

Cadet
Joined
Jun 12, 2019
Messages
2
I will see if I can replace my usb sticks for an ssd to see if collectd will become more stable after this.

@Huib - I'm running a mirrored SSD root and have observed this problem, so I doubt that SSD's will fix it for you. Unfortunately, it's intermittent so it's difficult to debug.
 

FreeNASBob

Patron
Joined
Aug 23, 2014
Messages
226
Also suffering from this issue.

Code:
Jun 28 21:09:02 freenas collectd[6103]: nanosleep failed: No child processes


All reporting shows 0 for all parameters and devices. I am using an SSD for the system dataset. FreeNAS 11.2-U4.
 

Jose Baars

Cadet
Joined
Mar 6, 2015
Messages
4
Hi,
Running Freena 11.2-U5.
Saw no reports and found this message in /var/log/messages:

Aug 7 03:35:53 zen collectd[4253]: nanosleep failed: No child processes

Had to start collectd with service start collectd as well to get graphs again.
 

Icey

Cadet
Joined
Dec 13, 2013
Messages
7
This has been broken for several months on my system FreeNAS-11.2-U4.1 , think it was broken sometime around U3. Starting collectd has begun to generate graphs. Probably should be a bug tracked on this. Don't think it relates to failed usb sticks though...
 
Top