SOLVED Data for UPS is stale

bloomo

Explorer
Joined
Apr 4, 2014
Messages
58
This fix is great! But it won't survive a reboot. I would also like to point out that you need to make sure the usbhid-ups driver actually started. You should get similar results to what I have listed below. I had the driver fail to start once, so i started it manually to debug it but it worked that time. I then started the nut service again and everything started successfully. FYI to start driver manually run '/usr/local/libexec/nut/usbhid-ups -a ups -DDD'

Code:
[root@FreeNAS ~]# service nut start
Network UPS Tools - UPS driver controller 2.7.1
Network UPS Tools - Generic HID driver 0.38 (2.7.1)
USB communication driver 0.32
Using subdriver: CyberPower HID 0.3
libusb_get_interrupt: Unknown error
Starting nut.
Network UPS Tools upsd 2.7.1
fopen /var/db/nut/upsd.pid: No such file or directory
listening on 0.0.0.0 port 3493
Connected to UPS [ups]: usbhid-ups-ups
 
[root@FreeNAS ~]# ps auxw | grep ups
root 7763 0.0 0.0 20340 3820 ?? Is 7:31PM 0:00.00 /usr/local/sbin/upsmon localhost
uucp 7764 0.0 0.0 20340 3932 ?? S 7:31PM 0:00.01 /usr/local/sbin/upsmon localhost
uucp 7788 0.0 0.0 20340 3928 ?? Is 7:31PM 0:00.00 /usr/local/bin/upslog -s ups -l /var/log/ups.log -i 300
uucp 8099 0.0 0.0 14232 2148 ?? Ds 7:38PM 0:00.00 /usr/local/libexec/nut/usbhid-ups -a ups
uucp 8101 0.0 0.0 20356 3856 ?? Ss 7:38PM 0:00.00 /usr/local/sbin/upsd
I did notice that the usbhid-ups driver didn't start one time. I did also notice though, that the services can take a few minutes to start up - or at least until they register in the running processes tab (they didn't all show immediately). I think it may have something to do with a chain reaction of requests that the core ups process starts. But this is only based on my own observations - uspd and upsmon I think, are requested by the core ups / nut service. Am I correct in thinking that this: 'libusb_get_interrupt: Unknown error' is what you don't want to see in that service start runout?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
No that's not the error I'm taking about. What I was referring to was the usbhis-ups driver not starting and causing an error which could also be seen by checking if the drive process is running
 

bloomo

Explorer
Joined
Apr 4, 2014
Messages
58
Sorry, being slow. It's late. Re read your post.
Do give the services a good 5 minutes to start though, I've noticed they can stagger.

Sent from my T0LTE using Tapatalk
 

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
@bloomo, this won't survive a reboot, correct? Have you created a feature request ticket for this to be added to the GUI?
 

Yatti420

Wizard
Joined
Aug 12, 2012
Messages
1,437
Interesting.. Glad to see this has been resolved.. Drive issue? Some1 from NUT should take alook at this for the cyberpower pfc driver..
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Actually, I think this is a FreeNAS problem and has to do with bad choice of settings for the nut configuration.
 

bloomo

Explorer
Joined
Apr 4, 2014
Messages
58
@bloomo, this won't survive a reboot, correct? Have you created a feature request ticket for this to be added to the GUI?
@budmannxx You're totally correct. It resets every time. It seems the system overwrites all the nut files every time the machine reboots.
The core ups process, which uses ups.conf gets its amended values (if any) from the GUI auxiliary – Obviously why it survives reboot – because those parameters are written in every time. As far as I can tell (and this is just me jumping to conclusions from reading the .sample files and the nut documentation) the ups.conf, upsd.conf and upsmon.conf all configure their relevant processes – different aspects of the nut service that don't share values in their .confs, which is why I did'nt add the MAXAGE value to the GUI ups aux section, as it would have written into ups.conf.

@cyberjock technically it's not a bad setting as the nut upsd docs seem to think that a 15 second driver response time is fair (default MAXAGE is 15 and that's what is does, but you probably knew that).

What I can't comment on is if this is a CyberPower specific issue or across the board. Also, is this response affected by things like power states? Maybe I'm overthinking it, but I clearly had times in my log that the "stale data" message was more prolific – generally at night when no one was using the system. BUT that said, even during the day the messages were still present, just not on clockwork. Still, I turned off all sleep and power saving features on my drives and it persisted. Maybe more messages at night was just time passing and the messages weren't broken up. Bumping MAXAGE was a clean and definitive fix.

I will create a feature request ticket, but @budmannxx or @cyberdog, would you suggest I request that MAXAGE is bumped or that upsd.conf be editable as aux field? Or this just confuses things?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Actually, I don't know what the maxage default is. I do know that your average UPS responds every 6-9 seconds with newer ones being every 2-3 seconds. So to me 15 seconds is plenty conservative. But now the conflict begins. Because clearly it's *not* conservative enough because you made it 25 and that's working. So some research needs to be done. Either something has recently changed in FreeNAS that has somehow broken things(buffer error or something?) or UPSes have suddenly changed and we haven't adapted to the new conditions.
 

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
I'm far from an expert on this stuff, but it seems like upsd.conf should be editable via the GUI so we can set aux parameters as needed.

I agree that we should try to figure out the root cause, as @cyberjock stated, but it does also seem like setting MAXAGE to 25 (or whatever, greater than the default value) eliminates the issue with logs being flooded and emails going out all the time. Increasing this timeout value isn't a solution to the problem, but it does seem to adequately address the symptoms.

I should note that my UPS functions fine in terms of shutting down my system after a power loss, so my goal here is to eliminate the frequent COMMBAD log messages and emails.
 

bloomo

Explorer
Joined
Apr 4, 2014
Messages
58
I'm far from an expert on this stuff, but it seems like upsd.conf should be editable via the GUI so we can set aux parameters as needed.

I agree that we should try to figure out the root cause, as @cyberjock stated, but it does also seem like setting MAXAGE to 25 (or whatever, greater than the default value) eliminates the issue with logs being flooded and emails going out all the time. Increasing this timeout value isn't a solution to the problem, but it does seem to adequately address the symptoms.

I should note that my UPS functions fine in terms of shutting down my system after a power loss, so my goal here is to eliminate the frequent COMMBAD log messages and emails.
Yeh the emails are a pain... I ended up turning them off all together before I fixed the issue.
The only issue with making upsd.conf editable via the GUI is, does this mean that upsmon.conf should be too? Because if the MAXAGE value doesn't work (getting rid of stale data messages), the next troubleshooting step according to the NUT docs is the DEADTIME value... which resides in upsmon.conf! Haha.

I have no idea how many of the .conf files in totality are regenerated every time the system restarts, is it EVERY SINGLE ONE? If so, the aux field seems like the only option. That said, MAXAGE doesn't have a side effect like DEADTIME, it doesn't affect how much info reaches the system in the event of the UPS going on batt. So maybe just bumping MAXAGE to 25 would be fair. I could do some more testing to see if moving it to 20 works. I mean, how far off 15 is it? I tried 30 and 25 and the latter ended up being fine.
 

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
I think conf files' aux fields are added to the GUI on an as-needed basis, when users make the request and the devs agree. Couldn't hurt for you to open the ticket for one or both conf files, to get a discussion going with the devs (I don't think they hang out in the forums too much, so the ticket system is recommended).

If you do this, would you mind updating this thread with a link to the ticket?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421

bloomo

Explorer
Joined
Apr 4, 2014
Messages
58
HAHA. I literally just added this one too. Shall I remove it? Don't even know if it's in the right place... still learning my way around here.
http://forums.freenas.org/index.php...onf-upsmon-conf-fix-for-ups-stale-data.21044/
edit: Just found the ticket section... let me know if I should remove the above and append it to your ticket as a response.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
leave both, the more visibility the better.
 

bloomo

Explorer
Joined
Apr 4, 2014
Messages
58
Another update, I moved the MAXAGE value to 20 to see if it would still silence the messages and it did. So the default of 15 seconds is really just a hair short... I still get one or two random stale data messages creeping in, but they follow no pattern and are brief. One or two, here and there, days apart.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
I changed mine to 20 also and haven't received any stale message in a couple weeks.
 

LoopyJuice

Cadet
Joined
Aug 7, 2014
Messages
7
Sorry to resurrect an old thread but this is what fixed mine, I woke up to 70 emails this morning about BADCOMM, the console just shows stale data every 4 mins or so and upsc shows loss of connection and re-establish in the same second every time.

Edited upsd.conf and put MAXAGE 25, no more errors at all, its a real shame this wont persist through a reboot though!

*edit* forgot to mention it was doing the same when connected to my ReadyNAS Pro but the emails were much less frequent, maybe once a day or so so obviously their NUT implimentation has something > 15 but <25 in the config by default
 

bloomo

Explorer
Joined
Apr 4, 2014
Messages
58
Yeh, I was hoping 9.2.1.7 would sort it but no dice. I think the target version for a fix is 9.3 in the bug report.
 

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
I'm actually seeing a slight uptick in the number of error messages since updating from 9.2.1.6 to 9.2.1.7, with MAXAGE 25. Still way fewer than without the edit to upsd.conf though, so I might increase it to 30 until the real fix is available in 9.3.
 

bloomo

Explorer
Joined
Apr 4, 2014
Messages
58
I'm actually seeing a slight uptick in the number of error messages since updating from 9.2.1.6 to 9.2.1.7, with MAXAGE 25. Still way fewer than without the edit to upsd.conf though, so I might increase it to 30 until the real fix is available in 9.3.
Weird, I actually dropped my maxage to 20 and have had nothing. Which CP are you using?
 
Top