Update causing Fan errors

termehansen

Cadet
Joined
Mar 24, 2023
Messages
4
I agree this must be some weird issue in TrueNAS, even now when setting my LCR to 0, I still get errors in notifications?!?

WARNING
Fan FANA Deasserted Lower Non-recoverable going low : Reading 1300 < Threshold 0 RPM.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
A "deasserted" error is one where a condition has been recovered from. This is generated by the IPMI subsystem.
 

avalon60

Guru
Joined
Jan 15, 2014
Messages
597
I have just replaced FAN 4 with a new 4 pin fan, and I am still getting the same error messages. The new fan is rated at 1800 RPM, but is only showing as spinning at 450 RPM.
I can't see it as being a fan fault, but something else??
The errors are not current as they are dated from 5 or 6 days ago, when I checked them in TrueNAS???
 
Last edited:

Daryle

Dabbler
Joined
Jan 26, 2017
Messages
13
I am also experiencing the same issue since updating to 13u4.

Before I found this post I figured my fans were on their way out. 2 at the same time? Unusual but not the craziest computer hardware thing I've experienced. They are old so replacing them wasn't a big deal, however. I am still getting these alerts. While trying to investigate what could be happening I came across this post.

I also have a SuperMicro MB. FAN3 and FAN4 are continuously alerting even with brand new fans. This wasn't an issue prior my updating to 13u4.
 

Maxburn

Explorer
Joined
Oct 26, 2018
Messages
60
Just fixed my alerting and now receiving these alerts. Guess add me to the list?
 

picklefish

Explorer
Joined
Mar 13, 2016
Messages
62
Just upgraded to TrueNAS-SCALE-22.12.1 from TrueNAS-SCALE-22.12.0 and it's now constantly spamming these warnings off and on 100s a minute. It's also got a messed up dates of 2017 even though my server has the correct date.
1681065051595.png


1681065028678.png
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It's also got a messed up dates of 2017 even though my server has the correct date.

What's the date on the IPMI/BMC? You need to remember you have TWO servers there, and the errors are being generated by the tiny little moron grade IPMI server whose job it is to monitor things like the fan speed, not the big NAS server whose time you probably did remember to set.
 

Maxburn

Explorer
Joined
Oct 26, 2018
Messages
60
Maybe this is more serious than I thought. IPMI agrees and I haven't logged into this in over a year. Unless changes in TrueNAS push down to IPMI?

 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Maybe this is more serious than I thought. IPMI agrees and I haven't logged into this in over a year. Unless changes in TrueNAS push down to IPMI?

So ignoring the time issue for the moment, what sort of fans are you using, and did you adjust the IPMI thresholds for them? The IPMI is clearly interpreting the values it is getting from the BMC for the fans as too low. I suspect iXsystems changed something in the software reporting stack that has caused these to be reported by the NAS in the new update, but if the IPMI is actually generating them, I don't think that's an incorrect behaviour. It may be a NEW behaviour, or might be annoying, but the fix could be to work out the IPMI fan thresholds correctly.
 

picklefish

Explorer
Joined
Mar 13, 2016
Messages
62
The bug is that it's reloading a bunch of old events. I just connected to IPMI and I have newer events in 2020. It's not spamming it over and over, it's just reloading a ton of old events. It's some sort of bug in the truenas logic. The bug being that it's showing old events and will reload them even if you clear them. I just went to my ipmi and deleted the old events.
 

picklefish

Explorer
Joined
Mar 13, 2016
Messages
62
To fix, short term:
Log in to IPMI, go to the logs, "GET" the logs in IPMI, delete the logs.
 

Maxburn

Explorer
Joined
Oct 26, 2018
Messages
60
So ignoring the time issue for the moment, what sort of fans are you using, and did you adjust the IPMI thresholds for them? The IPMI is clearly interpreting the values it is getting from the BMC for the fans as too low. I suspect iXsystems changed something in the software reporting stack that has caused these to be reported by the NAS in the new update, but if the IPMI is actually generating them, I don't think that's an incorrect behaviour. It may be a NEW behaviour, or might be annoying, but the fix could be to work out the IPMI fan thresholds correctly.
It's suspicious timing is all, this server has been running for years. Can/does truenas set hardware thresholds for things like fan speed?

Edit; first FANA logged alert in IPMI was 2/13/2023, I don't remember when I updated truenas but it that time frame sounds about right.

FANA is the CPU fan and now that I'm home again I poked around a little in the machine and I don't think I see anything wrong. I didn't see the CPU fan stopped, doesn't even look that dirty. CPU cooler on this machine is a artic heat pipe huge thing and with the little CPU and the way things are laid out here it might not even need a fan so I'm no longer concerned about anything dying. I don't see a lower threshold setting but on the optimized fan setting it seems to hover around 600-700rpm. I don't see a lower fan speed alert setting in IPMI at all, so it must be in the BIOS.

Edit2; I found the thresholds in IPMI, they are all 500 for CT and 300 for NR whatever those mean.

CPU fan is now hovering around 700 after I cleaned out a little dust, was 600 I think. Other fans are 600.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Can/does truenas set hardware thresholds for things like fan speed?

No, because how would it have any idea what to set them to? There's a reason that the IPMI is decoupled from the rest of the system. The IPMI is designed to be aware of the platform on which it is running, which should be set by the manufacturer. Since you are essentially the manufacturer of your homebuilt system, and only you know what kind of fans you crammed into it, you're responsible for configuring this sort of stuff. You can indirectly ask the IPMI server to set thresholds from the NAS environment using ipmitool, but the NAS on its own would never have any reliable way to know what numbers to set because it has no idea what you've installed for fans.

Edit2; I found the thresholds in IPMI, they are all 500 for CT and 300 for NR whatever those mean.


NC is the "non-critical" threshold (i.e. healthy). "CT" is the "critical" threshold (fan may be failing). "NR" is the "non-recoverable" threshold (i.e. it's dead Jim).
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I have no idea how any of you folks survived the constant ramping up and down of the fans on your servers.

This is how you make the errors go away: https://www.truenas.com/community/resources/how-to-change-ipmi-sensor-thresholds-using-ipmitool.35/

Just upgraded to TrueNAS-SCALE-22.12.1 from TrueNAS-SCALE-22.12.0 and it's now constantly spamming these warnings off and on 100s a minute. It's also got a messed up dates of 2017 even though my server has the correct date. View attachment 65529

View attachment 65528
You might have an IPMI bug (or crazy fast fans?).
 

avalon60

Guru
Joined
Jan 15, 2014
Messages
597
I just tried changing unc for one of my fans but I get an error saying 'unc is invalid, whatever that means
ipmitool sensor thresh 'FAN 4' upper unc 1600 ucr 1500 unr 1700.

What is the correct synatx.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
ipmitool sensor thresh 'FAN 4' upper 1600 1500 1700

But why are you setting the upper thresholds so low?
 

avalon60

Guru
Joined
Jan 15, 2014
Messages
597
Ok I don't know as this is a bit over my head .
With a fan that can spin at 1800RPM , what thresholds should I be using
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Well above 1800 RPM (like 20% above at least), since you don't want warnings for normal operation of the fan.
 

termehansen

Cadet
Joined
Mar 24, 2023
Messages
4
I have now even put all three thresholds to 0
FANA | 200.000 | RPM | ok | 0.000 | 0.000 | 0.000 | 25300.000 | 25400.000 | 25500.000

Yet still my notifications are filled up with:

Fan FANA Asserted Lower Critical going low : Reading 0 < Threshold 0 RPM.​


Have been up checking the server physical, and the fan is running completely stable, and no dust in sight.
This is really annoying as is "spams" out other notifications, any suggestions how to avoid this spamming from FAN A?


running on:
Platform: TRUENAS-MINI-3.0-XL+
Version: TrueNAS-SCALE-22.12.2
 
Top