Resource icon

How To: Change IPMI Sensor Thresholds using ipmitool

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
My board (Supermicro X11SCH-F) is reporting an absurd reading on Noctua NF-A4x10 PWM FAN. It's rated to 5000rpm maximum, but when it spins at - presumably - full speed, the IPMI reports 9500. Does anyone have any idea what might be the problem? I have only found out now, never really figuring out why was I seeing FAN alerts anytime I logged in to update the firmware or whatever when everything worked flawlessly.
Now I'm a little concerned the damn board might not be reading any FAN's speed properly...
 
Last edited:
Joined
Dec 2, 2015
Messages
730
My board (Supermicro X11SCH-F) is reporting an absurd reading on Noctua NF-A4x10 PWM FAN. It's rated to 5000rpm maximum, but when it spins at - presumably - full speed, the IPMI reports 9500. Does anyone have any idea what might be the problem? I have only found out now, never really figuring out why was I seeing FAN alerts anytime I logged in to update the firmware or whatever when everything worked flawlessly.
Now I'm a little concerned the damn board might not be reading any FAN's speed properly...
Please confirm that there is only one fan connected to the board. No CPU fan, no other case fan. If so, I'm equally puzzled.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
@Octopuss Maybe this is a case where you should look into IPMI fan control and define suitable HIGHER values rather than, or in addition to, LOWER values?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Do other fans read normally?
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
They do. I mean, the numbers look realistic at least. If they are real I have no idea after seeing this.
It's not terribly important since everything just works, but it's puzzling.

I haven't tried disconnecting everything on a live server though. I might, it's just a home box, it can live for two minutes without cooling, hehe.
 
Joined
Dec 2, 2015
Messages
730
They do. I mean, the numbers look realistic at least. If they are real I have no idea after seeing this.
It's not terribly important since everything just works, but it's puzzling.

I haven't tried disconnecting everything on a live server though. I might, it's just a home box, it can live for two minutes without cooling, hehe.
I'm wondering if there is any chance the IPMI data you are looking at is really for a different fan. Maybe try disconnecting that fan for a short test and see which IPMI fan shows as no longer present.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
They do. I mean, the numbers look realistic at least. If they are real I have no idea after seeing this.
It's not terribly important since everything just works, but it's puzzling.

I haven't tried disconnecting everything on a live server though. I might, it's just a home box, it can live for two minutes without cooling, hehe.
If you do so, prepare for all fans to spin up to 100%.

In any case, the options on the table are:
  1. Fan is sending out too many pulses. Can't really think of a realistic cause for this one...
  2. Pulses are being counted incorrectly. Weird, but imaginable, maybe a bit got flipped in the BMC's firmware.
  3. Data is being read incorrectly. Very unlikely, all things considered...
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
I'm wondering if there is any chance the IPMI data you are looking at is really for a different fan. Maybe try disconnecting that fan for a short test and see which IPMI fan shows as no longer present.
Oh the IPMI on these boards is DUMB. If I disconnect a fan on a running server, the IPMI GUI shows it in red but not disconnected IIRC.
(and it spins everything else to 100%)
And no, the data is not for any other fan. Can you imagine a PC fan that's supposed to spin at 9500rpm? This is the only fan in the server that's significantly faster on the upper end compared to the others (this one is rated at up to 5000 compared to 2000 or so on the others).
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
If you do so, prepare for all fans to spin up to 100%.

In any case, the options on the table are:
  1. Fan is sending out too many pulses. Can't really think of a realistic cause for this one...
  2. Pulses are being counted incorrectly. Weird, but imaginable, maybe a bit got flipped in the BMC's firmware.
  3. Data is being read incorrectly. Very unlikely, all things considered...
I tried resetting the BMC too and it didn't make any difference.
I can't test the fan in my PC unfortunately because it's mounted in custom 3D printed shroud on the HBA, and I'm too lazy to shut down the entire server and pull the card out :D
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
Yea, but I said PC fan :)
I guess I will have to try it in my own PC after all, it's pissing me off.
 
Joined
Dec 2, 2015
Messages
730
It may be instructive to swap the fan connection with that of another fan on the server. If the bad reading follows the fan, then the fan must have some strange issue. If the bad reading stays at the same fan connector, then it is a motherboard problem - either hardware or IPMI firmware, etc
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
No, the reading stays the same. I checked that (not intentionally, but I did).
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I'm getting Fan FAN3 Deasserted Lower Critical going low : Reading 2400 < Threshold 500 RPM errors, anyone can advice on how to fix things? I manually set the fan values with ipmitool and everything was great, but since last week they sometimes spin up and down totally random... and I get tons of errors like those.
Already tried to power cycle and to reset the values, don't know how to address this... besides changing the speed to "full" into supermicro's IPMI.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
How low do the fans actually go? The message you posted is after the error is cleared when the fan spins up.
 

Koop

Explorer
Joined
Jan 9, 2024
Messages
59
Oops I guess I should've put this info in here and not the review I left? Well here it is again:

I am a bit confused on the purpose of setting the maximum thresholds? Is there any reason to modify this? To let IPMI know how fast a fan can really go? Just trying to understand in the context of the terminology. Let's say my maximum RPM for my fan is 2200rpm. What should my numbers look like? Just a bit confused by the terminology of Non-Critical vs Critical vs Non-Recoverable with respect to an "upper" number. Would Upper Non-critical be the maximum rated RPM of my fan and then non-critical and non-recoverable higher numbers? I just see these numbers are reported with IPMITool show all three upper numbers for my fan to be beyond what is rated for maximum rpm.

Or does it not really matter and it'll just spin the fan as fast as possible when needed as long as all those values are above the maximum rated RPM?

One additional comment. I assume these changes will stick regardless of reboot or complete shutdown of the system? Even if BMC loses power?

I did this all on SCALE logged in as root via ssh for my X11SPi-TF board. Is there any concerns or complications to be aware of with differences in applying these changes via SCALE vs Core? I assume no but figured might as well ask.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Just a bit confused by the terminology of Non-Critical vs Critical vs Non-Recoverable with respect to an "upper" number.
The same as the lower.

In case of a 2000 RPM fan with a ±20% error I would set 2200 NC value, a 2400 CR value, and a 2500 NR value.
Changes should stick between power cycles, but you might need to apply it a couple of times because of... things.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
does it not really matter and it'll just spin the fan as fast as possible when needed as long as all those values are above the maximum rated RPM?
You'd probably get log spam at a minimum... I haven't cared enough to test it out.

did this all on SCALE logged in as root via ssh for my X11SPi-TF board. Is there any concerns or complications to be aware of with differences in applying these changes via SCALE vs Core?
Nope, exact same thing. I come here to look up my own stuff all the time for Linux machines, even though this was all originally done on FreeBSD.
 

Koop

Explorer
Joined
Jan 9, 2024
Messages
59
Thanks to you both. Not too complicated thankfully. I'll see if my event log yells at me anymore now that I've made my changes. It was only a few times that my fans got stuck in a constant ramp-up / ramp-down situation. But once or twice was enough to be annoying and seek out this change.

Nope, exact same thing. I come here to look up my own stuff all the time for Linux machines, even though this was all originally done on FreeBSD.

And thanks, I figured as much. I guess the only difference is getting to ipmitools. I couldn't do it via admin or sudo so I just logged in directly as root and it worked fine. Maybe I am just a linux/SCALE noob though for not knowing how to do it properly?

Maybe worth adding info around that for people looking to do this via on Scale? Whichever way is the proper way. I am sure people in the future using Scale may get confused?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Sudo should have worked, but maybe it got missed in the build process?
 
Top