Update causing Fan errors

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Oh, on a TrueNAS Mini? File a bug report asking iX to have their vendor fix the IPMI bug.
 

Smokie

Explorer
Joined
Oct 10, 2014
Messages
67
Same issue on my Poweredge R620. It’s causing one of my fans to spin so fast I’ve had to remove it. Multiple fan error alerts. Are we thinking this bug will be resolved in an update?


Version:
TrueNAS-SCALE-22.12.2
 
Last edited:

dlaflamme

Dabbler
Joined
Jan 17, 2016
Messages
16
I am kind of surprised there have been reports of this and there still seems to be no solution, or even reasonable information on how to debug. Several people have reported that alerts about fan speed have appeared after an upgrade of TrueNAS. I experienced this as well.

Some people seem to have experienced alerts with timestamps in the present suggesting that there is a problem happening now. Others, myself included, observe the behavior where these alerts are from far in the past but after the TrueNAS upgrade they appear again. Unsure if these are two different problems or different symptoms of the same problem, but one key feature appears to be upgrading TrueNAS. Note this appears to happen on Scale (https://www.truenas.com/community/threads/hundreds-of-fan-fana-deasserted-errors.108408/) as well as Core (this thread) which suggests it is an issue in the TrueNAS middleware and not an issue with a single underlying OS or driver.

In my case, I observed this after upgrading to TrueNAS-13.0-U5; I dismissed all alerts, logged out and back in, and a couple (not all) re-appeared after a while.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Same issue on my Poweredge R620. It’s causing one of my fans to spin so fast I’ve had to remove it. Multiple fan error alerts. Are we thinking this bug will be resolved in an update?


Version:
TrueNAS-SCALE-22.12.2
TrueNAS doesn't have control over your fans (not unless you specifically make that happen). I think you have a failing fan module.

I am kind of surprised there have been reports of this and there still seems to be no solution, or even reasonable information on how to debug. Several people have reported that alerts about fan speed have appeared after an upgrade of TrueNAS. I experienced this as well.

Some people seem to have experienced alerts with timestamps in the present suggesting that there is a problem happening now. Others, myself included, observe the behavior where these alerts are from far in the past but after the TrueNAS upgrade they appear again. Unsure if these are two different problems or different symptoms of the same problem, but one key feature appears to be upgrading TrueNAS. Note this appears to happen on Scale (https://www.truenas.com/community/threads/hundreds-of-fan-fana-deasserted-errors.108408/) as well as Core (this thread) which suggests it is an issue in the TrueNAS middleware and not an issue with a single underlying OS or driver.

In my case, I observed this after upgrading to TrueNAS-13.0-U5; I dismissed all alerts, logged out and back in, and a couple (not all) re-appeared after a while.
Is there a problem here? I haven't seen it, apart from hardware/firmware bugs that were highlighted by a new feature.

If your BMC is throwing fan errors, you either need to fix the thresholds or take up the issue of the false alarms (if applicable) with your hardware vendor. Or you might just have a dead/dying fan.
 

dlaflamme

Dabbler
Joined
Jan 17, 2016
Messages
16
Is there a problem here? I haven't seen it, apart from hardware/firmware bugs that were highlighted by a new feature.

If your BMC is throwing fan errors, you either need to fix the thresholds or take up the issue of the false alarms (if applicable) with your hardware vendor. Or you might just have a dead/dying fan.
Yeah, not sure. I guess at the minimum there is confusion. What is also confusing is that there seem to be at least two observed issues that may be different:

a) alerts about fans right now
b) alerts about fans from some time in the past

I am (and at least one other person in this thread) is seeing only (b). Others appear to be seeing (a), and for some of the other posts it is unclear.

Picklefish reported here that he is also seeing only old events being re-alerted on after the update. His solution appears to have been to just delete the events in the SEL and move on. Not ready to do that yet... will continue to observe so see if they come back again.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I think they just started being piped into the middleware's alerts functionality.
 

Berkyjay

Contributor
Joined
Nov 7, 2015
Messages
100
OK I just updated to TrueNAS-13.0-U5.3 today and am now getting these fan errors by the hundreds. I have never used the IPMI interface so I am unfamiliar with how to access it. Can someone provide a simple way to resolve this? I know for a fact that there is no issue with my fans so how do I change the thresholds for these alerts?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

Berkyjay

Contributor
Joined
Nov 7, 2015
Messages
100
Are your fans not constantly spinning up and down? Or are they non-PWM?
In any case, you'll want to have a look at this: https://www.truenas.com/community/resources/how-to-change-ipmi-sensor-thresholds-using-ipmitool.35/
Frankly I set this rig up over a year ago and I don't remember how I set the fans. It's a pretty quiet system and I am sure I set them at a static speed but I can't verify that. The fact that it's one specific fan causing all of the alerts SYS_FAN2 might point to a fan failure. But I honestly don't know how to identify which fan is SYS_FAN2. I need to take some time and open up my rig.

Code:
CPU_FAN1         | na         |            | na    | na        | na        | na        | na        | na        | na
CPU_FAN2         | na         |            | na    | na        | na        | na        | na        | na        | na
SYS_FAN1         | na         |            | na    | na        | na        | na        | na        | na        | na
SYS_FAN2         | 280.000    | RPM        | nr    | 280.000   | 420.000   | na        | na        | 35560.000 | 35700.000
SYS_FAN3         | 980.000    | RPM        | ok    | 280.000   | 420.000   | na        | na        | 35560.000 | 35700.000
 

Ellimist

Dabbler
Joined
Jun 8, 2014
Messages
32
So I stumbled here after also getting all the alerts. While I don't mind new functionality like this dlaflamme is correct about the alerts coming from past events. I had one from 2023 the rest were 2022. I've just cleared the log so I'll see if I get more alerts from the box but theres defiantly an issue with truenas reporting old events after the upgrade.
 

SW77

Dabbler
Joined
Apr 1, 2017
Messages
16
Execute from a shell.

Lower Fan Thresholds:

for i in 1 2 3 4 A ; do ipmitool -U "ipmi user name" -P 'ipmi password" -H "ipmi ip address" sensor thresh FAN${i} lower 150 225 300;done

1,2,3,4,A are the fan headers on your mobo


Upper Fan Thresholds:


for i in 1 2 3 4 A ; do ipmitool -U "ipmi user name" -P "ipmi password" -H "ipmi ip address" sensor thresh FAN${i} upper 2300 2400 2500;done

Dismiss alerts and all is good

I cannot remember where I found this information but it worked for me.
 
Top