Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.
Fan Scripts for Supermicro Boards Using PID Logic

Fan Scripts for Supermicro Boards Using PID Logic 2019-11-01

Changes for version 2019-11-01 since 2019-10-04. All scripts changed a bit. Here are bigger changes:

A. The two fan control scripts are now set up for configuration files, which should go in the same directory as the script. This way it is easier to install new script versions; you won't have to fiddle with settings unless there are new ones.

B. In spinpid2.sh, added a setting to tell the script how to determine what the duty cycles (%) are. The options are assuming they are what the script set (as recent versions have done) or reading the data from the board. Caveats and tips are in the config file above the setting.

C. In spinpid2.sh, there is a completely new and more elaborate approach to the whole BMC reset thing. I've tested as much as I can on my one-zone board, but will be looking for some feedback.
  1. First, each major cycle there is a mismatch test to see if either zone is way off the setting, either too high or too low.
  2. If mismatch, it will try to force the offending zone back into line. Then it reads fan data again, and repeats the mismatch test.
  3. If there is still a mismatch, reset the BMC, wait for it to come back (2 minutes), force-set fans again, read fan data again, and repeat mismatch test.
  4. If still a mismatch, go through one more time (force set fans, read fan data, repeat mismatch test) then give up and move on with the script.

If things are really screwed up and this can't fix them, the BMC will reset again the next major cycle (5 minutes or so. But I don't see any point to killing the script in that case, as I don't think that will help. As always, you need to have your fan settings correct based on running spintest.sh. That is needed to determine if fan speeds are appropriate or BMC reset is needed.

I hope this is the last change for a while, but depends on feedback.
  • Like
Reactions: SomeGuyInSandy
Discovered that BMC reset on my machine takes 105 seconds, so lengthened the wait after reset from 60 to 120 sec.
  • Like
Reactions: Dice
In both scripts:
  1. removed Ki and I (integral) term
  2. code tweaks to marginally improve efficiency
  3. added option to output to log only vs. log + console
  4. revised tuning advice at end of scripts to make it easier
In spinpid2.sh (dual-zone script):
  1. because some boards don't report correct fan duty, the script will now try to read it the first time and make needed adjustment, then assume the duty remains as it was set and not read it further
  2. added a setting to control whether interim CPU data are logged
  3. changed BMC reset code to avoid cycling resets that a few people reported.
Note that I don't have a way to test the dual zone script. Please report any issues. Also if you have previous versions, don't forget to copy over your settings.

See discussion for more details.
A couple of slight imrovements to the code for getting CPU temps. Thanks to @bestboy for the research, it can now handle any number of cores. Also the actual temperature reading might be slightly more efficient.
I had earlier switched to the best solution recommended by @Stux for reading CPU temperature in spincheck.sh. Now, with refined, complete bash code suggested by @bestboy, the apparently more efficient method has been incorporated into all 3 of the scripts that read temperature.

If you want the gory details, instead of reading CPU temp from the IPMI, we are now reading it from sysctl. We use the hottest of up to 10 cores as CPU temperature. I used awk instead of cut to get the actual numbers from the sysctl output, not sure if that is much of an improvement.

Thanks to Stux and bestboy!
Changed the command for reading CPU temperature. The new command, suggested by @Stux, is probably somewhat more efficient.
  • Like
Reactions: lmannyr and Dice
In spinpid2.sh, the reading of duty cycles from the motherboard is now commented out (disabled) by default. Some boards report incorrect duty, and this causes the script to go bonkers. I have no idea how widespread this fault is, but there is really no harm in not asking the board for duty cycles - the script assumes the duty is what it sets. There is also some minor code cleanup.

spinpid.sh (the single-zone version) has some improvements in logic that make it work better. It holds drive temps better. Also after a CPU-intensive task, the fans can come down quickly as the CPU cools until the drives need cooling. I finally got it perfect. Also some minor cleanup.
The single-zone script had an issue that caused fan control based on drive temperatures to be inaccurate if the system went through a long period of high CPU use. That's fixed here, other scripts not updated.
Top