This is a minor update with no changes to the actual fan control logic. If the fan scripts are working fine, you can safely ignore this. There are some changes to make the scripts compatible with running in a virtual machine, and with SAS drives. Changes:
- Moved IPMITOOL definition to config file for spinpid.sh and spinpid2.sh (other scripts don't use config file). If doing VM, this makes it easier to add your IP, user, and password and survive updates. This is the only real change to configs. You don't need to redo the whole config, you can just copy the IPMITOOL definition from the new config to your existing one.
- Made compatible with SAS drives, at least those with Hitachi/HGST brand.
- Added more exclusions to device list so we are just reading spinning drives.
- For reading CPU temp, we now use best method available. If sysctl is available, we use that, read all the cores and use the hottest (this is what we did unconditionally before). Otherwise we use the ipmitool sensor reading, which may be the only method available in a VM.
- Fixed formatting with a leading 0 in spinpid2.sh.
- Added conditional calls to optional user-defined functions after DRIVES_check_adjust() and CPU_check_adjust() (spinpid2.sh only). See barbierimc's post for example.
Changes for version 2019-11-01 since 2019-10-04. All scripts changed a bit. Here are bigger changes:
A. The two fan control scripts are now set up for configuration files, which should go in the same directory as the script. This way it is easier to install new script versions; you won't have to fiddle with settings unless there are new ones.
B. In spinpid2.sh, added a setting to tell the script how to determine what the duty cycles (%) are. The options are assuming they are what the script set (as recent versions have done) or reading the data from the board. Caveats and tips are in the config file above the setting.
C. In spinpid2.sh, there is a completely new and more elaborate approach to the whole BMC reset thing. I've tested as much as I can on my one-zone board, but will be looking for some feedback.
- First, each major cycle there is a mismatch test to see if either zone is way off the setting, either too high or too low.
- If mismatch, it will try to force the offending zone back into line. Then it reads fan data again, and repeats the mismatch test.
- If there is still a mismatch, reset the BMC, wait for it to come back (2 minutes), force-set fans again, read fan data again, and repeat mismatch test.
- If still a mismatch, go through one more time (force set fans, read fan data, repeat mismatch test) then give up and move on with the script.
If things are really screwed up and this can't fix them, the BMC will reset again the next major cycle (5 minutes or so. But I don't see any point to killing the script in that case, as I don't think that will help. As always, you need to have your fan settings correct based on running spintest.sh. That is needed to determine if fan speeds are appropriate or BMC reset is needed.
I hope this is the last change for a while, but depends on feedback.
In both scripts:
In spinpid2.sh (dual-zone script):
- removed Ki and I (integral) term
- code tweaks to marginally improve efficiency
- added option to output to log only vs. log + console
- revised tuning advice at end of scripts to make it easier
Note that I don't have a way to test the dual zone script. Please report any issues. Also if you have previous versions, don't forget to copy over your settings.
- because some boards don't report correct fan duty, the script will now try to read it the first time and make needed adjustment, then assume the duty remains as it was set and not read it further
- added a setting to control whether interim CPU data are logged
- changed BMC reset code to avoid cycling resets that a few people reported.
See discussion for more details.
I had earlier switched to the best solution recommended by @Stux for reading CPU temperature in spincheck.sh. Now, with refined, complete bash code suggested by @bestboy, the apparently more efficient method has been incorporated into all 3 of the scripts that read temperature.
If you want the gory details, instead of reading CPU temp from the IPMI, we are now reading it from sysctl. We use the hottest of up to 10 cores as CPU temperature. I used awk instead of cut to get the actual numbers from the sysctl output, not sure if that is much of an improvement.
Thanks to Stux and bestboy!
In spinpid2.sh, the reading of duty cycles from the motherboard is now commented out (disabled) by default. Some boards report incorrect duty, and this causes the script to go bonkers. I have no idea how widespread this fault is, but there is really no harm in not asking the board for duty cycles - the script assumes the duty is what it sets. There is also some minor code cleanup.
spinpid.sh (the single-zone version) has some improvements in logic that make it work better. It holds drive temps better. Also after a CPU-intensive task, the fans can come down quickly as the CPU cools until the drives need cooling. I finally got it perfect. Also some minor cleanup.
The single-zone script had an issue that caused fan control based on drive temperatures to be inaccurate if the system went through a long period of high CPU use. That's fixed here, other scripts not updated.