Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.
Fan Scripts for Supermicro Boards Using PID Logic

Fan Scripts for Supermicro Boards Using PID Logic 2018-01-01

Since motherboards have no way to access drive temperatures, they can’t really regulate them. The fan control scripts presented here read and respond to both drive and CPU temperatures. Such scripts are MUCH better than the control built into the boards. Here we also apply the magic of PID control. Mean drive temperature normally stays within 0.3 C of setpoint unless there is a disturbance, then within 0.5 C. The scripts ensure that your fans spin only as fast as needed to regulate drive and CPU temperatures at any given time. There are two additional scripts for learning about your temperatures and fans, that should be of interest to any fan fan.

For detailed description and discussion of PID logic used to regulate hard drive temperature, some basics on fan control, and early script development, see this forum post.

Four scripts can be downloaded (sorry I had to zip them; only one upload is allowed) for Supermicro motherboards with single or dual fan zones:
  • spincheck.sh reads and logs temperature and fan data at a chosen interval, but does not control fans in any way. Works for both 1- and 2-zone motherboards.
  • spintest.sh is a one-time utility that runs your fans through a range of duty cycles and logs resulting RPMs. Works for both 1- and 2-zone motherboards. The results can be used for some settings in spinpid2.sh.
  • spinpid.sh controls fans for motherboards with one fan zone. It responds to both drive and CPU temperatures, depending on the greatest need. Logs lots of temperature and fan data.
  • spinpid2.sh controls fans for motherboards with dual fan zones, peripheral and CPU/system. The zones can be reversed. Logs like spinpid.sh but with additional CPU log.

Before using the fan control scripts:
  1. Run spincheck.sh for a day or so. Look over the log to get a feel for your temperatures and fan duty cycles and RPMs while using the built-in control with the the fan mode you have set.
  2. If you have dual fan zones, decide which zone will do what and connect fans to headers accordingly. For most boards, Supermicro says:
    Zone 0, headers named with a number (e.g., FAN1, FAN2, etc.) --> CPU/System fans
    Zone 1, headers named with a letter (e.g., FANA, FANB, etc.) --> Peripheral fans (presumably including drives).
    Since zone 0 has more headers, and there are usually more peripheral fans than CPU fans, that seems backwards to some people (however it may make sense if you want diverse fans connected to the CPU zone and the peripheral fans are all the same kind). The script defaults to reversing this arrangement, setting ZONE_PER=0 and ZONE_CPU=1, but you can change settings to follow Supermicro's approach.
  3. Make sure your fan thresholds are set properly based on manufacturer's specifications using Ericloewe's guide. The thresholds are assigned to headers, so if you've rearranged fan connections, you may need to reset them. You must have fan(s) connected to header FAN1, and for the dual-zone script, FANA.
  4. If you have dual fan zones, run spintest.sh with no other fan control script running. See the resulting log.
  5. Go through the settings section of the respective script and carefully set them for your system.
  6. When first using the scripts, watch the log ( spinpid2.sh has a main log and a CPU log) to make sure there is not an obvious problem. If you get through a couple of full cycles and things are OK, they will likely stay OK.

Some things to know about the scripts
  1. The two fan control scripts both use a full cycle, when drives are checked and everything is logged, and shorter, interim CPU cycles.
  2. For the main drive cycle, 5-6 minutes is probably a good interval.
  3. For CPU interval, 1-15 seconds may be appropriate. For my passively cooled processor with low thermal design power, 15 seconds is OK. More powerful CPUs have faster temperatures spikes and need shorter intervals.
  4. spinpid2.sh has a separate log for CPU readings. If you do short CPU intervals, this log will get quite large in normal use; you may want to disable it after testing. In the function CPU_check_adjust, comment out this line: print_interim_CPU | tee -a $CPU_LOG >/dev/null
  5. With a single fan zone, you can get 'phantom' readings for duty cycle of the nonexistent zone.
  6. Some boards (dual zone only?) report incorrect duty cycle. As of the 2017-04-18 update, by default we no longer ask the board for this information, instead assuming it is what we set. If you want to play with it, you can uncomment the four lines that read and transform duty cycle in the function read_fan_data and see if your board replies truthfully.
  7. If you find that your fans are often at maximum or minimum duty cycle, you probably have an inadequate or overly aggressive fan setup (for the ambient temperature).
  8. The code works on drives that are on the motherboard or on a host bus adapter, at least using the LSI chips.
  9. The code will work if you let some or all of your drives go into standby (at least spinpid2.sh will, not sure about spinpid.sh yet). Checking temperatures and state will reset the standby timer, but it won't spin up drives in standby. If you want your drives to spin down, you will need a drive check interval longer than, and maybe twice as long as, the standby time.
  10. You can play with tuning of the PID constants (see instructions in the forum post linked above and the end of the scripts), but it's probably not needed.
  11. If you have devices appearing in the log that are not spinning drives: The script reads 'camcontrol devlist' once to get drive names. There is code to remove from the list devices that are not spinning drives. You may need to add some unique text to that code if you have another device that is not handled with DEVLIST="$(echo "$DEVLIST1"|sed '/KINGSTON/d;/ADATA/d;/SanDisk/d;/OCZ/d;/LSI/d;/INTEL/d;/TDKMedia/d')". The pattern should be clear. Let me know what you add and I can add it to the current script version.
  12. The code will work with drives from Western Digital, Toshiba, Hitachi, Seagate, and possibly other makers.
  13. In the beginning, the control scripts read your SMART return value and let you know if problems are indicated with any drive. During each cycle, in addition to temperature, they parse the bits of the return value to determine spinning status as accurately as possible, even if disk problems affect other bits. Then they report spinning (*), standby (_), or unknown (?, if the drive is essentially gone).
  14. There is logic to see if fans are out of control and a BMC reset is needed in spinpid2.sh, but not in spinpid.sh. In my 1-zone board, I've never had the fans go out of control without a reset happening automatically, and then it is always due to some crazy experiment I'm doing.
  15. You can create a post-init task in FreeNAS GUI to start the script after booting. Otherwise you can run it short-term (for testing) in an SSH/console session or long-term in a tmux session.
  16. Note that if you stop the fan control scripts (with control-C or by killing the processes), your fans will be in Full mode at the last set duty cycle. There will be no fan control at all. It is recommended that you go into the IPMI GUI and change the mode into one that works for you. When spintest.sh finishes, it automatically returns the fan mode to what it was at the start.
  17. Thanks to testers (I don't have a dual-zone system to test it on), including @demob, @Kevin Horton, @lmannyr, and @Reign. Thanks also to @joeschmuck for info on reading drive status. Despite all the testing, since every system is different, I would fully expect issues for some users. You are responsible for your machine. I am not responsible for overheating, overcooling, explosions, nuclear meltdowns or zombie apocalypse.
Author
Glorious1
Downloads
1,530
First release
Last update
Rating
5.00 star(s) 4 ratings

Latest updates

  1. A little bit better now

    A couple of slight imrovements to the code for getting CPU temps. Thanks to @bestboy for the...
  2. More/better efficiency update

    I had earlier switched to the best solution recommended by @Stux for reading CPU temperature in...
  3. Efficiency updates to fan control scripts

    Changed the command for reading CPU temperature. The new command, suggested by @Stux, is...

Latest reviews

Quality commented scripts.
Excellent at keeping CPU and disks temperatures steady by adjusting fans rotation speeds.
Comments in the script are very useful and explanatory but since I'm a noob, I'd love if they were even easier to understand, as if you were explaining to your 5-yr-old child...
Well written scripts and very descriptive, just the way all scripts should be. If you have cooling concerns then you need these scripts as at a minimum they can evaluate your temp specs. BZ !
Phenomenal script. Almost mandatory. Fantastic explanations.
Top