Script to control fan speed in response to hard drive temperatures

lmannyr

Contributor
Joined
Oct 11, 2015
Messages
198
No, both zones would be controlled in one script. Maybe I'll tinker with it in the next few days and send you something to try. The existing script has all the operations so all the complicated stuff is figured out already, just a matter of copying pasting, and some minor edits.

Yes, more people have boards with dual zones than ones like mine. I think @Stux may have a script that deals with dual zone control, but I don't think it uses PID. The PID logic works very well for the drive zone.

Your Script seems to be the way the board should control the fans from IPMI. Seems so logical. Its super dynamic. I'm really surprised this isn't something SM or all the mother boards don't already implement already.

Was looking over the script again. I don't see where the cpu is assigned a FANX (assignment) I see FAN1 2 etc but how where does it set a FAN header to CPU and your HDD. Its a 1 zone fan sript but your clearly controlling a fan for cpu and a different set of fans for HDD. Trying to understand the script....
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Your Script seems to be the way the board should control the fans from IPMI. Seems so logical. Its super dynamic. I'm really surprised this isn't something SM or all the mother boards don't already implement already.
I wondered the same thing. I guess the big customers are enterprises who don't care about quiet and careful regulation, they just let the fans scream 24-7. The rest of us don't drive Supermicro development.
I don't see where the cpu is assigned a FANX (assignment) I see FAN1 2 etc but how where does it set a FAN header to CPU and your HDD. Its a 1 zone fan sript but your clearly controlling a fan for cpu and a different set of fans for HDD. Trying to understand the script....
Remember, this script is for a single fan zone (zone 0).
Reading and setting fan mode is global, for all zones
Reading and setting duty cycle is done for a zone, not an individual fan header.
The only place I think you see "FAN1" is where I'm reading RPMs using "ipmitool sdr". Each header does have it's own reported RPM, I just picked FAN1 as they should all be the same.

Initially the script was regulating fans based only on drive temps. But when I did something CPU-intensive, that didn't work. I thought I borked my CPU but it was OK. Then I added a part where it looks at CPU temp. The cooling needs of the drives and the CPU are then compared, and one is chosen to drive the fan duty cycle ("Driver" in the output). But only one duty cycle is set.

The new script will control zones 0 and 1 independently, without picking a "Driver".
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
No, both zones would be controlled in one script. Maybe I'll tinker with it in the next few days and send you something to try. The existing script has all the operations so all the complicated stuff is figured out already, just a matter of copying pasting, and some minor edits.

Yes, more people have boards with dual zones than ones like mine. I think @Stux may have a script that deals with dual zone control, but I don't think it uses PID. The PID logic works very well for the drive zone.

Yep, I never got around to adding pid loop as I found it unnnecessary in my scenario.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
$DUTYhex can be in decimal

Ie "100" means 100%
Thank you, this is new to me. I thought anything here was interpreted as hex. But you're right, 64 is interpreted as 64 decimal, 0x64 as hex (=100%). Nice to know!
 

Bryan Lee

Dabbler
Joined
Jan 4, 2017
Messages
14
What a great thread. I'm currently building my first FreeNAS box using the X11SSL-CF board. Currently dong burn-in testing, while watching drive and CPU heat, and playing with different fan settings to see how cool and quiet I can get this to run. Once I'm satisfied with the burn-in I'll start working on implementing one of these control scripts. Thanks to everyone for their great work in here.

Edit:

So now I suppose I need to decide on a 2-zone solution, having my script manage both CPU and HD Fan zones. Or to use Optimal to regulate the CPU Fan, knowing that it will override the Fan Zone periodically, but counting on the script to put the Fan's back to their desired speed before too long.

I'm guessing one of the scripts in here will have logging to show when the HD Fan's duty cycle is set to an unexpected value (due to interference from the BMC). If it happens once an hour or so, I'll probably go with a one zone script for the HD fans. If it's every five minutes, I may go with a two-zone.
 
Last edited:

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
So now I suppose I need to decide on a 2-zone solution, having my script manage both CPU and HD Fan zones. Or to use Optimal to regulate the CPU Fan, knowing that it will override the Fan Zone periodically, but counting on the script to put the Fan's back to their desired speed before too long.

I'm guessing one of the scripts in here will have logging to show when the HD Fan's duty cycle is set to an unexpected value (due to interference from the BMC). If it happens once an hour or so, I'll probably go with a one zone script for the HD fans. If it's every five minutes, I may go with a two-zone.
Once you, or the script, set the fan mode to full, you shouldn't have any interference from the BMC, and the script will have full control.

For your dual zone system, you have two script options that I know of. One is Stux's hybrid fan controller script. It is well developed and has a lot of safety mechanisms if there are problems. It is written in perl.

The other is the dual-zone version of the PID bash script that I wrote. It has a different logging system. You might decide based on which works better for you, whether you know perl or bash better (so you can play with it yourself), or which type of logs you like. I developed the dual zone version with another user as I don't have such a system. It is working well on that system I think, but I'm looking for another guinea pig to try it before publishing it on the forum. If you want to try it, send me a conversation message in our profiles and I will post it to you.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
What a great thread. I'm currently building my first FreeNAS box using the X11SSL-CF board. Currently dong burn-in testing, while watching drive and CPU heat, and playing with different fan settings to see how cool and quiet I can get this to run. Once I'm satisfied with the burn-in I'll start working on implementing one of these control scripts. Thanks to everyone for their great work in here.

Edit:

So now I suppose I need to decide on a 2-zone solution, having my script manage both CPU and HD Fan zones. Or to use Optimal to regulate the CPU Fan, knowing that it will override the Fan Zone periodically, but counting on the script to put the Fan's back to their desired speed before too long.

I'm guessing one of the scripts in here will have logging to show when the HD Fan's duty cycle is set to an unexpected value (due to interference from the BMC). If it happens once an hour or so, I'll probably go with a one zone script for the HD fans. If it's every five minutes, I may go with a two-zone.

I wouldn't bother messing around with Optimal and go straight to a dual zone script.

It's fairly easy to tune my script for your system as the parameters are all documented and easily modifiable
 

ghost reaper

Dabbler
Joined
May 21, 2014
Messages
45
It is working well on that system I think, but I'm looking for another guinea pig to try it before publishing it on the forum.

hey im willing offer my system for testing and it is latest gear i got, i stumble across this post i got bad cooling issue with hdd 30-40+- and good time for testing im in summer

current fan setup is not wired up like you guys have it and i do have 2 splicer cables

EDIT: just run spincheck.sh but room getting cold now had main AC running with 3 fans to chill server room.

notice 2 things, it reads FAN1 witch cpu with current setup
Drive states: * spinning; _ standby; ? unknown
 

Attachments

  • spincheck.txt
    1.1 KB · Views: 380
  • ipmi sensors.txt
    755 bytes · Views: 405
Last edited:

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
hey im willing offer my system for testing and it is latest gear i got, i stumble across this post i got bad cooling issue with hdd 30-40+- and good time for testing im in summer

current fan setup is not wired up like you guys have it and i do have 2 splicer cables

EDIT: just run spincheck.sh but room getting cold now had main AC running with 3 fans to chill server room.

notice 2 things, it reads FAN1 witch cpu with current setup
Drive states: * spinning; _ standby; ? unknown
Thank you for offering to try the script. You are right, spincheck.sh currently reports RPM of only FAN1. I just updated it in that post so it now reports RPM of FAN1-4 and FANA; that capability is already in the dual-zone control script. It also now reports duty cycle of 2 zones (but that can be wrong depending on the board).

The "Drive states" line you pasted above is just a key to the symbols used in the output.

I sent you a conversation message about testing the dual-zone script.
 
Last edited:
Joined
Dec 2, 2015
Messages
730
I've been curious about how many different fan duty cycles the BMC has implemented. The ipmitool raw command allows the duty cycle to be specified either in hexidecimal, or with decimal integers. Hex implies 64 different duty cycles (from 1 to 64) and decimal integers imply 100 different duty cycles. But, it is quite possible that the BMC can only command a much smaller number of different duty cycles (e.g. 8, or 16 or 32), and it rounds the command to find the closest available duty cycle to use. If so, that would mean that fan control scripts implementing a PID control loop may work better if they called a smaller list of possible duty cycles.

Today I finally performed an experiment to attempt to investigate this question. My new system with X10SRH-cF motherboard has seven fans, two fans on the CPU cooler (one on each side), two chassis fans, and three hard drive fans. ipmitool only reports fan rpm in even 100s (e.g. 500, 600, 700 etc), but you can get a finer resolution look by looking at the average speed of all fans.

I wrote a perl script to command fan speeds from 100% to 25% duty cycle, decreasing the commanded duty cycle by 1% each step, waiting 15s for the speed to stabilize, then reporting the average fan speed. Then do the same thing all over again from 25% to 100%. I manually put the fans at 100% for several minutes before the test, to get the HD cool enough so they wouldn't over heat during the long period at low duty cycle in the middle of the test. I monitored the HD temps during the test, in case I needed to abort it to protect the drives.

The results show that the BMC can command at least 50 different duty cycles between 25% and 100%, which may imply that it has 64 duty cycles available from zero to full speed.

Code:
duty	   ave
cycle	  fan
		  speed
(%)	   (rpm)
100	   1642
99	   1642
98	   1628
97	   1642
96	   1585
95	   1585
94	   1571
93	   1557
92	   1528
91	   1528
90	   1514
89	   1514
88	   1514
87	   1485
86	   1457
85	   1457
84	   1428
83	   1414
82	   1400
81	   1385
80	   1385
79	   1371
78	   1357
77	   1328
76	   1314
75	   1314
74	   1300
73	   1300
72	   1271
71	   1242
70	   1228
69	   1214
68	   1214
67	   1185
66	   1185
65	   1171
64	   1142
63	   1128
62	   1085
61	   1085
60	   1085
59	   1071
58	   1057
57	   1028
56	   1028
55	   1000
54		985
53		942
52		942
51		942
50		928
49		914
48		871
47		871
46		857
45		842
44		814
43		771
42		771
41		771
40		742
39		742
38		700
37		657
36		642
35		642
34		642
33		614
32		585
31		542
30		542
29		542
28		485
27		485
26		457
25		442
		
		
25		442
26		457
27		485
28		485
29		542
30		542
31		557
32		585
33		614
34		642
35		642
36		642
37		657
38		714
39		742
40		742
41		771
42		771
43		771
44		814
45		842
46		857
47		857
48		871
49		914
50		928
51		942
52		942
53		942
54		985
55	   1000
56	   1028
57	   1028
58	   1042
59	   1071
60	   1085
61	   1085
62	   1100
63	   1114
64	   1142
65	   1171
66	   1185
67	   1185
68	   1214
69	   1214
70	   1228
71	   1228
72	   1257
73	   1300
74	   1300
75	   1314
76	   1314
77	   1328
78	   1371
79	   1371
80	   1385
81	   1385
82	   1400
83	   1414
84	   1414
85	   1457
86	   1457
87	   1485
88	   1514
89	   1514
90	   1514
91	   1528
92	   1528
93	   1557
94	   1571
95	   1585
96	   1585
97	   1642
98	   1642
99	   1642
100	   1642
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Today I finally performed an experiment to attempt to investigate this question.
Wow, that looks like quite an involved experiment! It is a question I hadn't considered before. I guess you're saying, if there are really only 64 duty cycles, the algorithm should bump to the next when a change is needed, rather than from one decimal number to the next, which may be the same duty cycle?

But the way I see it, all the changes you make are proportional to the range you have to play with. If the calculated change is too small to actually change the duty cycle, so be it. A change to the next duty cycle in this case is too big a change.

As you say, the fan RPMs are rounded to 100. The mean of fan RPMs tells you more than the RPM of one fan in the zone, but I wonder if it can be relied on to this extent. If the fans are all around 1060 RPM at one duty cycle, and all around 1080 in the next, the mean will be the same for both duty cycles.

That said, without any further evidence, I would say most likely you are right, and there are just the 64 steps. I'm just not sure I would change the logic because of that.
 
Joined
Dec 2, 2015
Messages
730
That said, without any further evidence, I would say most likely you are right, and there are just the 64 steps. I'm just not sure I would change the logic because of that.
I agree - don't change the scripts. There isn't enough resolution in the fan speed data to differentiate between 64 or 100 available duty cycles. And, those two values are close enough that it really doesn't matter in the big picture.

Worst case - there are only 64 available duty cycles, but the script is using decimal increments (i.e. it assumes 100 available steps). If one PID loop cycle asks for a speed change of 1, and it happens to round off to the same hex value as the current, then the fan speed won't change. But, the next time around the loop, the PID controller will likely ask for another 1 step increment, and this one will round to the next hex value for sure.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Btw, 0x64 == 100

(6*16 + 4)

There are 100 values between 0x01 and 0x64

0xF == 15
0x10 == 16
0xFF == 255
0x100 == 256

(Hope I'm not teeching you to suck eggs)
 
Joined
Dec 2, 2015
Messages
730
Btw, 0x64 == 100

(6*16 + 4)

There are 100 values between 0x01 and 0x64

0xF == 15
0x10 == 16
0xFF == 255
0x100 == 256

(Hope I'm not teeching you to suck eggs)
Duh! You are right of course, now that I actually think about it. Fortunately, I'm a mechanical engineering grad, not a computer engineer :)

I should have worded my description differently. The motivation for doing the test, the results, and the conclusion are still valid.
 

ghost reaper

Dabbler
Joined
May 21, 2014
Messages
45
hey

i got this error when i run spinpid2.sh from Glorious1
MB: X11SSL-F
./spinpid2.sh: line 124: printf: stdin:2:: invalid number
./spinpid2.sh: line 124: printf: syntax: invalid number
./spinpid2.sh: line 124: printf: error:: invalid number
./spinpid2.sh: line 124: printf: newline: invalid number
./spinpid2.sh: line 124: printf: unexpected: invalid number
./spinpid2.sh: line 124: printf: bc:: invalid number
./spinpid2.sh: line 124: printf: stdin:2:: invalid number
./spinpid2.sh: line 124: printf: syntax: invalid number
./spinpid2.sh: line 124: printf: error:: invalid number
./spinpid2.sh: line 124: printf: ): invalid number
./spinpid2.sh: line 124: printf: unexpected: invalid number

im not sure why yet, still working out your code, been busy with work.

and just noted info users with this board must plug cpu fan in fan1 and hdd fans in fanA, not having cpu fan in right i get random speed and then they turn off (while spinpid not running)
 
Joined
Dec 2, 2015
Messages
730
and just noted info users with this board must plug cpu fan in fan1 and hdd fans in fanA, not having cpu fan in right i get random speed and then they turn off (while spinpid not running)

This looks like what you'd see if the fan speed thresholds in IPMI were not set correctly. See this thread on how to set them.


hey

i got this error when i run spinpid2.sh from Glorious1
MB: X11SSL-F
./spinpid2.sh: line 124: printf: stdin:2:: invalid number
./spinpid2.sh: line 124: printf: syntax: invalid number
./spinpid2.sh: line 124: printf: error:: invalid number
./spinpid2.sh: line 124: printf: newline: invalid number
./spinpid2.sh: line 124: printf: unexpected: invalid number
./spinpid2.sh: line 124: printf: bc:: invalid number
./spinpid2.sh: line 124: printf: stdin:2:: invalid number
./spinpid2.sh: line 124: printf: syntax: invalid number
./spinpid2.sh: line 124: printf: error:: invalid number
./spinpid2.sh: line 124: printf: ): invalid number
./spinpid2.sh: line 124: printf: unexpected: invalid number

im not sure why yet, still working out your code, been busy with work.
I'm not sure which version of the script you are using, so the line numbers may not match. But, in the version of @Glorious1's script I just downloaded, line 104 gets the HD fan speed used by the script:
Code:
RPM=$($IPMITOOL sdr | grep "FAN1" | grep -Eo '[0-9]{2,5}')

If your HD fan is connected somewhere other than FAN 1, change that line to match where your fan is hooked up.

The errors you report above look like what you would get if the script did not have valid data. It is possibly related to the fan speed thresholds not being set correctly, or perhaps because the script was looking for fan speed on the wrong header. These errors may go away if you fix the two issues above.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
I'm not sure which version of the script you are using, so the line numbers may not match.
Sorry for the confusion. ghost reaper is testing an experimental dual-zone version of the PID script. We'll confine our back and forth about it to the separate conversation.

I think you're right about the thresholds, and also there is a problem with the script reading CPU temps for some reason.
 

demob

Dabbler
Joined
Dec 1, 2015
Messages
18
Sorry for the confusion. ghost reaper is testing an experimental dual-zone version of the PID script. We'll confine our back and forth about it to the separate conversation.

I think you're right about the thresholds, and also there is a problem with the script reading CPU temps for some reason.
Do you need any more dual-zone beta testers? I've just picked up a SC826E16 / X10SLL-F combo and would love to try some custom fan control per HDDs/CPU fan.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Do you need any more dual-zone beta testers? I've just picked up a SC826E16 / X10SLL-F combo and would love to try some custom fan control per HDDs/CPU fan.
Yes thanks. I'll send you a private message later.
 
Top