Fan Scripts for Supermicro Boards Using PID Logic

Fan Scripts for Supermicro Boards Using PID Logic 2020-08-20, previous one was missing a file

glauco

Guru
Joined
Jan 30, 2017
Messages
526
So with both fans connected with the 4th pin, I'm curious what kind of readout you're getting from the spincheck.sh script. Can you run that for a bit and paste the log?

I'm going to have to review the script and see how it should react if there are different fans reporting speeds in a zone, and how that interacts with the two variables you mentioned - I just don't remember at the moment.
Thank you!
This is it. Let me know if I can help with testing!
Code:
===== OUTPUT OF SPINTEST.SH ===============================================

								 ___Duty%___  Curr_RPM____________________
						MODE	 Zone0 Zone1  FANA  FAN1  FAN2  FAN3  FAN4
Conditions before test  Full	   100   100  2000   ---  2000   ---  1500

								 ___Duty%___  Curr_RPM____________________
						MODE	 Zone0 Zone1  FANA  FAN1  FAN2  FAN3  FAN4
Duty cycle 100%		 Full	   100   100  2000   ---  2000   ---  1500
Duty cycle 90%		  Full		90	90  1900   ---  1800   ---  1400
Duty cycle 80%		  Full		80	80  1700   ---  1700   ---  1300
Duty cycle 70%		  Full		70	70  1500   ---  1500   ---  1100
Duty cycle 60%		  Full		60	60  1300   ---  1300   ---  1000
Duty cycle 50%		  Full		50	50  1100   ---  1100   ---   900
Duty cycle 40%		  Full		40	40   900   ---   900   ---   700
Duty cycle 30%		  Full		30	30   700   ---   700   ---   500
Duty cycle 20%		  Full	   100   100   400   ---   500   ---   400


===== OUTPUT OF SPINCHECK.SH ===============================================================================

		  ada1 ada2 ada3 ada4 ada5 ada6 Tmax Tmean  ERRc CPU  FAN1  FAN2  FAN3  FAN4  FANA Fan%0 Fan%1 MODE
16:37:13  *37  *37  *37  *35  *38  *37  ^38  36.83  3.26  45   ---  2000   ---  1500  2000   100   100 Full
16:38:14  *37  *37  *37  *35  *37  *37  ^37  36.67  3.10  45   ---  2000   ---  1500  2000   100   100 Full
16:39:15  *37  *37  *36  *35  *37  *37  ^37  36.50  2.93  44   ---  2000   ---  1500  2000   100   100 Full
16:40:16  *37  *37  *36  *34  *37  *37  ^37  36.33  2.76  48   ---  2000   ---  1500  2000   100   100 Full
16:41:17  *36  *36  *36  *34  *37  *37  ^37  36.00  2.43  45   ---  2000   ---  1500  2000   100   100 Full
16:42:18  *36  *36  *36  *34  *37  *36  ^37  35.83  2.26  48   ---  2000   ---  1500  2000   100   100 Full


===== OUTPUT OF SPINPID2.SH =======================================================================================================

****** SETTINGS ******
CPU zone 1; Peripheral zone 0
CPU fans min/max duty cycle: 30/100
PER fans min/max duty cycle: 30/100
CPU fans - measured RPMs at 30 0x1.2cp+9nd 100 2000uty cycle: /
PER fans - measured RPMs at 30 0x1.c2p+8nd 100 1500uty cycle: /
Drive temperature setpoint (C): 38
Kp=4, Ki=0, Kd=40
Drive check interval (main cycle; minutes): 5
CPU check interval (seconds): 2
CPU reference temperature (C): 38
CPU scalar: 4

Key to drive status symbols:  * spinning;  _ standby;  ? unknown							  Version 2018-01-01

Tuesday, Aug 14																CPU		 New_Fan%  New_RPM_____________________
		  ada1 ada2 ada3 ada4 ada5 ada6 Tmax Tmean   ERRc	  P	 I	  D TEMP MODE	CPU PER   FANA  FAN1  FAN2  FAN3  FAN4
16:54:43  *36  *35  *35  *34  *37  *36  ^37  35.50  -2.50 -10.00  0.00 -20.00   42 Full	 50  30   1100   ---   700   ---   500
17:00:59  *36  *36  *36  *35  *37  *36  ^37  36.00  -2.00  -8.00  0.00   4.00   42 Full	 46  30   1000   ---   700   ---   500
17:07:16  *37  *37  *37  *36  *38  *37  ^38  37.00  -1.00  -4.00  0.00   8.00   50 Full	 78  34   1700   ---   800   ---   600
17:13:32  *38  *38  *38  *36  *39  *38  ^39  37.83  -0.17  -0.68  0.00   6.64   47 Full	 66  40   1400   ---   900   ---   700
17:19:47  *38  *39  *38  *36  *39  *39  ^39  38.17   0.17   0.68  0.00   2.72   49 Full	 74  43   1600   ---  1000   ---   700
17:26:02  *39  *39  *38  *36  *40  *39  ^40  38.50   0.50   2.00  0.00   2.64   45 Full	 58  48   1300   ---  1100   ---   800
17:32:17  *39  *39  *39  *37  *40  *39  ^40  38.83   0.83   3.32  0.00   2.64   50 Full	 78  54   1700   ---  1200   ---   900
17:38:30  *39  *39  *39  *37  *40  *39  ^40  38.83   0.83   3.32  0.00   0.00   43 Full	 50  57   1100   ---  1300   ---  1000
17:44:43  *39  *39  *39  *37  *40  *39  ^40  38.83   0.83   3.32  0.00   0.00   43 Full	 50  60   1100   ---  1300   ---  1000
17:50:57  *39  *39  *39  *37  *39  *38  ^39  38.50   0.50   2.00  0.00  -2.64   43 Full	 50  59   1100   ---  1300   ---  1000
17:57:13  *39  *39  *39  *37  *39  *38  ^39  38.50   0.50   2.00  0.00   0.00   50 Full	 78  61   1700   ---  1300   ---  1000
Tuesday, Aug 14																CPU		 New_Fan%  New_RPM_____________________
		  ada1 ada2 ada3 ada4 ada5 ada6 Tmax Tmean   ERRc	  P	 I	  D TEMP MODE	CPU PER   FANA  FAN1  FAN2  FAN3  FAN4
18:03:26  *39  *39  *39  *37  *39  *38  ^39  38.50   0.50   2.00  0.00   0.00   45 Full	 58  63   1300   ---  1400   ---  1000
18:09:42  *39  *39  *38  *37  *39  *38  ^39  38.33   0.33   1.32  0.00  -1.36   45 Full	 58  63   1300   ---  1400   ---  1000
18:15:56  *39  *39  *38  *37  *39  *38  ^39  38.33   0.33   1.32  0.00   0.00   46 Full	 62  64   1300   ---  1400   ---  1100
18:22:09  *39  *39  *38  *37  *39  *38  ^39  38.33   0.33   1.32  0.00   0.00   45 Full	 58  65   1300   ---  1400   ---  1100
18:28:25  *39  *39  *38  *37  *39  *38  ^39  38.33   0.33   1.32  0.00   0.00   46 Full	 62  66   1400   ---  1400   ---  1100
18:34:38  *39  *39  *38  *37  *39  *38  ^39  38.33   0.33   1.32  0.00   0.00   45 Full	 58  67   1300   ---  1500   ---  1100
18:40:53  *39  *39  *38  *37  *39  *39  ^39  38.50   0.50   2.00  0.00   1.36   45 Full	 58  70   1300   ---  1500   ---  1100
18:47:07  *39  *39  *38  *37  *39  *38  ^39  38.33   0.33   1.32  0.00  -1.36   46 Full	 62  70   1400   ---  1500   ---  1100
18:53:22  *39  *38  *38  *37  *39  *38  ^39  38.17   0.17   0.68  0.00  -1.28   53 Full	 90  69   1900   ---  1500   ---  1100
18:59:36  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00  -1.36   45 Full	 58  68   1300   ---  1500   ---  1100
19:05:50  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   49 Full	 74  68   1600   ---  1500   ---  1100
19:12:05  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   45 Full	 58  68   1300   ---  1500   ---  1100
19:18:19  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   49 Full	 74  68   1600   ---  1500   ---  1100
19:24:33  *39  *38  *38  *37  *39  *38  ^39  38.17   0.17   0.68  0.00   1.36   44 Full	 54  70   1200   ---  1500   ---  1100
19:30:50  *39  *39  *38  *37  *39  *38  ^39  38.33   0.33   1.32  0.00   1.28   53 Full	 90  73   1900   ---  1600   ---  1200
19:37:08  *39  *38  *38  *37  *39  *38  ^39  38.17   0.17   0.68  0.00  -1.28   51 Full	 82  72   1700   ---  1500   ---  1200
19:43:24  *39  *38  *38  *37  *39  *38  ^39  38.17   0.17   0.68  0.00   0.00   44 Full	 54  73   1200   ---  1600   ---  1200
19:49:38  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00  -1.36   43 Full	 50  72   1100   ---  1500   ---  1200
19:55:52  *38  *38  *37  *36  *39  *38  ^39  37.67  -0.33  -1.32  0.00  -2.64   48 Full	 70  68   1500   ---  1500   ---  1100
20:02:06  *38  *38  *37  *36  *38  *37  ^38  37.33  -0.67  -2.68  0.00  -2.72   46 Full	 62  63   1300   ---  1400   ---  1000
20:08:20  *38  *38  *37  *36  *38  *37  ^38  37.33  -0.67  -2.68  0.00   0.00   46 Full	 62  60   1300   ---  1300   ---  1000
20:14:45  *38  *38  *37  *36  *38  *37  ^38  37.33  -0.67  -2.68  0.00   0.00   53 Full	 90  57   1900   ---  1300   ---  1000
20:21:10  *38  *38  *37  *36  *38  *37  ^38  37.33  -0.67  -2.68  0.00   0.00   53 Full	 90  54   1900   ---  1200   ---   900


=====SPINPID2.SH SETTINGS======================

ZONE_CPU=1
ZONE_PER=0
DUTY_PER_MIN=30
DUTY_PER_MAX=100
DUTY_CPU_MIN=30
DUTY_CPU_MAX=100
RPM_CPU_30=600
RPM_CPU_MAX=2000
RPM_PER_30=450
RPM_PER_MAX=1500
SP=38
Kp=4
Ki=0
Kd=40
CPU_T=2
CPU_REF=38
CPU_SCALE=4


===== OUTPUT OF IPMITOOL SENSOR ==========================================================================================

FAN1			 | na		 |			| na	| na		| na		| na		| na		| na		| na
FAN2			 | 900.000	| RPM		| ok	| 300.000   | 400.000   | 500.000   | 2000.000  | 2100.000  | 2200.000
FAN3			 | na		 |			| na	| na		| na		| na		| na		| na		| na
FAN4			 | 700.000	| RPM		| ok	| 200.000   | 300.000   | 400.000   | 1500.000  | 1600.000  | 1700.000
FANA			 | 1500.000   | RPM		| ok	| 300.000   | 400.000   | 500.000   | 2000.000  | 2100.000  | 2200.000
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
OK @glauco, thanks for all the data! First thing you have to do is move one of the fans to the FAN1 header. As noted in the Overview > Before using the fan control scripts > #3, FAN1 must be occupied. That is the header the script uses to read RPM.

Unfortunately that means you may have to set the fan thresholds for that header as you did for the header where you have it now.

Then just set the two variables you asked about based on the spintest results for that fan. E.g., if you put your fast fan there, which is currently in FAN2, you would set as follows:

RPM_PER_30=700
RPM_PER_MAX=2000

The other fan should be regulated proportionally.

I see there is an error in the settings display at the head of the spinpid2.sh log - these variables are not printing correctly. Assuming you have numbers in there and the problem persists, I will look at that and ask you to test a new version - I can't run that script properly on my machine.
 

glauco

Guru
Joined
Jan 30, 2017
Messages
526
Thank you @Glorious1, I completely missed that part about connecting (at least one of) the HD fan to the FAN1 header!
It was easy because the FAN1 header is right next to the FAN2 header on my mobo.

Then I set the correct threshold for the FAN1 header.
# ipmitool sensor thresh "FAN1" lower 300 400 500
# ipmitool sensor thresh "FAN1" upper 2000 2100 2200

===== OUTPUT OF IPMITOOL SENSOR ==========================================================================================

FAN1 | 2000.000 | RPM | nc | 300.000 | 400.000 | 500.000 | 2000.000 | 2100.000 | 2200.000
FAN2 | na | | na | na | na | na | na | na | na
FAN3 | na | | na | na | na | na | na | na | na
FAN4 | 1500.000 | RPM | nc | 200.000 | 300.000 | 400.000 | 1500.000 | 1600.000 | 1700.000
FANA | 2000.000 | RPM | nc | 300.000 | 400.000 | 500.000 | 2000.000 | 2100.000 | 2200.000


Opened spinpid2.sh and set the measured fan RPMs at 30% duty cycle and 100% duty cycle:
RPM_PER_30=700
RPM_PER_MAX=2000

and rebooted.

spinpid2.sh was launched at startup as a Tasks > Post Init script and this is the tail of the log file:
Code:
===== SPINPID.LOG ================================================================================================================

****** SETTINGS ******
CPU zone 1; Peripheral zone 0
CPU fans min/max duty cycle: 30/100
PER fans min/max duty cycle: 30/100
CPU fans - measured RPMs at 30 0x1.2cp+9nd 100 2000uty cycle: /
PER fans - measured RPMs at 30 0x1.2cp+9nd 100 2000uty cycle: /
Drive temperature setpoint (C): 38
Kp=4, Ki=0, Kd=40
Drive check interval (main cycle; minutes): 5
CPU check interval (seconds): 2
CPU reference temperature (C): 38
CPU scalar: 4

Key to drive status symbols:  * spinning;  _ standby;  ? unknown							  Version 2018-01-01

Wednesday, Aug 15															  CPU		 New_Fan%  New_RPM_____________________
		  ada1 ada2 ada3 ada4 ada5 ada6 Tmax Tmean   ERRc	  P	 I	  D TEMP MODE	CPU PER   FANA  FAN1  FAN2  FAN3  FAN4
09:50:39  *35  *36  *35  *33  *36  *35  ^36  35.00  -3.00 -12.00  0.00 -24.00   47 Full	 50  30   1100   700   ---   ---   500
09:56:54  *36  *36  *36  *34  *36  *36  ^36  35.67  -2.33  -9.32  0.00   5.36   38 Full	 30  30	700   700   ---   ---   500
10:03:09  *36  *36  *36  *34  *37  *36  ^37  35.83  -2.17  -8.68  0.00   1.28   38 Full	 30  30	700   700   ---   ---   500
10:09:23  *36  *37  *36  *35  *37  *36  ^37  36.17  -1.83  -7.32  0.00   2.72   37 Full	 30  30	700   700   ---   ---   500
10:15:38  *36  *37  *36  *35  *37  *36  ^37  36.17  -1.83  -7.32  0.00   0.00   39 Full	 34  30	700   700   ---   ---   500
10:21:53  *37  *37  *36  *35  *37  *37  ^37  36.50  -1.50  -6.00  0.00   2.64   38 Full	 30  30	700   700   ---   ---   500
10:28:08  *37  *37  *37  *35  *38  *37  ^38  36.83  -1.17  -4.68  0.00   2.64   38 Full	 30  30	700   700   ---   ---   500
10:34:22  *37  *38  *37  *35  *38  *37  ^38  37.00  -1.00  -4.00  0.00   1.36   39 Full	 34  30	700   700   ---   ---   500
10:40:36  *37  *38  *37  *35  *38  *37  ^38  37.00  -1.00  -4.00  0.00   0.00   40 Full	 38  30	800   700   ---   ---   500
10:46:51  *37  *38  *37  *35  *38  *37  ^38  37.00  -1.00  -4.00  0.00   0.00   38 Full	 30  30	700   700   ---   ---   500
10:53:05  *37  *38  *37  *36  *38  *37  ^38  37.17  -0.83  -3.32  0.00   1.36   38 Full	 30  30	700   700   ---   ---   500
10:59:20  *37  *38  *37  *36  *38  *37  ^38  37.17  -0.83  -3.32  0.00   0.00   38 Full	 30  30	700   700   ---   ---   500
11:05:34  *37  *38  *37  *36  *38  *37  ^38  37.17  -0.83  -3.32  0.00   0.00   38 Full	 30  30	700   700   ---   ---   500
11:11:49  *37  *38  *37  *36  *38  *37  ^38  37.17  -0.83  -3.32  0.00   0.00   38 Full	 30  30	700   700   ---   ---   500
11:18:04  *37  *38  *37  *36  *38  *37  ^38  37.17  -0.83  -3.32  0.00   0.00   39 Full	 34  30	700   700   ---   ---   500
11:24:19  *37  *38  *37  *36  *38  *37  ^38  37.17  -0.83  -3.32  0.00   0.00   38 Full	 30  30	700   700   ---   ---   500
11:30:34  *38  *38  *37  *36  *38  *37  ^38  37.33  -0.67  -2.68  0.00   1.28   39 Full	 34  30	700   700   ---   ---   500
11:36:49  *38  *38  *37  *36  *38  *38  ^38  37.50  -0.50  -2.00  0.00   1.36   38 Full	 30  30	700   700   ---   ---   500
11:43:04  *38  *39  *38  *36  *38  *38  ^39  37.83  -0.17  -0.68  0.00   2.64   40 Full	 38  32	800   700   ---   ---   600
11:49:20  *38  *39  *38  *36  *38  *38  ^39  37.83  -0.17  -0.68  0.00   0.00   40 Full	 38  31	800   700   ---   ---   500
11:55:35  *38  *38  *38  *36  *38  *38  ^38  37.67  -0.33  -1.32  0.00  -1.28   38 Full	 30  30	700   700   ---   ---   500
12:01:50  *38  *38  *38  *36  *38  *38  ^38  37.67  -0.33  -1.32  0.00   0.00   43 Full	 50  30   1100   700   ---   ---   500
12:08:06  *38  *39  *38  *36  *38  *38  ^39  37.83  -0.17  -0.68  0.00   1.28   39 Full	 34  31	700   700   ---   ---   500
12:14:21  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   1.36   39 Full	 34  32	700   700   ---   ---   600
12:20:36  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   41 Full	 42  32	900   700   ---   ---   600
12:26:51  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   38 Full	 30  32	700   700   ---   ---   600
12:33:06  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   39 Full	 34  32	700   700   ---   ---   600
12:39:21  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   40 Full	 38  32	800   700   ---   ---   600

Seems to be working quite well, doesn't it? Ambient temperature varies a lot. Today is a pretty cool day, but some days it gets as hot as 31°C = 88°F!
 
Last edited:

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
So far so good @glauco. Keep an eye on it as the temperature and workload vary and let me know how it goes.

If you want to repair that cosmetic issue when it prints the settings during startup, at about lines 314-315 change these lines to double the '%' characters after 30 and 100 as follows:
Code:
printf "CPU fans - measured RPMs at 30%% and 100%% duty cycle: %s/%s\n" $RPM_CPU_30 $RPM_CPU_MAX
printf "PER fans - measured RPMs at 30%% and 100%% duty cycle: %s/%s\n" $RPM_PER_30 $RPM_PER_MAX

I will post an updated script soon, but it's no big deal.
 

glauco

Guru
Joined
Jan 30, 2017
Messages
526
Thanks, I've added an extra '%' after 30 and 100 in those two lines of the spinpid2.sh script.
Why are there two running instances of the spinpid2.sh?
# ps aux | grep spinpid2.sh
root 9663 0.0 0.0 8180 3560 v0 S+ 09:50 0:12.21 /usr/local/bin/bash /mnt/MyVolume/scripts/spinscripts/scripts/spinpid2.sh
root 9664 0.0 0.0 8180 3416 v0 I+ 09:50 0:00.00 /usr/local/bin/bash /mnt/MyVolume/scripts/spinscripts/scripts/spinpid2.sh

How do you suggest I should go about reloading the configuration? Should I kill those two processes and lauch the script again in background?
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
I don't know why there are two instances. I'm guess the 09:50 indicates the starting time? because that's when the log you showed above started. So they started at the same time. You may want to check your post-init task?

On the other hand, the log doesn't look like there are two instances writing to it, so I don't know what might be going on there.

I wouldn't bother restarting the edited script just to implement the edits, since the change only affects the beginning of the log.
 

glauco

Guru
Joined
Jan 30, 2017
Messages
526
Ok, I've scrubbed my 3.3TB ZFS pool to heat up the disks and see how hard disk fans (FAN1 and FAN4) react. It took 1 hour and 49 minutes.
I'm going to scrub it again on a hotter day because I want to check if I get BMC resets when the fans reach top speeds.
I am pretty happy with the results, but is there something I can do to keep the mean drive temperature closer to the setpoint (38°C)? Perhaps mess with Kp, Ki or Kd?
Code:
		  ada1 ada2 ada3 ada4 ada5 ada6 Tmax Tmean   ERRc	  P	 I	  D TEMP MODE	CPU PER   FANA  FAN1  FAN2  FAN3  FAN4
14:19:40  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   40 Full	 38  39	800   900   ---   ---   700
14:25:55  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   40 Full	 38  39	800   900   ---   ---   700
14:32:10  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   40 Full	 38  39	800   900   ---   ---   700
14:38:24  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   42 Full	 46  39   1000   900   ---   ---   700
14:44:39  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   41 Full	 42  39	900   900   ---   ---   700
14:50:53  *39  *39  *38  *36  *39  *38  ^39  38.17   0.17   0.68  0.00   1.36   43 Full	 50  41   1100   900   ---   ---   700
14:54:??   Started scrubbing ZFS pool
14:57:10  *39  *39  *38  *36  *39  *39  ^39  38.33   0.33   1.32  0.00   1.28   41 Full	 42  44	900  1000   ---   ---   800
15:03:25  *39  *40  *39  *37  *40  *39  ^40  39.00   1.00   4.00  0.00   5.36   43 Full	 50  53   1100  1200   ---   ---   900
15:09:39  *40  *40  *39  *37  *40  *40  ^40  39.33   1.33   5.32  0.00   2.64   43 Full	 50  61   1100  1300   ---   ---  1000
15:15:54  *40  *40  *39  *37  *40  *40  ^40  39.33   1.33   5.32  0.00   0.00   46 Full	 62  66   1300  1400   ---   ---  1100
15:22:09  *39  *40  *39  *37  *40  *39  ^40  39.00   1.00   4.00  0.00  -2.64   47 Full	 66  67   1400  1400   ---   ---  1100
15:28:24  *39  *39  *39  *37  *40  *39  ^40  38.83   0.83   3.32  0.00  -1.36   47 Full	 66  69   1400  1500   ---   ---  1100
15:34:40  *39  *39  *39  *37  *40  *39  ^40  38.83   0.83   3.32  0.00   0.00   47 Full	 66  72   1400  1500   ---   ---  1200
15:40:55  *39  *39  *38  *37  *40  *39  ^40  38.67   0.67   2.68  0.00  -1.28   49 Full	 74  73   1600  1600   ---   ---  1200
15:47:09  *39  *39  *38  *37  *40  *39  ^40  38.67   0.67   2.68  0.00   0.00   48 Full	 70  76   1500  1600   ---   ---  1200
15:53:23  *39  *39  *38  *37  *40  *38  ^40  38.50   0.50   2.00  0.00  -1.36   48 Full	 70  77   1500  1600   ---   ---  1200
15:59:38  *39  *38  *38  *37  *40  *38  ^40  38.33   0.33   1.32  0.00  -1.36   47 Full	 66  77   1400  1600   ---   ---  1200
16:05:51  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00  -2.64   45 Full	 58  74   1300  1600   ---   ---  1200
16:12:04  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   43 Full	 50  74   1100  1600   ---   ---  1200
16:18:17  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   44 Full	 54  74   1200  1600   ---   ---  1200
16:24:30  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   47 Full	 66  74   1400  1600   ---   ---  1200
16:30:44  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   47 Full	 66  74   1400  1600   ---   ---  1200
16:36:56  *38  *38  *38  *37  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   43 Full	 50  74   1100  1600   ---   ---  1200
16:43:48  Scrub ended
16:43:08  *38  *38  *37  *37  *39  *38  ^39  37.83  -0.17  -0.68  0.00  -1.36   42 Full	 46  72   1000  1500   ---   ---  1200
16:49:24  *37  *37  *37  *36  *38  *37  ^38  37.00  -1.00  -4.00  0.00  -6.64   39 Full	 34  61	800  1400   ---   ---  1000
16:55:39  *37  *37  *37  *36  *38  *37  ^38  37.00  -1.00  -4.00  0.00   0.00   41 Full	 42  57	900  1300   ---   ---  1000
17:01:54  *37  *37  *37  *36  *38  *37  ^38  37.00  -1.00  -4.00  0.00   0.00   43 Full	 50  53   1100  1200   ---   ---   900
17:08:09  *37  *37  *37  *36  *38  *37  ^38  37.00  -1.00  -4.00  0.00   0.00   40 Full	 38  49	800  1100   ---   ---   800
17:14:23  *37  *37  *37  *36  *38  *37  ^38  37.00  -1.00  -4.00  0.00   0.00   50 Full	 78  45   1700  1000   ---   ---   800
17:20:41  *37  *37  *37  *36  *38  *37  ^38  37.00  -1.00  -4.00  0.00   0.00   41 Full	 42  41	900   900   ---   ---   700
17:26:59  *37  *38  *37  *36  *38  *38  ^38  37.33  -0.67  -2.68  0.00   2.64   42 Full	 46  41   1000   900   ---   ---   700
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Interesting @glauco. My impression is your fan cooling capacity is lower than mine, relative to the cooling need. So right, some tuning might be in order. One thing you might try first, though, is reducing the time of the main drive cycle, so it checks the drives and adjusts every 2 or 3 minutes instead of 5. This might allow it to adjust more quickly.

If that isn't satisfactory, you can play with increasing Kp and/or Kd. See the brief tuning suggestions at the end of the script, and you can get more gory details here: https://forums.freenas.org/index.ph...d-drive-temperatures.41294/page-4#post-285668

Did you figure out why/if the script was running twice?
 

glauco

Guru
Joined
Jan 30, 2017
Messages
526
Thank you! After setting DRIVE_T to 3 (down from 5), ERRc strays just a tiny bit from 0, a huge improvement!
Code:
****** SETTINGS ******
CPU zone 1; Peripheral zone 0
CPU fans min/max duty cycle: 30/100
PER fans min/max duty cycle: 30/100
CPU fans - measured RPMs at 30% and 100% duty cycle: 600/2000
PER fans - measured RPMs at 30% and 100% duty cycle: 700/2000
Drive temperature setpoint (C): 38
Kp=4, Ki=0, Kd=40
Drive check interval (main cycle; minutes): 3
CPU check interval (seconds): 2
CPU reference temperature (C): 38
CPU scalar: 4

Key to drive status symbols:  * spinning;  _ standby;  ? unknown							  Version 2018-01-01

Thursday, Aug 16															   CPU		 New_Fan%  New_RPM_____________________
		  ada1 ada2 ada3 ada4 ada5 ada6 Tmax Tmean   ERRc	  P	 I	  D TEMP MODE	CPU PER   FANA  FAN1  FAN2  FAN3  FAN4
08:45:57  *37  *37  *37  *34  *37  *37  ^37  36.50  -1.50  -6.00  0.00 -20.00   46 Full	 50  30   1100   700   ---   ---   500
08:49:45  *37  *37  *37  *34  *37  *37  ^37  36.50  -1.50  -6.00  0.00   0.00   38 Full	 30  30	700   700   ---   ---   500
08:53:31  *37  *37  *37  *35  *37  *37  ^37  36.67  -1.33  -5.32  0.00   2.27   38 Full	 30  30	700   700   ---   ---   500
08:57:19  *37  *37  *37  *35  *37  *37  ^37  36.67  -1.33  -5.32  0.00   0.00   37 Full	 30  30	700   700   ---   ---   500
09:01:05  *37  *37  *37  *35  *37  *37  ^37  36.67  -1.33  -5.32  0.00   0.00   39 Full	 34  30	700   700   ---   ---   500
09:04:52  *37  *37  *37  *35  *37  *37  ^37  36.67  -1.33  -5.32  0.00   0.00   39 Full	 34  30	700   700   ---   ---   500
09:08:40  *37  *38  *37  *35  *37  *37  ^38  36.83  -1.17  -4.68  0.00   2.13   38 Full	 30  30	700   700   ---   ---   500
09:12:26  *37  *38  *37  *35  *38  *37  ^38  37.00  -1.00  -4.00  0.00   2.27   38 Full	 30  30	700   700   ---   ---   500
09:05:10  Scrub started
09:16:13  *37  *38  *37  *35  *38  *37  ^38  37.00  -1.00  -4.00  0.00   0.00   38 Full	 30  30	700   700   ---   ---   500
09:20:04  *38  *39  *38  *36  *38  *38  ^39  37.83  -0.17  -0.68  0.00  11.07   44 Full	 54  40   1200   900   ---   ---   700
09:23:51  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   2.27   42 Full	 46  42   1000  1000   ---   ---   700
09:27:38  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   42 Full	 46  42   1000  1000   ---   ---   700
09:31:25  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   43 Full	 50  42   1100  1000   ---   ---   700
09:35:13  *39  *39  *38  *36  *39  *38  ^39  38.17   0.17   0.68  0.00   2.27   46 Full	 62  45   1300  1000   ---   ---   800
09:39:00  *39  *39  *38  *36  *39  *38  ^39  38.17   0.17   0.68  0.00   0.00   43 Full	 50  46   1100  1000   ---   ---   800
09:42:49  *39  *39  *38  *36  *39  *38  ^39  38.17   0.17   0.68  0.00   0.00   45 Full	 58  47   1300  1100   ---   ---   800
09:46:36  *39  *39  *38  *36  *39  *38  ^39  38.17   0.17   0.68  0.00   0.00   46 Full	 62  48   1300  1100   ---   ---   800
09:50:24  *39  *39  *38  *36  *39  *39  ^39  38.33   0.33   1.32  0.00   2.13   44 Full	 54  51   1200  1100   ---   ---   900
09:54:12  *39  *39  *38  *36  *39  *39  ^39  38.33   0.33   1.32  0.00   0.00   45 Full	 58  52   1300  1100   ---   ---   900
09:57:59  *39  *39  *38  *36  *39  *39  ^39  38.33   0.33   1.32  0.00   0.00   46 Full	 62  53   1300  1200   ---   ---   900
10:01:46  *39  *39  *38  *36  *39  *39  ^39  38.33   0.33   1.32  0.00   0.00   45 Full	 58  54   1300  1200   ---   ---   900
10:05:33  *39  *39  *38  *36  *39  *38  ^39  38.17   0.17   0.68  0.00  -2.13   46 Full	 62  53   1300  1200   ---   ---   900
10:09:20  *39  *39  *38  *36  *39  *38  ^39  38.17   0.17   0.68  0.00   0.00   43 Full	 50  54   1100  1200   ---   ---   900
10:13:09  *39  *39  *38  *36  *39  *38  ^39  38.17   0.17   0.68  0.00   0.00   46 Full	 62  55   1300  1200   ---   ---   900
10:16:55  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00  -2.27   44 Full	 54  53   1200  1200   ---   ---   900
10:20:42  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   41 Full	 42  53	900  1200   ---   ---   900
10:24:30  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   43 Full	 50  53   1100  1200   ---   ---   900
10:28:16  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   45 Full	 58  53   1300  1200   ---   ---   900
10:32:03  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   46 Full	 62  53   1300  1200   ---   ---   900
10:35:49  *39  *39  *38  *36  *39  *38  ^39  38.17   0.17   0.68  0.00   2.27   42 Full	 46  56   1000  1200   ---   ---   900
10:39:36  *39  *39  *38  *36  *39  *38  ^39  38.17   0.17   0.68  0.00   0.00   44 Full	 54  57   1200  1300   ---   ---  1000
10:43:22  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00  -2.27   44 Full	 54  55   1200  1200   ---   ---   900
10:47:09  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   43 Full	 50  55   1100  1200   ---   ---   900
10:50:56  *38  *39  *38  *36  *39  *38  ^39  38.00   0.00   0.00  0.00   0.00   41 Full	 42  55	900  1200   ---   ---   900
10:54:42  *38  *38  *38  *36  *39  *38  ^39  37.83  -0.17  -0.68  0.00  -2.27   42 Full	 46  52   1000  1200   ---   ---   900
10:58:28  *38  *38  *38  *36  *39  *38  ^39  37.83  -0.17  -0.68  0.00   0.00   45 Full	 58  51   1300  1100   ---   ---   900
11:02:06  Scrub ended
11:02:14  *38  *38  *38  *36  *39  *38  ^39  37.83  -0.17  -0.68  0.00   0.00   41 Full	 42  50	900  1100   ---   ---   900

Regarding my fans, I had to pick those with quietness in mind because I live in a small apartment (the FreeNAS box sits right under my TV). That must be why they look weak to you.

I haven't investigated the double instance issue because I wouldn't know how to.
All I can tell you is sometimes there's as many as 4 instances running simultaneously!
It would be interesting to hear from other users who have set your script to start as a post-init task in the web UI.
rhiQuKc.png
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Yes, I don't think any tuning will get better precision than that.

You may want to create a new help thread in another forum section and see if you can get help with the multiple instances. Hopefully it's nothing - maybe there are some things the script does that spawn additional processes for part of it.
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
@dak180 thank you very much for posting that script! I have been monitoring this thread hoping someone would make the proper changes to work with ASRock boards (beyond my capabilities)!

I have a slightly different ASRock board than you. How did you find out all the values to put in your script that are below the line in your script "everything below here is MB specific"?

Thanks!
 

dak180

Patron
Joined
Nov 22, 2017
Messages
310
I have a slightly different ASRock board than you. How did you find out all the values to put in your script that are below the line in your script "everything below here is MB specific"?
There are really only three things below that line: the names of the temp sensors (these are easily obtained from the ipmi web interface), and the raw commands to read and write the fan speed settings; these I found through extensive web searches and verified experimentally and by contacting ASRock Rack support and asking.

I should also note that in addition to being specific to a given MB it is also dependent on which roles one wants the fans that are connected to be used in.
 

zvans18

Dabbler
Joined
Sep 6, 2016
Messages
23
I currently have 8x 7200 rpm drives I'm planning on phasing out for 5400 rpm due to fan noise and heat. I don't know much about PID math, but would I be right in assuming the script wouldn't behave quite right if I changed the set point to a more correct average temp? (like 6 equalizing at 34, the other two would be at, say, 28 for a set point of 32.5)
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
The scripts will bring the disks to whatever mean temperature you set, if the fan capacity and ambient temperature allow. The temp of individual disks will vary; disks that inherently generate more heat will be hotter and so on.
 

zvans18

Dabbler
Joined
Sep 6, 2016
Messages
23
I thought there was some part of the math that incorporated single disk delta, but if not then that makes things easier
 

noprobs

Explorer
Joined
Aug 12, 2012
Messages
53
Just built a new FreeNAS server with Supermicro mobo and found this script. Like it!
I note that HD get temp function is based on SATA drives only. I have a mix of SATA and SAS drives so will need to modify to check type of disk and then read the correct field from smartctl.

Before I smart to amend, has anybody already done this?

Thanks in advance.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Just built a new FreeNAS server with Supermicro mobo and found this script. Like it!
I note that HD get temp function is based on SATA drives only. I have a mix of SATA and SAS drives so will need to modify to check type of disk and then read the correct field from smartctl.
I don't know anything about SAS drives, but if they give temp in a smartctl query, it shouldn't be too complicated. If you need help send a query and response and I'll see what I can suggest.
 

noprobs

Explorer
Joined
Aug 12, 2012
Messages
53
Thanks for quick response. Line 221 edit to " grep "Current Drive Temperature:" /var/tempfile | awk '{print $4}' " works.

I will just need to create a test for HD type, probably by searching for 'SAS' or 'SATA' in smartctl output. I wil create this week and post back for anyone else interested
 

StarkJohan

Explorer
Joined
Mar 27, 2015
Messages
62
Great script! I just needed to add support for FAN5 which was easy thanks to the well commented script. One question comes to mind though.

If a fan breaks or is removed/hotswapped, the BMC automatically sets all fans to full as a safety measure. This of course causes the script to reset the BMC as the rpm does not match the last set duty cycle and it all ends up in a loop as others have noticed. What approach could we take to tackle a fan failure?

Say a fan fails, perhaps a warning email could be sent?
Assuming it is one of the peripheral fans and not the CPU. In stead of just looping the bmc reset, the script could continue doing its thing using the remaining fans.

Is this not a good way of handling it? Am I missing something?

Btw, I'm running spinpid2 with 1 cpu + 2 exhaust + 3 middle "hdd" fans.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Best way of handling it is by detecting that a fan is gone (RPM drops to zero) and either letting the BMC handle it or manually setting everything to 100%. Probably better to let the BMC handle it until the fan is back.
 
Top