SMART Temperature Notification Emails

Status
Not open for further replies.

xviruz

Dabbler
Joined
Oct 11, 2015
Messages
12
I very recently setup a FreeNAS box and am running into an error with SMART temperature email notifications. I'm on FreeNAS-9.3-STABLE-201509282017 and, in /var/log/messages, I'm seeing:

freenas alert.py: [common.system:210] Failed to send email: [Errno 1] _ssl.c:510: error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record mac​

I'm triggering notifications with these SMART temperature settings:

Check interval: 3
Power mode: Idle
Difference: 2
Informational: 20
Critical: 25
Email to report: foo@gmail.com (where "foo" is my actual email)​

I've also tried setting the above parameters to "3, Never, 0, 20, 25, foo@gmail.com"; "30, Never, 0, 40, 50, foo@gmail.com".

When temps pass the critical threshold, an alert is triggered in the web GUI and logged to /var/log/messages. The "Failed to send email" error does not consistently appear after each polling---I'm not sure why but I don't receive emails either way. Leaving the email field blank also generates the error.

I'm pretty sure my email is setup correctly as "Send Test Mail" works, both the mail and sendmail commands work, and I get UPS and security notification emails. My email settings:

From email: foo@gmail.com
Outgoing mail server: smtp.google.com
Port to connect to: 465
TLS/SSL: SSL
Use SMTP Authentication: true
Username: foo@gmail.com
Password: app password (as I'm using 2 factor auth)​

I've set both the root and my own user's emails to foo@gmail.com.

I've also tried rebooting and powering down, to no avail. If it matters, my hardware is an Asrock C2550D4I, 4x8gb Kingston ECC ram, 8x4TB HGST NAS. Any ideas?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Why the heck do you have a check interval of 3 minutes with a differential of 2 degrees? Depending on the thermodynamics of your system, either you're going to be seeing wide enough fluctuations to get constant triggers, or you'll never get a trigger from it. Why not simply... leave it at the default???

I can almost guarantee you that when you first powerup the system from cold iron, you *will* trigger that 2 degree differential.

In any case I think those values are unsustainable long term and devalue the function that SMART monitoring is supposed to provide.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Noticed it doesn't work on my server either, but all the others emails works perfect. I just given up on finding the source of the problem but it would be useful to know how to fix it.
 

xviruz

Dabbler
Joined
Oct 11, 2015
Messages
12
Why the heck do you have a check interval of 3 minutes with a differential of 2 degrees? Depending on the thermodynamics of your system, either you're going to be seeing wide enough fluctuations to get constant triggers, or you'll never get a trigger from it. Why not simply... leave it at the default???

I can almost guarantee you that when you first powerup the system from cold iron, you *will* trigger that 2 degree differential.

In any case I think those values are unsustainable long term and devalue the function that SMART monitoring is supposed to provide.

I did it as a test to trigger it constantly (after the system was up), as I hoped would be clear from setting the critical threshold at 25C. My normal settings are 30, Never, 10, 40, 50. I've also tried with longer check intervals and I don't get emails either...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
So are we saying that differential emails are somehow different (and broken)? Or that all SMART emailing is broken?

My emailing seems to work fine (I don't know if I've tested with the current build though), but I have differential set to zero.
 

xviruz

Dabbler
Joined
Oct 11, 2015
Messages
12
So are we saying that differential emails are somehow different (and broken)? Or that all SMART emailing is broken?

My emailing seems to work fine (I don't know if I've tested with the current build though), but I have differential set to zero.
That all SMART emailing is broken. I remember also testing with "3, never, 0, 20, 25", "30, never, 0, 40, 50" and had the same problem---I'll retest it tonight. Any other settings you think would be insightful to try?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
I didn't have a bad sector or other things like that for now so I don't know if ALL SMART emails are broken but what I know is that every emails works (weekly, scrub start, new update, custom scripts, ...) but not the ones regarding the temp limits.

I also tried with different settings (long and short intervals, temps limits close and far from the normal temp, ...) and didn't receive any email. I given up because I know my thermal design is ok and I have my custom script with SMART values but it would be great if it worked of course...
 
Last edited:

xviruz

Dabbler
Joined
Oct 11, 2015
Messages
12
Oops, yes, I meant that SMART temperature notifications are broken (i.e., both for >temp difference and >critical threshold). I've not had any SMART errors/tests/scrubs yet so I can't say anything about non-temperature SMART notifications.
 

xviruz

Dabbler
Joined
Oct 11, 2015
Messages
12
Okay I tested it with temperature difference disabled (set to 0) and it has the same problem. I've tried with intervals of 3, 5, 10, 30. I did manage to receive one email but it was clobbered with alerts from different intervals.

Again, the thing I notice is that the errors don't show up after every check interval with critical temps. I can't tell if that's because it outright fails to send (not even logging an error), or if it thinks it succeeded, or if there's a separate time interval between emails (e.g., max one email per 20 mins or something).
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
This seems pretty scary to me (that SMART emails may not work). I think I may need to test this one today. :/
 

platinumjsi

Dabbler
Joined
Oct 13, 2015
Messages
49
Im having the same issues, when I first set the box up a week or so ago I got emails when the drives went over 40 degrees, nothing now :/
iirc there was a recent update I applied so maybe that broke it?
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I have filed a detailed bug on the SMART issues. The bug number is 12046, but it is internal so nobody can see it but ix employees. I will update this thread as appropriate.

Thanks to everyone that has been involved.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
It was marked as "cannot reproduce" without actual investigation. Long story, but basically I'm done trying to fight this bug. If you have the issue (and I have no doubt there are since this has been identified in the forum by dozens of users, feel free to open a new bug ticket. Maybe if someone besides me opens it then it'll get the attention it needs. :/
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
I just set my critical threshold low enough to generate a critical alert, and I did get an email. 9.10
 

biGdada

Dabbler
Joined
Oct 24, 2012
Messages
44
I just set my critical threshold low enough to generate a critical alert, and I did get an email. 9.10
what kind of email? there are 2 emails that should be sent - one about a critical alert, sent to root and another one should be sent to an address(es) listed in s.m.a.r.t. configuration dialog.
i'm not getting the second one.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
what kind of email? there are 2 emails that should be sent - one about a critical alert, sent to root and another one should be sent to an address(es) listed in s.m.a.r.t. configuration dialog.
i'm not getting the second one.
I don't know which one it is, since I would be the recipient of both. But the subject is "Critical Alerts" and the content of the email says "Device: /dev/da2 [SAT], Temperature 36 Celsius reached critical limit of 36 Celsius (Min/Max ??/36)". That sounds like all you need to know.
 

biGdada

Dabbler
Joined
Oct 24, 2012
Messages
44
yep, thats the one being sent to root. i'm getting it too, the problem is with the one being sent to "email to report" in s.m.a.r.t dialog
 

rm-r

Contributor
Joined
Jan 7, 2013
Messages
166
Status
Not open for further replies.
Top