SOLVED Failure Alert Notifications via Email

Status
Not open for further replies.

phonoflux

Dabbler
Joined
Aug 23, 2012
Messages
21
Hi Guys

I've spent the better part of the afternoon searching on how I can get FreeNAS-8.2.0-RELEASE-p1-x64 to alert me via email if something's up.

'Something up' would include:
- Drive removed
- Drive with S.M.A.R.T errors
- Anything else other than perfect operation

It seems that through the GIU the only alert you can setup is a SMART check if the temp has gone up by x, its over x or over y.

I'm testing out FreeNAS in the hope of using it going forward and one of the tests I have done is probably the most basic, pulling out a disk while it is running to simulate worst case epic failure. The console reports this straight away and the alert light will eventually go to yellow indicating something is up but that's about it.

I've got email setup, my email in the root account and test emails are sending.

Seems i'm not alone for this requirement as others have resorted to scripting up checks that can be put in CRON jobs that get run every so often however finding a good one of these is proving difficult and they were written back in 2009 for older versions.

Reporting on a failure or a SMART error has been a staple for any RAID solution I have ever come in contact with. Software on an add-in card, onboard raid, enterprise raid (Hyper-V or ESX) i'm just so confused as to why there is nothing at all available for FreeNAS? Some users are saying "its a bug in FreeBSD" and.. that's it.. people seem to accept that.

My fear is that if I use this and everything is sweet for say a year. I never bother to check the nas or log onto the webUI as it's up and running. However in the examples above a disk could be on its way out, be unplugged, dead, etc and I wouldn't even know about it. Then another disk dies and it's all over (wanting to use RaidZ).

Someone please tell me "Phono you fool, there's all that under xyz in the GUI" or "Ah, here load this script up in CRON and change your email, you'll be fine". I'm pretty damn good at using google, tried the search functions on here going through pages and pages of posts and i''m constantly coming up with nothing.

Cheers
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
If you searched, which it seems like you did to a point, you should have found that this is a known issue that probably won't be resolved until FreeNAS 9.1, LATE this year or early next year. We're waiting for zfsd to be added to FreeBSD.
 

phonoflux

Dabbler
Joined
Aug 23, 2012
Messages
21
The posts I found stating this will be resolved in later versions were indeed there, but written by you too. I'm almost shocked that you're happy with that reply. Even now i'm looking over at the console and its still complaining about the unavailable disk, offline uncorrectable sectors, etc.

So there's not even any work around? The console knows about it, no way for it to send an email out to root?

Not even with a dirty hack? Doesn't have to be fancy. Recommendations to existing scripts if there are any? Frankly it seems like if I want any warning that something is up I need to leave a monitor hooked up to the nas and check it every so often?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
If a drive is disconnected you WILL receive an email with the 3am nightly email if setup. It will show the zpool as degraded. It will NOT send you an email 30 seconds after the drive is disconnected.

If your drive has smart errors you WILL get an email when the drive SMART status is checked. Again, this is if the email address is provided.

I have personally tested both of these. There is a thread somewhere that documents the first issue in the last 30 days by myself.
 

phonoflux

Dabbler
Joined
Aug 23, 2012
Messages
21
Ahh noobsauce80, excellent news to hear there is some reporting, thanks for that! Is this a standard email that sends to root at 3am with any issues I need to be made aware of?

By that theory, if root email is setup (and tested/working) I'll get these notifications?

I'll test this myself too tonight by:
1. Leaving the raid degraded
2. Plugging in a drive I know has issues and probably wont pass the smart tests and see what lovely joy I have in my inbox tomorrow.

Thanks for the speedy reply!
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Is this a standard email that sends to root at 3am with any issues I need to be made aware of?

You will receive the standard emails. The only indication of the degraded status will be that it says "DEGRADED". You will have to make this realization. It is the only indicator unfortunately :( If your hard drives are connected to a RAID controller with an audible alarm you will obviously get the alarm within a few seconds of disconnecting the drive. Keep in mind that you should NEVER disconnect a hard drive from an array without first making FreeNAS of your intent to unplug the drive per the manual. A loss of data may occur.

As for the SMART tests, you will likely setup a SMART check every 30 minutes or so. When that check is performed, if your drive has certain characteristics you will immediately get an email. My emails were for uncorrectable read errors. I can only assume that if the disk had failed the SMART test that you would have immediately received an email. My drive did not fail smart test because of the error.

Hope this clears things up.
 

phonoflux

Dabbler
Joined
Aug 23, 2012
Messages
21
Hey, once every 24h is a lot better than 'never' which is what I was lead to believe.

Test Raid is currently in a degraded state, rebooted to make sure. Have also plugged in the 'dodgy' drive which BIOS is all up in my business for attempting to use, so FreeNAS should also be most unimpressed with it when I schedule a smart test.

You mention I should let FreeNAS know of my intent for drives before doing anything to them? In a scheduled replacement scenario due to smart errors, fully agree. However I was wanting to test the robustness of the system as all to many people are happy to dive on in with the mindset of "oh i'm covered by RAID!" and during my researching of trying to see if there was a solution like this in the first place there were indeed a few people that were caught off guard. Probably due to an earlier version and not being told not to setup root with an email address.

In short, everything you have mentioned so far leads me to believe there is indeed something in place to notify admins in the event of a "no news is good news" mindset (like myself).
 

phonoflux

Dabbler
Joined
Aug 23, 2012
Messages
21
Hi Guys, reporting back.

Can confirm with email setup and working, and root having an email address set, each night at 3am you get two emails.

1. Listing the state of all connected drives and the detail I was wanting to see was in there:

Checking status of zfs pools:
pool: vol1
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-2Q
scrub: none requested
config:

I also got the S.M.A.R.T. errors as they happen which is excellent.

All is well and my worries are at bay. Cheers for the advice.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
FreeNAS sends the nightly emails partly because if you learn to expect them and then suddenly don't you'll know something is wrong. So no news is bad news then. ;) Just keep in mind the intent of the email. Some only warn of blatant issues(for instance, any SMART error is NEVER good), others give you(the admin) the info to make your own determination of what is "good" or "bad".
 
Status
Not open for further replies.
Top