getting spammed by mail alerts

Joined
Jan 27, 2020
Messages
577
After I set up my FreeNAS and a first pool I get a round of email alerts which I struggle to understand.
Some of them I receive once a day, some by the hour.
Is there a documentation on what these alerts mean, to get a better understanding?

According to the GUI my pool status is healthy.

Here are some of the messages I got:

These two are getting really obnoxious, alternating avery 10 mins and spam my mail account:
  • "Failed to check for alert BootPoolStatus" - I checked my bootpool in the GUI, all seems fine
  • "Boot pool status is ONLINE: One or more devices has experienced an error resulting in data corruption. Applications may be affected.." - like above, nothing is indicating a problem with the bootpool, I did scrub the boot pool once or twice, all is green, even smart long test are not giving my anything odd. These are two newly bought drives.
Then there are these multiple times a day:
  • "Failed to check for alert quota"
  • "Pool state is OFFLINE: None" - When I check the GUI, again all seems fine.

I doubt this is normal behaviour, nor that this is intedended by the devs.
In total, I received 46 emails yesterday from my NAS!
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
I see your boot pool on a WD green SSD (3D NAND models)

It looks to me like you may have the issue which means that with TRIM enabled, your files are getting corrupted. https://redmine.ixsystems.com/issues/35065

You can either disable TRIM (system-wide setting) or go with a different boot SSD to fix the problem if that applies to you.
 
Joined
Jan 27, 2020
Messages
577
I see your boot pool on a WD green SSD (3D NAND models)

It looks to me like you may have the issue which means that with TRIM enabled, your files are getting corrupted. https://redmine.ixsystems.com/issues/35065

You can either disable TRIM (system-wide setting) or go with a different boot SSD to fix the problem if that applies to you.
huh, interesting. The very first install had issues with an immediatly degraded boot pool after a fresh install. The second try went without issues.
Maybe I should try to disable TRIM and see, if the alerts get better.

Do I have to re-install for the tunable to get into effect?
Where excactly do I put the tunable?
Any SSD recommendations that work reliably?

Thanks!
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Do I have to re-install for the tunable to get into effect?
Where excactly do I put the tunable?
Any SSD recommendations that work reliably?
No. (although you may want to take your config backup and reinstall as the files are actually corrupted).
Under System, Tunables.
Kingston seems to be a consensus as a combination of working well and cheap.
 
Joined
Jan 27, 2020
Messages
577
I switced the boot device a while ago, still getting frequently spammed by the mail alert saying:

"Failed to check for alert quota" and then the following code:

Code:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 95, in main_worker
res = loop.run_until_complete(coro)
File "/usr/local/lib/python3.7/asyncio/base_events.py", line 579, in run_until_complete
return future.result()
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 51, in _run
return await self._call(name, serviceobj, methodobj, params=args, job=job)
File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 34, in _call
with Client('ws+unix:///var/run/middlewared-internal.sock', py_exceptions=True) as c:
File "/usr/local/lib/python3.7/site-packages/middlewared/client/client.py", line 360, in __init__
self._ws.connect()
File "/usr/local/lib/python3.7/site-packages/middlewared/client/client.py", line 181, in connect
rv = super(WSClient, self).connect()
File "/usr/local/lib/python3.7/site-packages/ws4py/client/__init__.py", line 239, in connect
self.protocols, self.extensions = self.process_handshake_header(headers)
File "/usr/local/lib/python3.7/site-packages/ws4py/client/__init__.py", line 332, in process_handshake_header
raise HandshakeError("Invalid challenge response: %s" % value)
ws4py.exc.HandshakeError: Invalid challenge response: b'rovxj2o+uhdisfumqirnovwn2iq='
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/alert.py", line 650, in __run_source
alerts = (await alert_source.check()) or []
File "/usr/local/lib/python3.7/site-packages/middlewared/alert/base.py", line 204, in check
return await self.middleware.run_in_thread(self.check_sync)
File "/usr/local/lib/python3.7/site-packages/middlewared/utils/run_in_thread.py", line 10, in run_in_thread
return await self.loop.run_in_executor(self.run_in_thread_executor, functools.partial(method, *args, **kwargs))
File "/usr/local/lib/python3.7/site-packages/middlewared/utils/io_thread_pool_executor.py", line 25, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/../alert/source/quota.py", line 35, in check_sync
datasets = self.middleware.call_sync("zfs.dataset.query_for_quota_alert")
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1143, in call_sync
io_thread=True, job_on_progress_cb=job_on_progress_cb,
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1166, in run_coroutine
return fut.result()
File "/usr/local/lib/python3.7/concurrent/futures/_base.py", line 428, in result
return self.__get_result()
File "/usr/local/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1074, in _call
return await self._call_worker(name, *args)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1094, in _call_worker
return await self.run_in_proc(main_worker, name, args, job)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1029, in run_in_proc
return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1003, in run_in_executor
return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
ws4py.exc.HandshakeError: Invalid challenge response: b'rovxj2o+uhdisfumqirnovwn2iq='
 
Joined
Jan 27, 2020
Messages
577
Joined
Jan 27, 2020
Messages
577
Just found this in the release notes of 11.3:

  • Periodic alert scripts have been replaced by the Alert framework. Periodic alert emails are disabled by default and previous email alert conditions have been added to the FreeNAS alert system. E-mail or other alert methods can be configured in Alert Services.

It may have something to do with this change in 11.3.
Per-alert severity is set to IMMEDIATELY by default for each alert item. Though the alerts in the GUI act normal, the email-alerts are going crazy.
 
Top