PC backup to TrueNAS server filled disk and crashed...

GlueFactoryBJJ

Dabbler
Joined
Oct 15, 2015
Messages
32
Hi Everyone!

To start, I've been using computers since the Radio Shack TRS-80 (4K RAM) and had a 25+ year career in IT, but never really got into using Unix/Linux. Especially not the command line.

My system is a home network backup server. It's been pretty reliable until the beginning of December 2022. Because of the following problems and "life", I haven't been able to even try to address this until now.

What happened is that a backup process filled up the "Backup" pool on the server and then crashed, not deleting the incomplete backup. Because of this, there are 0 bytes free on that share. I can see this from the Dashboard, which is confusing, because of the rest of the story...

Since this, I have been unable to access that share. Windows doesn't report that it even exists (I'm assuming that this is because of a SMB issue), even though I CAN connect to the administrative IP address.

I'm on TrueNAS-12.0-U8 and I've been wanting to update to TrueNAS, but I can't because the server says there is no room for the download.

Next, I get the following error. I'm assuming that 192.168.1.nnn is the SMB share IP address?

WARNING
The Web interface could not bind to 192.168.1.nnn. Using 0.0.0.0 instead.
2023-01-21 13:26:52 (America/Chicago)
Dismiss

On top of all this, I'm getting the following error on three drives (no warnings before all this started to happen):

CRITICAL
Device: /dev/da0 [SAT], not capable of SMART self-check.
2023-01-21 21:57:02 (America/Chicago)
Dismiss

Surely this isn't telling me I've lost three drives at the same time? The Dashboard screenshot at the end of the message says there are no errors on the drives. Confusing...

Finally, this message showed up today (the attached monitor has TONS of error messages):

CRITICAL
Failed to check for alert HasUpdate: Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/alert.py", line 740, in __run_source alerts = (await alert_source.check()) or [] File "/usr/local/lib/python3.9/site-packages/middlewared/alert/base.py", line 211, in check return await self.middleware.run_in_thread(self.check_sync) File "/usr/local/lib/python3.9/site-packages/middlewared/utils/run_in_thread.py", line 10, in run_in_thread return await self.loop.run_in_executor(self.run_in_thread_executor, functools.partial(method, *args, **kwargs)) File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run result = self.fn(*self.args, **self.kwargs) File "/usr/local/lib/python3.9/site-packages/middlewared/alert/source/update.py", line 67, in check_sync path = self.middleware.call_sync("update.get_update_location") File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1272, in call_sync return self.run_coroutine(methodobj(*prepared_call.args)) File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1312, in run_coroutine return fut.result() File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 438, in result return self.__get_result() File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result raise self._exception File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/update.py", line 412, in get_update_location os.chmod(path, 0o755) OSError: [Errno 28] No space left on device: '/var/db/system/update'
2023-01-21 19:27:48 (America/Chicago)
Dismiss

Is the server FUBAR? Do I just reinstall and hope it will rebuild the existing share (it has in the past, on new installs)? Or will access the shell allow me to delete some files on the pool so it has some space and that will fix the problem?

The ironic thing is I got 4 more 4TB drives (to give me some sorely needed expansion room) ordered a couple days before this happened.

Any help would be appreciated!

Scott

PS. I've attached a pic I took of what on my continuously scrolling monitor that is directly attached to the server. I've also attached a screenshot of the server's Dashboard.
 

Attachments

  • PXL_20230122_095855330_2.jpg
    PXL_20230122_095855330_2.jpg
    535.2 KB · Views: 81
  • 2023-01-22 - TrueNAS - Dashboard screen.jpg
    2023-01-22 - TrueNAS - Dashboard screen.jpg
    141.4 KB · Views: 87
Last edited:

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Please post the output of zpool status -v here in code tags, like this:

Code:
root@NAS4[~]# zpool status -v
  pool: freenas-boot
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:01:29 with 0 errors on Fri Jan 20 03:46:29 2023
config:


        NAME          STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          ada0p2      ONLINE       0     0     0


errors: No known data errors


  pool: tank
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 06:18:21 with 0 errors on Sun Jan 15 10:18:32 2023
config:


        NAME                                            STATE     READ WRITE CKSUM
        tank                                            ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/712205ba-4e81-11ea-ac5f-0cc47a786989  ONLINE       0     0     0
            gptid/71d36306-9a1c-11ea-be24-0cc47a786989  ONLINE       0     0     0
            gptid/71aa6b24-4e81-11ea-ac5f-0cc47a786989  ONLINE       0     0     0
            gptid/7191d206-4e81-11ea-ac5f-0cc47a786989  ONLINE       0     0     0
            gptid/18c7baa6-f63a-11eb-a693-000f530e6898  ONLINE       0     0     0
            gptid/7235612a-4e81-11ea-ac5f-0cc47a786989  ONLINE       0     0     0


errors: No known data errors
root@NAS4[~]#
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
These error messages are pretty harmless:
Code:
WARNING
The Web interface could not bind to 192.168.1.nnn. Using 0.0.0.0 instead.
2023-01-21 13:26:52 (America/Chicago)

CRITICAL
Device: /dev/da0 [SAT], not capable of SMART self-check.
2023-01-21 21:57:02 (America/Chicago)

The first simply says that their is a problem with the Web GUI using a specific IP, so it will bind to any IP. You may have a network problem, like you were using DHCP and the DHCP server is not supplying an IP. But if you can get into the Web GUI, you are good.

The second is likely because "/dev/da0" is a USB device, most of which don't allow SMART self-checks.

Is your boot device a USB drive?


As for your real problem, first get the zpool status -v as @Redcoat suggests. I'd add a zfs list -t all -r as well.
 

GlueFactoryBJJ

Dabbler
Joined
Oct 15, 2015
Messages
32
Here is the "zpool status -v" output:

Warning: settings changed through the CLI are not written to
the configuration database and will be reset on reboot.

root@Goober:~ # zpool status -v
pool: Backup
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(5) for details.
scan: scrub repaired 0B in 12:29:02 with 0 errors on Sun Jan 22 12:29:03 2023
config:

NAME STATE READ WRITE CKSUM
Backup ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gptid/79f82934-553e-11ea-9846-408d5c425011 ONLINE 0 0 0
gptid/7a2dc47f-553e-11ea-9846-408d5c425011 ONLINE 0 0 0
gptid/7a6ee531-553e-11ea-9846-408d5c425011 ONLINE 0 0 0
gptid/7ac24e8f-553e-11ea-9846-408d5c425011 ONLINE 0 0 0
gptid/7b5c6dfa-553e-11ea-9846-408d5c425011 ONLINE 0 0 0
gptid/7af3496c-553e-11ea-9846-408d5c425011 ONLINE 0 0 0
gptid/7b6b44b9-553e-11ea-9846-408d5c425011 ONLINE 0 0 0
gptid/7b932e76-553e-11ea-9846-408d5c425011 ONLINE 0 0 0
gptid/7b1d16d6-553e-11ea-9846-408d5c425011 ONLINE 0 0 0

errors: No known data errors

pool: freenas-boot
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 0B in 00:00:49 with 8 errors on Sun Jan 22 03:45:49 2023
config:

NAME STATE READ WRITE CKSUM
freenas-boot DEGRADED 0 0 0
ada0p2 DEGRADED 0 0 17 too many errors

errors: Permanent errors have been detected in the following files:

freenas-boot/ROOT/12.0-U8@2020-11-08-13:25:42:/usr/local/lib/python3.7/site-packages/zettarepl/snapshot/__pycache__/name.cpython-37.opt-1.pyc
freenas-boot/ROOT/12.0-U8@2020-11-08-13:25:42:/usr/local/lib/python3.7/site-packages/pysnmp/proto/secmod/rfc3414/auth/__pycache__/__init__.cpython-37.opt-1.pyc
freenas-boot/ROOT/12.0-U8@2020-11-08-13:25:42:/usr/local/lib/python3.7/site-packages/aiohttp/__pycache__/hdrs.cpython-37.opt-1.pyc
freenas-boot/ROOT/12.0-U8@2020-10-02-05:44:38:/usr/local/lib/python3.7/site-packages/aiohttp/__pycache__/http_exceptions.cpython-37.opt-1.pyc
freenas-boot/ROOT/12.0-U8@2020-10-02-05:44:38:/usr/local/lib/python3.7/site-packages/boto3/do
 

GlueFactoryBJJ

Dabbler
Joined
Oct 15, 2015
Messages
32
These error messages are pretty harmless:
Code:
WARNING
The Web interface could not bind to 192.168.1.nnn. Using 0.0.0.0 instead.
2023-01-21 13:26:52 (America/Chicago)

CRITICAL
Device: /dev/da0 [SAT], not capable of SMART self-check.
2023-01-21 21:57:02 (America/Chicago)

The first simply says that their is a problem with the Web GUI using a specific IP, so it will bind to any IP. You may have a network problem, like you were using DHCP and the DHCP server is not supplying an IP. But if you can get into the Web GUI, you are good.

The second is likely because "/dev/da0" is a USB device, most of which don't allow SMART self-checks.

Is your boot device a USB drive?


As for your real problem, first get the zpool status -v as @Redcoat suggests. I'd add a zfs list -t all -r as well.
The boot drive is a 120GB SSD. I haven't had very good luck with the reliability of USB drives.

Here is the output for "zfs list -t all -r":

root@Goober:~ # zfs list -t all -r
NAME USED AVAIL REFER MOUNTPOINT
Backup 24.6T 0B201K /mnt/Backup
Backup/.system 1003M 0B786M legacy
Backup/.system/configs-5ece5c906a8f4df886779fae5cade8a5 93.8M 0B 93.8M legacy
Backup/.system/cores 201K 0B201K legacy
Backup/.system/rrd-5ece5c906a8f4df886779fae5cade8a5 117M 0B117M legacy
Backup/.system/samba4 5.43M 0B 1.33M legacy
Backup/.system/samba4@update--2020-11-08-19-27--11.3-U5 594K -850K -
Backup/.system/samba4@update--2021-03-17-04-26--12.0-RELEASE 466K -850K -
Backup/.system/samba4@update--2021-04-19-02-52--12.0-U2.1 484K -887K -
Backup/.system/samba4@update--2021-05-10-19-42--12.0-U3 484K -832K -
Backup/.system/samba4@update--2022-02-19-23-34--12.0-U3.1 256K -887K -
Backup/.system/samba4@wbc-1645313894 311K -887K -
Backup/.system/samba4@wbc-1646926329 521K -905K -
Backup/.system/samba4@wbc-1653800860 567K -887K -
Backup/.system/services 219K 0B219K legacy
Backup/.system/syslog-5ece5c906a8f4df886779fae5cade8a5 201K 0B201K legacy
Backup/.system/webui 201K 0B201K legacy
Backup/Backup 24.6T 0B 24.6T /mnt/Backup/Backup
Backup/iocage 630M 0B 1.78M /mnt/Backup/iocage
Backup/iocage/download 138M 0B201K /mnt/Backup/iocage/download
Backup/iocage/download/11.3-RELEASE 137M 0B137M /mnt/Backup/iocage/download/11.3-RELEASE
Backup/iocage/images 201K 0B201K /mnt/Backup/iocage/images
Backup/iocage/jails 704K 0B201K /mnt/Backup/iocage/jails
Backup/iocage/jails/PlexMediaServer 503K 0B210K /mnt/Backup/iocage/jails/PlexMediaServer
Backup/iocage/jails/PlexMediaServer/root 292K 0B487M /mnt/Backup/iocage/jails/PlexMediaServer/root
Backup/iocage/log 201K 0B201K /mnt/Backup/iocage/log
Backup/iocage/releases 489M 0B201K /mnt/Backup/iocage/releases
Backup/iocage/releases/11.3-RELEASE 489M 0B201K /mnt/Backup/iocage/releases/11.3-RELEASE
Backup/iocage/releases/11.3-RELEASE/root 489M 0B487M /mnt/Backup/iocage/releases/11.3-RELEASE/root
Backup/iocage/releases/11.3-RELEASE/root@PlexMediaServer 1.12M -487M -
Backup/iocage/templates 201K 0B201K /mnt/Backup/iocage/templates
freenas-boot 10.7G 81.8G 23K none
freenas-boot/ROOT 10.7G 81.8G 23K none
freenas-boot/ROOT/11.3-U1 260K 81.8G 1016M /
freenas-boot/ROOT/11.3-U2 290K 81.8G 1019M /
freenas-boot/ROOT/11.3-U3.2 228K 81.8G 1018M /
freenas-boot/ROOT/11.3-U5 216K 81.8G 1.01G /
freenas-boot/ROOT/12.0-U2.1 209K 81.8G 1.15G /
freenas-boot/ROOT/12.0-U3 184K 81.8G 1.16G /
freenas-boot/ROOT/12.0-U3.1 186K 81.8G 1.16G /
freenas-boot/ROOT/12.0-U8 10.7G 81.8G 1.20G /
freenas-boot/ROOT/12.0-U8@2020-02-22-06:09:15 8.26M - 1018M -
freenas-boot/ROOT/12.0-U8@2020-03-02-02:05:19 8.54M - 1018M -
freenas-boot/ROOT/12.0-U8@2020-04-21-13:16:27 1015M - 1016M -
freenas-boot/ROOT/12.0-U8@2020-06-22-21:08:46 1019M - 1019M -
freenas-boot/ROOT/12.0-U8@2020-10-02-05:44:38 1018M - 1018M -
freenas-boot/ROOT/12.0-U8@2020-11-08-13:25:42 1.01G - 1.01G -
freenas-boot/ROOT/12.0-U8@2021-03-16-23:24:41 1.06G - 1.06G -
freenas-boot/ROOT/12.0-U8@2021-04-18-21:50:31 1.15G - 1.15G -
freenas-boot/ROOT/12.0-U8@2021-05-10-14:40:26 1.16G - 1.16G -
freenas-boot/ROOT/12.0-U8@2022-02-19-17:31:47 1.16G - 1.16G -
freenas-boot/ROOT/
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Your boot SSD is failing. Make sure you backup your configuration, (and check afterwards to see if the config was listed in the corrupt file list). Some SSDs don't have all the SMART info.

It does appear your Backup pool is completely full, not a good place to be. There can be problems cleaning out items to make some free space.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
So, yes, as @Arwen indicated, as you apparently have access to the GUI, so backup your configuration with Save Configuration on System>General. Make a new boot drive (use a USB stick if more expedient, can always make an SSD boot drive "later"), reboot and upload your saved Config file. Then you can investigate making space on your pool by deleting stuff you don't need.

If you can't save your config for some reason come back and we can advise how to find the config that is automagically saved daily at 03h15.

Also, tell us about your hardware, please.

You mentioned having bought 4 new 4TB drives to increase capacity. How did you intend to deploy them?
 
Last edited:
Top