Lost Data in Upgrade to 11.2

jchan94

Explorer
Joined
Jul 30, 2015
Messages
55
So, before upgrading everyone should take a snapshot?

Let me clarify:

zfs list states that there is data in the dataset. However, when going to shell, and going to the mnt point (cd /mnt/data/photos.set), there are no files in the mnt point

but with my snapshot, I was able to restore them now....the other datasets that aren't snapshotted are gone. Those datasets, I don't worry about personally, but it sucks that I lost that data of course.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Just wanted to jump in here, because it looks like I've been affected by this too, but I have snapshots.

My boot drive with 9.11 had crashed, and I figured I'd upgrade in the process.

After upgrading to 11.2, I imported the pool, and upgraded it.

My dataset is still in tact for the one that had snapshots, but my other ones are nuked. When trying to read the dataset via shell, it won't read any data.
Are you saying that your dataset that has snapshots was unaffected? Or did you simply recover the data by rolling back?


So, before upgrading everyone should take a snapshot?
You should always use snapshots for a variety of reasons.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

jchan94

Explorer
Joined
Jul 30, 2015
Messages
55
To completely clarify, and set an example:

Dataset A has a recursive snapshot.
Dataset B does not have a snapshot.

After a fresh USB, I imported the pool, and upgraded the ZFS pool.

Dataset A via zfs list showed that it had 3TB of data, but when creating a SMB share, it did not show any files.
Dataset B showed no data at all. The dataset was there, but no mb was used.

Then I cloned the dataset A snapshot, and promoted it. Dataset A files showed up now. Dataset B was a flop and gone magically.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
It's a bit of a stretch, but are your logs intact? If so, we definitely want to take a look.
 

doorstop

Cadet
Joined
Feb 19, 2019
Messages
1
Has there been any progress on this issue? I think my friend has just encountered this going from 11.1u5 to 11.2-Release-U1. They system hung when it rebooted as part of the install. Rebooting into 11.1u5 showed most of the data missing. File_Pool was nearly 60% full, but is now 0%.



NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT

File_Pool 36.2T 2.67G 36.2T - 10% 0% 1.00x ONLINE -

freenas-boot 7.38G 3.09G 4.29G - - 41% 1.00x ONLINE -
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
No progress, but logs and a detailed description of the environment are welcome.
 

eddyanm

Cadet
Joined
Jul 29, 2014
Messages
6
I just upgraded from 11.1u7 to 11.2-u2 on Feb 19, 2019 and encounter the same issue. Several datasets from different pools are showing empty but are actually using space in the pool.

Here's my zfs list output. Note volume1/data, public, volume1/backup and its sub directories are all referring to almost no data yet uses large amount of space. Those folders are all showing empty. My regular snapshots also show little to no data referenced starting Feb 19.

Strangely, my volume1/media and volume1/datastore datasets have all of its content.

NAME USED AVAIL REFER MOUNTPOINT
volume1 1.94T 1.58T 122K /mnt/volume1

volume1/data 102G 1.58T 54K /mnt/volume1/data
volume1/datastore 98.4G 1.58T 98.4G /mnt/volume1/datastore
volume1/dloads 66K 1.58T 66K /mnt/volume1/dloads
volume1/media 1.33T 1.58T 1.33T /mnt/volume1/media

volume1/public 414G 1.58T 54.5K /mnt/volume1/public
volume2 516G 1.25T 152K /mnt/volume2
volume2/backup 516G 1.25T 128K /mnt/volume2/backup

volume2/backup/data 102G 1.25T 128K /mnt/volume2/backup/data
volume2/backup/datastore 88K 1.25T 88K /mnt/volume2/backup/datastore
volume2/backup/public 414G 1.25T 128K /mnt/volume2/backup/public
 

ben-efiz

Cadet
Joined
Feb 21, 2019
Messages
9
@eddytheflow you could try to execute for example zfs get all volume1/data. It gives you more details about the dataset. It could be that the data is actually within a snapshot, see usedbysnapshots and usedbydataset in output.

And just to add, i did the same upgrade and have the same situation. My dataset size actually comes from snapshots...
 

eddyanm

Cadet
Joined
Jul 29, 2014
Messages
6
@ben-efiz Unfortunately, I couldn't wait and had to restore my datasets from snapshots prior to the upgrade. But you are probably right.

Today, I was checking each dataset one by one and found that my volume1/dloads was completely lost. I didn't have any snapshot taken earlier so it's completely empty. And I lost a 200GB file within the volume1/datastore dataset.

I should've mentioned that when I first attempted the 11.2U2 update from 11.1U4, it failed and FreeNAS didn't boot. I reverted back to 11.1U4 before upgrading to U7 and then to 11.2U2 again. My jails disappeared after the first attempt but I didn't check my dataset at that time. It's most likely, the data loss happened during the first upgrade attempt too.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Not that I have high hopes, but are any of the logs intact?
 

ben-efiz

Cadet
Joined
Feb 21, 2019
Messages
9
@ben-efiz Unfortunately, I couldn't wait and had to restore my datasets from snapshots prior to the upgrade. But you are probably right.

Today, I was checking each dataset one by one and found that my volume1/dloads was completely lost. I didn't have any snapshot taken earlier so it's completely empty. And I lost a 200GB file within the volume1/datastore dataset.

I should've mentioned that when I first attempted the 11.2U2 update from 11.1U4, it failed and FreeNAS didn't boot. I reverted back to 11.1U4 before upgrading to U7 and then to 11.2U2 again. My jails disappeared after the first attempt but I didn't check my dataset at that time. It's most likely, the data loss happened during the first upgrade attempt too.

@eddyanm The whole situation sounds very similar to me! See my thread https://forums.freenas.org/threads/empty-datasets-after-upgrade-to-11-2-u2-and-volume-import.74056/

I hope for the sake of others that this very critical issue will be resolved (e.g. with your external logs). I guess for us its just lost data. It's the first time since running and updating from FreeNAS 9.3. There we always issues with configuration, jails, plugins etc. but data at least was safe at least.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
@Ericloewe I still have the boot environments for 11.1U4 and U7. Which log files should I pull? I can mount them and look.
The logs aren't stored in the boot device they're in the system dataset. The current logging environment is mounted as /var/log, but other folders may exist, depending on how the whole upgrade process failed.
 

eddyanm

Cadet
Joined
Jul 29, 2014
Messages
6
@ben-efiz I hope so too. Going forward, I am going to add an extra step to my workflow to snapshot all the datasets to a separate volume and take that offline before an upgrade.

@Ericloewe I have these logs in /var/log. Looks like some do go back to before the upgrade. Which ones would be useful and where can I upload them?

total 20859
-rw-r--r-- 1 root wheel 1 Feb 22 03:01 3ware_raid_alarms.today
-rw-r--r-- 1 root wheel 1 Feb 21 03:01 3ware_raid_alarms.yesterday
-rw------- 1 root wheel 1573636 Feb 22 08:58 auth.log
-rw------- 1 root wheel 103419 Feb 22 00:00 auth.log.0.bz2
-rw------- 1 root wheel 20687 Feb 21 00:00 auth.log.1.bz2
-rw------- 1 root wheel 86911 Feb 20 00:00 auth.log.2.bz2
-rw------- 1 root wheel 102153 Feb 17 00:00 auth.log.3.bz2
-rw------- 1 root wheel 100776 Feb 16 00:00 auth.log.4.bz2
-rw------- 1 root wheel 101848 Feb 15 00:00 auth.log.5.bz2
-rw------- 1 root wheel 100417 Feb 14 00:00 auth.log.6.bz2
-rw------- 1 root wheel 99356 Feb 22 08:58 cron
-rw------- 1 root wheel 9388 Feb 22 00:00 cron.0.bz2
-rw------- 1 root wheel 9324 Feb 21 00:00 cron.1.bz2
-rw------- 1 root wheel 10997 Feb 20 00:00 cron.2.bz2
-rw------- 1 root wheel 81916 Feb 22 06:19 daemon.log
-rw-r----- 1 root wheel 9782 Feb 20 00:00 daemon.log.0.bz2
-rw-r----- 1 root wheel 2942 Feb 17 00:00 daemon.log.1.bz2
-rw-r----- 1 root wheel 3509 Feb 15 00:00 daemon.log.2.bz2
-rw-r----- 1 root wheel 2800 Feb 12 00:00 daemon.log.3.bz2
-rw-r----- 1 root wheel 3291 Feb 10 00:00 daemon.log.4.bz2
-rw------- 1 root wheel 411123 Feb 22 08:58 debug.log
-rw------- 1 root wheel 25401 Feb 22 00:00 debug.log.0.bz2
-rw------- 1 root wheel 35897 Feb 21 00:00 debug.log.1.bz2
-rw------- 1 root wheel 103609 Feb 22 03:01 dmesg.today
-rw------- 1 root wheel 88872 Feb 21 03:01 dmesg.yesterday
-rw-r--r-- 1 root wheel 4422 Feb 19 16:10 iocage.log
-rw-r--r-- 1 root wheel 130 Jul 21 2014 lpd-errs
-rw------- 1 root wheel 2293 Feb 22 04:58 maillog
-rw-r----- 1 root wheel 679 Feb 22 00:00 maillog.0.bz2
-rw-r----- 1 root wheel 750 Feb 21 00:00 maillog.1.bz2
-rw-r----- 1 root wheel 920 Feb 20 00:00 maillog.2.bz2
-rw-r----- 1 root wheel 495 Feb 17 00:00 maillog.3.bz2
-rw-r----- 1 root wheel 375 Feb 16 00:00 maillog.4.bz2
-rw-r----- 1 root wheel 379 Feb 15 00:00 maillog.5.bz2
-rw-r----- 1 root wheel 121 Feb 14 00:00 maillog.6.bz2
-rw------- 1 root wheel 205078 Feb 21 20:46 mdnsresponder.log
-rw------- 1 root wheel 88493 Feb 22 04:58 messages
-rw-r----- 1 root wheel 13666 Feb 20 00:00 messages.0.bz2
-rw-r----- 1 root wheel 4022 Feb 17 00:00 messages.1.bz2
-rw-r----- 1 root wheel 5609 Feb 6 00:00 messages.2.bz2
-rw-r----- 1 root wheel 4128 Jan 1 00:00 messages.3.bz2
-rw-r----- 1 root wheel 15942 Apr 24 2018 messages.4.bz2
-rw-r----- 1 root wheel 15712 Apr 22 2018 messages.5.bz2
-rw-r----- 1 root wheel 10665 Mar 9 2018 messages.6.bz2
-rw-r----- 1 root wheel 11007 Mar 6 2018 messages.7.bz2
-rw-r----- 1 root wheel 17090 Mar 3 2018 messages.8.bz2
-rw-r----- 1 root wheel 10502 Mar 1 2018 messages.9.bz2
-rw-r--r-- 1 root wheel 1708591 Feb 22 08:52 middlewared.log
-rw-r--r-- 1 root wheel 90716044 Jan 29 2018 middlewared.log.1
-rwxr-xr-x 1 minio minio 0 Mar 1 2018 minio.log
-rw------- 1 root wheel 7217 Feb 22 03:01 mount.today
-rw------- 1 root wheel 5423 Feb 21 03:01 mount.yesterday
drwxr-x--- 2 netdata netdata 3 Jan 29 2018 netdata
drwxr-xr-x 2 root wheel 5 Apr 23 2018 nginx
-rw-r--r-- 1 www www 83 Feb 21 09:18 nginx-access.log
-rw-r--r-- 1 www www 10773 Dec 22 2016 nginx-access.log.0.bz2
-rw-r--r-- 1 www www 30346 Feb 21 09:18 nginx-error.log
-rw-r--r-- 1 root wheel 162 Feb 19 13:56 pbid.log
-rw------- 1 root wheel 0 Jul 4 2014 pf.today
-rw-r----- 1 root network 130 Jul 21 2014 ppp.log
drwxr-xr-x 2 root wheel 2 Jul 4 2014 proftpd
drwxr-xr-x 2 root wheel 10 Jan 29 2018 samba4
-rw------- 1 root wheel 130 Jul 21 2014 security
drwxr-xr-x 2 root wheel 2 Jul 7 2016 sssd
-rw-r--r-- 1 root wheel 14 Feb 19 03:05 telemetry.json.bz2
-rw-r--r-- 1 uucp uucp 130 Jul 21 2014 ups.log
-rw------- 1 root wheel 141061 Feb 21 20:07 userlog
-rw-r--r-- 1 root wheel 197 Feb 20 22:43 utx.lastlogin
-rw-r--r-- 1 root wheel 2902 Feb 21 09:18 utx.log
-rw-r----- 1 root wheel 49489 Feb 21 09:18 uwsgi.log
-rw-r--r-- 1 root wheel 53022 Feb 21 09:18 vmware-vmsvc.log
-rw-r--r-- 1 root wheel 0 Jul 1 2016 wtmp
-rw-r--r-- 1 root wheel 0 Jul 1 2016 wtmp.0
-rw-r--r-- 1 root wheel 0 Jun 1 2016 wtmp.1
-rw-r--r-- 1 root wheel 0 May 1 2016 wtmp.2
-rw------- 1 root wheel 130 Jul 21 2014 xferlog
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Looks promising. The easiest option is probably to PM me the debug tarball (you could post it here, but it's a bit borderline from a privacy standpoint).

Also, do you have a decent estimate of when the upgrade happened, to filter out the rest from the logs?
 

ben-efiz

Cadet
Joined
Feb 21, 2019
Messages
9
I wonder if there is a general functionality in ZFS to quickly wipe data (but not snapshots)? Maybe thats something one can look for during bug analysis? I find it very scary that all zfs data is gone so quickly and easily... Even an rm -R would take some time if we talk about terrabytes of data.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I wonder if there is a general functionality in ZFS to quickly wipe data (but not snapshots)?

zfs destroy is pretty immediate unless you have some serious tools, but that is definitely not the case here, as it would have been logged in the pools' histories - which was not the case. There is little doubt that we're looking at POSIX filesystems operations.


Even an rm -R would take some time if we talk about terrabytes of data.
It's not about the size, but the number of files. The actual freeing takes place in the background.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
zfs destroy is pretty immediate unless you have some serious tools, but that is definitely not the case here, as it would have been logged in the pools' histories
It also would have taken the snapshots with it, which hasn't happened.
 
Top