Hello everyone. I'm seeing some strange behavior in my FreeNAS setup and was hoping that someone could help me to understand what is going on.
Background:
I currently have a FreeNAS setup consisting of a pool of six 12TB drives configured for ZFS2. Underneath the top level dataset, I have a number of other datasets organized mostly by media type (so, photos, music, video, backups, and so on).
I need to grow this pool and am in the process of doing so. I have six more 12TB drives ready to go, and my plan was to
So, I set up these 18 drives (also as ZFS2), created snapshots of the original pool, and started the replication process.
The Problem:
Replication is getting close to finishing, but it is going to fail. Why? Because most (but not all) of the replicated data seems to have grown to be just about %150 of its original size, and I just don't know why.
Drilling down a bit, I took a look at my "Photos" data, and focused in on one file object in specific using zdb. I turned on a bunch of -ddddd's and dumped the stats for one of the files before replication and after. This is where things start to become confusing. On the surface, everything looks to be the same. The block size is the same, the structure of the file seems to be the same, the logical and physical sizes of each of the blocks seems to be the same. The only things which appear different seem to be
File in original pool
Replicated file in backup pool
So, before replication, the logical size of the file is 2.83MB while the on-disk size is 2.85MB. So, the on-disk size is ~%101 of the logical size.
After replication, the logical size of the file is still 2.83MB, but the on-disk size is now 3.81MB. Now the on-disk size is ~%134.6 of the logical size.
The recordsize of the top level datasets for the original and backup datasets is 128KB, while the record size for the "photos" dataset is 16KB. If I go into the "edit settings" for the "photos" dataset on the backup pool, it warns me that 64KB is the optimal record size. I assume that the 16KB number was replicated from the original data set (which also shows 16KB, but gives no warning).
So, I'm pretty confused. I understand that there can be wasted space if only a portion of the underlying record/block size is used by a file, but the dumps I have here seem to imply that the block structure of the files are functionally identical, as is the record size of the dataset.
Does anyone have any idea of what I am doing wrong here? Why does this file take up so much extra space after replication to the pool made up of a larger number of smaller drives? Is there something that I can do in order to prevent this from happening? I'm not super concerned with maximizing performance during the backup and restore phase of this plan. I just want to make sure that I don't lose anything and that when I have finally rebuilt the original pool, that it performs well.
Thanks in advance for any help!
Background:
I currently have a FreeNAS setup consisting of a pool of six 12TB drives configured for ZFS2. Underneath the top level dataset, I have a number of other datasets organized mostly by media type (so, photos, music, video, backups, and so on).
I need to grow this pool and am in the process of doing so. I have six more 12TB drives ready to go, and my plan was to
- Replicate all of my data to a completely different pool
- Destroy the original pool
- Add the new drives to the system
- Recreate the original pool with the new drives added in
- Replicate from the backup pool to the newly created pool.
So, I set up these 18 drives (also as ZFS2), created snapshots of the original pool, and started the replication process.
The Problem:
Replication is getting close to finishing, but it is going to fail. Why? Because most (but not all) of the replicated data seems to have grown to be just about %150 of its original size, and I just don't know why.
Drilling down a bit, I took a look at my "Photos" data, and focused in on one file object in specific using zdb. I turned on a bunch of -ddddd's and dumped the stats for one of the files before replication and after. This is where things start to become confusing. On the surface, everything looks to be the same. The block size is the same, the structure of the file seems to be the same, the logical and physical sizes of each of the blocks seems to be the same. The only things which appear different seem to be
- the physical locations of each of the blocks in the storage (this makes sense to me)
- The overall `dsize` of the file (this is the part which is strange)
File in original pool
Code:
Dataset Goliath/photos [ZPL], ID 113, cr_txg 2204, 27.1G, 4726 objects, rootbp DVA[0]=<0:36c1192a9000:3000> DVA[1]=<0:ca0acf6f000:3000> [L0 DMU objset] fletcher4 lz4 LE contiguous unique double size=800L/200P birth=597612L/597612P fill=4726 cksum=11a86feea6:64aff8ab68f:12c2edd2d32da:26d6e0c097824f Object lvl iblk dblk dsize lsize %full type 12406 2 32K 16K 2.86M 2.83M 100.00 ZFS plain file (K=inherit) (Z=inherit) 168 bonus System attributes dnode flags: USED_BYTES USERUSED_ACCOUNTED dnode maxblkid: 180 path <redacted> uid 1000 gid 1000 atime Wed Jan 9 23:56:10 2019 mtime Sat May 15 13:08:22 2004 ctime Wed Jan 9 23:56:10 2019 crtime Sat May 15 13:08:22 2004 gen 3402 mode 100660 size 2959582 parent 10911 links 1 pflags 40800000004 Indirect blocks: 0 L1 0:103a6d2e000:6000 0:d8004185000:6000 8000L/1e00P F=181 B=3402/3402 0 L0 0:183844a7000:6000 4000L/4000P F=1 B=3402/3402 4000 L0 0:183844b9000:6000 4000L/4000P F=1 B=3402/3402 8000 L0 0:183844bf000:6000 4000L/4000P F=1 B=3402/3402 <snip> 2cc000 L0 0:183850dd000:6000 4000L/4000P F=1 B=3402/3402 2d0000 L0 0:183850e3000:6000 4000L/4000P F=1 B=3402/3402 segment [0000000000000000, 00000000002d4000) size 2.83M
Replicated file in backup pool
Code:
Dataset GoliathBackup/photos [ZPL], ID 492, cr_txg 3139, 36.1G, 4726 objects, rootbp DVA[0]=<0:25d26af74000:3000> DVA[1]=<0:94337b3f000:3000> [L0 DMU objset] fletcher4 lz4 LE contiguous unique double size=800L/200P birth=3168L/3168P fill=4726 cksum=dbae30877:49c3d186b91:d391e2f87026:1ad0922f3402e4 Object lvl iblk dblk dsize lsize %full type 12406 2 32K 16K 3.81M 2.83M 100.00 ZFS plain file (K=inherit) (Z=inherit) 168 bonus System attributes dnode flags: USED_BYTES USERUSED_ACCOUNTED dnode maxblkid: 180 path <redacted> uid 1000 gid 1000 atime Wed Jan 9 23:56:10 2019 mtime Sat May 15 13:08:22 2004 ctime Wed Jan 9 23:56:10 2019 crtime Sat May 15 13:08:22 2004 gen 3402 mode 100660 size 2959582 parent 10911 links 1 pflags 40800000004 Indirect blocks: 0 L1 0:255a89f8f000:6000 0:a0391560000:6000 8000L/1e00P F=181 B=3168/3168 0 L0 0:25d226f4c000:6000 4000L/4000P F=1 B=3168/3168 4000 L0 0:25d226f6a000:6000 4000L/4000P F=1 B=3168/3168 8000 L0 0:25d226f76000:6000 4000L/4000P F=1 B=3168/3168 <snip> 2cc000 L0 0:25d26af23000:6000 4000L/4000P F=1 B=3168/3168 2d0000 L0 0:25d26af29000:6000 4000L/4000P F=1 B=3168/3168 segment [0000000000000000, 00000000002d4000) size 2.83M
So, before replication, the logical size of the file is 2.83MB while the on-disk size is 2.85MB. So, the on-disk size is ~%101 of the logical size.
After replication, the logical size of the file is still 2.83MB, but the on-disk size is now 3.81MB. Now the on-disk size is ~%134.6 of the logical size.
The recordsize of the top level datasets for the original and backup datasets is 128KB, while the record size for the "photos" dataset is 16KB. If I go into the "edit settings" for the "photos" dataset on the backup pool, it warns me that 64KB is the optimal record size. I assume that the 16KB number was replicated from the original data set (which also shows 16KB, but gives no warning).
So, I'm pretty confused. I understand that there can be wasted space if only a portion of the underlying record/block size is used by a file, but the dumps I have here seem to imply that the block structure of the files are functionally identical, as is the record size of the dataset.
Does anyone have any idea of what I am doing wrong here? Why does this file take up so much extra space after replication to the pool made up of a larger number of smaller drives? Is there something that I can do in order to prevent this from happening? I'm not super concerned with maximizing performance during the backup and restore phase of this plan. I just want to make sure that I don't lose anything and that when I have finally rebuilt the original pool, that it performs well.
Thanks in advance for any help!