"copy" parameter in 3-way mirror or raidz - question

Sara

Dabbler
Joined
Jun 26, 2022
Messages
18
Hello everyone,

One issue bothers me about the "copies" parameter. In the ZFS documentation on oracle I read that "These copies are in addition to any pool-level redundancy" and here my confusion begins.

eg.
# zfs set copies=2 users/home
# zfs get copies users/home
or in GUI TrueNas

- If I have a "pool" created from "vDev-s", whether it's "mirror" or "raidZ-x", does this parameter only affect the number of copies on one particular disk in vDev or vDev in general?

- The default is 1 and for example for a 3-way mirror I understand that there is 1 copy of data on each of the 3 disks, if I set it to 2 then there will be 2 copies of data on each of the 3 disks? (and will reduce capacity by 50% as a result but more secure for data) Do I understand it correctly?

- And does the above apply the same to raidZ-x that this parameter applies to data copies, i.e. apart from the checksums data recovery, it will simply keep the same data in a copy of 1, 2 or 3 copies somewhere on raidZ disks, and in the case of a mirror on each disk in the mirror in this number of copies? Do I understand it correctly?

- My mind exploded :D

Thx.
Best Regards,
Sara
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
- The default is 1 and for example for a 3-way mirror I understand that there is 1 copy of data on each of the 3 disks, if I set it to 2 then there will be 2 copies of data on each of the 3 disks? (and will reduce capacity by 50% as a result but more secure for data) Do I understand it correctly?

Maybe. It has nothing to do with the redundancy you've chosen. If you have, for example, a three-way mirror vdev, then each block stored on that vdev is stored on each of the three component drives. If you have two three-way mirror vdevs, the same is true- but it becomes easier to understand copies= with two vdevs.

If you use copies=2, then ZFS will store two copies of your data blocks, and it will try to ensure that they are not in the same vdev. For the dual three-way mirror vdev we are discussing, this means that ZFS will end up storing a copy on each vdev, and each vdev will store a copy on each of its three component drives.

The important bit to get your head around is that these are two entirely separate data redundancy strategies. The mirror one is much better/safer.

- And does the above apply the same to raidZ-x that this parameter applies to data copies, i.e. apart from the checksums data recovery, it will simply keep the same data in a copy of 1, 2 or 3 copies somewhere on raidZ disks, and in the case of a mirror on each disk in the mirror in this number of copies? Do I understand it correctly?

This is a bit garbled. Again, copies= refers to the number of copies of your blocks of data that ZFS will store on the pool. It functions against the pool as a whole. When you set, for example, a dataset to copies=10, this means that ZFS will attempt to store 10 copies of your block in the pool. If you have ten vdevs, it will try to store one in each. If you have five vdevs, it will try to store two in each. If you have one vdev, it will store 10 on your single vdev.

How the vdevs are constructed is not considered. It would be a mistake to consider mirrors as additional copies of the blocks from a copies= perspective; the mirroring is redundancy for recovery from a disk failure.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
In regards to reduction of usable space to 50%, YES. This does assume all new data. Any pre-existing data will use whatever "copies=" was set to before, (defaulting to 1).

Their are a maximum of 3 data copies, unless native ZFS encryption is used. Then only 2 data copies are allowed.

Next, standard metadata, like directory entries, have at least 2 copies, even on a single disk pool. If you set "copies=2", then standard metadata gets yet another copy, for 3 total. And if you set "copies=3", then standard metadata still only gets 3 copies if I understand it.

Last, you can fine tune redundancy of metadata with "redundant_metadata" similar to "copies". See manual page "zfsprops" for details.
 

Sara

Dabbler
Joined
Jun 26, 2022
Messages
18
Thx a lot, jgreco and Arwen, I understood it like this :smile:: The "copy" parameter only affects the number of additional copies between the different vDevs in the pool, it does not affect the native redundancy of each vDev :) I think I got it, thanks, but I will ask more questions to feed my inner detail paranoia :smile:

Please correct me if I still don't understand something:

"copies" in mirror:

- Assuming we have 1 vDev in pool: Regardless of the setting (copy=), they store at least one copy of the file on each from 2 mirror disk in vDev, in 3-way mirror on all 3 disk in vDev?

- Assuming we have 2 vDev in pool, each consisting of 3 disks: then with copy=1 it keeps a copy on each of the 3 disks on 1 vdev, if we have copy=2+ it keeps a copy on every disk of all vdevs, unless we have more than 3 vdev? Correct? :)

So if we have, for example, a 3-way mirror and 3 vDevs and copy=2, it will store copies on each disk in used vDevs, and used 2 vDevs from pool to store copys, rotating the additional copy between vDevs? If copy=3 then on every disk in all pool vDev in that case?

"copies" in raidZx:

- Assuming we have 1 raidZ2 vDev in pool: copy=1 store 3 copy (oryginal+2copy) on 3 drives in 1 vdev in pool, when copy=2 we still store 2+1 copy on 2+1 drives in 1 vdev in pool (because we only have 1 vDev in the pool).

Last, you can fine tune redundancy of metadata with "redundant_metadata" similar to "copies". See manual page "zfsprops" for details.
Thx. i read more about this.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
- Assuming we have 2 vDev in pool, each consisting of 3 disks: then with copy=1 it keeps a copy on each of the 3 disks on 1 vdev, if we have copy=2+ it keeps a copy on every disk of all vdevs, unless we have more than 3 vdev? Correct? :)
Possible, but not guaranteed. Where the extra copies are stored is ZFS inner business.

The use case for "copies=2" is when backing up a dataset to an external drive—say, a USB drive. Here "copies=2" forces redundancy on a single disk vdev (single drive pool, even) and ensures that ZFS could correct data in the event of "bit rot" on the disk. Of course, it reduces the usable space by 50% and does not protect against a complete failure of the drive.
 
Joined
Oct 22, 2019
Messages
3,641
The only use I could ever see for copies=2 is when dealing with small files that are especially crucial to safeguard their integrity. Maybe will and testaments, important receipts, financial and legal documents, etc.

A dedicated dataset for such files and documents with copies=2 makes sense. (Redundancy from the vdevs + extra protection from copies=2.)

The reason I say "small files", is because of what @Etorix alluded to. You effectively double the space consumed by every file on that's written to the dataset.

For a dataset comprised of small, important files? Sure. For a dataset comprised of large files? Multimedia? No thanks!

The only use I could ever see for copies=2 is when dealing with small files that are especially crucial to safeguard their integrity. Maybe will and testaments, important receipts, financial and legal documents, etc.

A dedicated dataset for such files and documents with copies=2 makes sense. (Redundancy from the vdevs + extra protection from copies=2.)

The reason I say "small files", is because of what @Etorix alluded to. You effectively double the space consumed by every file on that's written to the dataset.

For a dataset comprised of small, important files? Sure. For a dataset comprised of large files? Multimedia? No thanks!
 

Sara

Dabbler
Joined
Jun 26, 2022
Messages
18
I think on Git repos with LFS, many small files and lot medium files. Thx for answer :smile:
 
Top