pedz
Dabbler
- Joined
- Jan 29, 2022
- Messages
- 35
I have the Mini X+ with 32G of memory. The user interface for deduplication really seems to discourage using it because it is “memory intensive” so I chose not to.
I am currently going through a car load (literally) of old disk drives sending them to the NAS to store. I am sure there is a ton of duplicate files. What I’ve done in the past is wrote a program to find duplicate files. One feature that it had is I could easily find the full path to all of the duplicates so if I found a file that I no longer wanted, I could delete it and delete all of the duplicates and thus freeing up the space.
The first question is, is 32G enough memory to safely run the deduplication process? The NAS currently has 40TB of raidz2 storage but it has no VMs or other load.
The other question is, does the built in deduplication system offer a way to tell the user where the duplicates are so I can delete those too? I get the idea that it finds duplicate blocks, not duplicate files but I’m only guessing about that.
The last question is if I build my own system to find duplicates: this is a BSD system. I’ve not looked around but I assume I can find packages to add for a simple database, etc. I guess if nothing else, I can build a VM and have it be the place that runs the app I create. It is relatively simple (but slow) to find duplicate files.
I am currently going through a car load (literally) of old disk drives sending them to the NAS to store. I am sure there is a ton of duplicate files. What I’ve done in the past is wrote a program to find duplicate files. One feature that it had is I could easily find the full path to all of the duplicates so if I found a file that I no longer wanted, I could delete it and delete all of the duplicates and thus freeing up the space.
The first question is, is 32G enough memory to safely run the deduplication process? The NAS currently has 40TB of raidz2 storage but it has no VMs or other load.
The other question is, does the built in deduplication system offer a way to tell the user where the duplicates are so I can delete those too? I get the idea that it finds duplicate blocks, not duplicate files but I’m only guessing about that.
The last question is if I build my own system to find duplicates: this is a BSD system. I’ve not looked around but I assume I can find packages to add for a simple database, etc. I guess if nothing else, I can build a VM and have it be the place that runs the app I create. It is relatively simple (but slow) to find duplicate files.