Verify data integrity after copy (check for client errors). Create server-side list of checksums?

Status
Not open for further replies.

alheim

Dabbler
Joined
Nov 19, 2014
Messages
22
Maybe I am paranoid, but: Say I copy a few hundred thousand files to the NAS over the network via a Windows CIFS share, on quality hardware (specs below). Once the data is safely on the ZFS NAS, I'm not too worried about its integrity. But how do I know that a bit hasn't flipped during data transmission? Say a bug in the router or a flaw in a NIC.

Is this something to worry about, or will error checking catch this sort of mistake along the way?

One idea is to run a checksum on all of the files, and verify it against the original data (this also could be saved and used to check integrity of my backups). On Windows this is easy enough using a program like ExactFile, but being new to Unix, I'm not sure how to do this server-side on the NAS.


Hardware:

FreeNAS:
Lenovo ThinkServer TS140
4x WD Red 3tb in RAID-Z2
8gb ECC RAM

Client (Windows 10 Pro):
Lenovo ThinkServer TS140
8gb ECC RAM

Network:
Ubiquiti US-8-60W Unifi Switch
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
You can do it with a command pipeline or script, or if you're clever and a little patient learning about it, even with mtree.

The commandline solution can be molded to meet your needs a little more easily. This will create a list of the SHA256 for files in the current directory and subdirectories:

% find . -type f -print0 | xargs -0 sha256
 

alheim

Dabbler
Joined
Nov 19, 2014
Messages
22
Thanks, I'm looking into this and will experiment.

How does one check on the progress of this hashing, and other commands that run in the background i.e. smartctl ?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
If you do it the way I showed at the command line, it just prints it to the terminal. You can redirect output to a file and stick it in the background if you want, which is probably more useful if you're going to compare to generated numbers on the PC, in which case you can use "tail -f file" to monitor its progress.

Honestly there are so many ways you can manipulate stuff into useful formats, I didn't really try to do anything complicated.
 

alheim

Dabbler
Joined
Nov 19, 2014
Messages
22
Thanks again, I've spent a few hours playing with this and made some headway, most notably I learned how to use "ps" and "top" to view the running processes (find, sha256, etc.), and the kill command to terminate the processes once I realized that the verbose hashing process was going to dominate my terminal for quite some time! And I am learning way way around other basic commands as well.

Can you expand upon how to direct the output to a file? "% find . -type f -print0 | xargs -0 sha256 tail -f file" worked to create the verbose checksums, and I think I understand how to use "tail -f file" to check progress once underway.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Redirect into a file, just like in DOS. DOS takes a large bit of its syntax from UNIX.

% echo hi > newfile

If you want it to go to both a file AND to the console you can use tee.

% find . -type f -print0 | xargs -0 sha256 | tee newfile

Now the thing is, UNIX is a great text processing system and you can pretty much reformat output into more usable formats in any one of a dozen ways, but this may be a bit beyond what is sane for a beginner to attempt to do. However, I will note that you can install Cygwin on Windows and use it to install some basic UNIX utilities on Windows, at which point you can do a neat trick... run a similar "find" command on the Windows side, redirect it into a file, run sort on both files, and then run diff on them. Getting it "just right" might be a little tricky, but if you managed to find your way through ps and top and all that, it's likely within your grasp.
 
Status
Not open for further replies.
Top