SOLVED moving files to FreeNAS, should I stay away from Windows? (getting CRC mismatches)

Status
Not open for further replies.

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
I have finally built and populated my FreeNAS with drives. So I started to move files from the desktop and discovered that even though the copy operation finishes without error, I get CRC mismatching in some very large mts-files. I simply copy/paste in Windows 7 explorer.

Here is how I noticed it.
After copying, I created md5 and sfv digests of the local folder and when I copy them to the server folder and run the compare, I notice mismatches. So I fired up Microsoft's Synctoy and created an echo-mirroring of the local folder to the existing server folder. Synctoy picked up on the same files that had shown CRC-errors and then copied those files anew and finished without error. Afterwards I run the md5 and sfv digest files again to verify on the server - and lo and behold, the same files show CRC mismatches again. The files in question (video mts-files) are over 12 GB in size, I don't know if that is relevant. I thought Synctoy verified the transfer when finished, but apparently that is not the case.

Should I stay away from Windows, and boot into Linux and use scp or rsync to copy my media library to the server?

EDIT: It just got even more strange. Yesterday I copied a huge folder (45GB) and ran an sfv check afterwards and it went OK. I re-ran the same check with the same digest file just now and it found two CRC mismatches in two files that were reported as OK yesterday. ??
 
Last edited by a moderator:

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
Please post details on your FreeNAS hardware. This could be a software problem or a hardware problem.
 

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
Sorry, I thought they were visible in the signature, so I didn't want to "clutter".

The RAM has been through 24h MEMtest and the HDDs have been burnt-in with badblocks.

FreeNAS 11.0 U4
X11SSM-F
i3 6100
16GB Samsung ECC RAM (M391A2K43BB1-CRC PC4-19200)
boot drive Sandisk UltraFit 3.0 32GB
6 x 4 TB WD Red in RAIDZ2
Seasonic FOCUS Plus 650W Gold
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
Was this mysterious sfv check run over the network? Do you have a different switch, or at least different ports on your switch, that you can test?
 

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
Was this mysterious sfv check run over the network? Do you have a different switch, or at least different ports on your switch, that you can test?

Yes, it was run over the network.
I'll re-run the SFV check from Windows using a different port in the switch (Netgear GS105) once my rsync run with --checksum finishes. I have now booted into Linux Mint.
 

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
So I have switched from Windows 7 and tested a bit now in Linux Mint using rsync. Still using the same port in the switch, switched ports later on (see below).

I did a grsync (can't manage to get rsync to work at the command line) of the same directory and it completed successfully with exit status 0 (which as far as I can tell is rsync's way of saying that the copied file conforms to the source as regards content, not only file name and size, i.e., some kind of hash check). Then I created an MD5SUM digest from Linux local command line and checked it from my local Linux CLI. The very same three huge files that got CRC mismatches under Windows were mismatched now too.

Then I copied the digest to the place where rsync had copied the files on the server, SSHed into the FreeNAS and ran md5 manually on three files (I couldn't find out how to run the entire digest) One file (let's call it FILE) that belonged to the "mismatch three" had a CRC that did not match the one in the digest created by MD5SUM on Linux. The two other files were control files that were not of the "mismatch three" and had CRC that corresponded to the digest.

I copied FILE two times again using normal drag&drop in Linux and the CRCs of these two were different from each other and the one copied before. All three CRCs generated from three copies of FILE on the server were different from the source locally.

I now zipped some completely different files to a zip archive amounting to 16GB, to see if it is file size that messes things up during transfer. And again, after drag&drop transfer there was a CRC mismatch.

At this stage I switched ports in the switch, and retransferred my huge zip file, now with grsync that gave exit status 0 = CRC mismatch.

I am completely stumped by this. It would seem large files >10GB get corrupted in transfer, yet (g)rsync gives exit status 0 which is supposed to be a certificate of integrity. Smaller files <50MB verify okay in the thousands.

My other hardware is desktop - netgear router - netgear switch - 10 meter cat6 cable to FreeNAS (need ten meter to reach).
 
Last edited by a moderator:

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
If you can connect the cable directly between the client and the NAS, it would help to determine whether the switch is a problem.
 

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
If you can connect the cable directly between the client and the NAS, it would help to determine whether the switch is a problem.
Certainly a valid test. If I may, I propose I try that tomorrow, because today I have a tight deadline and I need to read up on how to configure the NAS and desktop to talk to each other, also in a way that I can revert and not shut myself out of the NAS (I read a couple of old threads here in the forum and it was not instantly obvious to me how to do it in a way that I could revert).

Yesterday I said the problem was with huge files. This morning I see that also 16MB RAW photo files are problematic. Only 10 000 <100kB files were verified correctly time after time.

Anyway, I have run other tests today, and I am beginning to suspect something in my desktop might be failing (>6 years old, opened, extended often, recently moved to new chassis when FreeNAS got the old). Yesterday the computer BSOD:d during a long checksum verification, and also during a long checksum creation. I have since been hit by a wave of BSODs upon boot, and my computer have also begun to falseboot twice before it gets past POST and into Windows. Prior to yesterday, I can hardly remember such behaviour. After BSOD, I think I can see some jibberish flashing past my eyes in the POST messages for one or two of my disk drives (why are the POST-messages logged somewhere, it's not like anyone has the time to read them?). Also my C: has been auto-checked upon boot twice today.

Kind of suspicious, all that... :smile:

Mind you, I have been working normally on the computer all day, once I stopped copying files and running checksum tests. I am writing from the desktop now as well.

So, I got the brilliant idea to use my laptop instead.
I pulled out the backup HDDs from the cupboard and loaded 100 GB of files that have been problematic during my testing onto a USB SSD. Reboot the laptop, created sfv for the 100GB (small and huge files), and then started to transfer it all to the NAS. After transfer, reboot, and then everything verifies OK when running the checksum on the laptop over the network (I haven't tried the desktop with the same checksum digest because I can't afford more BSODs today).

Also, as regards the EDIT in my first post, the checksum that passed two days ago on the desktop and that then failed yesterday with the same digest...it passes now when run on the laptop! Twice!

(All this in Windows, no Linux today. So I guess my question in the title is not valid anymore.)

So my guess is that it is not the switch because the laptop is using the same ports that were problematic yesterday, but rather the desktop+HDDs/router/two Ethernet cables connecting them to the switch.

First thing in the morning I need to run MEMTEST and some PRIME95 on the desktop, I think...
I'll also try to connect the desktop directly to the NAS to remove router (and switch) from the equation.

In the meantime, I'd be grateful for any thoughts as to good ways to proceed or just loose ideas...

One thing... I have the router first facing the internet. Then from the router I have one port to desktop and another port to the switch. And from the switch to the NAS. Could that be relevant, should I have the desktop connected to the switch, rather?
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
I would put the desktop on the switch so it is one less hop from the NAS, but the difference is probably not measurable.

Windows is not good at dealing with memory errors, but this could also be a power supply problem. If you can swap either, that might locate the problem. The good news is that it sounds like the FreeNAS system is okay.
 

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
The good news is that it sounds like the FreeNAS system is okay.
Yes, I feel some relief in that sense (still touching wood, though). I'd much rather the desktop be faulty than the new NAS.
 

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
Update:
This morning I fired up memtest86 and it immediately started blinking red. First time I ever saw bad RAM. Now I have spent many hours detecting which RAM stick of 6 it was that was bad, and it was easily spotted, since the computer would not even POST when I had it attached. I am on 3 sticks now, hoping to RMA the set of 3 to which set the faulty stick belong, and have begun to generate new hash digests that will be copied over to the NAS with rsync later on or tomorrow and then double verified. I will also run a 24h pass of memtest86 at some stage, but now I wanted to get going again. At this stage I think I will put on hold the test to connect the desktop directly to the NAS. I think the culprit may have been solely the bad RAM stick.

Thanks to everyone who has chimed in!
 
Status
Not open for further replies.
Top