Data corruption cause

Status
Not open for further replies.

Earl

Cadet
Joined
Nov 1, 2013
Messages
2
I hope you guys will be gentle as I have barely any technical knowledge of FreeNAS. Our small business has a FreeNAS box built by a tech who no longer works for us. Based on what I've read here in the forums, I suspect our data is now corrupt and unrecoverable but would like the experts here to confirm and diagnose the cause before I finally give up.


The General Issue: Data Corruption. Almost all files are suspected corrupted, few files were spared.


Symptoms:

• Sample folder shown in Explorer on Windows 7:
FreeNAS-Win7.jpg

Properties of a suspected corrupt file:
FreeNAS-File-Properties.jpg

When opening one of the suspected corrupt files:
FreeNAS-Open-Error.jpg

The error is similar for suspected corrupt pdf, excel, and word files.

• Same folder shown on Windows XP (even with "show hidden files and folders" enabled. Notice the suspected corrupt files aren't even showing):
FreeNAS-WinXP.jpg



System:
• FreeNAS-8.2.0-BETA4-x64 (r11722)
• AMD E-350 Processor (Dual Core)
• 2x2TB RaidZ1
• 8gb RAM, Non-ECC (Yes, I've read this maybe the culprit. I've ran memtest though and there were no errors)
FreeNAS-Memtest.jpg



Scrubs: run every 7 days. (recent scrub result below)
FreeNAS-Scrub.jpg



Virtualize: no


No other backups.


What I'm thinking the culprit is:
I hope I'm not taking the statements from this post (http://forums.freenas.org/threads/300-more-worth-it.15689/page-2) out of context, but I suspect one of the computers we're using may have a bad memory.


Questions:
1) Would just opening Windows Explorer by an offending client computer with bad memory cause widespread corruption? Because I know some of the files were not accessed for a long time.

2) Also, can you confirm that our data is no longer recoverable?


Thank you for your insights in advance.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'm gonna send you a PM. This is beyond what a forum can help you with...
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
If it were ZFS, I'd expect to see some reporting in the zpool status. ZFS will not be silent if it thinks there's an error. It seems unlikely that it would survive a massive corruption event without also destroying pool integrity.

That suggests that either your files are still there and intact, or something else went through and corrupted them, or they were corrupted when copied.

Depending on how they were copied, several different theories might be reasonsble. For that corrupt file detailed above, is there a chance it was created in 2007 and copied to the NAS in 2012?

Regardless, you need to look at the data preferably from a different client.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I would like to know the gist of what the resolution was here, just for academic interest, if you don't mind, cyberjock et al.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
he hasn't responded to me. But several other people with similar problems have been traced to the workstations having bad RAM and opening every file to create thumbnails. So they opened up their folder of pictures and about 1/2 of them got trashed because of the thumbnailing :(
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
he hasn't responded to me. But several other people with similar problems have been traced to the workstations having bad RAM and opening every file to create thumbnails. So they opened up their folder of pictures and about 1/2 of them got trashed because of the thumbnailing :(
Interesting! (unfortunate for the guy it affects).
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
Snapshots to the rescue.
 

Earl

Cadet
Joined
Nov 1, 2013
Messages
2
Hello guys. Sorry I wasn't able to reply during the weekend. I forgot the password to my login here and I don't have access to my work e-mail from home.

I just replied to Cyberjock, hopeful that he can help us get the data back.

Just to answer some questions here, our data si ZFS. Almost all of our data was copied about a year ago. There were no problems opening the files until about a weeks ago when most of the files at initial glance became hidden. Upon checking their properties, they showing archived. Opening them would give an invalid file format error.

Here's what's interesting. I followed jgreco's advice to view the file from different clients. I viewed the files and folders from three(3) Windows XP computers and all of them wouldn't show the suspected corrupt files. I viewed it from two(2) windows 7 computers and one of them was showing the corrupt files, albeit hidden, but the other won't show it. All those computer had 'show hidden files and folder' enabled.

Some of the folders contained exclusively excel files. Some only contained word files. Some only PDF files. I believe those files would not generate thumbs. I understand Excel and Word files create temp files but even the folder that contains pdf files only got corrupted as well.
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
Snapshots to the rescue.

Not going to help.

If bad ram results in corruption of the file, then all snapshots are affected too.

Snapshots only track changes by applications, not changes by zfs's self healing. Since the change to the file would be the result of zfs trying to 'fix' your bad hard drive that's actually bad ram, it will change the blocks on disk that all snapshots refer too. That's exactly why downstream zfs replicates will be corrupted too. Look at it the other way, if a disk did have silent corruption, and zfs fixed it, wouldn't you want the fix to propagate to your other zfs replicated copies?

Zfs has to trust the memory. There's no other way to get around it. Exactly why ecc memory is listed as a requirement.
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
I think cyberjock mentioned corruption caused by bad workstation (=client) RAM. In this case snapshots should have helped to restore a working copy of the files.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
For the record, *NONE* of my CIFS shares include write permissions. If you want to add something to my FreeNAS, you have to come in with sftp or ftp from certain boxes on my LAN that *I* control (no wife, no kids, etc.). So as far as I know, there's no way any of the computers served files via Samba shares are going to have any ability to mess with any byte of pool storage. Stories like this one make me think this is a good idea, if it works for your situation (e.g., media serving).

And...of course...I make offline backups to other hard drives every night. Just in case.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
For the record, *NONE* of my CIFS shares include write permissions. If you want to add something to my FreeNAS, you have to come in with sftp or ftp from certain boxes on my LAN that *I* control (no wife, no kids, etc.). So as far as I know, there's no way any of the computers served files via Samba shares are going to have any ability to mess with any byte of pool storage. Stories like this one make me think this is a good idea, if it works for your situation (e.g., media serving).

You're right. Your workstations could not directly corrupt ZFS. If your server has bad non-ECC RAM, then the loading of the thumbnails could cause corruption since the pool will think there is corruption and attempt to fix it. There is no way to bypass this selfhealing feature. Even if you mount the pool as read-only it will try to repair itself when it finds corruption.

But, this does make me wonder something. FTP/SFTP doesn't have error correction for the data packets. I've seen many files get corrupted just traveling 50 feet on LAN cables via FTP. So its quite possible you are getting corruption because you are using FTP. Tonight I have plans, but tomorrow I'll be available(I'll send you a PM in the morning) and we'll go from there. But I'm a little skeptical of your choice of FTP/SFTP now that I know that's what you've used.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I think you misread CJ. I'm not the guy with the problem. I was just adding in that I configured my permissions as read-only from clients.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I think you misread CJ. I'm not the guy with the problem. I was just adding in that I configured my permissions as read-only from clients.

Oh, you are right! Can you tell I've had a long day? LOL
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
CJ, you're right that there's no error "correction". But both of those protocols are moderated by TCP, which detects corrupted packets, and both FTP and SFTP will request or NAK a bad packet, and it WILL be resent...

So I think with FTP/SFTP, the received and ACK'd packet *WILL* match the sent packet, at least as it occurred at the sender's NIC.
 
Status
Not open for further replies.
Top