Huge system wide corruption (Flash drive)

Status
Not open for further replies.

getoffmalawn

Cadet
Joined
Nov 26, 2011
Messages
2
I started experiencing strange issues with NFS clients hanging the entire operating system. I thought it might have been the clients so I restarted, and everything would work for a little (about 5-10 minutes) then another hang. So I logged into the web ui and restarted NFS, thinking it might have gone a little funny. It took an unusually long time (about 1-2 minutes), but it said it had stopped/started, yet I had no access. I logged in via ssh, and, well, I'll let the command do the talking.

Code:
[root@freenas] ~# ps aux | grep nfs 
Segmentation fault


That set off alarm bells. I could run the two commands separately, but piping immediately resulted in seg faults. So I figured, maybe it's worth restarting FreeNAS altogether, and did so through the web ui. After it finished rebooting, I couldn't access the web ui, but I could ssh into the box and ping it. After ssh'ing in, I ran ls, and I'll let the output do the talking again.

Code:
[root@freenas] ~# ls
/libexec/ld-elf.so.1: /lib/libc.so.7: Undefined symbol "environ"


So it would appear my system is royally messed up. I'm running FreeNAS on a cheap 4GB flash drive I bought specifically for it, about 2 weeks ago, so it would certainly seem that something has gone wrong with the drive. What can I do to diagnose the problem, and possibly fix it? I'd rather not go with a full reinstall if possible, although it's looking likely at this point.

Edit: I feel I should mention that I can run certain commands, like 'find .' to list all files in the current directory, I can change directories, etc, so for the most part the system is still functional at a low level, just quite broken.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
It's most likely your flash drive, it happens. There's not really any simple process to diagnose it. If you had a FreeBSD system you could possible run 'fsck' on it, but you're better off trying another flash drive.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It's possibly the flash drive, which, protosd is right about, should be replaced. You *can* run fsck against the partitions on the flash drive, but I don't have a free FreeNAS box handy at the moment so I don't have specific directions for you.

However, I should note that I have seen, over the years, problems with the base system occasionally causing garbage to be written out to disk, and things like libc are particularly likely to get corrupted because of how widely that's used.

I'm guessing if you are using NFS then you are familiar enough with FreeBSD to figure out how to accomplish the following with the resources available to you. What you should do is to grab a snapshot of the checksums of your FreeNAS flash, something like

# mtree -c -p / -x -k sha1digest > /data/myflashcksums.1

Then store that file somewhere (NOT on your datastore! Avoid writes to your datastore just in case your system is corrupting things)

Then reinstall FreeNAS on a new flash. Run the same command to a different file

# mtree -c -p / -x -k sha1digest > /data/myflashcksums.2

And then compare the two. Or if you want to be clever with mtree, feed the first output into mtree and let it tell you the differences.

Basically one of several things will happen.

1) There will be no differences. If that's the case, your system itself is corrupting data, you need to run memtest and other burnin utilities to help identify what needs replacing. Do not proceed to use FreeNAS until you have a stable system, or you WILL LOSE DATA.

2) There will be differences all over the place, including in files that you cannot imagine FreeNAS having any reason to have accessed. This probably means defective flash, again, replace at once.

3) There will be a small number of differences in commonly accessed files. This suggests that your system is not stable, but could also be bad flash. Replace the flash, cheap insurance, but burn in your system as though the situation was 1).

If you see any other unusual situation or identify any other pattern, post it and we can see what the possibilities are. No matter what, your system as it stands is not trustworthy and should be regarded with some skepticism.
 

getoffmalawn

Cadet
Joined
Nov 26, 2011
Messages
2
Hey guys, sorry for the late update. The flash drive was screwed, so I spent another 16 dollars to get a better branded one, and it's worked perfectly ever since. Thanks for the help regardless!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Great to hear it. Cheap flash is great when it works, stinks when it doesn't.
 
Status
Not open for further replies.
Top