FreeNAS freezes. Error "ffs_blkfree"

Status
Not open for further replies.

zeromh

Cadet
Joined
May 10, 2012
Messages
6
I've been having problems with my NAS since I set it up. It worked fine for about 5 days, and then it froze, and gave me a message like this:

dev = ufs/Largedrive, block=72658904, fs = /mnt/Largedriv
panic: ffs_blkfree: freeing free block
cpuid = 1

When this happens, I turn the NAS off, reboot, run disk check (fsck), and then reboot again. When I do this the NAS will work properly for a few days - sometimes as long as 5 or 6 days, sometimes as short as 1. But inevitably the same error will occur (sometimes with a different drive; I only have two), and I'll have to restart the NAS.

Incidentally, I've also noticed that the number of the "block" that it refers to keeps increasing. For example, I've gotten an error regarding block 7260008, then the next time the NAS froze it was block 72658808, then 72658904, etc.

Can someone tell me how to fix whatever this is? I'm not a very advanced FreeNAS user, so any hand-holding explanations would be appreciated.

Here's some System Info that might be useful:
Build: FreeNAS-8.0.3-RELEASE-p1-x86 (9591)
Intel(R) Pentium(R) 4 CPU 3.20GHz
Memory: 3826MB
OS: FreeBSD 8.2-RELEASE-p6

Thanks in advance!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
You might want to take it offline for a bit and run some memory and system tests. It would help to know that your hardware didn't have an obvious problem of some sort. You may also want to use your drive manufacturer's tools to see if there are any indications of problems.

Typically, ffs_blkfree: freeing free block results from corrupted data. This can happen several ways. For example, if bits are rotting in RAM, your system may be relying on cached data in memory, and the ffs_blkfree sanity check catches it. Or if the disk is maybe losing and remapping disk blocks on you. Or... well anyways, smart first step is validate hardware.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
4GB is really the minimum to use FreeNAS. There's lots of forum posts of people that have had panics that were resolved with more RAM.
 

zeromh

Cadet
Joined
May 10, 2012
Messages
6
noobsauce: Yes, as William said, I'm using UFS.

jgreco: Thanks for the reply. Unfortunately, I don't know how to run any tests.... This computer seemed to be working fine before (it used to run Windows XP), but of course something could have gone bad with the hardware. Can you explain, or link to any information about running the tests you suggest?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I'd suggest finding a copy of memtest86 to run for a day or two. You can find standalone boot CD's that you can run without touching the content on your drives. My guess is that will turn up something, but it's only one of several possible tools to use.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
i just noticed the UFS in your original post. I'm not sure why but I never notice the UFS. I guess I just always think "why use UFS when ZFS is better?"
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Because ZFS isn't better, just different.

ZFS is memory-piggy. UFS has been fine on FreeBSD since the days of systems with 2MB (*M*B) of RAM. ZFS will make very good use of the memory it consumes on a large memory system, however.

ZFS is evolving fairly rapidly. When you upgrade to the next verion of FreeNAS with a new version of ZFS, you might not be able to revert. UFS is good, old, stable, faithful UFS.

ZFS is great at managing massive amounts of storage across multiple disks, but is abysmal at managing a small amount of storage on a single disk (at least relative to UFS). FreeBSD's efforts over the years to improve that situation (ccd, Grog's vinum, geom) have not really given UFS the sort of superpowers that people would like. On the flip side, a UFS based system is fairly straightforward, easy to work with, easy to debug and repair, etc.

ZFS performance is unpredictable and relies too much on tuning. I've harped on that a lot. UFS performance is predictable but tends to be less than what it could be, at least in theory.

ZFS and UFS are just different things.
 

zeromh

Cadet
Joined
May 10, 2012
Messages
6
Thanks again, jgreco.

I ran three passes of memtest with no errors. Is it necessary to run it for longer in order to find something?

I also ran the hard drive check that comes with BIOS (which only takes a few minutes), and that turned up fine as well.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I'd let the memtest run a day. Or two. If your fileserver is busy and it takes five days to crash, that probably means it needs to be pushed harder and longer than if it is lightly loaded and takes five days to crash.

However, there are other things that can go wrong in a system, and it won't pay to get too fixated on any one thing unless you get a clue that something specific is an issue.

You can try doing something fairly passive to test the I/O system. It won't catch many classes of problems, but you can script up some parallel dd's from each disk. Get a list of your disk devices:

# camcontrol devlist
<WDC WD7500AAKS-00RBA0 30.04G30> at scbus0 target 0 lun 0 (pass0,ada0)
<WDC WD7500AAKS-00RBA0 30.04G30> at scbus1 target 0 lun 0 (pass1,ada1)
<WDC WD7500AAKS-00RBA0 30.04G30> at scbus2 target 0 lun 0 (pass2,ada2)
<WDC WD7500AAKS-00RBA0 30.04G30> at scbus3 target 0 lun 0 (pass3,ada3)

In this case, ada0 through ada3.

You can run a NONquick read test on each disk:

# dd if=/dev/ada0 of=/dev/null bs=1048576 &

The dd command just copies one device to another, in this case the null device. The ampersand runs it as a background command. You can stack up and do all your devices simultaneously that way. If all your devices are the same, the ending numbers reported should all be reasonably similar. Otherwise you might need to use your head a bit.
 

zeromh

Cadet
Joined
May 10, 2012
Messages
6
Well for starters, our fileserver is very lightly loaded, so I don't know how much pushing it needs to crash. But anyway, I let the memory test run for 48 straight hours, and there weren't any errors.

I'm not sure if I did the I/O system check correctly- I just typed in exactly what you wrote for ada0 and ada1 (we only have 2 disks). The results were [1] 13178 and [2] 13180, respectively. Is that all there is to it?

I've got the NAS running again now. I'm assuming it will eventually freeze up again. Any other ideas for things to test?

I appreciate all your help. Thanks.
 
Status
Not open for further replies.
Top