FreeNAS 8.0.x Upgrades and Data Corruption

Status
Not open for further replies.

kevanbrown

Dabbler
Joined
Mar 19, 2012
Messages
17
Each time I upgrade between FreeNAS 8.0.x builds, one or more of my NFS/ZFS volumes reports data corruption shortly after the nightly system status report is generated. I don't know if the timing is related or not, but regardless the data corruption is quite a large inconvenience.

Environment:

Dell PowerEdge 1900 with High Point Tech RocketRAID 2314 attached to a Sans Digital 8-bay SATA disk enclosure. There are 8 HDD's in the enclosure and the RocketRAID 2314 is configured with hardware RAID enabled and 4 mirrors are create of the 8 HDDs in the enclosure. ZFS volumes are then create atop these hardware RAID mirror logical drives and then NFS shares to export them for use with VMware ESX 5.0 running on two Dell PowerEdge T410 servers.

Process resulting in issue:

I perform a web upgrade of my current FreeNAS 8.0.x build to the latest FreeNAS 8.0.x build (I'm usually only one build behind). After the first nightly system status report is generated (3:00AM), my virtual machine backups for one or more virtual machines begin to fail and checksum errors are logged by ZFS, resulting in I/O errors and data corruption. I have tried deleting and restoring the virtual machine files from backup, but this doesn't seem to completely resolve the checksum errors on the volume and I am finally required to completely destroy and recreate the underlying ZFS volume and NFS share, then restore the backups and everything is usually fine from then on. As it sounds, this process does require hours of work and downtime after each FreeNAS upgrade and subsequent data corruption.

My last upgrade (just this past weekend) was from FreeNAS 8.0.3-p2 to 8.0.4. The same data corruption condition eventually reared its ugly head once again.

With this latest build upgrade, I also noted that while monitoring /var/log/messages via the web console for the barrage of ZFS I/O checksum failure events, after a while I was suddenly unable to access the management address of the FreeNAS system. I ran ifconfig bce0 down and ifconfig bce0 up to resolve the issue and was once again able to access the FreeNAS web console. I'm unsure what may have caused this, but I've not seen this particular issue in previous upgrades (just the data corruption issue).

Regards,

Kevan
 

louisk

Patron
Joined
Aug 10, 2011
Messages
441
So, to make sure I understand, you say you have 4 mirrors setup in hardware RAID. How is your ZFS setup? Did you use RAIDZ? RAIDZ2? Throw everything into the pool with no redundancy?
 

kevanbrown

Dabbler
Joined
Mar 19, 2012
Messages
17
Hi Louis,

4 hardware RAID mirrors, each with a single ZFS volume; each ZFS volume exported as a separate NFS volume. Redundancy is provided at the hardware RAID level only.

Kevan
 

louisk

Patron
Joined
Aug 10, 2011
Messages
441
OK. Dumb question. Why use ZFS? Usually, you take a collection of spindles and use ZFS to manage them, but you seem to be doing all the management with the RAID controllers. For that senario, I would probably use UFS. Less memory intensive.
 

kevanbrown

Dabbler
Joined
Mar 19, 2012
Messages
17
In the FreeNAS 8.0.x beta/release candidate days, that's exactly what I did; I used ZFS to manage both the RAID and file system. However, after FreeNAS 8.0 RC3, all newer releases no longer worked correctly with my RAID controller and passive port multiplier SATA enclosure; instead of seeing all four HDDs on each channel, FreeNAS would only see the first HDD on each channel. I reported this as a bug in RC4 and later, but the FreeNAS team closed the bug and said that it was intended behavior with no real explanation as to why it worked correctly in previous builds. So I've just stuck with ZFS ever since, even though the RAID is now managed by the hardware controller instead of the file system manager.

My original intent with ZFS when I first started using FreeNAS was to use it as both the RAID and file system manager, and to take advantage of ZFS snapshots as an additional backup tool

If switching to UFS will save me from these data corruption issues after each web GUI upgrade, then I'm not necessarily opposed to that, as I'm not currently using ZFS snapshots. However, the FreeNAS team pushes the use of ZFS in all scenarios in their documentation and I'm still hoping to get around to starting to use ZFS snapshots.

The server has 16GB of RAM, so memory usage should not be an issue in either case.
 

louisk

Patron
Joined
Aug 10, 2011
Messages
441
I don't know (with out testing) if UFS will resolve any corruption issues or not, but if you're using the RAID cards do to the work of managing spindles, UFS will be lower overhead and potentially faster.
 
Status
Not open for further replies.
Top