I have a Dell 2950 server. Dual Xeon, 32GB FB ram, 8x146GB 10K SAS disks connected via a PERC 6i. I am running the latest system and PERC 6/i BIOS. This is my second FreeNAS I am building, and am getting a weird problem.
I can create 8 raid 0 disks with MFIUTIL and likewise see 8 /dev/mfidx devices. Then I create a RAIDZ2 pool using the 8 disks. Next I create a dataset and share that out with CIFS. Then I start a large file transfer to the disk array, and max out my gigabit link. Finally, I use MFIUTIL to fail a disk. The system KERNEL PANICS.
I can repeat the above steps when no data transfer is occurring, and the system handles perfectly fine. I can even replace the disk with a new one, and rebuild the pool no problems.
This is NO GOOD. My last FreeNAS build was a generic supermicro box with some sort of LSI card that supported IT mode. I could physically pull two disks from a RAIDZ2 volume during a write. Granted, the system hung for a second, but otherwise the system remained connected and usable with a degraded volume.
I plan on using this DELL box as a SAS volume for vmware with iscsi. If this box reboots, my ESX hosts connected to the iscsi lun will also hang and cause my cluster to go down.
I have seen WAY TOO MANY posts about rebooting being normal when a disk fails... This is not an option.
Is there a bug with the MFI driver where throughput + disk fail == panic?
I can create 8 raid 0 disks with MFIUTIL and likewise see 8 /dev/mfidx devices. Then I create a RAIDZ2 pool using the 8 disks. Next I create a dataset and share that out with CIFS. Then I start a large file transfer to the disk array, and max out my gigabit link. Finally, I use MFIUTIL to fail a disk. The system KERNEL PANICS.
I can repeat the above steps when no data transfer is occurring, and the system handles perfectly fine. I can even replace the disk with a new one, and rebuild the pool no problems.
This is NO GOOD. My last FreeNAS build was a generic supermicro box with some sort of LSI card that supported IT mode. I could physically pull two disks from a RAIDZ2 volume during a write. Granted, the system hung for a second, but otherwise the system remained connected and usable with a degraded volume.
I plan on using this DELL box as a SAS volume for vmware with iscsi. If this box reboots, my ESX hosts connected to the iscsi lun will also hang and cause my cluster to go down.
I have seen WAY TOO MANY posts about rebooting being normal when a disk fails... This is not an option.
Is there a bug with the MFI driver where throughput + disk fail == panic?