Recurring lu_disk_write failed & lu_disk_lbwrite failed

Status
Not open for further replies.

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
A Cisco 3750G with a failed fan is like a little EZ-Bake-Oven. It'll keep chugging along, frying packets along the way, until it eventually burns its brains out.

They're not really liked all that well in some corners of the world, see catalyst-3750s-bad-luck-with-a-cisco-logo. Definitely worth swapping out the switch. If you can connect the ESXi to the FreeNAS directly with a cable, that is a great low-cost way to get rid of one element for testing purposes. This is probably easier to test than replacing network interfaces, and since you're having a problem in common on two separate machines with two separate cards, I'd try eliminating the switch first. But you could also easily have had a batch of bad cards, or worse, especially if they're Intels, fakes.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
1. The switch could be failing. I'd try a different switch, even if only temporarily. Most switches I've seen that failed don't give you any "hey.. I am broke" lights. They just don't work right, assuming the switch will even complete bootup testing. Very unfortunate, but its true. :(
2. I'd try a different NIC on each end too. If you have some spare cables, I'd swap out the network cables too. I'd be willing to bet this is more of a hardware issue than driver. If it were a driver there would be tons of people with the problem. ;)

I agree, and never rule anything out. I've given the switch a once-over, and all indications are good (not that this really means anything).

A Cisco 3750G with a failed fan is like a little EZ-Bake-Oven. It'll keep chugging along, frying packets along the way, until it eventually burns its brains out.

They're not really liked all that well in some corners of the world, see catalyst-3750s-bad-luck-with-a-cisco-logo. Definitely worth swapping out the switch. If you can connect the ESXi to the FreeNAS directly with a cable, that is a great low-cost way to get rid of one element for testing purposes. This is probably easier to test than replacing network interfaces, and since you're having a problem in common on two separate machines with two separate cards, I'd try eliminating the switch first. But you could also easily have had a batch of bad cards, or worse, especially if they're Intels, fakes.

I decided to do just this tonight. I build a crossover cable and connected one Intel NIC to the other with no switch in between. Things actually looked good at first, I powered one FreeBSD machine on and begin the install of windows on another. It seemed to be humming right along. I went ahead and started provisioning another windows server box when I started getting errors again. I hit about 400M of bandwidth as reported in FreeNAS during the heaviest throughput.

I don't have any other interfaces that are supported to try at the moment. Not sure where to go from here. I may go out on a limb and run memtest on both servers to make sure all is good there.
 

benfrank3

Cadet
Joined
Oct 18, 2012
Messages
9
Hello,
Well it took me a while but I have moved off all the data (3 tb of 5.5 tb total) and then destroyed the zvol, made a new one, then attached via iscsi. I set the disk to dynamic (in Windows 2003) and did a long format (ntfs) of the volume.
Then I turned off write caching (on by default in device manager).
Lastly I copied all 3 TB of data back.
I never saw any of the write alerts/issues in the Freenas console during all this and so far still running well.
Not sure how long it will last but hoping this is long term fixed.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Hope yours is fixed.

Mine is setup differently, but I'm not making much progress. I connected both machines (ESXi and FN) via the Intel Pro 1000 GT NICs (iSCSI traffic only). Initially I routed traffic through the switch, but to eliminate that out of the mix, I connected the NICs back to back. Still got errors.

I then setup ESXi to use the Intel NIC as it's main NIC for management and iSCSI traffic. On the FN side, I used the onboard NIC. I also loaded 3 of the 5 VMs using iSCSI by using HDTune writing a 10GB file to the C: drive. I came pretty close to saturating the GB link, and got no iSCSI errors. Then at some point last night, a couple iSCSI errors were thrown.

Today I swapped Intel NICs in the ESXi box. I expected to see a bunch of problems right off the bat. I was able to repeat the same test with no errors again.

Why is it that when using the Intel NICs back to back I constantly get iSCSI errors? I'm not sure where to go with this now.
 

benfrank3

Cadet
Joined
Oct 18, 2012
Messages
9
Sorry it took me so long to reply but I did want to follow up on this because I fixed it.
I formatted the server I was running freenas on. I loaded Win 2012 and created the same iSCSI target disk. Never looked back.
Transfer performance is improved 10x and I have had ZERO problems since October. Freenas, see ya!
Frank
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'm glad your problems are solved and you wrote back. Personally, I'd be scared having a large array with a Windows file system. Even Microsoft new ReFS that is only supported in Windows Server 2012 is having.... issues. ReFS is, in many ways, Microsoft's homebrew ZFS type file system with checksumming, etc. I know several people that have tried ReFS in enterprise environments, they've all abandoned it due to issues that seem to develop out of nowhere and when they contact Microsoft their only recommendation is to restore from backup. Not a good thing for what is supposed to be a new superior file system for Enterprise class file servers.

Nonetheless, FreeNAS isn't for everyone. I'm glad you have your system working how you want, and I'm betting you are more comfortable with Microsoft stuff than FreeBSD.
 
Status
Not open for further replies.
Top