vmware loses connection for 15 sec

Status
Not open for further replies.

oehTie

Cadet
Joined
Feb 11, 2014
Messages
8
Hello All,

Last few days, we've seen disconnect messages in vmware. It loses its nfs mounts for 15 seconds and then reconnects. Seems like this has something to do with the regular cron. Today we have had a disconnect at 11:00 hours, and at 12:00 hours again. Logs don't really show a solutions. Anyone encountering this issue or any suggestions how to fix this?

Our system is a supermicro box, 2x 6core cpu, 128gb ram, 1x dual 10gbit cx4 connectors, 6x ssd for caching and 2x lsi9207-8e connected to an external jbod, filled with sas disks.

Any questions, feel free to ask. Any help would be appreciated.

Thanks!

Theo
 
D

dlavigne

Guest
/etc/crontab shows the cron jobs. Do you have any scrubs, smart tests, or periodic snapshot tasks scheduled at the affected times?
 

oehTie

Cadet
Joined
Feb 11, 2014
Messages
8
Only the default...

Just had another crash, at a moment i'm sure a cron was not running.

Scrubs, or smart tests are also not running. Snapshots have all been disabled. There are still some old snapshots. I'll remove these tonight.

Reporting shows only a large spike in the system load graph. It goes up to 20 where normal load is 1.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
That's neat. You haven't given a whole lot to go on though. So that leaves me with big questions like "is the box nonresponsive during the event" etc.

So here's what I suggest. Log in some ssh sessions with large scrollback capability (or run them in script(1) or whatever).

First verify that the time on your filer is correct to the second.

Then in one window run a shell script:

Code:
# sh
# while true; do
> date
> sleep 1
> done


Let that run. It'll tell us a few things, one of which is ultimately whether the userland is getting screwed up, but it will also help identify other problems - such as if the time is suddenly off after an "event," there may be a hardware problem. You should generally see the time increase by one second, possibly two now and then.

In another window run:

zpool iostat 1

That'll let you get some idea of whether there's some massive I/O event going on.
 

bigphil

Patron
Joined
Jan 30, 2014
Messages
486
What version of VMware ESX/ESXi are you running? What version of FreeNAS?
 
Status
Not open for further replies.
Top