VirtualBox continuously "aborts" clients

Status
Not open for further replies.

Chris Deluca

Cadet
Joined
Apr 9, 2016
Messages
8
Since the upgrade to 9.10, I cannot keep any VirtualBox client up and stable (Oracle Red Hat Linux 7.2, Windows Server 2012 R2 are the two I'm trying right now). I got them both loaded, but they are just flakey as hell. I'll be trying to do something and poof, gone. Under 9.3, no problems whatsoever. Suggestions?
 

Chris Deluca

Cadet
Joined
Apr 9, 2016
Messages
8
So, this was my repro:
1. From a stopped state, start VirtualBox. Start both VirtualBox "clients" (Win2K12 R2, Oracle Red Hat 7.2).
2. Do update activity in both clients (e.g. Windows Update and Software Update)
3. Win2K12R2 aborted (Oracle still running at the moment). The messages captured are:

VirtualBox /var/log/messages:
root@VirtualBox:/ # tail -f /var/log/messages
Apr 16 16:04:30 VirtualBox syslogd: exiting on signal 15
Apr 16 16:11:34 VirtualBox syslogd: kernel boot file is /boot/kernel/kernel
Apr 16 16:14:29 VirtualBox reboot: rebooted by root
Apr 16 16:16:40 VirtualBox syslogd: exiting on signal 15
Apr 16 16:16:44 VirtualBox syslogd: kernel boot file is /boot/kernel/kernel
Apr 16 16:21:17 VirtualBox syslogd: exiting on signal 15
Apr 16 16:21:21 VirtualBox syslogd: kernel boot file is /boot/kernel/kernel
Apr 16 16:45:17 VirtualBox syslogd: kernel boot file is /boot/kernel/kernel
Apr 16 16:59:44 VirtualBox syslogd: exiting on signal 15
Apr 17 07:37:12 VirtualBox syslogd: kernel boot file is /boot/kernel/kernel

FreeNAS /var/log/messages:
[root@freenas ~]# tail -f /var/log/messages
Apr 17 07:37:09 freenas kernel: epair0a: link state changed to UP
Apr 17 07:37:09 freenas kernel: epair0a: link state changed to UP
Apr 17 07:37:09 freenas kernel: epair0b: link state changed to UP
Apr 17 07:37:09 freenas kernel: epair0b: link state changed to UP
Apr 17 07:37:09 freenas kernel: epair0a: promiscuous mode enabled
Apr 17 07:37:09 freenas kernel: ng_ether_ifnet_arrival_event: can't re-name node epair0b
Apr 17 07:37:09 freenas kernel: ng_ether_ifnet_arrival_event: can't re-name node epair0b
Apr 17 07:37:11 freenas mDNSResponder: mDNS_Execute: SendResponses didn't send all its responses; will try again in one second
Apr 17 07:37:11 freenas mDNSResponder: mDNS_Execute: SendResponses didn't send all its responses; will try again in one second
Apr 17 07:38:02 freenas kernel: epair0b: promiscuous mode enabled
Apr 17 07:46:58 freenas kernel: pid 41542 (VBoxHeadless), uid 1001: exited on signal 11

Thank in advance.
 

Chris Deluca

Cadet
Joined
Apr 9, 2016
Messages
8
1. ASRock C2750D4I Mini ITX Server Motherboard FCBGA1283 DDR3 1600/1333
2. 2 @ Western Digital WD30EZRX (3TB WD Green SATA 64mb cache)
3. 2 @ 64gb SSD (as L2ARC cache)
4. 32gb ECC RAM (8GBx4 DDR3/DDR3L-1600MT/s (PC3-12800) DR x8 ECC UDIMM)

Attached are photos from the reporting function showing overall system load. The SSDs are extras I had lying around - I'm not 100% convinced they are providing the benefit I had hoped. Everything else is brand new.

 

Attachments

  • cpu.png
    cpu.png
    14.5 KB · Views: 364
  • disk.png
    disk.png
    17.1 KB · Views: 303
  • phys_ram.png
    phys_ram.png
    14.5 KB · Views: 292
  • swap.png
    swap.png
    9.9 KB · Views: 293
  • interface0_traffic.png
    interface0_traffic.png
    11.6 KB · Views: 348
  • diskspace.png
    diskspace.png
    10.8 KB · Views: 301
  • processes.png
    processes.png
    14.9 KB · Views: 279
  • arc_size.png
    arc_size.png
    9.4 KB · Views: 295
  • arc_hit_ratio.png
    arc_hit_ratio.png
    9.5 KB · Views: 319

iBlueDragon

Cadet
Joined
Aug 13, 2015
Messages
4
Nearly same issue here, so I don't think it's hardware related (although I have the same CPU and same amount of RAM as the OP).

Everything was running fine (2 VMs, WHS 2011 & Win 7) even under 9.10 till I created a virtual machine to run Ubuntu (Desktop). The WHS 2011 became unstable to the point that it was not usable anymore. Sometimes the VM got aborted, sometimes it just hung and couldn't be shut down from the VirtualBox control panel (progress bar stopped at 14% or 28%). Trying to restart the jail often led to a reboot of the NAS. The Win 7-VM didn't seem to have problems.

I tried reinstalling the jail and WHS. VM got aborted twice (once while booting from the ISO image, once after installing the VB Guest Addons). But then it was running fine (installing hundreds of updates). Everything got totally unstable again after I added the VM for Ubuntu (OS not even installed, yet).

I am still investigating the issue. I now deleted the Ubuntu VM again and reverted to a snapshot of the WHS that I did right after the install. So it's installing updates again…

Does anyone have Windows and Linux running properly in parallel in VirtualBox under 9.10?

I will report back when I get the WHS updated and the necessary softwre installed.

Kind regards,
iBlueDragon
 

Chris Deluca

Cadet
Joined
Apr 9, 2016
Messages
8
I also had some of the items iBlueDragon described - for example, today I tried to shut down one of the VBox instances and it got stuck at the 28% mark. Attached is the screenshot and log at the time. It has been sitting here almost 1/2 an hour.

An interesting note, the system has become more stable after applying both Windows and Oracle patches to the respective client OSs (clearly not 100% yet, but a little better...).
 

Attachments

  • stuck.txt
    63.9 KB · Views: 368
  • Screen Shot 2016-04-18 at 7.43.04 AM.png
    Screen Shot 2016-04-18 at 7.43.04 AM.png
    154.1 KB · Views: 625

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739
The version of VirtualBox on FreeNAS is now pretty old, so I doubt that it will be updated now as Bhyve support is included in 9.10. Yes, Bhyve will run Windows, you just need to perform a headless install using an unattend XML script. See: https://lists.freebsd.org/pipermail/freebsd-virtualization/2015-October/003832.html

The only thing I can think why they're aborting is lack of RAM. ZFS by default will take 80% of available RAM for ARC.

If you're running 9.10, you're better off migrating to Bhyve than continuing to use VirtualBox. I'm not 100% sure, but VirtualBox support may be dropped when FN10 is released.

Details on how to use Bhyve in FreeNAS can be found in the documentation: http://doc.freenas.org/9.10/freenas_jails.html#using-iohyve
 

iBlueDragon

Cadet
Joined
Aug 13, 2015
Messages
4
RAM really seems to be an issue. After installing all the updates my WHS became pretty unstable again. Just got aborted while not doing anything (uptime < 12h). No VM for Linux created at that time, so it seems it's not related to the OSs used.

But reducing the allocated RAM from 8GB to 4GB seems to have helped. Machine already running for over 24 hours without any issues.
 

Chris Deluca

Cadet
Joined
Apr 9, 2016
Messages
8
Very interesting. I spent a bunch of time just getting all of the updates to both clients in and then let them bake. Windows crashed after a day or so (it was allocated at 2gb RAM). My RH Linux client (at 6gb RAM) has run for about 3 days (no activity, just on and whatever it does when it sits there). I'll play with RAM allocation and see what that does.
 

Chris Deluca

Cadet
Joined
Apr 9, 2016
Messages
8
Actually, that value was not checked for the main disk - I didn't notice it and didn't alter it from default. It is checked for the ISO mounted media.
 

iBlueDragon

Cadet
Joined
Aug 13, 2015
Messages
4
I didn't have the I/O caching enabled for the WHS, either. It seems to be disabled by default if you choose a server version of windows. However, it is enabled for the desktop version.

But at the moment I have 3 machines (2x Win, 1x Ubuntu) running fine with altogether 9 GB of RAM while before 1 machine with 8 GB always got aborted. So reducing the allocated memory from 8 to 4 GB definetly worked for me.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
that value was not checked for the main disk
For me and others, checking it is the difference between stable and aborted VMs, regardless of the guest OS.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
If I had to guess what is going on, I'd guess that your system doesn't have enough RAM for the workload and ZFS is getting too aggressive with its memory demands. If you don't have enough RAM eventually ZFS asks for free RAM that doesn't exist, and processes start being killed. The processes using the most RAM are often the first to be killed, hence your VMs going down.

I wouldn't be using an L2ARC in a system with only 32GB of RAM, double so if you are running VMs because of RAM limitations.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Is this mentioned somewhere?
Only in just about every thread raising questions about VM stability under VirtualBox. I don't know why it isn't on by default.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
I think cyberjock's theory is correct, except that this is the bug in FreeBSD that prevents ARC from shrinking appropriately. When you enable the host I/O cache, the FreeBSD kernel is keeping a larger filecache around that it can go raid instead of panicing out(the process). The issue is more commonly observed as poor system performance(swapping) and unexpected swap usage.
 

crimsondr

Dabbler
Joined
Feb 6, 2015
Messages
42
Only in just about every thread raising questions about VM stability under VirtualBox. I don't know why it isn't on by default.
Well my first search in google led me here and fixed the problem.

I think cyberjock's theory is correct, except that this is the bug in FreeBSD that prevents ARC from shrinking appropriately. When you enable the host I/O cache, the FreeBSD kernel is keeping a larger filecache around that it can go raid instead of panicing out(the process). The issue is more commonly observed as poor system performance(swapping) and unexpected swap usage.
I have 32gb RAM and 9x4gb in raidz2. I tried to setup Windows 10 in a VM with 2gb RAM and it kept freezing/aborting. This is the only VM on the machine. The only other thing on the machine is another jail where I installed transmission, plex, and flexget. So I don't think I am running out of memory but I could be wrong.
 
Status
Not open for further replies.
Top