Bhyve jail autorestart if jail goes down?

KevDog

Patron
Joined
Nov 26, 2016
Messages
462
Yesterday I was working within my bhyve Ubuntu VM, but came back this morning to see the VM had been stopped. I had to enter the FreeNAS GUI to restart the VM. I don't really understand how the VM was stopped overnight. Is there a way for FreeNAS to do this monitoring and automatically attempt to restart a VM if stopped? I think autostart only refers to starting the VM on boot.
 

bkvamme

Dabbler
Joined
Jun 26, 2014
Messages
16
Hi,

Here is a simple checker script I wrote for testing if jails are up. Might be able to adapt it for bhyve VMs, given that there is an accessible port that can be checked. The only difference would be the "restart_command".

edit: I added the commands for restarting a bhyve VM on FreeNAS.

Code:
#/bin/bash
killswitch=$(<./killswitch) #Simple killswitch. To prevent the script from running, the killswitch file should be set to "1"

log_file=./uptimecheck.log # Set location to your choice.

if [ $killswitch -eq 0 ]; then
    service_shortname="plex"
    vmname="plex"
    jailname="plex"
    host=JAIL_IP
    ports=32400
    ssh_keyfile="./FREENAS_IP_id_rsa" #You need to use a ssh keypair. This should be the location of a private key associated with the root user account
    ssh_username=root
    ssh_host=FREENAS_IP
    ssh_port=22
    restart_command="/usr/local/bin/iocage restart $jailname" # For restarting jails
    #restart_command="/usr/local/bin/iohyve stop $vmname && /usr/local/bin/iohyve start $vmname" # Uncomment this line and comment the line above if this is a VM.
    remote=0
    for port in $ports
    do
        if [ $(nc -v -z -w2 $host $port &> /dev/null; echo $?) -eq 0 ]; then
            echo "$(date '+%Y-%m-%d %H:%M:%S') Checked $service_shortname at $host:$port. Port is OPEN." >> $log_file
        else
            echo "$(date '+%Y-%m-%d %H:%M:%S') Checked $service_shortname at $host:$port. Port is CLOSED. Waiting for 1 minute, and check again."  >> $log_file
            sleep 60
            if [  $(nc -v -z -w2 $host $port &> /dev/null; echo $?) -eq 0 ]; then
                echo "$(date '+%Y-%m-%d %H:%M:%S') Checked $service_shortname at $host:$port. Port is OPEN. Continuing monitoring." >> $log_file
            else
                echo "$(date '+%Y-%m-%d %H:%M:%S') Checked $service_shortname at $host:$port. Port is still CLOSED. Restarting jail." >> $log_file
                if [ $remote -eq 0 ]; then
                    $restart_command  >> $log_file
                else
                    $(ssh $ssh_username@$ssh_host -i $ssh_keyfile -p $ssh_port $restart_command)  >> $log_file
                fi
                echo "$(date '+%Y-%m-%d %H:%M:%S') Restart command run successfully, waiting 60 seconds and checking again." >> $log_file
                sleep 60
                if [  $(nc -v -z -w2 $host $port &> /dev/null; echo $?) -eq 0 ]; then
                    echo "$(date '+%Y-%m-%d %H:%M:%S') Checked $service_shortname at $host:$port. Port is OPEN. All OK." >> $log_file
                else
                    echo "$(date '+%Y-%m-%d %H:%M:%S') Checked $service_shortname at $host:$port. Port is still closed. Trying to restart jail again." >> $log_file
                    if [ $remote -eq 0 ]; then
                        $restart_command  >> $log_file
                    else
                        $(ssh $ssh_username@$ssh_host -i $ssh_keyfile -p $ssh_port $restart_command)  >> $log_file
                    fi
                    echo "$(date '+%Y-%m-%d %H:%M:%S') Restart command run successfully, waiting 60 seconds and checking again." >> $log_file
                    sleep 60
                    if [  $(nc -v -z -w2 $host $port &> /dev/null; echo $?) -eq 0 ]; then
                        echo "$(date '+%Y-%m-%d %H:%M:%S') Checked $service_shortname at $host:$port. Port is OPEN. All OK." >> $log_file
                    else
                        echo "$(date '+%Y-%m-%d %H:%M:%S') Checked $service_shortname at $host:$port. Port is still closed. Giving up." >> $log_file
                    fi
                fi
            fi
        fi
    done
else
    echo "$(date '+%Y-%m-%d %H:%M:%S') Killswitch active. Not doing anything" >> $log_file
fi
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Maybe I missed something, but jails don't run in bhyve.

In my experience, if a jail dies, the whole system is unstable and you're unlikely to be able to restart. I'd be happy to hear if the script proposed above is a success.

In re-reading your original post, I see what you actually meant was the other way... you are wanting to check and restart VMs, not jails at all.

You would need some kind of similar script to the above and you could possibly have the same challenge I mentioned with jails (although there are many more reasonss a VM can crash, so who knows).
 

bkvamme

Dabbler
Joined
Jun 26, 2014
Messages
16
Maybe I missed something, but jails don't run in bhyve.

In my experience, if a jail dies, the whole system is unstable and you're unlikely to be able to restart. I'd be happy to hear if the script proposed above is a success.

In re-reading your original post, I see what you actually meant was the other way... you are wanting to check and restart VMs, not jails at all.

You would need some kind of similar script to the above and you could possibly have the same challenge I mentioned with jails (although there are many more reasonss a VM can crash, so who knows).
The intention of the script above is more to restart the VM/Jail if the service goes down. The script is checking for a TCP response on the port specified. If there is no response within a minute or so, the script connects to the server and run a command to restart the VM/jail.

I posted the script as it is pretty generic. For a VM hosted with iohyve, the restart command would be:

Code:
iohyve stop $vmname && iohyve start $vmname


I've added the commands for bhyve to the script now.

I am using this for both FreeNAS and Proxmox VMs and Jails/containers. I am away for long periods of time with limited connectivity, and I have the script above running on a cronjob every hour, checking the different servers I've setup. This is of course a bandaid, and will not fix any issues apart from random issues, but it does the trick for me.
 

KevDog

Patron
Joined
Nov 26, 2016
Messages
462
Yea my jails sometimes don't start at boot (although specified) and neither do some of my VMs. It's really strange -- sometimes it works and other times it doesn't. Also if the system is up a long time -- like 10-12 days, sometimes the jails stop working or the VMs stop working and I need to restart either/or. The inconsistency is really problematic. I'll try the script and modify if for bhyve and for jails. You've given me a good start and I likely just need to modify the startup command. OTOH, it would be great if this "watchdog" type of script were actually build into freenas itself.
 
Top