Stupid Long Reboot Times!

Status
Not open for further replies.

siconic

Explorer
Joined
Oct 12, 2016
Messages
95
I have my server setup so that my boot disk is a single high reliability 76GB 15k 6GB/s SAS disk. I have had no issues with it, never SEEMS to bog down, and for over 3 years now has been my OS disk, without even a hickup.

HOWEVER, my reboot time is around 30 minutes and from the time I click "reboot" to the time the GUI comes back up, is actually over 30 minutes. It used to be around 10, but since updating, the newest version FreeNAS 11 it is taking FOREVER.

I also noticed that even with 3 boot volumes, I am maxing out my space on the disk. In the past, I have tried to start fresh, by re-installing FreeNAS, and then uploading my backed up configs. This seemed to work perfect, as far as everything being the way it was BEFORE the wipe, but then I see a few anomylous bugs that come back, like an old IP address "WARNING: Sept. 28, 2017, 8:29 a.m. - The WebGUI Address could not bind to 10.0.30.1; using wildcard" while my current IP range is 192.168.*.*, and slow performance.
upload_2017-9-28_9-1-31.png


So, I actually have a lot of questions.
1. How are people using 32GB USB thumb drives, when my 2 boot volumes alone use 39GiB????
upload_2017-9-28_8-59-31.png

2. Would there really be ANY appreciable difference between a USB and a SAS disk? I feel like most SAS disks are faster than USB anyway.
3. Am I doing something wrong?
4. Should I start fresh without importing my configs? If so, how can I redo all my configs, without losing data?

I am OK with manually doing the work, but the data is the concern.
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
So, I actually have a lot of questions.
1. How are people using 32GB USB thumb drives, when my 2 boot volumes alone use 39GiB????
Something is wrong with your configuration. My boot drive is 32GB and it has room to spare. It was less than half used last I looked.
2. Would there really be ANY appreciable difference between a USB and a SAS disk? I feel like most SAS disks are faster than USB anyway.
There is no advantage to a SAS disk at all. I use a mirrored pair of 2.5" laptop hard drives and it works great. Reboot time (estimated) is less than 10 min. It doesn't bother me to sit and wait for it.
3. Am I doing something wrong?
I am not saying that you are doing something wrong, but there is something wrong.
4. Should I start fresh without importing my configs? If so, how can I redo all my configs, without losing data?
Your config has nothing to do with your data. The data is safely stored on the pool and none of it is in the boot drive. When you get to the point of crating a pool, import the existing pool instead of creating a new one. As long as you don't wipe your data drives, you should be fine.
I would wipe that boot drive and do some testing on it because a defect may have developed.
 

siconic

Explorer
Joined
Oct 12, 2016
Messages
95
Something is wrong with your configuration. My boot drive is 32GB and it has room to spare. It was less than half used last I looked.

Found the issue to at least #1 because of this thread.

Still have 20GiB of used space though. But much better!
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
I have a 16GB usb drive as my boot and I am using 1.4GB for 3 different boot environments.
  1. Initial install
  2. default
  3. 11.0-U3 (on Reboot, Now)
so yeah, something is wrong in your setup which is taking that much space for the boot drives. Check your folders again, maybe you have some other artifacts than the 19GB folder you found. Some cron job or rsync task putting data in the boot pool rather than the data pool.
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
but then I see a few anomylous bugs that come back, like an old IP address "WARNING: Sept. 28, 2017, 8:29 a.m. - The WebGUI Address could not bind to 10.0.30.1; using wildcard"
I see this on my machine too and would love to have FreeNAS forget about the old IP. I don't know how though.

I have 2 NICs on my board and initially I had tried 1 port, but then switched to using the other. But I keep getting that warning about not being able to bind to the old IP. Why won't you let it go FreeNAS !!:mad:
 

siconic

Explorer
Joined
Oct 12, 2016
Messages
95
I see this on my machine too and would love to have FreeNAS forget about the old IP. I don't know how though.

I have 2 NICs on my board and initially I had tried 1 port, but then switched to using the other. But I keep getting that warning about not being able to bind to the old IP. Why won't you let it go FreeNAS !!:mad:

It is very Irksome! lol

Whatever was in 11.0-U3 was taking up all the space. Removed it, and down to 791.2 MiB

upload_2017-9-28_14-2-23.png


Soooo, now that my disk size problem has been solved, WTH is up with the slow boot times? I am keeping Freenas pretty busy, 3 VM`s, and 8 jails, 11 disks in Raid Z3, and 4 Pools. But I cant imaging that all of those things would bog it down that bad. I mean the R410 server undoubtedly can handle all of that without breaking a sweat, and I almost never see the CPU go above 20%.
 
Last edited:

siconic

Explorer
Joined
Oct 12, 2016
Messages
95
Why does it matter? How often do you reboot your server?

In the past, only on updates. The last 2 weeks, it seems like once or twice a day.

During my resilver I was getting very inconsistent write times to the new disks, and reboots would solve the problem and speed it up.

Also, the GUI keeps hanging up, and stopping and restarting the django and nginx services from SSH terminal does not seem to fix the problem. So I reboot to fix that.

Then I have a VM that I shutdown and startup from time to time, but it almost always hangs on shutdown. I cant use iohyve to shut it down, because it is in the new GUI VM manager, and I havent yet tried to figure out or learn how to force kill a VM created and started from the GUI. When I run
Code:
ps -a
I never see it running either, so I just reboot to fix that as well.

So, thats ultimately why. I find myself rebooting so much lately, that it is annoying.
 

styno

Patron
Joined
Apr 11, 2016
Messages
466
Then I have a VM that I shutdown and startup from time to time, but it almost always hangs on shutdown.
A reboot will try to properly shutdown everything. My guess is that this is causing your delays.
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
Whatever was in 11.0-U3 was taking up all the space. Removed it, and down to 791.2 MiB
But now you have no other boot environment to go to in case U4 craps out. Make backups of the config.

It might be that the upgrade from U3 to U4 wrote some files which were not cleared or somehow the process got interrupted which resulted in the boot pool blooming to 39GB. I have not yet upgraded. I have read about many people having issues with U4 upgrade. I am going to wait a while before upgrading.
 

siconic

Explorer
Joined
Oct 12, 2016
Messages
95
A reboot will try to properly shutdown everything. My guess is that this is causing your delays.

It is possible, however the actual shutdown is pretty quick. If I watch from the server monitor, the actual boot takes a long time, and there are always what appear to be errors. I will try to get some pictures and post them, maybe that is the issue. I will also time each phase. Good ideas, lol.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Soooo, now that my disk size problem has been solved, WTH is up with the slow boot times?
I just got where I can look at my system (even though this part is solved) I thought it might be useful to share. I have eleven boot environments on my boot pool goin all the way back to version 9.10.1-U1 and it is only using 5.9 GiB (15%) of my boot pool which is listed as 37.2 GiB in size. As you can tell from the number of versions I have, I have not concerned myself with cleaning the old ones off because they take so little space that I just have not worried about it. The oldest one is dated from September 27th of 2016.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Also, the GUI keeps hanging up, and stopping and restarting the django and nginx services from SSH terminal does not seem to fix the problem. So I reboot to fix that.
I am keeping Freenas pretty busy, 3 VM`s, and 8 jails, 11 disks in Raid Z3, and 4 Pools. But I cant imaging that all of those things would bog it down that bad.
I am not saying that you are overburdening the server, but all those things take time to startup. Each of the jails and the VMs have to start in addition to the host OS. That (to me) explains the long boot time, but I could be all wrong on that.
As for the GUI hanging, that is perplexing. The only time my GUI has frozen was because of a hardware fault that ultimately crashed the server.
You mentioned resilvering, what was going on with that? Multiple failed drives? What kind of drive controller are you using?
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
HOWEVER, my reboot time is around 30 minutes and from the time I click "reboot" to the time the GUI comes back up, is actually over 30 minutes. It used to be around 10, but since updating, the newest version FreeNAS 11 it is taking FOREVER.

Just for comparison: Some timestamps from a reboot of my C2750 Avoton (2.4 GHz 8 Core Silvermont Server Atom) booting off of 2 SanDisk X300s SSDs (mirrored). FreeNAS-11.0-U4, no jails/plugins. Perhaps this helps to sort out what exactly takes that long on your system.

From /var/log/messages:
Code:
Sep 28 23:02:03 blunzn shutdown: reboot by root:
[...]
Sep 28 23:03:07 blunzn syslog-ng[1932]: syslog-ng shutting down; version='3.7.3'
Sep 28 23:05:55 blunzn syslog-ng[1932]: syslog-ng starting up; version='3.7.3'
Sep 28 23:05:55 blunzn Copyright (c) 1992-2017 The FreeBSD Project.
[...]
Sep 28 23:06:00 blunzn ntpd[2411]: ntpd 4.2.8p10-a (1): Starting
[...]
Sep 28 23:06:12 blunzn root: /etc/rc: WARNING: failed precmd routine for minio

The ntpd message comes after hardware device detection (USB, ada?, GEOM_ELI, ...) and the minio warning is the very last message in /var/log/messages related to startup.

From /var/log/debug.log:
Code:
Sep 28 23:06:53 blunzn uwsgi: [middleware.notifier:198] Popen()ing: route -nv show default|grep 'interface:'|awk '{ print $2 }'
Sep 28 23:06:53 blunzn uwsgi: [middleware.notifier:198] Popen()ing: route -nv show -inet6 default|grep 'interface:'|awk '{ print $2 }'
Sep 28 23:06:53 blunzn uwsgi: [freeadmin.navtree:114] Unable to import 'freenasUI.documentation' 'nav': No module named 'freenasUI.documentation.nav'
Sep 28 23:06:53 blunzn uwsgi: [freeadmin.navtree:402] App freenasUI.documentation has no nav.py module, skipping

Last (at least for me) identifiable traces in that file from restarting FreeNAS middleware/GUI.

If the GUI indeed does hang this would be an evidence for a serious problem. If the problem is sluggish behavior (maybe due to swap usage) there is a chance that this could be easily worked around.
 

siconic

Explorer
Joined
Oct 12, 2016
Messages
95
As for the GUI hanging, that is perplexing. The only time my GUI has frozen was because of a hardware fault that ultimately crashed the server.
You mentioned resilvering, what was going on with that? Multiple failed drives? What kind of drive controller are you using?

The GUI is a bit perplexing to me as well. I am not an entire noob to FreeNAS, and really not a noob to Linux, so i really need to dig into that one a bit more myself. It does seem to mostly occur when I remove a disk from my NAS. The controller is an LSI Logic SAS3801E L3-01123-04E SAS PCI-E.

It has served me well, and I bought it used from fleabay for $15 in May of 2015. I was hesitant to use it, because it had soot ALL OVER IT! Looked like it had been in a datacenter fire or something. So I took some electronics solvent and cleaned it up, let it dry, and used it. Here we are, 2 and a half years later still kickin! But, it could be the controller, especially since I keep getting very inconsistent resilver rates, unless I reboot, and when I remove a disk it causes FreeNAS to go haywire.

When I remove a disk, if I turn on the monitor, I see errors scrolling across the screen so fast that it looks like white noise, and usually that is when the GUI hangs up and requires a reboot. Now that you have mentioned it, maybe I should buy another and swap them out, just to see, since they are only about $9 now, with free shipping! Unless you have a better suggestion for a SFF-8088 external SAS card?

As for resilvering, it was for an upgrade from 1TB to 2TB disks. I did have two bad disks in that lot of 11, so I am in the process of swapping them out. I opted for SAS disks, and even those were getting about 35MiB/s until I rebooted. After the reboot I am getting 250MiB/S. So there is definitely something going on. I don't want to make assumptions just yet, because one disk is throwing errors at a staggering rate.
 
Last edited by a moderator:

siconic

Explorer
Joined
Oct 12, 2016
Messages
95
UPDATE: So, when I use mptutil I do not see my LSI card. It only shows the backplane for the server. When I run sas2flash -listinstall I get "No LSI SAS adapters found"...
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
When I remove a disk, if I turn on the monitor, I see errors scrolling across the screen so fast that it looks like white noise, and usually that is when the GUI hangs up and requires a reboot.
Do you just pull the disk, or do you offline it first? If you just pull it, swap on the disk could be in use, which would cause all manner of problems. We're supposed to see mirrored swap in 11.1, which would avoid this problem.
 

siconic

Explorer
Joined
Oct 12, 2016
Messages
95
Do you just pull the disk, or do you offline it first? If you just pull it, swap on the disk could be in use, which would cause all manner of problems. We're supposed to see mirrored swap in 11.1, which would avoid this problem.

Although not best practice, I usually just pull it. Thats good to know, but I thought swap was only being used by the OS disk???
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I thought swap was only being used by the OS disk???
Not sure why you'd think that; there's no swap on the boot device by default. @Stux has written up some instructions for putting it there, which you can do if you like, but it isn't that way in a stock installation.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
No, there's no swap on the boot drive.

When I run sas2flash -listinstall I get "No LSI SAS adapters found"...
Well, since it's an SAS1 card, sas2flash is expected to fail. I guess the equivalent you'd need is sasflash, but I'm not sure that exists.
 
Status
Not open for further replies.
Top