Network recovery

Status
Not open for further replies.

mrreload

Dabbler
Joined
Jun 20, 2014
Messages
17
FreeNAS-9.2.1.9-RELEASE-x64
Dell T5400
onboard NIC (broadcom?)
24GB ECC RAM
CyberPower 900VA UPS

Issue:
Today I had a quick power outage (Off then right back on). My entire network came right back online. The FreeNAS box (the only device on a UPS) stayed on like expected, BUT it was not reachable via the network. I was forced to reboot it and then it was fine.
I want to avoid rebooting. How can I get FreeNAS to react better to a power and/or network outage? Would configuring it with a static IP help? Currently configured with a reserved DHCP IP from my pfsense firewall.

As a side note:
Does anyone know if 9.3 is any better with it's support of IPv6?

Thanks for continuing to make this excellent product!!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So ... your FreeNAS box saw the network link drop, then the network link came back, it probably tried to re-DHCP and could not get a DHCP, so it probably ended up over in 169.254.* or something?

Configure a static IP, or put the switch and DHCP server on a UPS too.
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
Just reset the interface next time, eg:
Code:
ifconfig <interface_name> down
ifconfig <interface_name> up


Edit: What J said is best, static is the way to go.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Just reset the interface next time, eg:
Code:
ifconfig <interface_name> down
ifconfig <interface_name> up

That probably won't fix it if it is dhcpd that caused the trouble. You'd effectively need to make it re-DHCP.
 

mrreload

Dabbler
Joined
Jun 20, 2014
Messages
17
so dhclient wouldn't work? Assigning static IPs just sucks. I prefer to manage my IPs from a central location, hence using DHCP server to assign static addresses to certain clients.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
so dhclient wouldn't work?

dhclient is already running. You can maybe kill and restart it, but I can't say whether or not the process is supervised, and if it is, then you're opening up more trouble.

Assigning static IPs just sucks. I prefer to manage my IPs from a central location, hence using DHCP server to assign static addresses to certain clients.

Right. So. You have some options.

One is to fix your busted DHCP environment. Because you're expecting the DHCP service to work properly when things fail. And you've encouraged gratuitous failure by not protecting the switch and DHCP server. So you should put the switch and DHCP server on a UPS too, and if this were a professional environment, you'd add a backup DHCP server to the mix. This still isn't foolproof but it's a lot closer. Most professional environments would analyze the risk of the remaining edge cases and would make a call whether to simply own the possibility of failure for those edge cases, or decide to ditch and do things the right way. It *is* possible to have a high availability DHCP environment, but what you have ... isn't.

Another is to fix your busted concept of network address allocation. DHCP is best used to serve transient clients. It is possible to use it to allocate server IP or infrastructure IP space, but you're opening yourself up to a whole range of bootstrap failures when you choose to let the DHCP server be the sole authority there, as you have chosen. It is better to use DHCP as a bootstrap aid to get a server, networking, or infrastructure device planted properly on the network and then wire it down at that address through static configuration of the device. This allows your network to be resilient in the face of various failures. Nobody ever said infrastructure management was supposed to be easy, and it isn't.

I'm not really interested in debating the correctness of this beyond what I've said here. Your strategy is demonstrably wrong because of what happened to you; you get to own that failure and decide how to remedy it (or not). That's the harsh reality. Sorry.
 

mrreload

Dabbler
Joined
Jun 20, 2014
Messages
17
As this is not an enterprise environment, just my home, I am not really concerned with debating correctness either.
You are absolutely correct that the DHCP server and multiple switches should be protected by UPS also but I have to decide how best to spend my hard earned money. I need a whole new FreeNAS box first and foremost.
The enterprise networks I have setup and managed are/were all widely protected with UPS and failovers. I just don't have that luxury here.
Just looking to see of there was any way to FreeNAS to behave differently when it sees a link drop. Just dropping the IP completely seems like bad form to me, and most other OSes I have dealt with do not drop like this unless a reboot occurs before the DHCP server and switches come back online. I actually just tested this on Ubuntu, unplugged the switch, and it did not drop the IP. What gives?
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
It doesn't seem to me this is expected behavior. I can certainly drop power to my router without affecting my FN box in the slightest. It has a dhcp reservation with a 24h ttl. Can't say I've tested the edge cases as frankly I couldn't care less at home. That's on a bone stock config. Midrange linksys with nothing interesting set up.

You can whack power on anything anytime and everything just pops back up. Pretty common to reset the router at my house if wifi is acting glitchy. No fear or mercy is shown. Can't comment on pfsense behavior mine only routes vpn traffic and doesn't serve dhcp.

I'd be more inclined to think you had an address conflict or something when the router cycled without touching the FN box. I'd ensure my static dhcp addresses are well outside the regular scope so race conditions or extra devices can't screw anything up.

Unfortunately the only topology mentioned is pfsense and switch... which could translate to anything. It shouldn't take much testing to get the behavior you want. FN does not drop addresses in a unique fashion it pretty much acts exactly as expected.

Might be something unique in your hardware mix. But your experience is not common as far as I've seen. Good luck should be easy to fix and test.

Apologies for odd grammar and auto correct.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It doesn't seem to me this is expected behavior. I can certainly drop power to my router without affecting my FN box in the slightest. It has a dhcp reservation with a 24h ttl. Can't say I've tested the edge cases as frankly I couldn't care less at home. That's on a bone stock config. Midrange linksys with nothing interesting set up.

You can whack power on anything anytime and everything just pops back up. Pretty common to reset the router at my house if wifi is acting glitchy. No fear or mercy is shown. Can't comment on pfsense behavior mine only routes vpn traffic and doesn't serve dhcp.

I'd be more inclined to think you had an address conflict or something when the router cycled without touching the FN box. I'd ensure my static dhcp addresses are well outside the regular scope so race conditions or extra devices can't screw anything up.

Unfortunately the only topology mentioned is pfsense and switch... which could translate to anything. It shouldn't take much testing to get the behavior you want. FN does not drop addresses in a unique fashion it pretty much acts exactly as expected.

Might be something unique in your hardware mix. But your experience is not common as far as I've seen. Good luck should be easy to fix and test.

Apologies for odd grammar and auto correct.

So you disabled your DHCP server, then dropped link to a client, and re-established link, and the client held the address? That's broken.

dhclient is expected to monitor link status and if the link goes down, and then comes back, it is supposed to assume that it may now be plugged into another network. There isn't a physical sensor available to tell it that it is plugged into the same switch.

If it is plugged into a different network but retains the old address, it could be stomping on an address that's already allocated and in-use on the new network. So it is supposed to go back into to DHCP mode and acquire a new address. Those of us who work with lots of gear are quite used to going to a switchport and doing "shutdown" followed by "no shutdown" to force a re-DHCP without physically detaching a device. See for example https://community.extremenetworks.c...sers_to_release_or_renew_their_dhcp_addresses etc. Downing link should trigger a re-DHCP and I consider devices where that does not happen to be broken.

My guess is that unlike the OP your DHCP is actually managed by your "router" so you never disabled your DHCP server, and rebooting the router made a DHCP server available very quickly, so that when your FreeNAS re-DHCP'd, it got a renewal in a very short timeframe. This could be different than the OP's situation, where the booting of a pfSense box could potentially take some much longer period of time.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
Interesting. Obviously I need some more complex gear to play with. Never had cause to test all this. Makes sense though my windows box definitely cycles the nic and likely reaquires its reservation.

Heh now I'll have to watch some things. Honestly never had a need to complicate it. The reservations get hit as expected. Of course half the junk is virtual some addreses are static, some are dhcp, some are reserved. One thing I do know is that a power cycle is a non event. But there may have been lots of dhcp traffic I never noticed.

Op also noted a jail and host ip conflict in another thread. So he could have lots of things going on.

I'm interested now in my esxi server behavior. If I have a dozen vm's and its link gets unplugged replugged does that cascade a bunch of re-dhcp. I don't think I've ever bothered paying attention to it, but figured I'd have noticed if it did. Probably options I've never bothered with.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Op also noted a jail and host ip conflict in another thread. So he could have lots of things going on.

Hadn't noticed.

I'm interested now in my esxi server behavior. If I have a dozen vm's and its link gets unplugged replugged does that cascade a bunch of re-dhcp. I don't think I've ever bothered paying attention to it, but figured I'd have noticed if it did. Probably options I've never bothered with.

You're going to feel real sheepish in about five seconds:

The dozen VM's do not see link drop because they retain link to the vSwitch. Only the link between the vSwitch and your real switch flaps, which isn't likely to cause much of anything from a DHCP point of view.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
You're going to feel real sheepish in about five seconds:

The dozen VM's do not see link drop because they retain link to the vSwitch. Only the link between the vSwitch and your real switch flaps, which isn't likely to cause much of anything from a DHCP point of view.
Heh. Not too sheepish. Normally I'd have just "done it", but was just lounging in bed on my phone. Of course since I am on the naughty list and my vSwitch keeps things happy I haven't watched that FreeNas box bounce much. Couldn't remember that last IP iteration on the esxi server so I didn't speculate.

I always do appreciate a nice link and a heads up. I take donations of 10Gb managed switches as well. Hope Christmas treated you well.
 

mrreload

Dabbler
Joined
Jun 20, 2014
Messages
17
I think I understand better now. Quick and dirty solution for now will be to put the FN local switch on the same UPS.
No Link Drop= No dhclient retry. Work for me.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
How about that, I run extreme networks gear too.

Well, actually we don't here. It was just a convenient first-hit-on-Google because sometimes I get the feeling someone might be about to argue a point about which they're wrong. Preemptive strike.

We've had exceedingly good luck with the Dell PowerConnect (now Dell Networking) gear over the years. The Dell 5012 (circa 2001) was the first reasonably priced gigabit switch but it wasn't full wirespeed. The Dell 5212 (2003) was also good but its big brother the 5224 suffered a firmware flaw that scared the crap out of everyone, it was made out of two switch ASICS that could lose sync with each other. This was actually an Accton switch that many mfrs (3Com, SMC, Foundry, etc) all OEM'd. And that worked great right up until we tried to do vlans on it and found that it had significant issues with broadcast domain leakage. So we moved onto the 5324 and beat the crap out of those for a decade. No 10G uplinks, no L3 switching, only 24 ports, but the list of hardware I've used that's been problem-free for a decade is very small. Back in 2005-2006 Dell was giving them away to some of their larger customers who made big server purchases, and I was able to pick up maybe a dozen more of them for chump change because those customers usually already had their own preference for (and investment in) switch fabric. During the jump from 100M to 1G, we got rid of a bunch of Synoptics 28115's, and Bay Networks 100M switches. Pretty sure we'd gotten rid of the Netstar/Ascend GRF400's before that point but we still have those because I have this weird thing about not wanting to sell off hardware that cost nearly six figures new.

We have stacked x460-48t's with 10gig uplinks for client switches and stacked x670v-48x's for our servers/storage.

We've been reasonably happy with the Dell 7048's for edge and the Dell 8132F's for core, which I guess kinda makes us a Force10 shop, which is cool because of the historical NetBSD angle. If the 8132's work out well I'll probably continue to have us move in that direction and kill off more 5324's.

I've always preferred smaller individual switches to a larger chassis, because it's easier to have a spare or work around a failure in a crisis. For that same sort of reason I've avoided stacking over the years. Two smaller cheaper switches over a more expensive bigger switch.

I love their protocols, especially elrp for the client stacks.

So... trying to fix STP's brokenness through proprietary protocols. Ain't technology grand.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I take donations of 10Gb managed switches as well.

If you're looking for something small, I'll note that the Dell Networking 5524 is often available on eBay for $500-$600. It's a nice small managed switch with two 10G SFP+ ports. You're a few days late for Santa based delivery, though I hear it's never too early to start being good for next Christmas.

Hope Christmas treated you well.

Despite my wish list of parts for a new hypervisor and VM storage NAS, Santa failed to deliver. Guess I'll have to order the stuff myself.
 
Status
Not open for further replies.
Top