SOLVED Server no longer boots after I changed boot image to an older one

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
I have serious network problems at home and trying to solve it.

Basic info here:

I have 2 servers at home, NITRO is the one I store stuff on and NAZZE rsync's everything every 3 days.
Both runs FreeNAS and serves files mostly via samba as I only have windows (7 & 10) at home for now.

Yesterday, I totally lost contact with NAZZE and only one win10 machine can connect to NITRO via smb.
I have no idea what caused it. Win10 decided to do some upgrades as usual, might be it apart from the fact that my 2 win7 machines can't reach the servers nor the win10 machines shared folders anymore.
WIn10 can see win7 shares just fine.
Rebooted the router, no changes.

Connecting to the servers via http works, just not via smb.

Now to that main issue.

I upgraded NAZZE to from 10.0.U7 (I think) to 10.2-U3. Didn't help.
Read in this forum that a downgrade to an earlier version might help so I made the previous version active and rebooted.
Now it can't figure out where to boot from.
Had the upgrade to 10.2 also upgraded the ZFS version on my mirrored USB sticks or what ??

NAZZE is a HP microserver gen 8 with 16gb ram and 5 4tb disks and boots from mirrored USB sticks.
 

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
SSH is turned on in the settings, yes.
Never tried to access my servers that way though.
Only have windows running at home, no linux or bsd.
 

garm

Wizard
Joined
Aug 19, 2017
Messages
1,556
In Windows 10 2018 April update there is an OpenSSH client enabled by default. You reach it from both cmd and powershell. To get better copy past ability I like to use cmder
 

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
In Windows 10 2018 April update there is an OpenSSH client enabled by default. You reach it from both cmd and powershell. To get better copy past ability I like to use cmder

Goody. I didn't knew that, but how does that help me ??
NAZZE halts with some kind of text beginning with BTX yadda yadda and then staph.

Tested to SSH in to the working server, this is the result:

jasse@nitro:~ % pwd
/nonexistent

Seems that I need to elevate my rights in the server ;-)
 

garm

Wizard
Joined
Aug 19, 2017
Messages
1,556
You either need to give your user sudo rights or log in as root with the webUI password.

Then check zpool status -v and post the printout in [ code] tags
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I like using Putty for my SSH and everything else, many do here. Small simple to use program and you can capture your session to a text file if you desire.
Read in this forum that a downgrade to an earlier version might help so I made the previous version active and rebooted.
Now it can't figure out where to boot from.
So your computer NAZZE doesn't know which drive to bootstrap from is what I'm reading. This is a BIOS function of the computer so you have a few things to try here, but I seriously doubt any of your data is lost. You also may have a failing BIOS battery if you can reconfigure the boot drive without issue, the BIOS just fails to retain that configuration and normally the system date/time when this happens.

1. When the computer is bootstrapping press the keyboard key that gets you into the BIOS boot menu or boot order.
2. Select your USB flash drive if it sees it. If it is not seen then maybe you have a failing USB flash drive.
3. If this fails to work, get a single clean/new USB flash drive and reinstall FreeNAS with the version you desire. Reboot and restore your backup configuration file. All should be good. If you don't have a backup configuration file you might be able to pull the from one of the failed/corrupt USB drives but if you had a fairly simple configuration, it might be easier to just reconfigure it and add your pool.

And please troubleshoot one problem at a time, the problems are likely not related.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
In Windows 10 2018 April update there is an OpenSSH client enabled by default. You reach it from both cmd and powershell. To get better copy past ability I like to use cmder
I didn't know this, I'll have to give this a try on my laptop, it's good to learn something new.
 

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
You either need to give your user sudo rights or log in as root with the webUI password.

Then check zpool status -v and post the printout in [ code] tags

Oki, here it comes

Code:
jasse@nitro:~ % zpool status -v
  pool: NITRO_dev
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:01 with 0 errors on Sun Apr  7 00:00:01 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        NITRO_dev                                       ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/b4b8222b-b841-11e8-8106-0cc47a86669a  ONLINE       0     0     0
            gptid/b5797e6c-b841-11e8-8106-0cc47a86669a  ONLINE       0     0     0

errors: No known data errors

  pool: NITRO_tank
 state: ONLINE
  scan: scrub repaired 0 in 0 days 03:57:45 with 0 errors on Mon Apr  1 04:57:46 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        NITRO_tank                                      ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/88e47662-b83f-11e8-8106-0cc47a86669a  ONLINE       0     0     0
            gptid/89ea27f9-b83f-11e8-8106-0cc47a86669a  ONLINE       0     0     0
            gptid/8af1a4a1-b83f-11e8-8106-0cc47a86669a  ONLINE       0     0     0
            gptid/8c16a93d-b83f-11e8-8106-0cc47a86669a  ONLINE       0     0     0
            gptid/8d2981e1-b83f-11e8-8106-0cc47a86669a  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:01:42 with 0 errors on Fri Apr  5 03:46:42 2019
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            da1p2   ONLINE       0     0     0
            da0p2   ONLINE       0     0     0

errors: No known data errors
[CODE
]
 

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
So your computer NAZZE doesn't know which drive to bootstrap from is what I'm reading. This is a BIOS function of the computer so you have a few things to try here, but I seriously doubt any of your data is lost. You also may have a failing BIOS battery if you can reconfigure the boot drive without issue, the BIOS just fails to retain that configuration and normally the system date/time when this happens.

1. When the computer is bootstrapping press the keyboard key that gets you into the BIOS boot menu or boot order.
2. Select your USB flash drive if it sees it. If it is not seen then maybe you have a failing USB flash drive.
3. If this fails to work, get a single clean/new USB flash drive and reinstall FreeNAS with the version you desire. Reboot and restore your backup configuration file. All should be good. If you don't have a backup configuration file you might be able to pull the from one of the failed/corrupt USB drives but if you had a fairly simple configuration, it might be easier to just reconfigure it and add your pool.

And please troubleshoot one problem at a time, the problems are likely not related.

I have some vauge memories the the BTX thingie is part of the boostrap procedure.
I'm quire sure that the zpool is intact, I just can't reach it.
I'll check the battery, can't hurt to put in a new one.

About no 1. This is an at least half hour long battle where the key to winning is to press F9 as hard as possible, at least it worked last night.
About no 2. Replaced one that was failing last week. It resilvered just fine.
About no 3. I don't have anymore usb2 sticks, and the microserver apperently don't like the usb3 kind. I have tested that. At least it didn't like a mix of usb2 and usb3 sticks
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I have some vauge memories the the BTX thingie is part of the boostrap procedure.
I'm quire sure that the zpool is intact, I just can't reach it.
I'll check the battery, can't hurt to put in a new one.

About no 1. This is an at least half hour long battle where the key to winning is to press F9 as hard as possible, at least it worked last night.
About no 2. Replaced one that was failing last week. It resilvered just fine.
About no 3. I don't have any more usb2 sticks, and the microserver apperently don't like the usb3 kind. I have tested that. At least it didn't like a mix of usb2 and usb3 sticks
Use a single USB flash drive, mirrored flash drives don't really gain you anything, just make a backup of your configuration file after you make any real changes to your freenas configuration and you will be able to restore very easily. I of course never promote using dual USB boot drives becasue they are not true RAID and thus do not have failover capability. But if you prefer it, I'm not going to talk you out of it, it comes down to personal preference provided you know the technical details of how the system will work and it's limitations.

I wouldn't change the BIOS battery unless you feel it needs it. But since you offered up that you did change the USB stick recently it could be just you need to reconfigure the BIOS to boot from that device. You may have changed the device that was the actual configured boot device and this is what is causing your boot issue. Remember, you want to configure the boot device within the BIOS configuration screens, not select a one time boot device.

Odd that a USB 3 flash drive fails to work, it should be backwards compatible but you know your hardware better than I do. But then again I do recommend USB 2.0 devices anyway because many USB 3.x devices have problems dissapating heat and are prone to premature failure. I think devices have been getting better over the past few years but that is again a personal decision.

And your pools look fine as you said.
 

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
The pools are on NITRO, the server that's alive, although I can only reach it from one win10 machine.
I'm copying all the files from that one to my other computers for backup right now. Yeah, I'm a bit paranoid as most people that visits this forum.

Mirrored flash drives help, if one fails you just have to detach, replace it, and attach the new device. Resilver happens automatically. No reinstalls needed. I have tried the at least two times already in a few years. That feature comes in handy.

New BIOS battery didn't change a thing.
Tried the F9 trick on two different keyboards until I remembered that the gen8 microserver does not bother with the usb3 ports when booting, not even with usb3 devices (Kingston) in the usb2 slot.
Might as well restart all the switches in the house when I have copied all the files, not that I think that matters.

I have to rummage through the house to see if I have more usb2 sticks and try a new install.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I suspect that if you didn't reconfigure your BIOS when you replaced the last USB Flash drive, that is your problem. Find that keyboard and get into the BIOS to fix your boot priority.
 

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
Had an idea. Rummaged through my trash bin (the one used for electronic scrap, old ignition coils etc) and found the usb2 stick that was replaced last week.
Now the sucker boots in 11.1-U7 and the pool is there.
Still can't reach it from other computers via smb, that was actually the main question for this thread.
 

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
Worked for 15 minutes then got stuck for some kind of flash drive write error, read protected I think it was, but the server shut down on it's own accord so I just saw i for like 5 seconds.
SSH seemed to work but I had just started a scrub on the flash/boot so the system wasn't really responsive.
The SSH session kinda got stuck in no mans land after I wrote "sudo zpool status -v".
No response at all.
That got me off the chair to go check the monitor attached to the server.
 

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
Booted succesfully again with the usb stick from the trash.
SSH to the machine.
This is the requested output:

Code:
jasse@nazze:~ % zpool status -v
  pool: NAZZE_tank
 state: ONLINE
  scan: scrub repaired 0 in 0 days 04:01:53 with 0 errors on Sat Apr 13 04:02:55 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        NAZZE_tank                                      ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/a1ea6f29-b5ac-11e7-bf39-b05ada8782a4  ONLINE       0     0     0
            gptid/a2ed926d-b5ac-11e7-bf39-b05ada8782a4  ONLINE       0     0     0
            gptid/a3f41827-b5ac-11e7-bf39-b05ada8782a4  ONLINE       0     0     0
            gptid/a4f8f2fc-b5ac-11e7-bf39-b05ada8782a4  ONLINE       0     0     0
            gptid/a6027bd5-b5ac-11e7-bf39-b05ada8782a4  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub repaired 0 in 0 days 00:10:53 with 0 errors on Sun Apr 14 16:16:23 2019
config:

        NAME                     STATE     READ WRITE CKSUM
        freenas-boot             DEGRADED     0     0     0
          mirror-0               DEGRADED     0     0     0
            da0p2                ONLINE       0     0     0
            5336685993595117704  UNAVAIL      0     0     0  was /dev/da0p2

errors: No known data errors
 

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
I have wiped everything in: /var/db/samba4
and deleted: /usr/local/etc/smb4.conf
then had FN create new files, using the gui to no avail.

I'm beginning to suspect RAM failure, but that doesn't explain why my servers (and more) just diappeared from my network.
I'll reboot everything network related tomorrow.
If that doesn't fix it then it's shotgun time.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
If you suspect a hardware failure such as RAM then run the the burn-in tests and verify your hardware before taking another step forward. I'd run the RAM tests for a few days minimum, a week if I suspect a failure. Don't forget CPU testing as well. Do you have your system on an UPS? If not then any power glitch can cause serious damage/corruption.
 

Jasse Jansson

Explorer
Joined
Mar 19, 2017
Messages
71
Servers, router, one computer and most switches are behind 2 UPS'es that's verified to work fine

Shut down everything, servers, computers, swithces and router.
Startup sequence: switches, router, servers, more switches, computers.
Now everything works and all computers sees eachother.

NAZZE still cant read /usr/local/etc/smb4.conf according to the scrolling boot screen. Works fine anyway.

Feel free to explain this to me. Switch hiccup maybe.
Many thanks anyway.
 
Top