Degraded disk replacement problem: boot hangs after clearing CMOS

NAS-Carl

Cadet
Joined
Mar 10, 2023
Messages
4
Hello,

Build information

Motherboard: Asrock E3C236D4U​
Hard Disks: 4x Seagate ST 4000NM0033​
Problem statement
TrueNAS Scale alerted me one of my drives was degraded. I offlined the drive, powered down my server, and replaced the bad drive with a new one. When I turned on my server, it hanged with message DXE PCI Bus Enumeration 93. I cleared CMOS, then restarted the server and was able to get as far as the BIOS. From there, TrueNAS started to load, but the serial console cut out at a certain point; I am managing my system via IPMI. The boot does not appear to have been successful, as I cannot reach the web interface.

I know there is more information I should provide, but I don't know how. I'm an amateur and not very experienced.

Any insights would be appreciated. Thank you.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Well, let's start with the basics:
  • The serial console cuts out, but what about the graphical console?
  • Update the system firmware, if an update is available
  • Replace the battery with a fresh one
Also, do you have any PCIe cards in the system?
 

NAS-Carl

Cadet
Joined
Mar 10, 2023
Messages
4
Thank you for your reply.

The system firmware is up to date, and the CMOS battery is new.

I do have a couple PCIe cards in the system. One is a graphics card so I could have an HDMI connection; the motherboard only has VGA. When the serial console blanks out, I tried plugging in the HDMI but got no signal.

I should also have mentioned in my original post that I have one additional Western Digital hard disk (WD40EZRZ) in the system, plus a Samsung SSD (840 EVO) and a SK Hynix NVMe drive (BC711). The NVMe drive is my boot drive.

Here's a screenshot of the boot menu I'm able to get to:

Thanks again for your help.

Screenshot 2023-03-10 115608.png
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I do have a couple PCIe cards in the system. One is a graphics card so I could have an HDMI connection; the motherboard only has VGA. When the serial console blanks out, I tried plugging in the HDMI but got no signal.
What about the VGA output (via the physical connector or the iKVM functionality)?
 
Joined
Jun 15, 2022
Messages
674
Pull disconnect the data cable from the new drive and see if the system boots; a BIOS incompatibility (drive BIOS vs. system BIOS) can cause these symptoms. (I've actually run into this, though it's pretty rare in my experience.)
 

NAS-Carl

Cadet
Joined
Mar 10, 2023
Messages
4
I found a workable solution, thanks to your inputs.

After clearing CMOS, I entered BIOS and disabled the onboard VGA output. @Ericloewe's question about VGA output brought this to mind as something to try. I then proceeded with the OS boot as I had before, but this time, the console remained on IPMI, so I was able to see what I was doing. TrueNAS did indeed boot up, though for some reason my network configurations no longer worked.

After resetting network configs, I was able to get back onto the web interface, at which point I redid the drive replacement. This time, I didn't actually shut down the server; I realized that I had misread the instructions in the documentation (https://www.truenas.com/docs/scale/scaletutorials/storage/pools/disks/replacingdisks/). I thought it said to shutdown the system, but it doesn't. Instead, I just offlined the faulty drive, then swapped it out with the replacement. Then I selected replace drive, and it worked! System currently resilvering the RAIDZ2.

Thanks again for your inputs, I appreciate it.
 
Top