Server Unavailable all of the sudden

Status
Not open for further replies.

cuvy

Dabbler
Joined
Jun 12, 2015
Messages
40
Hi there,

I woke up this morning and my server was unavailable. I connected to the remote console via IPMI and this is what I'm seeing over and over again:
Screen%20Shot%202015-06-28%20at%2012.25.13%20PM.png

I can't reach the FreeNAS control panel, I can't ssh into the server nothing. I tried reset the server as well.

I research this a little but honestly I'm not sure what to search for. Any help would be appreciated.

Thanks!

Hardware:
FreeNAS-9.3-STABLE-201505130355;
RAIDZ2 (5x 9x3TB);
Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz;
34GB RAM
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Looks like an SAS-related issue.

Is your SAS controller flashed to P16 IT mode?

Also, please list all the hardware and FreeNAS version, as required by the forum rules.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I concur that it is probably a SAS related issue. Please list all of your hardware, including the type of HBA and what it's been flashed to, if anything.
 

cuvy

Dabbler
Joined
Jun 12, 2015
Messages
40
Hi there, sorry it took me a while to post the hardware information: http://f.uberfive.com/3g0B0u2g0X0Z

Quick update, I've rebooted both the server and the 45-drive chassis attached to it via SAS and it rebooted correctly. I have 4 hard drive with errors (in 3 arrays) so Im slowly changing the hard drive. Here are the Hard Drive errors I'm seeing:

  • CRITICAL: Device: /dev/da21 [SAT], 16 Currently unreadable (pending) sectors
  • CRITICAL: Device: /dev/da21 [SAT], 16 Offline uncorrectable sectors
  • CRITICAL: Device: /dev/da39 [SAT], ATA error count increased from 297 to 298
  • CRITICAL: Device: /dev/da45 [SAT], 8 Currently unreadable (pending) sectors
  • CRITICAL: Device: /dev/da45 [SAT], 8 Offline uncorrectable sectors
  • CRITICAL: Device: /dev/da52 [SAT], Failed SMART usage Attribute: 184 End-to-End_Error.
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
That would definitely explain why your server went unavailable. In fact, you may find yourself unable to rebuild the zpool because of the number of failed disks. :(
 

cuvy

Dabbler
Joined
Jun 12, 2015
Messages
40
Thankfully, it is not in the same array I have 5 different arrays of 9 drives. One thing I am wondering is this, can I replace more than one drive at a time if they are not in the same RAID array?

For example
raidz2-0
da1
da2
da3
da4
da5
da6
da7
da8
da9

raidz2-1
da10
da11
da12
da13
da14
da15
da16
da17
da18

Could I replace da1 and da10 at the same time? My gut tells me yes but I'd rather have someone else opinion before doing so.

Thanks :)
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Wait a second.

You have a JBOD? On a FreeNAS? Explain.
 

cuvy

Dabbler
Joined
Jun 12, 2015
Messages
40
I have a 36-drive server running FreeNAS and I have a 45-drive chassis connected via SAS.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I understand that, but you are running them as a JBOD? You cannot run a JBOD in FreeNAS.

Perhaps you are just using JBOD in some other sense than I am used to hearing it.
 

cuvy

Dabbler
Joined
Jun 12, 2015
Messages
40
Oh, maybe it's a french Canadian thing haha. We call our big 4U chassis "jbod" servers. I understand how it can confuse things, I've edited my post above. For the record, all hard drive are using ZFS RAID-Z2 9 hard drive per array (vdev).
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I understand that, but you are running them as a JBOD? You cannot run a JBOD in FreeNAS.

Perhaps you are just using JBOD in some other sense than I am used to hearing it.
Some people call a chassis full of disks attached to an SAS expander which connects to the host a JBOD.
Probably because it has little logic beyond the disks.
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
I understand that, but you are running them as a JBOD? You cannot run a JBOD in FreeNAS.

Perhaps you are just using JBOD in some other sense than I am used to hearing it.

I think JBOD in the true sense is exactly what you can run on FreeNAS. Perhaps you are thinking of the false 'JBOD' mode some RAID controllers have when they present each disk separately abstracted as an 'array' of one.
 
Joined
Jul 3, 2015
Messages
926
Hi there,

I woke up this morning and my server was unavailable. I connected to the remote console via IPMI and this is what I'm seeing over and over again:
Screen%20Shot%202015-06-28%20at%2012.25.13%20PM.png

I can't reach the FreeNAS control panel, I can't ssh into the server nothing. I tried reset the server as well.

I research this a little but honestly I'm not sure what to search for. Any help would be appreciated.

Thanks!

Hardware:
FreeNAS-9.3-STABLE-201505130355;
RAIDZ2 (5x 9x3TB);
Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz;
34GB RAM

I've seen this twice before and both times it was a result of the wiring in the backplane being configured incorrectly. I say incorrectly but supermicro say there are a few different ways to wire the backplane depending on which OS your running. The issue appeared only when a 45 bay JBOD was used and disks were inserted into the back, generally in multipath mode. You may want to check your wiring.

The original wiring method was HBA Input into Pri J0 and Sec J0, the cascade cable to the rear was going out from Pri J1 and Sec J1. After reconfiguring by moving cascade cable from J1 on both Pri and Sec to J2 on both Pri and Sec all worked great.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I've seen this twice before and both times it was a result of the wiring in the backplane being configured incorrectly. I say incorrectly but supermicro say there are a few different ways to wire the backplane depending on which OS your running. The issue appeared only when a 45 bay JBOD was used and disks were inserted into the back, generally in multipath mode. You may want to check your wiring.

The original wiring method was HBA Input into Pri J0 and Sec J0, the cascade cable to the rear was going out from Pri J1 and Sec J1. After reconfiguring by moving cascade cable from J1 on both Pri and Sec to J2 on both Pri and Sec all worked great.
Well, if that fixes the OP's problem, then that's a real nice contribution from a new forum user. Thanks Johnny Fartpants.
 

cuvy

Dabbler
Joined
Jun 12, 2015
Messages
40
I've seen this twice before and both times it was a result of the wiring in the backplane being configured incorrectly. I say incorrectly but supermicro say there are a few different ways to wire the backplane depending on which OS your running. The issue appeared only when a 45 bay JBOD was used and disks were inserted into the back, generally in multipath mode. You may want to check your wiring.

The original wiring method was HBA Input into Pri J0 and Sec J0, the cascade cable to the rear was going out from Pri J1 and Sec J1. After reconfiguring by moving cascade cable from J1 on both Pri and Sec to J2 on both Pri and Sec all worked great.

This is very interesting. There are terms I do not fully understand:

* Currently my 45 bay JBOD has hard drive in the back only, the front is still empty, I'm slowly adding new hard drive every week. It seems I have the same setup that you had. That being said, what does "multipath mode" means? Is this a settings in FreeNAS? Does it mean the hard drive in my arrays are not next to each other physically?
* When you are talking about the wiring method, I'm not sure I understand the terms here. What does J0, J1 and J2 means? Is that the SAS input on the JBOD? Pri and Sec means Primary and Secondary?

I have 1 SAS plug on my FreeNAS server (http://f.uberfive.com/image/2x323d3P0I0K) and my Supermicro JBOD has 4 (http://f.uberfive.com/image/0Z2o1R0G233O). Would it means the cable that goes from the JBOD to the server isn't using going out from the correct output?
 
Joined
Jul 3, 2015
Messages
926
This is very interesting. There are terms I do not fully understand:

* Currently my 45 bay JBOD has hard drive in the back only, the front is still empty, I'm slowly adding new hard drive every week. It seems I have the same setup that you had. That being said, what does "multipath mode" means? Is this a settings in FreeNAS? Does it mean the hard drive in my arrays are not next to each other physically?
* When you are talking about the wiring method, I'm not sure I understand the terms here. What does J0, J1 and J2 means? Is that the SAS input on the JBOD? Pri and Sec means Primary and Secondary?

I have 1 SAS plug on my FreeNAS server (http://f.uberfive.com/image/2x323d3P0I0K) and my Supermicro JBOD has 4 (http://f.uberfive.com/image/0Z2o1R0G233O). Would it means the cable that goes from the JBOD to the server isn't using going out from the correct output?

Multipath is when you connect your JBOD to your server twice using multiple HBAs for resilience (not all JBODs support this). It may not be exclusive to this issue but both times I've seen it its been on multipath setups.

The wiring I'm talking about is inside your 45 bay JBOD. The connections I refer to will be visible once you've opened up your JBOD.

As a test you may want to slot your rear drives in the front and see what happens as like I said it only effects the rear.

supermicro as default ship their JBODs in the initial config I mentioned and on Solaris and Nexenta its appears not to be an issue however FreeNAS dosent like it and wants it wired the other way.
 

cuvy

Dabbler
Joined
Jun 12, 2015
Messages
40
This is very helpful, thank you very much!

If I move the hard drive to the front of the JBOD, will this mess up (data loss) my current pool and vdevs?
 
Joined
Jul 3, 2015
Messages
926
I don't see why but perhaps let some others on the forum validate my comment before trying it as I wouldn't want to be responsible for you loosing your data.
 
Joined
Jul 3, 2015
Messages
926
Naturally the system would need shutting down first before you moved the drives.
 
Joined
Jul 3, 2015
Messages
926
In theory you should be able to take all your disks out and put them into my JBOD and your pool would still be fine so long as you export the pool first before you import however this is done during shutdown as standard.
 
Status
Not open for further replies.
Top