SOLVED A lot of errors and reboots, unusable

Status
Not open for further replies.

HertogArjan

Dabbler
Joined
Oct 16, 2016
Messages
30
EDIT: I found something else which is interesting to say the least. In memtest it said that ECC was disabled. I wanted to check this and sure enough in the bios all 4 connected modules were labeled as NON-ECC, even though they definitely are ECC modules. I am not entirely sure anymore if it was labeled ECC a while back, but I seem to recall it was. I looked in the bios if there maybe was some option I could enable or disable but could not fine anything. Can anyone explain this? My memory modules were all Crucial CT102472BD160B.


I made a few photos but I doubt you will be able to tell there is shorting on any photos I make. I'd have remove the motherboard again. I do have another case, I could try to run it in that, but that would take some time. Currently it is 2:45 AM here, I'll be going to bed now, if I'm going to try that it will have to happen tomorrow. Meanwhile the memtest will run. I apologize for the mess with cables, I would have made it a little more neat, but after swapping out the motherboard so often I just stopped caring about it. No wires are touching the chassis and airflow is also barely obstructed. Note the drives are also currently not powered even though they seem connected, I disconnected the cables from the PSU. I don't want to risk the memtest failing as a result of issues with SAS controller even though it would be unlikely. I also added some photos showing the actual errors, might be useful in determining the source of the problem. They are actually old screenshots, but they show the exact same error, only now not during boot, but that is only because now it is not trying to import the volumes during boot. Here you go:

First the crash:
Crash1.jpg
Crash2.jpg
Crash3.jpg


System photos:
IMG_3737.JPG
IMG_3738.JPG
 
Last edited:

HertogArjan

Dabbler
Joined
Oct 16, 2016
Messages
30
That's because you have an i5 installed.

Ooh, good point! Can not believe I had not thought of that. Guess that is just temporary then.

I might just place the motherboard in another chassis and see if that helps. Also going to look closer at the board, see if there is not anything that is shorted out.

EDIT: Reseating motherboard in another chassis did not work. Since my CPU is probably defective, I thought perhaps the SAS controller firmware actually got corrupted when I flashed it before. Reflashing with the i5 unfortunately did not make a difference.

EDIT 4-11: Does anyone have anymore suggestions? I installed a ssd in the system with Windows installed on it and ran some disk benchmarks to get the data flowing. It has been stable for hours and I am starting to think the sas controller might not be unstable with all drives, or windows just somehow does not care when the sas controller resets. I have been in contact with the Supermicro support and he also suggests now that it might be somehow the volumes and that I should try rebuilding them. However, I have been using these volumes in my older system for a long time now and never experienced any issues. Since this is also more Freenas or ZFS territory, I hoped someone here could shine some light on the issue.

EDIT: Right after I edited this message I got a BSOD when starting up windows on the system, claiming that 'a required device isn't connected or cannot be accessed.' and that it needs to be repaired. Pretty sure windows is not going to be able to repair anything. It does seem the SSD isn't stable either.
 
Last edited:

HertogArjan

Dabbler
Joined
Oct 16, 2016
Messages
30
It has been almost two months since I last edited this topic, so I would like to bump it up once more. I RMA'd the motherboard and the CPU once more, but a problem still remains. The "mps0: IOC FAULT" error still occurs when I try to connect my drives to the SAS ports. My system has been stable with the ordinary SATA ports for nearly 300 hours of uptime. I have purposefully been stressing the disks which would have normally caused problems, but it did not fail. This is the most progress I have had so far. Still the board is of little use to me if I cannot get the SAS ports to work. Supermicro seems reluctant to offer any support since FreeBSD is not an "officially supported OS" for the X10SL7-F-P and unfortunately they are in their right to do so. I cannot fully prove that FreeNAS is not to blame. However, I have seen so many people on this forum use this board for FreeNAS that I do not believe FreeNAS is not compatible with this board. I would really like to see some suggestions from members who own this board if they perhaps see something wrong with my configuration. I looked at an old X10SL7-F topic and specifically at the server configurations in their signatures. I noticed two things; all I could see were combinations of the board with an xeon and BIOS 2.0 also seemed to be popular, while the latest version is 3.0a. Could it just be as simple that my i3 CPU is incompatible with the SAS controller? Or that the latest bios actually breaks it? The latter seems very unlikely.
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
Could you not remove your pool drives (to protect them) and load up some version of a supported
OS and test the controller ports with the help of SM tech support?
 

HertogArjan

Dabbler
Joined
Oct 16, 2016
Messages
30
I actually had the same idea already and in the last reply to SM tech support I asked specifically what OS is officially supported, but I have not received a reply back yet. Probably because of the holiday season. I will have to wait for their reply.

EDIT 25-3-17:
A quick update: about a month ago I decided to boot FreeNAS on my newer hardware again and strangely enough it booted without any problems. Note that I haven't changed anything hardware wise, neither have I performed any bios updates or firmware updates. It was a little too soon to declare the system stable back then, but after a month has passed and I haven't had any problems with my system I am confident that it is stable now. It must have been fixed in an FreeNAS/FreeBSD update, because that is the only thing I have updated. I don't think I am going to understand it any more than I do now. It's an inconclusive ending, but an ending nonetheless.
 
Last edited:
Status
Not open for further replies.
Top