Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

FYI, Intel C2000 family of processors: System Fault may lead to dead system.

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,139
Thanks
3,867
#21

millst

FreeNAS Experienced
Joined
Feb 2, 2015
Messages
127
Thanks
14
#22
Hmm, I'd really like to see the new revision boards Supermicro is supposedly shipping. I'd bet they're adding an external clock.
SuperMicro approved the RMA for my A1SRi-2758F. They will apply a fix, but wouldn't give me any details on it (even after asking twice). Just passing 17 months of use so I'll take some scheduled downtime over the chance of getting hosed.

-tm
 

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,139
Thanks
3,867
#23
SuperMicro approved the RMA for my A1SRi-2758F. They will apply a fix, but wouldn't give me any details on it (even after asking twice). Just passing 17 months of use so I'll take some scheduled downtime over the chance of getting hosed.

-tm
Can you post some pictures once you have the board in your hands?
 
Last edited:

fta

FreeNAS Experienced
Joined
Apr 6, 2015
Messages
148
Thanks
33
#24
My mini (asrock c2750d4i) died on Saturday night. It lasted 2 years and 1 week. I'm currently waiting on ix to RMA me an advance replacement. Getting antsy as everything in my home runs off my NAS.
 

millst

FreeNAS Experienced
Joined
Feb 2, 2015
Messages
127
Thanks
14
#25
I'll try to remember. I doubt I'll ship until the end of the week and then it will be a couple weeks before I get it back.

-tm
 

Arwen

FreeNAS Expert
Joined
May 17, 2014
Messages
1,120
Thanks
547
#26
From what I've read, and my computer electronics experience, (I programmed and helped design embedded micro-controllers),
the LPC clock appears to have too high a load attached to it. Thus, wearing out the transister driver over time. A simple board
change to use an external amplifier, (even a simple TTL one), would prevent the problem.

So I doubt they are adding an external clock, unless it's 100% synchronized to the builtin one.

From a former:
- Chief programmer (for embedded boards)
- Assistant electrical engineer (for embedded boards)
 

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,139
Thanks
3,867
#27
From what I've read, and my computer electronics experience, (I programmed and helped design embedded micro-controllers),
the LPC clock appears to have too high a load attached to it. Thus, wearing out the transister driver over time. A simple board
change to use an external amplifier, (even a simple TTL one), would prevent the problem.

So I doubt they are adding an external clock, unless it's 100% synchronized to the builtin one.

From a former:
- Chief programmer (for embedded boards)
- Assistant electrical engineer (for embedded boards)
Yeah, there were some additional details that took a while to show up - I tend to agree with that assessment now. Some theories of rework being possible on assembled units were floating around, but I don't see that happening.
 

Arwen

FreeNAS Expert
Joined
May 17, 2014
Messages
1,120
Thanks
547
#28
Yeah, there were some additional details that took a while to show up - I tend to agree with that assessment now. Some theories of rework being possible on assembled units were floating around, but I don't see that happening.
It may be possible to re-work a board to prevent the problem, (if I understand it correctly).
A cut trace, added amp / TTL chip, some tiny wires, (perhaps hot glued to board after), and
all is good. Seen it a million times on computer boards, in the past.

Not saying that's what Supermicro, Asrock Rack or any vendor will do. And it almost certainly
won't fix the problem on a failed board whence the problem rears it's ugly head.

Wish Intel would give us a new generation Avoton. Not the Xeon Ds, as they are expensive. Or
the vaporware Denvertons. But something along these lines of changes;
  • PCIe 3.x instead of version 2.0
  • USB 3.1 gen 2 on at least 1 port, preferably 2 or more ports
  • Still within the 20 watt power envolope, (14nm fabrication?)
  • DDR4 memory
  • All 6 SATA ports running at 6Gpbs
  • Misc. CPU instruction updates, (ones added since 2013, the original Avoton's release year)
Should not take much engineering.
 
Last edited:

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,139
Thanks
3,867
#29
Wish Intel would give us a new generation Avoton.
They finally announced C3000 parts, with a vague later this year promise.

  • PCIe 3.x instead of version 2.0
  • USB 3.1 gen 2 on at least 1 port, preferably 2 or more ports
  • Still within the 20 watt power envolope, (14nm fabrication?)
  • DDR4 memory
  • All 6 SATA ports running at 6Gpbs
  • Misc. CPU instruction updates, (ones added since 2013, the original Avoton's release year)
Except for USB, all those are confirmed. Up to 16 SATA ports, too (at the expense of PCIe lanes).
10GbE is also confirmed.

Some parts do have a larger TDP, but others stay at 20W.
 
Joined
Sep 5, 2015
Messages
12
Thanks
4
#30
My asrock c2550d4i stopped working. It failed to respond after a night running, so i used the IPMI to check it out.
This is a selfbuild device. In use since about october 2015. In most cases the motherboard should die with a powerup. Mine died in a running situation.
This was the result:


According the cpu sensor it was on fire:


(strange that this management software not react on such strange values?)

The motherboard fails to boot now. Hard drives are spinning. No (VGA) output.
Only the IPMI Management interface works.



I hope some know solutions for the following questions:
* I am missing the 3, 5 and 12 volts values. Is this normal in a standby situation? With other words, i can rule out the PSU?
* Is there a solution for this issue? Just replacing it with the same revision is asking for troubles in the future.
* How do ik know/prove i get a fixed motherboard/cpu
 

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,139
Thanks
3,867
#31
I am missing the 3, 5 and 12 volts values. Is this normal in a standby situation? With other words, i can rule out the PSU?
Yes. The only standby rail in the ATX spec is +5VSB. For some reason, the board seems to monitor a +3.3VSB rail, which has to be locally regulated, since no ATX PSU supplies such a thing.

As for your problem, the CPU temperature reading could just be a fluke. For the board to die mid-operation, the buggy bus failing mid-operation would be a possibility, if it's being used for low-speed IO or something (which I think someone said it was).
 

fta

FreeNAS Experienced
Joined
Apr 6, 2015
Messages
148
Thanks
33
#32
My asrock c2550d4i stopped working. It failed to respond after a night running, so i used the IPMI to check it out.
This is a selfbuild device. In use since about october 2015. In most cases the motherboard should die with a powerup. Mine died in a running situation.
Mine died in the exact same way a week ago. I'm still waiting for ix to ship my RMA.
 
Joined
Oct 8, 2014
Messages
62
Thanks
2
#33
I purchased my ASRock c2750d4i for my home FreeNAS build in November 2014. In January 2016 I had to RMA it because it died with these symptoms. (IPMI worked, but nothing else.) Now, my new one died the same way two days ago and I'm waiting to get an RMA before I can send mine in. Two failures in a little over two years - so frustrating. What good is a 3-year warranty when you have to be in constant fear of losing all your data and having to possibly revert to backups? If it wasn't for this reliability problem, this would have been the perfect board for a 24x7 home based NAS. I'm not sure what to do for the long term if and when I get my RMA replacement. I'm not sure if there any other low power consumption, multiple core options out there, which is sad! You would think someone like AMD would see an opportunity here.

As a side-note, Intel and the rest of these manufacturers are really dropping the ball by apparently trying to cover this up. At my office, we have multiple high-end Cisco firewalls and switches that use this chipset across our entire network. Cisco has yet to contact us regarding this. You would think they would have at least proactively notified us before finding out only because my little home NAS died.
 
Joined
Mar 11, 2016
Messages
55
Thanks
4
#34
I also have a C2750D4I board purcahed in january 2016. I've already contacted my dealer which told me that the board is END OF LIFE and I can't get a new one. I could send the board back to get my money back at least but then my productive Freenas system is gone :(

So I've now contacted asrock directly to see if they have another board or a solution for my dilemma

Today I got response from asrock, basically they left me with a potential risk of sudden failure and I have to look for a solution to mitigate the impact if the board should suddenly decide to die :( I'm not sure how high the chances are, but I expect a 50% chance of failure in the next 12 months


Original-Response
Code:

We are aware of the unexpected failure from Intel's processors. Our R&D engineering
team has been working tirelessly with Intel to provide them the boards that failed
and coordinate with Intel to implement fixes. However, there is no evidence if this
failure occurs on all Intel-based server boards. If your Avoton-based server board
appears to have failed, and the cause of failure is confirmed to be related to Intel
silicon, we will help you replace it. 

 
Last edited:

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,139
Thanks
3,867
#37
Probably between the processor and the back panel.
 

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,139
Thanks
3,867
#39
Thanks for the picture, but I meant the top side. The back is mostly passives and what I'd guess to be switching regulators, at first glance.
 
Joined
Aug 7, 2016
Messages
5
Thanks
0
#40
I have been hit by the same issue (see post here).
So is there a better MB option for HomeMade FreeNAS servers to replace the ASRock Mini (C2550D4I) ?

There is no point of just replacing it with the same board if the issue if going to hit you again.
 
Last edited by a moderator:
Top