RAM Fried!

Status
Not open for further replies.

DrMuffin

Dabbler
Joined
Feb 8, 2014
Messages
11
Out of nowhere my server went down, the fans and hard drives were spinning but I couldn't access the GUI. I could only access the board via IPMI. A forced shutdown and reboot gave me beep codes for a memory error. Here is what I found:

http://imgur.com/a/gWjYt

It looks to me that it burnt somehow, which I've never seen before! I'm getting a refund and switching to Samsung from now on, I just hope my data and board are OK. The board boots fine and I can access it via IPMI still, so I think it survived. I won't know for sure till some new RAM comes in though.

The RAM was a Crucial stick model CT102472BA160B so if you have the same you may want to take a look. Or I could have just got a lemon, which is likely because Crucial has been good in the past.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
So, Cyberjock, D4nthr4x, and I, are all looking over these photos. If your camera can do super-macro mode or something, and you can really get on to these surface mount components, we can give you more pointers, but, we all feel like we see evidence of some of the surface mount components shorting out. But it's hard to see. It would be very very very unusual for Crucial/Micron server memory DIMMS like this (assuming you bought them new) to have this kind of screw up. Almost unheard of. Not sure what happened.

Also, your IPMI has its own whole deal. Think of it as a second autonomous computer on the motherboard. Just because that works doesn't mean the main motherboard functionality is still alive.

We'd be interested to find out exactly what does and does not work once you try it with new memory.
 

D4nthr4x

Explorer
Joined
Feb 28, 2014
Messages
95
So on the middle picture, is that blue color copper corrosion on the stick? If you can get a better picture that would be awesome. Also can you see if there is any corrosion on your motherboard. Is this server stored in a very humid climate? Also what is your PSU? I'm starting to think it's not the memory but possibly a different issue.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'd make this kind of assumption....

1. If the system works with the new RAM it was probably a short or something in the RAM.
2. If your system won't POST with the new RAM its also obvious that the system has been damaged. That could be from overvolting and/or a short that applied voltage somewhere it shouldn't have.

Normally I'd expect that if there was an overvolt condition leading to overcurrent of this magnitude you'd see current damage on the motherboard itself. I do have concerns considering the actual pins on the RAM appear to have been heat stressed. So this *may* be a dead system. Does the IPMI log any overvoltage or anything in the logs? That would be the place to look.

On the plus side, your hard drives shouldn't have been damaged unless it was a PSU problem. And my guess is it wasn't since you can power it on enough to get into the IPMI without more components smoking themselves.

If you can get high detail pictures of the RAM(maybe from an angle too) that might be interesting. Corrosion can also occur rather quickly when heated to the point that the metal wants to bond with the oxygen in the air. So the corrosion could be a symptom or a cause. But, considering the conditions your server was in I'd think it is more likely a cause.
 

DrMuffin

Dabbler
Joined
Feb 8, 2014
Messages
11
Alright guys, I remembered I have a jeweler's magnifying glass so I used that with my phones camera to try and get some good macro shots:

http://imgur.com/a/q4CSe

Looking at these you can really see the corrosion, or at least what looks like corrosion. The environment is not humid or really hot, the hard drives stay between 27C-30C for what it's worth.

The IPMI event log doesn't show anything, just some chassis intrusion which doesn't help.

My power supply is a Rosewill RV350-2, not the greatest PSU in the world, but it does the job. I'm still baffled at this whole corrosion deal though!
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Definitely looks like corrosion.

As I said, almost IMPOSSIBLE to conceive of Micron shipping DIMMs like this. This is almost certainly from some kind of humidity problem or liquid spillage. COuld be some kind of electrolyte from one of the surface mount components or something, I don't know enough to be certain.
 

D4nthr4x

Explorer
Joined
Feb 28, 2014
Messages
95
$30, non-80+, Rosewill PSU. I wouldn't put that in any computer let alone a storage server. It could have certainly been failure from other components, but the likely poor regulation of that PSU likely contributed to your problems. I had corrosion on a mobo I'm not sure what caused it then either. I would recommend pulling the PSU cables out of the mobo and checking for signs of singeing on them as well. Sometimes that is hidden by the connectors. But my view on PSUs is: if the ram dies my pc shuts off data is fine, if the cpu dies my pc shuts off data is fine, if the mobo dies the pc shuts off data is fine, if a hdd dies I replace it because I have redundancy, if the psu dies it potentially takes everything with it and you lose all of your data, or you lose every other component.
 

DrMuffin

Dabbler
Joined
Feb 8, 2014
Messages
11
Very true, my PSU was a leftover and is next on my upgrade list. I had planned on getting the Antec EarthWatts EA-380D some time soon, unless someone has a better recommendation on an efficient and quiet PSU?
 

D4nthr4x

Explorer
Joined
Feb 28, 2014
Messages
95
I like Seasonic 80+ Gold PSUs. They don't appear to be on sale. But maybe try the new ram first then we can go from there. Possibly get a PSU tester (antec makes one) and see what the outputs read at. Not that a tester will be a real world test.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
But my view on PSUs is: if the ram dies my pc shuts off data is fine, if the cpu dies my pc shuts off data is fine, if the mobo dies the pc shuts off data is fine, if a hdd dies I replace it because I have redundancy, if the psu dies it potentially takes everything with it and you lose all of your data, or you lose every other component.

Quoted for truth. That's exactly how I look at it.

Anyway, I took a look at those new pictures. Some of the corrosion looks like some kind of chemical(or food) has interacted with the metal. Some things are very caustic when evaporating and can create their own chemical reactions that can cause corrosion like what you are seeing. Not sure if you've used some kind of paint, paint thinner, or other really nasty chemical that might cause something like this. Acetone would potentially look like that too, but you'd probably know better if you were using that in an industrial environment.

If you have similar corrosion on your motherboard you should consider replacing it. I'm not sure what kind of environment your server is in, but I keep mine in my basement. It stays much more damp than I'd like, but I don't even see corrosion like that. If I knew I would be using any kind of chemical in my basement I'd definitely ventilate my basement before, during and after using said chemical.

At first the pins on your RAM stick that were damaged looked like heat damage. But the new pictures look like some had the gold plating eaten right off the pins. Almost as if it had been dipped in acid. That iss surely not the case since all of the pins would have damage. Other pins look like it may have been melted off. I'm not sure if it would have gotten *that* hot though, but who knows.

In any case, it definitely looks to me like the blue corrosion probably resulted in some kind of short that lead to the catastrophic failure of the RAM stick you are seeing. Assuming the other components are still in usable condition I'd expect everything to boot. But again, if any other components have that kind of corrosion then you not only know that this is somehow self-inflicted(and you should seek out the problem) but they very likely will fail at some point in the future and should be replaced proactively.

Personally, I don't even consider Rosewill PSUs to be something I'd ever use in any computer except one I want to burn up. They're just not exactly high-end stuff, and I'm all about buying the best because of the exact mentality D4nthr4x mentioned above.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Very true, my PSU was a leftover and is next on my upgrade list. I had planned on getting the Antec EarthWatts EA-380D some time soon, unless someone has a better recommendation on an efficient and quiet PSU?

You can't go wrong with Seasonic. You could get along with something like a budget Corsair, but at low wattages, Seasonic has no competition (in ATX consumer PSUs - I'm sure Delta could run circles around Seasonic's G-Series, but the price would be prohibitive).
For a small server, I typically recommend the Seasonic G-series. A G-360 should be plenty for up to 10 5400RPM disks, but you might want to upgrade to the G-450, since the G-360 isn't modular (the G-450 is semi-modular).

For reference, 80+ is a sort of minimum indicator for quality. If it doesn't manage that, something is seriously wrong (some exceptions apply for extra-large PSUs that would draw too much current at 110V, which means they can't be tested a 110V and thus cannot be certified). 80+ Gold typically gives you some confidence that the product is decent enough to trust.

For greater peace of mind, read PSU reviews. They're almost predictable after you've read a few. There are few surprises these days.
 
Status
Not open for further replies.
Top