Ryzen Stability on 11.0-U4

Status
Not open for further replies.

diskdiddler

Wizard
Joined
Jul 9, 2014
Messages
2,377

ezra

Contributor
Joined
Jan 15, 2015
Messages
124
Unfortunately no new BIOS update to address this issue. Will probably be resolve in the new AMD chips. Zen2. I am trying a new possible solution.
http://forum.asrock.com/forum_posts...inux-freezes-on-asrock-x370-taichi-c6-enabled

C6 State - Package - Disabled
C6 State - Core - Enabled

Disabling the C6 Package
We know its an idle power issue, maybe this will work.

Any update on this?

What speeds and voltages is everyone running on the RAM?
Auto.

Tried updating asrock bios to 4.60 and tinker the various new settings for power state to no avail...
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
Any update on this?
It lasted for 3 weeks, but sadly it still blacked out in the end. The computer is just collecting dust right now, might try with a ryzen 2 cpu. It would still be a very power workstation for someone in the future.

EDIT: My Flash drive actually failed... I will have to restart the test when I get a new drive. Did not reboot it before posting
 
Last edited:

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
It does not look like they solved the issue in 2xxx. The hardware error only happens at complete idle so anything that keeps it from touching idle should prevent the error from happening. Disabling C6 completely, running a program continuously or overclocking will solve the issue. I might do a slight overclock on my system and report back, weird for a server, but the issue is still not acknowledged by AMD. :(
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
I think someone should do some tests with the RAM timings. I built a Ryzen 5 1600 HTPC recently and I had to underclock the RAM. I used the spec'd timings and voltage but set a slower frequency. My system went from random crashes and other issues to fully stable.
 

ezra

Contributor
Joined
Jan 15, 2015
Messages
124
This is just to weird, havent touched my system for over a month i guess ( related to these lockups ) i still run the monero mining app on all cores at 50% when idle only. No freezes what so ever.

I did install the latest Mobo firmware and it had a lot of new options like the C6 but... its just one big magic show for me in the bios... Tried various configs but cant really say anyone made a big difference.
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
My usb drive failed during the test, I will get a new one and restart the test.

Edit: Started the test 3 days ago on a SSD.
I will check again in a month... cheers
 
Last edited:

tekdad

Cadet
Joined
Jul 1, 2017
Messages
1
I got my ryzen 1700 CPU at launch and decided to try freenas, but would get the random lock ups with nothing helpful in the logs. So I switched to Ubuntu and had the same issues until I used this https://github.com/r4m0n/ZenStates-Linux


Perhaps this may help with freenas too? I've been looking at coming back to freenas.
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
I got my ryzen 1700 CPU at launch and decided to try freenas, but would get the random lock ups with nothing helpful in the logs. So I switched to Ubuntu and had the same issues until I used this https://github.com/r4m0n/ZenStates-Linux


Perhaps this may help with freenas too? I've been looking at coming back to freenas.

Update BIOS and Play with Advanced > AMD CBS > Zen Common Options > Power Supply Idle Control. -- Should be three options there. You need to select the option that disables (some)C6 power saving.

Still testing, at day 11. I will wait a full 30 days before making any proclamations. On Nightly builds with my 1700. After 30 days I will test again with a full hard drive and daily active use before making any recommendations to the community.
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
Good news everyone it has been 31 days running, no problems! I have been running the 11-Nightlies. So I am optimistic, but I am not ready to declare victory. I will load up a complete FREENAS setup and test for another 30 days. And if all goes well I will post a fresh post and declare it stable. This has been almost a year since I have tried to get a stable Freenas setup, I had to move my business setup to another solution while I battled Ryzen Freenas, but hopefully this signals a new start.

This all stems from AMD faulty hardware deep sleep (c6) mode that has affected all OS (windows, linux...), solutions to this before the lasted BIOS were to overclock( so the system never entered deep sleep), run a bare metal hypervisor on the top, or change c6 settings somewhere else. The BIOS update disabled some C6 features so, you are actually losing some really low power usage <10 watts, a fair trade for servers that have periods of inactivity. I don't know what solutions they developed for the EPIC processor line, but Ryzen and threadripper were confirmed to be affected. I don't know for sure on 2xxx. Anyway I hope to get the next 30 day test started soon on a regular freenas branch.


-WM
 

Sejrup

Cadet
Joined
Oct 2, 2016
Messages
1
Big thanks to you for testing this out. I was in the same situation with a Ryzen build freezing for no apparent reason. Unfortunately I found this thread after having replaced mobo and CPU with an Intel. But perhaps I will revert to Ryzen again :smile:
 

ykhodo

Explorer
Joined
Oct 19, 2017
Messages
52
Good news everyone it has been 31 days running, no problems! I have been running the 11-Nightlies. So I am optimistic, but I am not ready to declare victory. I will load up a complete FREENAS setup and test for another 30 days. And if all goes well I will post a fresh post and declare it stable. This has been almost a year since I have tried to get a stable Freenas setup, I had to move my business setup to another solution while I battled Ryzen Freenas, but hopefully this signals a new start.

This all stems from AMD faulty hardware deep sleep (c6) mode that has affected all OS (windows, linux...), solutions to this before the lasted BIOS were to overclock( so the system never entered deep sleep), run a bare metal hypervisor on the top, or change c6 settings somewhere else. The BIOS update disabled some C6 features so, you are actually losing some really low power usage <10 watts, a fair trade for servers that have periods of inactivity. I don't know what solutions they developed for the EPIC processor line, but Ryzen and threadripper were confirmed to be affected. I don't know for sure on 2xxx. Anyway I hope to get the next 30 day test started soon on a regular freenas branch.


-WM
So update my BIOS to the latest? c6 is disabled by default?
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
So update my BIOS to the latest? c6 is disabled by default?
Yes update to the latest BIOS, it does not disable C6 completely unless you do some other tricks. AMD CBS > Zen Common Options > Power Supply Idle Control -- I used the "typical power settings", it might be a different name/ area depending on manufacture.

Edit 1: Just go some Windows WM working properly ( its not straightforward-> 2Vcpu for install[ NOT 1vcpu] -> install Virto Network driver for >2 cpu). .. ehh the 11.2 beta is nice, but there are still quite a few bugs.

Edit 2: Plex doesn't work in the beta.

Edit 3: the Virto Network drive is not required "sometimes" for more than 2 cpu, depend on how you installed windows- current bug on e1000 drive with windows (restricted to 2 processors) - ref ( https://redmine.ixsystems.com/issues/24350)

Edit 4: Just for reference only 1VCPU doesn't work right, because windows selects wrong HAL, however due to bug that was just fixed in 11.2 beta AMD CPUs could only install with 1 vcpu -resulting in a lagging, unstable mess. -> 11.2 beta has a bunch of bugs, no crashes, I will probably try do my 30 day test on 11.2 beta- already completed another 7 day test on 11.1

Edit 5: My windows VMs keep crashing randomly... :(
 
Last edited:

ykhodo

Explorer
Joined
Oct 19, 2017
Messages
52
So this is still broken for me but in a new way. It's not a soft lockup, the system just freezes.

I enabled `typical power settings`, disabled `cool 'n quiet` and disabled all c states.

The way to reproduce is to start a scrub on the main data store. The only thing that fixes this issue is disabling SMT.
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
ykhodo, I am sorry you are still having problems. What is the difference between your soft lockup and the freezes? What branch?
I reset all my BIOS settings to default except for SVM (VMs).
I have started a scrub, I will edit this when it finishes. Should take less than a day. ~ 4 hours actual estimate 2*4TB

Edit 1: late on edit, but it finished after 4 hours.. no problem. I am on the 11.2 beta though

At least you actually have errors now, the devs can start fixing some other bugs now
 
Last edited:

ykhodo

Explorer
Joined
Oct 19, 2017
Messages
52
ykhodo, I am sorry you are still having problems. What is the difference between your soft lockup and the freezes? What branch?
I reset all my BIOS settings to default except for SVM (VMs).
I have started a scrub, I will edit this when it finishes. Should take less than a day. ~ 4 hours actual estimate 2*4TB

The soft lockups would just turn the screen black. The current freezes I can actually see STDERR messages on tty1 from some of my jails, but I can no longer ping the system and I can't switch ttys. No kernel panics or anything.

I am on FreeNAS-11.1-U5

I just checked the logs and saw the following:

Code:
Jul  6 04:37:31 hadrian /middlewared[245]: dnssd_clientstub DNSServiceRefSockFD called with invalid DNSServiceRef 0x82001a8a0 FFFFFFFF DDDDDDDD
Jul  6 04:37:31 hadrian /middlewared[245]: dnssd_clientstub DNSServiceProcessResult called with invalid DNSServiceRef 0x82001a720 FFFFFFFF DDDDDDDD
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
T
I am on FreeNAS-11.1-U5
I hate to tell you to wait, but 11.2 had a ton of bug fixes in it, and some ryzen fixes. The 11.2 beta is pretty buggy, but it hasn't crashed on me so far. I still want a stable windows VM so I am not going to start a 30 countdown until I can stabilize it.
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
I have a Ryzen 5 1600 and a ASROCK AB350M Pro4 paired with GSKILL Trident Z DDR4 3200 running Windows. I've found that even though I have the latest BIOS, I am unable to run at 3200 GHz as Windows will crash and reboot. I can, however, get a stable system at 2800 GHz. Have you all tried dialing down the memory frequency?
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
@wackymole was running 11.2 nightlies to get ryzen stability.
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
I have a Ryzen 5 1600 and a ASROCK AB350M Pro4 paired with GSKILL Trident Z DDR4 3200 running Windows. I've found that even though I have the latest BIOS, I am unable to run at 3200 GHz as Windows will crash and reboot. I can, however, get a stable system at 2800 GHz. Have you all tried dialing down the memory frequency?

You are talking about just Windows only, no VM right? Ryzen 1 had some serious RAM incompatibility at the start, because of the new architecture. I think there are still some issues, but this has nothing to do with freenas or windows VMs. That would be between manufactures motherboard and ram manufacture. I am running server ECC at 2400 MHZ. There is plenty of resources to help you out on RAM compatibility buying now. I think it has improved a lot since launch and 2xxx.

The issues I am facing now have to do with the new interface and bhyve in 11.2 nightlies. One of the big bugs is shutdown, does not seem to actually do anything to the VM. It says it's off and it continues running. Or the VNC ports that are completely wonky...

-WM
 
Last edited:
Status
Not open for further replies.
Top