Ryzen Stability on 11.0-U4

Status
Not open for further replies.

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
Just wanted to post an update on the stability of my Ryzen 1700 system. I was getting random crashes on the NAS between 2-6 am every 3-5 days or so( it would lock up and display nothing on the monitor/ no error logs), anyway I have been stable for 10 days so far.

~I disabled C6. ( I don' think this did anything because some people reported a crash after disabling it)
~ I turned off SMT ( I dont think it has anything to due with SMT bug(because no high cpu loads in the middle of the night), but I did want to test it, I don't need threads anyway cores-> threads)
~I disabled Cool'n'Quiet ( think this is the cause issue, I think the chip is dropping to a very low power state and crashing. )

Next step is to turn on C6 and SMT and test for another 10 days. Then upgrade to 11.1. I expect 11.1 to be a lot more stable, it appears that Ryzen might get official support in 11.1 as well (?)(documentation).

Anyway I hope this helps some other people. Thank you everyone for working so hard on making FreeNAS amazing. Keep up the great work Devs.
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
Thanks for the update! Good to hear. I vote for trying SMT first as that's the feature I'd want back first.
 

ahze

Cadet
Joined
Apr 10, 2016
Messages
1
Thanks for the update. I will be trying the Cool'n'Quiet as I have tried both disabling C6 and SMT without success on a 1700.
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
So good news! FreeNAS-11.0-U4 Ryzen has been stable for another 10 days! I enabled everything except Cool'n'Quiet and it seems to be working as intended. My best guess is that when the NAS gets very inactive the CPU tries to go into a super low powered mode that crashes FreeNAS. So if you are getting random crashes in the middle of the night with no output or logs on a Ryzen system, I would recommend disabling Cool'n'Quiet.
Of course this is a symptom, not a solution. So a bug report should probably be filled out with either FreeNAS or FreeBSD.

I am very happy to say that other than this one issue (and bhyve windows 10 ) Ryzen (1700) has been performing perfectly on FreeNAS.

Keep up the great work!
 
Last edited by a moderator:

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
Well this is unfortunate, I crashed today. Black screen, completely random, maybe on a write cycle. I still have not setup remote sylog server, so still no logs.... Hoping for 11.1.. otherwise might have rebuild server with some intel. :(

Ryzen not stable on 11.0-U4 -- Cheers

Edit 2: https://redmine.ixsystems.com/issues/25987 Might have to reopen
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
See what 11.1 brings. It's supposed to be released next week.
 

ykhodo

Explorer
Joined
Oct 19, 2017
Messages
52
Mine is crashing on 11.1 as well. In fact, it seems to be more frequent than on 11.0 with disabled cool 'n quiet.
 
Joined
Apr 9, 2015
Messages
1,258
For those having issues with Ryzen and FreeNAS I would think about getting a copy of FreeBSD and doing some stability testing. Ryzen is VERY new and while things are being worked on it has not been recommended for use with FreeNAS as it can not be considered stable yet. Even though it should work OK with 11.1 it has not been tested.

By using straight FreeBSD and doing a bunch of benchmarks you can possibly find either that is (a) stable or (b) unstable. If it is stable then file a bug report with iXsystems, if it isn't then file a bug report with FreeBSD.

But honestly there are going to be a bunch of issues, it's new architecture and it will probably be at least another 6 months before it is good enough for use in FreeNAS and I would not look at using it until around FreeNAS 12 comes about.
 

ykhodo

Explorer
Joined
Oct 19, 2017
Messages
52
I finally ran an rsyslog server and these are the last messages that came before the black screen of death.

Dec 29 08:32:20 htpc ZFS: vdev state changed, pool_guid=9753480041264982941 vdev_guid=17184261080127710914
Dec 29 08:33:00 htpc /usr/sbin/cron[3096]: (root) CMD (/usr/local/bin/python /usr/local/www/freenasUI/tools/autosnap.py > /dev/null 2>&1)
Dec 29 08:33:00 htpc /usr/sbin/cron[3097]: (operator) CMD (/usr/libexec/save-entropy > /dev/null 2>&1)

There was only other time "zdev state changed" was called in the log from the last week of uptime and it was not quickly followed by autosnap.py nor save-entropy.
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
Interesting ykhodo, I just updated to 11.1 in the last 2 days, so I have not crashed yet, but I will also try to get my rsyslog server going to capture some more logs.

-- A dev will probably have to explain those logs to me because I am not sure what going on.
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
Made it 5 days.... Got this from my rsyslog, but I think it is meaningless.

Code:
Jan  2 03:25:00 nas /usr/sbin/cron[48050]: (root) CMD (/usr/libexec/atrun > /dev						  /null 2>&1)
Jan  2 03:25:00 nas /usr/sbin/cron[48049]: (root) CMD (/usr/local/bin/python /us						  r/local/www/freenasUI/tools/autosnap.py > /dev/null 2>&1)
Jan  2 03:26:00 nas /usr/sbin/cron[48211]: (root) CMD (/usr/local/bin/python /us						  r/local/www/freenasUI/tools/autosnap.py > /dev/null 2>&1)

Well Ryzen I tried really hard to use you and make you acceptable, but you are still not stable after a year. --- I will probably gut the server and rebuild. Save Ryzen for personal computers, its not FreeNAS stable. Maybe threadripper or EPIC is different, but 11.1 completely crashes randomly for Ryzen 1700. Cheers

Edit 1: -- opened bug report -- https://redmine.ixsystems.com/issues/27537
 
Last edited by a moderator:

ykhodo

Explorer
Joined
Oct 19, 2017
Messages
52
Made it 5 days.... Got this from my rsyslog, but I think it is meaningless.

Code:
Jan  2 03:25:00 nas /usr/sbin/cron[48050]: (root) CMD (/usr/libexec/atrun > /dev						  /null 2>&1)
Jan  2 03:25:00 nas /usr/sbin/cron[48049]: (root) CMD (/usr/local/bin/python /us						  r/local/www/freenasUI/tools/autosnap.py > /dev/null 2>&1)
Jan  2 03:26:00 nas /usr/sbin/cron[48211]: (root) CMD (/usr/local/bin/python /us						  r/local/www/freenasUI/tools/autosnap.py > /dev/null 2>&1)

Well Ryzen I tried really hard to use you and make you acceptable, but you are still not stable after a year. --- I will probably gut the server and rebuild. Save Ryzen for personal computers, its not FreeNAS stable. Maybe threadripper or EPIC is different, but 11.1 completely crashes randomly for Ryzen 1700. Cheers

Edit 1: -- opened bug report -- https://redmine.ixsystems.com/issues/27537

Ugh I really don't want to scrap my RAM, motherboard, and CPU. It's $770 in parts
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
Well I am moving everything important off it, but I think I will keep it doing Plex duties right now... but yeah I don't want to replace it with equivalent Intel parts.. especially since I would probably have to get RDIMM sticks. -- I was so happy earlier in the morning because I finally got Windows 10 in a VM on it. Live and learn I guess. Thankfully ZFS storage is pretty robust, so my movies would be fine.
 
Last edited:

ezra

Contributor
Joined
Jan 15, 2015
Messages
124
Same here, it ran for over a day, then after some burn in tests it fails somewhere after an hour of letting some commands run. No logs etc... How could one setup a remote rsyslog server, or the client from within FreeNAS, any guides?
 
Last edited by a moderator:

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
In System->General you can give a remote Syslog server. Though the output looks mostly useless right now. I think it is some type of memory leak, so watch the memory reports. Disabling Cool'n'Quiet definitively helped in 11.U4 . I am working with the Devs right now, you might try disabling the watchdog. It might take a little while, but eventually we will find the bug.
 

ezra

Contributor
Joined
Jan 15, 2015
Messages
124
Got it fixed just after my post... sorry for not looking... Got a thread on the watchdog? It seems my previous post about the fails wasnt related to freenas/ryzen.

It seemed that one of my new SSD's was massivly outputting errors, the freezes occured when burning in my SSD/HDD's took the failed SSD out and its running for some hours now without any error messages.

Suppose it will fail sometime soon, but ill keep going and wait for the fix. Please tell me if i can do anything to help debug this.
 

ezra

Contributor
Joined
Jan 15, 2015
Messages
124
Right. I've only disabled cool & quiet so far. it only froze up on me once yesterday after about a week. No black screen just frozen screen. Have to check the logs later today.
 
Last edited by a moderator:

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
This thread is specifically about the black screen instability that affects all Ryzen processes, is your issue related?
 

wackymole

Explorer
Joined
Aug 21, 2017
Messages
59
So, the devs are still investigating, but have no leads right now. So l will continue to search for symptoms and potential causes.

1) I have strong reason to believe that is it is memory (ram) related. ( My system seems to die when I use all my ram)(memory leak bug is not helping, should be patched in the next update)
2) Power management or motherboard power saving could also be related. ( Cool'n'Quiet and power optimiztions)

I have three primary questions?
1) Is everyone running ECC ram in a compatible motherboard?
2) Is anyone getting crashes running non-ECC ram?
3 What motherboard are you using? -- (mine- ASRock AB350 PRO4)

Have a great week!-- Cheers
 

Attachments

  • lastpic(freenas).png
    lastpic(freenas).png
    17.6 KB · Views: 740
Status
Not open for further replies.
Top