Daken00 Posted February 19, 2021 Share Posted February 19, 2021 Hi all! I'm running unRAID 6.8.3 on a server with a Gigabyte Tech B450 I AORUS PRO motherboard with an AMD Ryzen 5 2600 processor, 16 GB DDR4, RAID 4 - 1 8TB Parity drive, 1 8TB storage drive, 256GB cache drive. Plug-ins are CA Auto Update Applications, CA Backup / Restore Appdata, Community Applications, Dynamix Cache Directories, Dynamix SSD TRIM, Dynamix System Temperature, Fix Common Problems, Nerd Tools, Tips and Tweaks, Unassigned Devices, Unassigned Devices Plus and User Scripts. Docker containers are Plex, Tautulli, Sonarr, Radarr, Jackett, DelugeVPN and rTorrentVPN. All are bin-hex or linusserver. I have been running unRAID for about a year. In December, I started getting server freezes (no Web UI, console unresponsive) that required a hard reboot. And this seemed to be happening every 3-5 days. No consistent timing. I've worked with the guy who built my server, and we've tried a few tweaks, but nothing lasts more than a week. I had even gotten to the point of daily reboots, which I wasn't happy about but did seem to keep the freezes at bay. I've had suspicions that DelugeVPN might be causing issues during downloads, since most of the server freezes were overnight, but I've had a couple during the day, including this morning, when nothing was downloading. Today all that was happening on the server was a local Plex stream. I mirror my syslog to the flash drive, so I have full logs back to the beginning of the month (attached). That includes four separate freezes (2/9,2/13,2/16,2/19). Any help you guys can provide would be greatly appreciated. syslog.zip fett-diagnostics-20210219-1534.zip Quote Link to comment
ChatNoir Posted February 20, 2021 Share Posted February 20, 2021 Hello, Did you check this section of the FAQ ? Quote Link to comment
JorgeB Posted February 20, 2021 Share Posted February 20, 2021 Also run memtest, multiple programs are segfaulting. Quote Link to comment
Daken00 Posted February 20, 2021 Author Share Posted February 20, 2021 @ChatNoir I had not seen that question in the FAQ, Working down through it now. @JorgeB I had run memtest a few months ago in trying to figure this out, but only ran 1 or 2 passes and had no errors. How may passes should I let it run? Quote Link to comment
JorgeB Posted February 21, 2021 Share Posted February 21, 2021 16 hours ago, Daken00 said: How may passes should I let it run? The usual recommendation is 24H, though in most cases if there's a problem a few hours are usually enough to find it. Quote Link to comment
Daken00 Posted February 21, 2021 Author Share Posted February 21, 2021 9 hours ago, JorgeB said: The usual recommendation is 24H, though in most cases if there's a problem a few hours are usually enough to find it. Okay, it ran over 24 hours and a full 8 passes with no errors. Hopefully we are good there. Now that that test is done, I'll start looking at the FAQ stuff. Quote Link to comment
Daken00 Posted February 22, 2021 Author Share Posted February 22, 2021 Server locked up again overnight. I took a pic of the console before I restarted it (attached). @ChatNoir I've gone through the article on the lockups, but I can't find the Power Supply Idle Control setting in the BIOS (version F50) and I don't believe the RAM is overclocked. I am going to try upgrading the BIOS to the newest version (F60h) and see if that helps. Let me know if you see anything in what was spit out before the lockup. Quote Link to comment
ChatNoir Posted February 22, 2021 Share Posted February 22, 2021 From the Manual : https://download.gigabyte.com/FileList/Manual/mb_manual_b450-aorus-pro-wifi_1002_e_190528.pdf I see mention of : Global C-state Control Power Supply Idle Control It looks like it is in the Advanced CPU Core Settings section. But this could change depending on BIOS version. I agree that with only 2 sticks of RAM at 2133MT/s you are good for a 2600. Quote Link to comment
Vr2Io Posted February 22, 2021 Share Posted February 22, 2021 It look like relate idle problem, if everytime call-trace have " RIP: ... cpuidle_enter_state ...... ", then it should relate this. My suggestion was try without start array or mount any storage, let system in most idle status, then check does trace will happen within 1hrs. Quote Link to comment
Daken00 Posted February 23, 2021 Author Share Posted February 23, 2021 I found the Power Supply Idle Control and Global C-States setting this morning. Power Supply setting was set to Typical current idle ( I think I had changed this previously), but Global C-States was still at Auto. I disabled that, so now I'll wait and see if the server stays up. Quote Link to comment
Daken00 Posted March 3, 2021 Author Share Posted March 3, 2021 After almost 8 days of uptime, I'm cautiously optimistic that the global c-states setting being disabled has fixed my issue. Thanks for all the assistance! 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.