Optico89 Posted April 19, 2023 Share Posted April 19, 2023 (edited) Hello, Now that I'm back into Unraid 6.11.5 on a fresh install - The server is constantly restarting / crashing resulting in unclean shutdowns. This will happen even if I don't start the array at all and let it sit after it restarts. It appears to be random, as it could happen back to back or in 5 minutes, or 10 etc In the hour or so minutes I've been troubleshooting its restarted around 5 times already, and out of 5 i've only started the array 1 time in the beginning to install the following apps: Community Apps Nvidia Driver Auto Turbo Write Mode GPU Statistics My Servers I've done a clean reboot in order to activate a clean system reset, with no resolve - server keeps randomly restarting.. Attached is the latest SysLog file from about 2:05PM - It's 2:15PM right now. Equipment / Setup Motorola DOCSIS 3.1 Cable Modem pfSense Router built box Version 2.60 Stable Netgear MS108EUP Managed Ultra60 PoE Multi-Gig Plus switch No parity drives 4x 10TB HDD's 1x 1TB SSD Cache drive 1x 1TB NVME Samsung 980 Pro Cache drive HBA Controller: LSI SAS 9300-8i 8-Port in IT mode GPU: Nvidia Quadro P4000 CPU: Ryzen 9 5900x 12-core 24-Thread Motherboard: ASRock X570 Pro 4 P3.10 RAM: TeamGroup T-Create 32GB DDR4 3600MHz PC4-28800 CL18 syslog Edited April 19, 2023 by Optico89 Quote Link to comment
Optico89 Posted April 19, 2023 Author Share Posted April 19, 2023 Duration before self-restart. @ 1520hrs - 1hr 27min looking at the sys logs right after - this is what the error/warnings show.. any idea what this could be and if it may be related to the random restarts? Quote Link to comment
Optico89 Posted April 19, 2023 Author Share Posted April 19, 2023 Another random restart: 15:57hrs - 37 minutes since the last This time heres what the Syslog shows for all errors / warnings.. Quote Link to comment
Squid Posted April 19, 2023 Share Posted April 19, 2023 Can you post your entire diagnostics Quote Link to comment
Optico89 Posted April 19, 2023 Author Share Posted April 19, 2023 (edited) Hey Squid - Glad to see you! Another random restart - Here's the syslog. server restarted at 1631hrs - 34 minutes since the last See attached for diagnostics just ran it via SSH, as well as the current sysLog as of now optimusprime-diagnostics-20230419-1647.zip syslog Edited April 19, 2023 by Optico89 added system log Quote Link to comment
Optico89 Posted April 20, 2023 Author Share Posted April 20, 2023 (edited) Another random restart @ 1721hrs - 50 Minutes since the last Heres a broader look at the syslog surrounding the error: optimusprime-diagnostics-20230419-1729.zip Edited April 20, 2023 by Optico89 added broader look from syslog error & added updated diagnostics file Quote Link to comment
Optico89 Posted April 20, 2023 Author Share Posted April 20, 2023 Random crash @ 2035hrs - 3hrs 14mins since the last. Systemlog shows no errors this time.. Ran the following docker containers for approximately 3 hours: NGINX Wireguard duckDNS Unifi Video Jellyfin Started the following docker containers @ 3hr mark: Radarr Sonarr Nzbget Server crashed approximately 12 minutes after.. Quote Link to comment
Optico89 Posted April 20, 2023 Author Share Posted April 20, 2023 Restarted again @ 2144hrs - 1hr 14 minutes since the last System logs showing the following error: Didnt run any containers this time. Quote Link to comment
JorgeB Posted April 20, 2023 Share Posted April 20, 2023 Start by running memtest. Quote Link to comment
Optico89 Posted April 20, 2023 Author Share Posted April 20, 2023 I attempted to start a memtest but soon as I execute the option on the unraid boot menu the server restarts itself. is there an alternative method to running the memtest via ssh or through the console? Quote Link to comment
JorgeB Posted April 20, 2023 Share Posted April 20, 2023 Memtest only works with legacy/CSM boot, if you can only boot UEFI use the free Passmark Memtest. Quote Link to comment
Optico89 Posted April 20, 2023 Author Share Posted April 20, 2023 memtest from Passmark has been running heres where its currently at. shows no errors so far.. Quote Link to comment
Optico89 Posted April 20, 2023 Author Share Posted April 20, 2023 pass #2 of 4 just completed - heres the results: no errors so far Quote Link to comment
JorgeB Posted April 20, 2023 Share Posted April 20, 2023 Leave it for a few more hours, 24 is the recommended amount, though usually if there's an error to be found it's found sooner, unfortunately memtest is only conclusive if an error is found, but it if doesn't find any errors after an additional couple of passes the next suspects would be board/CPU. Quote Link to comment
Optico89 Posted April 20, 2023 Author Share Posted April 20, 2023 (edited) ok - I'll let it continue to run and finish all 4 passes then follow-up with the results. In the event there are no errors, is there software that can be used to diagnose / test the CPU similar to Memtest that I can prepare in advance to get that started given no errors are found on the ram? Edited April 20, 2023 by Optico89 Quote Link to comment
Optico89 Posted April 20, 2023 Author Share Posted April 20, 2023 (edited) All passes completed, everything passed with 0 errors... see below EDIT: Ordered a replacement CPU. Should arrive day after tomorrow. In researching mce errors it and decoding the above errors recorded, it looks to be all pointed at the CPU threads. Hopefully this is the fix and the mobo is just fine. Edited April 21, 2023 by Optico89 Quote Link to comment
Solution Optico89 Posted April 22, 2023 Author Solution Share Posted April 22, 2023 (edited) Installed new AMD Ryzen 9 5900x 12/24 CPU. Running diagnostics to see if the problem is re-created. will update either way with results. Edit: Forgot to mention the logs look clean and clear with 0 errors at all. Edited April 22, 2023 by Optico89 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.