December 7, 20232 yr This server has been stable for 2-3 years, no changes to hardware or configuration since this began. The server goes unresponsive from all remote connectivity. I'm not sure about local, I'm waiting for another failure top verify if anything is accessible locally, I was running headless. Dockers stop responding, UI, SSH, all completely dead. A restart brings it back up. I'm including the Diagnostic from just after the last 2 failures and recovery. I don't see anything in the logs, but I'm hoping there's a clue in the files. ASRock X570 Taichi ryzen 9 5950x 128 GiB DDR4 GSKILL Memory 6.12.4 unimatrixzero-diagnostics-20231207-1722.zip unimatrixzero-diagnostics-20231205-1604.zip
December 7, 20232 yr What is your current USB? I was having really similar issues where my syslog was showing corrupted data. Same behavior as you're describing, after a few days things would just gradually stop working, requiring a reboot.
December 8, 20232 yr Author While scrubbing the Syslog for personal details, I found this. Quote [Hardware Error]: Corrected error, no action required. [Hardware Error]: CPU:1 (19:21:0) MC21_STATUS[-|CE|MiscV|-|PCC|-|CECC|-|-|-]: 0x8b48c03108508948 [Hardware Error]: IPID: 0x0000000000000000 [Hardware Error]: Bank 21 is reserved. [Hardware Error]: cache level: RESV, tx: GEN I'm exploring the potential that this is the cause, and will be changing "typical idle current" as I've read this can cause random stability issues. If this is the case I will post for posterity.
December 8, 20232 yr Author 15 hours ago, Scheev said: What is your current USB? I was having really similar issues where my syslog was showing corrupted data. Same behavior as you're describing, after a few days things would just gradually stop working, requiring a reboot. SAMSUNG MUF-32AB/AM FIT Plus 32GB
December 8, 20232 yr 1 hour ago, Jryski said: SAMSUNG MUF-32AB/AM FIT Plus 32GB To my understanding and research, USB 3+ keys are not ideal for use with Unraid. I would recommend moving to a USB 2.0 key.
December 8, 20232 yr Community Expert 17 minutes ago, Scheev said: To my understanding and research, USB 3+ keys are not ideal for use with Unraid. I would recommend moving to a USB 2.0 key. This is not always that easy any more as USB2 drives are getting harder to find. If a USB3.x drive needs to be used then it is definitely worth seeing if an USB2 port on the server can be used. Even if they do not have an external USB2 port I think most motherboards still tend to have a USB2 header on the motherboard that can be used with the appropriate adapter/cable.
December 21, 20232 yr Author Well, after running memtest and getting passes, rebuilding the docker image, checking the SMART on all drives, changing all BIOS settings to recommended setting, removing the XMP profile on the RAM, it's still freezing. I've checked all the logs and all I see is mover running, mover completes and nothing else is written to the logs, the server simply stops responding. I've had this happen while under load and while idle. The flash drive is only 3 months old, I had it fail on me a few months ago. I'm at a loss here.
December 22, 20232 yr Community Expert One thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.
December 22, 20232 yr Author I finally feel like I've found something in the logs, attaching for anyone who might understand what I'm seeing. Syslog Errors.txt
December 22, 20232 yr Community Expert There are a lot of call traces, can't see what's causing them though, but they look more hardware related to me, you can still try what I mentioned above.
December 22, 20232 yr Author 49 minutes ago, JorgeB said: There are a lot of call traces, can't see what's causing them though, but they look more hardware related to me, you can still try what I mentioned above. Thanks, I will try that and report back.
December 26, 20232 yr Author Solution I made the move to IPVLAN i also made these changes... Host access to custom networks: Enabled Preserve user defined networks: Yes All issues have ceased. This is not an ideal solution, but it's working for me for now. I'm now at 48 hours stable and error free for the first time in months. I'm not sure why they call this issue resolved in this version, as it's clearly still an issue, but I'm stable for now. Thanks for the help along the way.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.