November 15, 20205 yr Hi there, since March I'm using Unraid 6.8.3 on my self-built server with all new hardware. Did run smooth and without any glitches so far. But the last couple of days I'm facing more and more problems which lead to total freeze of the system - had to do a hard reboot with switch off the system via power button. Even the cli didn't work anymore nor ssh were available. Today I was lucky to get a syslog and a diagnostic 5 minutes prior to the crash and I installed a syslog server on my notebook for catching the log remotely. Last entry before the crash was on Sonntag, 15. November 2020 16:56:21 I also attached a screenshot of the console from a former crash Before this crash I swa all 4 cores of the cpu rising up to 100% step by step I tried to do memtest86+ from the console when booting, but this always forced a reboot with no further action Can anyone please check, wether it's a hardware problem or some docker/plugins/software is causing all this trouble? Thanx in advance, unrno unr-server-diagnostics-20201115-1652.zip unr-server-syslog-20201115-1553.zip 2020-11-15.txt Edited November 15, 20205 yr by unrno.spam some more information
November 15, 20205 yr Author Tried to do memtest but didn't succeed from the Unraid Menu at boot (system just did a reboot) Try to get a live linux CD to do a memtest... Will check the BIOS settings for "Power Supply Idle Control"...
November 15, 20205 yr 3 minutes ago, unrno.spam said: Tried to do memtest but didn't succeed from the Unraid Menu at boot (system just did a reboot) I think you have to boot in legacy instead of UEFI for memtest on the Unraid boot menu to work. Or 4 minutes ago, unrno.spam said: Try to get a live linux CD to do a memtest... or you can make a bootable memtest flash drive.
November 15, 20205 yr Author Just had a new crash at Sonntag, 15. November 2020 18:54:02 Flashing a USB stick for memtest right now What does the page fault in the log mean? Guess this is related to memory not to the CPU, right? 2020-11-15.txt
November 22, 20205 yr Author Didn't do am memtest so far, because since the last crash the server is running now almost a full week 24/7 with no glitches at all. So I wait for the next time, I have to reboot. What else did I do? I entered the BIOS at the last reboot, to check some entries. But didn't change anything, although I saved the settings when I left. Can't really count for a problem solver. Then I installed the Plugin "Tips and Tweaks" and changed via the plugin the settings for: vm.dirty_background_ratio = 1 vm.dirty_ratio = 2 Don't know if this solved my problem, but at least I had no crashes since. Guess a memtest is still a good idea...
November 22, 20205 yr I wouldn’t wait for another hard crash. Much nicer to avoid file system errors with a clean boot than risk issues.Looks like most likely bad memory. Do the memtest now, if there is bad RAM it often shows up pretty quickly and you can got on with a warranty RMA.You should also do a file system check on you array and cache as well. Sent from my iPhone using Tapatalk
November 22, 20205 yr 1 hour ago, unrno.spam said: Guess a memtest is still a good idea... memtest never hurts. Running with bad RAM can hurt quite a bit, including data loss.
Archived
This topic is now archived and is closed to further replies.