ImBadAtThis Posted March 5, 2019 Share Posted March 5, 2019 (edited) Version: 6.6.7 UNRAID Server Plus MB: ASRock X370 Taichi CPU: Ryzen (1st gen) 1700x BIOS: v4.8 Mem: 16GB G. Skill TridentZ DDR4 3200 Cache: 250GB Samsung 970 EVO NVMe SSD Array: Assorted WD & Seagate Unassigned Device: 250GB Samsung 840 EVO SSD Hello all, I am at my wit's end with this issue. I have been running my current setup for nearly 3 years without a hiccup but starting in January I started experiencing odd behavior. The first sign on something amiss was my server froze, upon restart all of my dockers and VMs were gone. It ran fine for a few weeks until seemingly out of the blue it began freezing every 12 hours or so. I have a monitor and keyboard attached but the terminal was completely unresponsive. I restart and enter GUI mode to see if I can gather more info or if I'm given a warning before freezing, but get nothing. The GUI freezes completely. No keyboard or mouse input accepted. Fix Common Problems reports nothing. Since then, the following actions have been taken with no improvement in outcome: 1. Uninstalled all VMs, docker containers, and plugins -- system still froze 2. Reinstalled OS -- system still froze 3. Reinstalled all SATA devices -- system still froze Attached to this post is a .zip containing diagnostics and an image of the syslog when the system was frozen. Please feel free to let me know if additional information is required. AND THANK YOU FOR ANY HELP. I'm at the end of my rope with this. tower-diagnostics-20190305-1304.zip Edited March 22, 2019 by ImBadAtThis Quote Link to comment
dlandon Posted March 5, 2019 Share Posted March 5, 2019 2 hours ago, ImBadAtThis said: Version: 6.6.7 UNRAID Server Plus MB: ASRock X370 Taichi CPU: Ryzen (1st gen) 1700x BIOS: v4.8 Mem: 16GB G. Skill TridentZ DDR4 3200 Cache: 250GB Samsung 970 EVO NVMe SSD Array: Assorted WD & Seagate Unassigned Device: 250GB Samsung 840 EVO SSD Hello all, I am at my wit's end with this issue. I have been running my current setup for nearly 3 years without a hiccup but starting in January I started experiencing odd behavior. The first sign on something amiss was my server froze, upon restart all of my dockers and VMs were gone. It ran fine for a few weeks until seemingly out of the blue it began freezing every 12 hours or so. I have a monitor and keyboard attached but the terminal was completely unresponsive. I restart and enter GUI mode to see if I can gather more info or if I'm given a warning before freezing, but get nothing. The GUI freezes completely. No keyboard or mouse input accepted. Fix Common Problems reports nothing. Since then, the following actions have been taken with no improvement in outcome: 1. Uninstalled all VMs, docker containers, and plugins -- system still froze 2. Reinstalled OS -- system still froze 3. Reinstalled all SATA devices -- system still froze Attached to this post is a .zip containing diagnostics and an image of the syslog when the system was frozen. Please feel free to let me know if additional information is required. AND THANK YOU FOR ANY HELP. I'm at the end of my rope with this. tower-diagnostics-20190305-1304.zip I would start by removing the NerdPack, preclear, and S3 Sleep plugins. See if removing any of these plugins helps. If removing these doesn't solve the problem, reboot in safe mode. If that doesn't help, you should test your memory. Quote Link to comment
ImBadAtThis Posted March 5, 2019 Author Share Posted March 5, 2019 Thanks for the response. Ive actually run a memory test and everything was fine. I will try uninstalling those plugins again...and then Safe Mode. If I dont have any issues in Safe Mode, what is that telling me? How should I proceed from there? Additionally, Ive attached an image of the most recent lockup from just a few minutes ago. Quote Link to comment
John_M Posted March 5, 2019 Share Posted March 5, 2019 Nearly three years? AMD only launched Ryzen 7 on 2 March 2017. Either way, that's a very early Ryzen 1700X you're using, so how are you mitigating its tendency to lock up when idling? I don't see either the boot option tweak or the go file tweak in your diagnostics so maybe you've disable C6 state in your BIOS or you've configured normal power idle instead of the default low power idle? Perhaps after a recent BIOS update you forgot to re-select this particular customisation. Quote Link to comment
John_M Posted March 5, 2019 Share Posted March 5, 2019 (edited) You have virtualisation support disabled in your BIOS, which is why kvm_amd can't load and your VMs won't work. That is the default in the ASRock BIOS, which suggests that the defaults have been restored or the CMOS has been corrupted. Look for SVM (secure virtual machine, a.k.a. AMD-V) and enable it. Mar 5 12:47:23 Tower kernel: kvm: disabled by bios Mar 5 12:47:23 Tower root: modprobe: ERROR: could not insert 'kvm_amd': Operation not supported Edited March 5, 2019 by John_M Added error messages from syslog Quote Link to comment
ImBadAtThis Posted March 5, 2019 Author Share Posted March 5, 2019 14 minutes ago, John_M said: You have virtualisation support disabled in your BIOS, which is why kvm_amd can't load and your VMs won't work. That is the default in the ASRock BIOS, which suggests that the defaults have been restored or the CMOS has been corrupted. Look for SVM (secure virtual machine, a.k.a. AMD-V) and enable it. Mar 5 12:47:23 Tower kernel: kvm: disabled by bios Mar 5 12:47:23 Tower root: modprobe: ERROR: could not insert 'kvm_amd': Operation not supported To your first post, 2 years then. I couldnt remember when it was released, but it was soon after initial release that I purchased the CPU. Also, Ive never experienced instability issues with this system and thus never disabled C6 or effected the boot options tweak or the go file tweak. Yes Ive simply reset CMOS and havent yet re-enabled virtualization. Ive been focused on just getting the system stable as-is. Quote Link to comment
John_M Posted March 5, 2019 Share Posted March 5, 2019 26 minutes ago, ImBadAtThis said: Yes Ive simply reset CMOS and havent yet re-enabled virtualization. Ive been focused on just getting the system stable as-is. You didn't mention clearing the CMOS in the list of things you'd done to try to fix your problem. In that case, disable low power idle in the BIOS and enable SVM and see how it goes. You'll want to enable XMP so that you can set the RAM speed to DDR4-2666 (assuming you have two 8 GB modules), too. Quote Link to comment
Vr2Io Posted March 6, 2019 Share Posted March 6, 2019 (edited) Mar 5 12:47:01 Tower kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0) ### [PREVIOUS LINE REPEATED 15 TIMES] ### Suggest check the BIOS setting ( try enable/disable/auto ), this about Ryzen C6 issue. Besides, I never try S3 sleep plugin on my Ryzen. From my point 0f view, your problem relate sleep dead. Edited March 6, 2019 by Benson Quote Link to comment
ImBadAtThis Posted March 10, 2019 Author Share Posted March 10, 2019 On 3/5/2019 at 3:49 PM, John_M said: Nearly three years? AMD only launched Ryzen 7 on 2 March 2017. Either way, that's a very early Ryzen 1700X you're using, so how are you mitigating its tendency to lock up when idling? I don't see either the boot option tweak or the go file tweak in your diagnostics so maybe you've disable C6 state in your BIOS or you've configured normal power idle instead of the default low power idle? Perhaps after a recent BIOS update you forgot to re-select this particular customisation. Thanks for the feed back all! John_M, can you point me to the text for that go file tweak? I cant seem to find it. Not 6 months ago I couldnt avoid it, but now that I need it... As an update, Ive modified the Power Supply Idle Control, as this was apparently introduced in a recent BIOS as a way to avoid disabling C6 altogether. If this doesnt work, Im going to disable C6 in the bios. If that doesnt work, Im going to use the go file tweak. If that doesnt work I will investigate disabling rcu callbacks per spaceinvaderone's video. If any of this sounds like a garbage plan, let me know. Thanks Quote Link to comment
John_M Posted March 10, 2019 Share Posted March 10, 2019 I'd try the Power Supply Idle Control first. If that doesn't work then try this zenstates go file tweak before you resort to disabling C-states in the BIOS. These are the instructions: Quote Edit your \\tower\flash\config\go script (using a good editor like Notepad++ (not Notepad)) and add the "zenstates" command right before "emhttp", like this: /usr/local/sbin/zenstates --c6-disable /usr/local/sbin/emhttp & This is the post (it's the first message - scroll down a bit to find it): Quote Link to comment
ImBadAtThis Posted March 22, 2019 Author Share Posted March 22, 2019 Thanks for everyone's help. Changing the power supply Idle Control setting in the BIOS seems to have done the trick. No issues since. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.