RBoots Posted March 14, 2021 Share Posted March 14, 2021 (edited) For the past couple of weeks I've occasionally been finding that my server has rebooted or is unresponsive. (The past 1 week it's been basically unusable.) As far as I can tell there is no single-thing that causes this to happen. It will just happen anywhere between 10 minutes to 5 hours. I have tried many different things in attempt to narrow down what the issue is, but nothing fixes the problem. Before I list everything I've done so far I want to ask about something I found in a syslog file. (For some reason it only just crossed my mind to search for "error" in the log files.) .... node #0, CPUs: #1 #2 #3 mce: [Hardware Error]: Machine check events logged mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 5: bea0000000000108 mce: [Hardware Error]: TSC 0 ADDR 1ffff816f2e4a MISC d012000100000000 SYND 4d000000 IPID 500b000000000 mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1615731921 SOCKET 0 APIC 6 microcode 8001138 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 smp: Brought up 1 node, 16 CPUs And another time: .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 mce: [Hardware Error]: Machine check events logged mce: [Hardware Error]: CPU 14: Machine Check: 0 Bank 5: bea0000000000108 mce: [Hardware Error]: TSC 0 ADDR 1ffff8152f996 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1615699503 SOCKET 0 APIC d microcode 8001138 #15 smp: Brought up 1 node, 16 CPUs Is this a good indicator that my CPU is failing? I do have a CPU in my main desktop that will work in the server (the motherboard was just in my desktop until recently.), is it worth the effort to install it in my server for now? Or am I somehow misinterpreting this message? Extra note: Some variation of these same messages have appeared in all of the log files I just searched in. Solution was found here: Edited March 16, 2021 by RBoots Issue was solved Quote Link to comment
Squid Posted March 14, 2021 Share Posted March 14, 2021 During initialization of the cores, certain combinations of hardware issue an MCE. Can be safe to ignore. What you really want to do is configure the syslog server (mirror to flash), and after your next crash post the syslog generated and a set of diagnostics Quote Link to comment
RBoots Posted March 14, 2021 Author Share Posted March 14, 2021 6 minutes ago, Squid said: During initialization of the cores, certain combinations of hardware issue an MCE. Can be safe to ignore. What you really want to do is configure the syslog server (mirror to flash), and after your next crash post the syslog generated and a set of diagnostics I already have that turned on. I guess I don't really know what I'm looking for, but it doesn't seem to show any errors before it freezes. There's just a gap from the normal logs, to the next boot. If it doesn't appear to be related to a CPU issue, should I go on with listing the things I've tried to fix the problem? Quote Link to comment
trurl Posted March 14, 2021 Share Posted March 14, 2021 Post Diagnostics ZIP so we can get a better idea of your hardware and configuration Quote Link to comment
RBoots Posted March 14, 2021 Author Share Posted March 14, 2021 15 minutes ago, trurl said: Post Diagnostics ZIP so we can get a better idea of your hardware and configuration rbootsserver-diagnostics-20210314-1739.zip Quote Link to comment
JorgeB Posted March 15, 2021 Share Posted March 15, 2021 Start here: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=819173 1 Quote Link to comment
RBoots Posted March 16, 2021 Author Share Posted March 16, 2021 On 3/15/2021 at 4:56 AM, JorgeB said: Start here: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=819173 That appears to have fixed my issues. My server has been on for over 17 hours now, which is much longer than it has been for the past week. (Maybe I should give it another day before I declare it officially fixed.) At least it was an easy fix, I just wish I came across this much earlier. Thanks for your help 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.