darrenyorston Posted October 29, 2017 Share Posted October 29, 2017 Fix Common Problems is reporting a Machine Check Events error with the following fix Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged I turned on the mcelog in Nerdpack but where do I find the diagnostics file to post? Quote Link to comment
Sean M. Posted October 30, 2017 Share Posted October 30, 2017 (edited) 1 hour ago, darrenyorston said: Fix Common Problems is reporting a Machine Check Events error with the following fix Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged I turned on the mcelog in Nerdpack but where do I find the diagnostics file to post? This came up for me the other day, here's a post with all the various ways. I used - http://<Insert IP Address>/log/syslog Replace the <Insert IP Address> with your respective server. You can use Tower if that works for you in general. Edited October 30, 2017 by Sean M. 1 Quote Link to comment
Squid Posted October 30, 2017 Share Posted October 30, 2017 tools - diagnostics simplest way thru the gui Quote Link to comment
trurl Posted October 30, 2017 Share Posted October 30, 2017 24 minutes ago, Sean M. said: This came up for me the other day, here's a post with all the various ways. I used - http://<Insert IP Address>/log/syslog Replace the <Insert IP Address> with your respective server. You can use Tower if that works for you in general. That is for V5 and we really, really wish you would not use those methods for V6 because V6 provides much more information for us if you give us the Diagnostics (which includes syslogs). If we want just a syslog we will ask for it. Quote Link to comment
darrenyorston Posted October 30, 2017 Author Share Posted October 30, 2017 Here is the file which was downloaded. tower-diagnostics-20171031-0642.zip Quote Link to comment
bonienl Posted October 30, 2017 Share Posted October 30, 2017 20 hours ago, trurl said: If we want just a syslog we will ask for it. In V6 besides diagnostics, which give a full report, syslogs and smart reoprts can be obtained individually from the GUI too. None of the complicated V5 methods is required any longer. Quote Link to comment
darrenyorston Posted October 31, 2017 Author Share Posted October 31, 2017 So should I just ignore this warning? Quote Link to comment
TJOPTJOP Posted December 25, 2019 Share Posted December 25, 2019 Hi Guys, I also received the message "Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged" So I upload my diagnostics here. Hopefully someone can help me to troubleshoot. Also in my unraid log I see errors about failure ram. I use ECC ram so it will repare but I need to find the bad modules. bigmama-diagnostics-20191225-2151.zip Quote Link to comment
macbentosh Posted March 16, 2020 Share Posted March 16, 2020 Got an alert from FCP plugin...Only error I see is this. Mar 15 18:53:28 Rinzler kernel: mce: [Hardware Error]: Machine check events logged Mar 15 18:53:28 Rinzler kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: f600000000070f0f Mar 15 18:53:28 Rinzler kernel: mce: [Hardware Error]: TSC 0 ADDR 419fdc570 Mar 15 18:53:28 Rinzler kernel: mce: [Hardware Error]: PROCESSOR 2:610f01 TIME 1584323579 SOCKET 0 APIC 0 microcode 6001119 Mar 15 18:53:28 Rinzler kernel: Performance Events: Fam15h core perfctr, AMD PMU driver. any help would be awesome. Thanks Quote Link to comment
trurl Posted March 16, 2020 Share Posted March 16, 2020 5 hours ago, macbentosh said: Got an alert from FCP plugin... Go to Tools-diagnostics and attach the complete Diagnostics zip file to your NEXT post. Quote Link to comment
macbentosh Posted March 21, 2020 Share Posted March 21, 2020 (edited) Happened again and started a parity check Edited March 22, 2020 by macbentosh Quote Link to comment
Squid Posted March 21, 2020 Share Posted March 21, 2020 Certain combinations of cpus / motherboards / poltergeists deciding to just hang out issue an MCE when initializing the processor during boot up. This is what's happening to you, and is normal (assuming the poltergeists weren't involved) and can be safely ignored. Quote Link to comment
macbentosh Posted March 21, 2020 Share Posted March 21, 2020 4 minutes ago, Squid said: Certain combinations of cpus / motherboards / poltergeists deciding to just hang out issue an MCE when initializing the processor during boot up. This is what's happening to you, and is normal (assuming the poltergeists weren't involved) and can be safely ignored. Now I’m wondering why it rebooted and started a parity check. What caused the shutdown!? Quote Link to comment
Squid Posted March 21, 2020 Share Posted March 21, 2020 First thoughts are power problems or bad memory. Have you run a memtest for a couple of passes recently from the boot menu? Quote Link to comment
Squid Posted March 21, 2020 Share Posted March 21, 2020 Also, while it's not necessarily your problem, but the utmost in stability in any computer system comes when using matching memory sticks (which you're not). If you do have to mix and match between manufacturers (which you are), then try and ensure that the CL spec on them match (which yours don't). It's not a given, but it is a distinct possibility. Quote Link to comment
macbentosh Posted March 21, 2020 Share Posted March 21, 2020 8 minutes ago, Squid said: Also, while it's not necessarily your problem, but the utmost in stability in any computer system comes when using matching memory sticks (which you're not). If you do have to mix and match between manufacturers (which you are), then try and ensure that the CL spec on them match (which yours don't). It's not a given, but it is a distinct possibility. Gotcha. Much of this machine was given to me. I’ll order another stick of the same to replace the one I received with it. Quote Link to comment
macbentosh Posted March 22, 2020 Share Posted March 22, 2020 13 hours ago, Squid said: Also, while it's not necessarily your problem, but the utmost in stability in any computer system comes when using matching memory sticks (which you're not). If you do have to mix and match between manufacturers (which you are), then try and ensure that the CL spec on them match (which yours don't). It's not a given, but it is a distinct possibility. Just rebooted at random again. Any thoughts on where I should start to look for what the issues is? Quote Link to comment
macbentosh Posted March 22, 2020 Share Posted March 22, 2020 13 hours ago, Squid said: Also, while it's not necessarily your problem, but the utmost in stability in any computer system comes when using matching memory sticks (which you're not). If you do have to mix and match between manufacturers (which you are), then try and ensure that the CL spec on them match (which yours don't). It's not a given, but it is a distinct possibility. No thoughts that it might by my HBA? Quote Link to comment
Hill1023 Posted July 21, 2020 Share Posted July 21, 2020 Hello, I had this error come up recently after my server during a scan. It's been running a while, so thought it was strange to have come up now after running for so long. My zip file is attached. Appreciate the help very much. hpsvr-diagnostics-20200721-0928.zip Quote Link to comment
mhowland24 Posted August 5, 2020 Share Posted August 5, 2020 Just got the same error, would appreciate some help figuring out what the issue is, thanks tower-diagnostics-20200805-0607.zip Quote Link to comment
Squid Posted August 5, 2020 Share Posted August 5, 2020 Aug 5 01:09:44 Tower kernel: mce: [Hardware Error]: Machine check events logged Aug 5 01:09:44 Tower kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Aug 5 01:09:44 Tower kernel: EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090 Aug 5 01:09:44 Tower kernel: EDAC sbridge MC0: TSC 40a13e7678cb1 Aug 5 01:09:44 Tower kernel: EDAC sbridge MC0: ADDR f380c8640 Aug 5 01:09:44 Tower kernel: EDAC sbridge MC0: MISC 1402e0086 Aug 5 01:09:44 Tower kernel: EDAC sbridge MC0: PROCESSOR 0:306e4 TIME 1596600584 SOCKET 0 APIC 0 Aug 5 01:09:44 Tower kernel: EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xf380c8 offset:0x640 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:2 rank:1) Bad memory. The system event log in the BIOS would hopefully provide a little more insight to which DIMM is starting to go, besides Channel 1, Dimm 0 Quote Link to comment
TheShadowDuke Posted January 25, 2021 Share Posted January 25, 2021 (edited) Well, I got the same message and it told me to post my log. I'm honestly not even sure what I'm looking at/for to be honest. I'm pretty new to this side of things. mars-diagnostics-20210124-1821.zip Edited January 25, 2021 by TheShadowDuke Quote Link to comment
truthforwho Posted January 25, 2021 Share Posted January 25, 2021 I've seen this error pop up several times in the past few months. Finally decided to ask for help. Any insight is appreciated. tower-diagnostics-20210124-2206.zip Quote Link to comment
spacer00ster Posted January 31, 2021 Share Posted January 31, 2021 Hello. Can anyone please help me figuring out why my server keeps rebooting? unspace-diagnostics-20210131-1239.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.