comfox Posted August 24, 2018 Share Posted August 24, 2018 FCP is telling me that I am having a hardware error due to Machine Check Events detected on your server I downloaded the diag and looked at the syslog. I see the MCE's but I can't make out heads nor tails of what it means. Any help? Aug 3 10:21:24 Tower kernel: mce: CPU supports 9 MCE banks Aug 3 10:21:24 Tower kernel: mce: CPU supports 9 MCE banks Aug 3 10:21:24 Tower kernel: mce: [Hardware Error]: Machine check events logged Aug 3 10:21:24 Tower kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: be00000000800400 Aug 3 10:21:24 Tower kernel: mce: [Hardware Error]: TSC 0 ADDR fffff80205102395 MISC fffff80205102395 Aug 3 10:21:24 Tower kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1533306062 SOCKET 0 APIC 0 microcode 22 Aug 3 10:21:24 Tower kernel: Performance Events: PEBS fmt2+, Haswell events, 16-deep LBR, full-width counters, Intel PMU driver. Aug 3 10:21:24 Tower kernel: mce: [Hardware Error]: Machine check events logged Aug 3 10:21:24 Tower kernel: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 3: be00000000800400 Aug 3 10:21:24 Tower kernel: mce: [Hardware Error]: TSC 0 ADDR fffff80205102395 MISC fffff80205102395 Aug 3 10:21:24 Tower kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1533306062 SOCKET 0 APIC 4 microcode 22 tower-diagnostics-20180824-0814.zip Quote Link to comment
John_M Posted August 24, 2018 Share Posted August 24, 2018 You have a hardware fault that's showing up early in the boot sequence while the multiple cores of your CPU are being initialised. I'd run memcheck (select it from the boot menu or download the newer version and install on its own, separate USB flash and boot from that) for a good long time (48 hours, say) first. If it passes consider reseating the CPU and checking for bent pins. You can install mcelog using the Nerd Pack plugin. Other worthwhile things you can do include checking for a newer BIOS and updating to unRAID 6.5.3. Quote Link to comment
comfox Posted August 25, 2018 Author Share Posted August 25, 2018 I have mcelog installed from Nerd Pack. I have been running it for a while. I can try the memtrest but this is a pretty heavy server running a Win 10 VM that I game with as well as running many dockers including Plex which transcodes. I would think that if I had a stability issue or a memory issue I would have run in to it by now no? Quote Link to comment
Squid Posted August 25, 2018 Share Posted August 25, 2018 Upgrading to 6.5.3 may also help as it will include later microcode updates than your 6.5.0 Quote Link to comment
comfox Posted August 26, 2018 Author Share Posted August 26, 2018 22 hours ago, Squid said: Upgrading to 6.5.3 may also help as it will include later microcode updates than your 6.5.0 Thanks for the heads up @Squid. I didn't realize there was a new version out. I do not like the new update feature of unRAID. It used to be easy to tell if there was a new version, now I find I never get notified or can see it. I will see if the issues continue with the new version. Quote Link to comment
Squid Posted August 26, 2018 Share Posted August 26, 2018 1 hour ago, comfox said: now I find I never get notified or can see it. You just need to go to Settings - Notification settings, enable Check for OS updates (on whatever schedule you want), and set up your email or notification agent Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.