Jump to content

Machine Check Events

Featured Replies

Posted

Hi,

 

I have MCEs:

"Machine Check Events detected on your server. Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged"

 

I installed mcelog through the Nerd Tools plugin but don't know what to do with it. Diagnostics Attached

 

Background: Last week I had a disk (disk8) that had a repeated increase in reallocated sectors count.

I shrank the array and now it is a mounted UD that I use for unimportant data.

It's reallocated sector count is still the same.

 

This is my IPMI KVM terminal screen. however no timestamp so I don't know when these errors were generated.

 

zz6e36b.jpg

 

 

Any advice on the MCEs?

juno-diagnostics-20180716-0913.zip

  • Community Expert

MCE look memory related check the board's SEL (system event log) on the bios, there might be more info.

 

You also need to tun xfs_repair on disk8.

  • Author

I guess this is not the right SEL. 

 

G6a1iOU.jpg

 

This one looks full since January. Cleared it now.

 

XmzWpJh.jpg

Edited by Gico

  • Author

I restarted the server yesterday. Errors appear again today. Another diagnostics attached.

How can I read the board's SEL? Restart and enter the bios?

The IPMI event log I cleared yesterday is still empty,

and the following "System event log" has no meaningfully data.

 

DuFF0u7.jpg

 

 

juno-diagnostics-20180718-2046.zip

  • Community Expert
1 minute ago, Gico said:

and the following "System event log" has no meaningfully data.

That's why I said it might have more info, if it doesn't remove some of Dimms, i.e., try with just one, if no errors start adding more one at at time.

  • Author

Yeah I thought of that but hoped for some kind of info before I start to tinker with the hardware.

Thanks. I will try.

Archived

This topic is now archived and is closed to further replies.