Jump to content
  • MCELOG ERROR on AMD sytems

    • Annoyance

    I'm not sure why mcelog is error-ing out. Nor why a machine check event logs to dmesg on an AMD system... As you can see here, the SGC Unriad box eventual hits a machine check event. mcelogs error about a module to use even thought this module is available and in use seen via tools system drivers. ?is there a different mcelog command to see the system error???

    this system was up longer than a week and then error-ed with a machine check event...


    I find that any amd system with a processor from zen 1 - zen 3 system when attempting to use mcelog received error unsupported processor. use module xyz...


    This is Continue off a few forum post. From system error to FCP post to nerd tools post for mcelog to general support and now this bug report as something doesn't seem right here.


    A reboot seems to fix all issues, except that mcelog still errors out with unsupported processor...

    ... Can't seem to find old FCP suport forum post that needs to be updated ...



    22 hours ago, bmartino1 said:

    So my friend's system today started to randomly do this error. FCP says machine event. So I decided to run diagnostic.


    Rebooted the server and had a server fault kernel panic. Boot to safe mode success, fix mod changes and rebooted system to normal boot and its up no errors...

    Not sure what setting changed or issues with a system staying up on an AMD platform that is causing this issue.


    My friend also has 3 unraid licenses and Mutiple Computer type systems, but done't have a forum account...

    This machine is called SGC names from star trek.


    There seems to be a problem with this kernal module driver...



    Unirad 6.12.8 > system drivers:



    but terminal shows error:


    Thoughts ideas?


    after_reboot_all_systems_go_-_sgc-diagnostics-20240326-1658.zip 191.46 kB · 0 downloads sgc-diagnostics-20240326-1157.zip 181.73 kB · 0 downloads



    User Feedback

    Recommended Comments

    to be clear this is the bug report issue:


    root@BMM-Unraid:~# mcelog
    mcelog: ERROR: AMD Processor family 25: mcelog does not support this processor.  Please use the edac_mce_amd module instead.
    CPU is unsupported



    Link to comment

    ? may need to look into rasdaemon...




    jeffbee on Jan 3, 2021 | parent | context | favorite | on: ECC matters


    Newer linux have replaced mcelog with edac-util. I think most shops operating systems at that scale are getting their ECC errors out of band with IPMI SEL, though.


    gsvelto on Jan 3, 2021 [–]


    It's rasdaemon these days: https://www.setphaserstostun.org/posts/monitoring-ecc-memory-on-linux-with-rasdaemon/


    Edited by bmartino1
    Link to comment
    5 hours ago, bmartino1 said:

    ? may need to look into rasdaemon...

    LT is already aware of this, I think they were looking to use rasdaemon for v6.13

    • Like 1
    Link to comment

    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.

  • Status Definitions


    Open = Under consideration.


    Solved = The issue has been resolved.


    Solved version = The issue has been resolved in the indicated release version.


    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.


    Retest = Please retest in latest release.

    Priority Definitions


    Minor = Something not working correctly.


    Urgent = Server crash, data loss, or other showstopper.


    Annoyance = Doesn't affect functionality but should be fixed.


    Other = Announcement or other non-issue.

  • Create New...