Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

MCE errors and random freezes

Featured Replies

My server was running fine for some time now. Here are the specs:

 

Unraid version: 6.9.2

Asus Prime B350-PLUS

Ryzen 7 1700 @ 3000 MHz

32 GB DDR4 with 4 Dimms (2x 8GB @ 3000 MHz + 2x 8GB @ 3200 MHz) running at 3000 MHz

 

The CPU was overclocked to 3.7 GHz before, as I used my gaming setup as VM on the server. Since moving to a dedicated gaming rig, I restored all overclocking settings in the BIOS to stock values.

 

After this the server started to randomly freeze up - usually daily. When this happens it is apparently still running (case lights are up ;) ) but is not accessible in any way, since the network stack just stops working. Only way to bring it back is to hard reset the device. 

 

Since this behavior started I'm getting following error messages in the syslog:

Mar  3 08:39:49 Nexus kernel: mce: [Hardware Error]: Machine check events logged
Mar  3 08:39:49 Nexus kernel: mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 5: bea0000000000108
Mar  3 08:39:49 Nexus kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff813c3054 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Mar  3 08:39:49 Nexus kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1646293169 SOCKET 0 APIC 6 microcode 8001138

After seeing this, I run an memtest check overnight, which did not bring up any errors.

 

I attached diagnostics. It is however from a running system, i.e. NOT taken after a crash, as like I said when the server crashes, it crashes for good and I cannot access any logs.

 

Only changes between a perfectly running system and one crashing often is reverting the CPU to stock settings and exchanging the crappy PSU for a good one. Maybe one more thing: I used two of the RAM sticks in my new gaming rig for a moment, before the new ram arrived. After that the sticks were put back into the server. At the same time memtest did not detect any errors - I do know this does not mean there are none, but still.

 

My ideas for further troubleshooting are:
- run the server with only 2 RAM sticks at a time to see if this changes anything

- resetting BIOS settings to default, in case I f*** something up cleaning the overclocking

 

Any further ideas? Especially about the error message, as I don't really get what it is trying to tell me ;) 

nexus-diagnostics-20220303-2019.zip

Edited by Namarath

  • Namarath changed the title to MCE errors and random freezes

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.