Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Server keeps crashing

Featured Replies

I initially installed a m.2 Google Coral and the Coral Accelerator Module Driver plugin, but my server started crashing.  Assuming that this was the issue, I removed the Google Coral and uninstalled the plugin.  Now my server is still crashing and I have no idea why, I'm hoping someone could look at the diagnostics and let me know.

Fix common problems states that there is a hardware issue:

Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the Unraid forums (which is what I did).

 

 

Edited by clowncracker

Solved by clowncracker

  • Community Expert
5 hours ago, clowncracker said:

Your server has detected hardware errors.

This usually suggests just that, a hardware problem, start by running memtest.

  • Author
12 hours ago, JorgeB said:

This usually suggests just that, a hardware problem, start by running memtest.

 

Memtest has completed four passes with no issues.

 

The weird thing is that it isn't an instant crash.  The server is fine for 3ish hours and then the UI just stops working.  I cannot access it from the webpage and I have manually restart the computer.

Edited by clowncracker

  • Author

Just crashed again during a parity check, after being online for about 3 hours and 45 minutes.  I needed to manually restart my server to get it to be responsive again.

I have a notification popup that says Parity check finished (0 errors) with a duration of over 19 hours, even though the server was online for less than four hours.

I'd like to note that fix common problems (and the syslog) no longer indicate that this is a hardware issue.  I've attached the syslog.

 

 

Edited by clowncracker

  • Author
19 hours ago, JorgeB said:

This usually suggests just that, a hardware problem, start by running memtest.

Another update, I've had it running in safe mode with all VMs and Dockers disabled for 7 hours with no crashes yet.

Is the RAM ECC?... Clutching at straws

 

13 hours ago, clowncracker said:

The server is fine for 3ish hours and then the UI just stops working. 

 

Again, clutching at straws... Does the server keep working and/while the UI stops working?

 

Hope it helps.

 

MGrey.

 

 

  • Author
8 minutes ago, MrGrey said:

Is the RAM ECC?... Clutching at straws

 

 

Again, clutching at straws... Does the server keep working and/while the UI stops working?

 

Hope it helps.

 

MGrey.

 

 

All of the VMs and Dockers stop working, I think it just crashes but the computer doesn't turn off.

 

Not ECC RAM.  Considering it's been working for about 8 hours at this point in safe mode with no Dockers/VMs running, I'm fairly certain the hardware error was a false flag.

 

This all started when I installed the m.2 Google coral and installed the driver plugin, so I think the driver plugin messed something up.  Even after I uninstalled the plugin and removed the Google coral, the issue persisted.

Seems you got a sorta working stable mode now. This means you can try around seeing what exactly causes the error. Its gonna be a lot of effort as you have to wait many hours but you can at least start activating stuff again bit by bit and see how the server reacts.

 

Otherwise: Maybe try to limit every docker & VM to just one CPU core via pinning and check again. Maybe its just one docker going berserk and taking up 100% CPU on all cores causing nothing else to work anymore?

 

 

  • Author

Is there nothing in the system log or diagnostics that might help determine the cause?

@JorgeB any chance you can look at the latest diagnostic and system log I attached?  Nothing has changed in my config and this point and I have no idea how to diagnose this issue.

 

Edited by clowncracker

  • Community Expert

There are call traces and sgefault logged, but those by themselves don't rule point to a culprit, just suggest a hardware problem, RAM and/or board would be my main suspects.

  • Author
5 hours ago, JorgeB said:

There are call traces and sgefault logged, but those by themselves don't rule point to a culprit, just suggest a hardware problem, RAM and/or board would be my main suspects.

 

I believe the sever crashes when the CPU gets near 100% utilization.  If memtest didn't give me any errors, do you think that means it's the motherboard?

  • Community Expert

Could be, can never say for sure.

  • Author
  • Solution

The issue ended up being the motherboard.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.