Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

[SOLVED] multiple server crashes

Featured Replies

hello community,

 

did set up my first unraid server on mostly new HW some days back and keep on getting server crashes during parity checks or copying data from an USB device to the disks on the server using krusader.

 

Things I did so far:

Did run a memtest without errors just yesterday.

Disabled C-states in BIOS

no OC (CPU/ RAM)

 

This time I could save the syslog when it crashed, see attached.

Also it seems that copying the syslog file to the flash drive doesn't work. At least that is my interpretation on

Sep  9 09:10:07 Mittelerde rsyslogd:  Could not find template 1 'flash' - action disabled [v8.2002.0 try https://www.rsyslog.com/e/3003 ]
Sep  9 09:10:07 Mittelerde rsyslogd: error during parsing file /etc/rsyslog.conf, on or before line 66: errors occured in file '/etc/rsyslog.conf' around line 66 [v8.2002.0 try https://www.rsyslog.com/e/2207 ]

 

I'm not using a cache drive at the moment as I first wanted to upload all data.

 

HW specs:

MSI x570 Tomahawk

AMD Ryzen 7 3700x

MSI GT 710

4x8GB HyperX Fury RAM

Ziyituod PCIe SATA 4 port card

4x WD Red 3TB (used)

4x WD Blue 1TB (used)

ADATA XPG SX8200 Pro 512GB

1x Crucial 128GB SSD (used)

 

No VMs

 

I'm using the beta version as it enables the onboard network (as far as I read in the forum).

 

Hope someone can help me finding the root cause of the crashes or has suggestions to improve the setup

 

 

syslog_20200909-1119.zip

  • Community Expert

Ryzen and Linux not the best choice, though it does work OK for some, there are NMIs logged, but no idea what is causing them, make sure RAM is not over the to max officially supported speed for your config, can't see that without the full diags.

  • Author

thanks for the reply.

RAM is currently running at 2400

  • Community Expert
3 minutes ago, die3zehn said:

RAM is currently running at 2400

That's fine for 3rd gen with a full load, look for a bios update, unfortunately other than that don't have many suggestions, a future Unraid release with a newer kernel might also help.

  • Author

updated bios, will test for some time again.

 

do you think the additional SATA card causes issues?

  • Community Expert
5 minutes ago, die3zehn said:

do you think the additional SATA card causes issues?

I don't know that model, complete diags would give more info.

  • Author

attached the complete diags while server was running.

In the meantime it crashed again. Unfortunately I don't have a syslog file after that. Just saw a kernel panic on the screen. Was just running parity.

 

mittelerde-diagnostics-20200909-1410.zip

  • Community Expert

The add-on SATA controller uses a SATA port multiplier and those are not recommended, though doubt it's the reason for the current issues, but if you can test without it do it.

  • Author

tested the server without the SATA card, same issue. Another "Kernel panc - not syncing: Attempted to kill the idle task! Shutting down cpus with NMI"

 

Anyone another idea? Guess using the stable version is no option as the Kernel is even older?

  • Community Expert
12 hours ago, die3zehn said:

Guess using the stable version is no option

It won't hurt to try.

  • Author

Stable version didn't really made a difference unfortunately.

 

MSI released a new bios which contains "Updated AMD AGESA ComboAm4v2PI 1.0.8.1" and some improvements to S3 wake up issues. Testing it since yesterday.

Parity check went through over night, server has been up for 17 hours without a crash 😀. Although I had a CPU load of 90 - 100% when I checked this morning.

WebUI was very slow due to the cpu load. Not sure what caused the load as nothing was actively running at that moment.

 

Attached the syslog from this morning.

 

mittelerde-syslog-20200911-0714.zip

  • Community Expert

Please post the complete diags instead.

  • Author

don't have it from the time of the high cpu load.

Got one from last evening and I can pull one now if that helps

 

Next time I try to remember to pull one

  • Community Expert

System load average is very high, but nothing on top using CPU.

  • Author

not sure if this is still due to using a Ryzen CPU as it seems that some cores tend to run at 100% after they stop working. Happening still during file transfers, no longer using Krusader although now SMB seems to crash and the only thing working is a manual reboot pushing the reset button.

 

diags attached

mittelerde-diagnostics-20200913-1004.zip

  • Community Expert

Does the same happen after booting in safe mode?

  • Author

tried it over night, got a kernel panic after some cpuidle_enter_state messages. Couldn't take any diag file, just saw it on the screen connected to the server.

 

Could this be RAM related as well? I read that memtest isn't that reliable. Would now try it with one RAM at a time to see if one of them is causing issues.

Although most of the error messages show some CPU issues.

 

Maybe I should just accept that unraid and my HW setup just don't work together.....

  • Community Expert

Did you set the correct "power supply idle control" option as described here? If yes try disabling c-states completely.

  • Author

thanks JorgeB

I did set it to "Typical Current Idle", just double checked.

Will now set the "Global C-State Control" to disabled as well and test it again

 

 

  • Author

after getting crashes even with those settings I then moved the USB flash drive to another USB port as I saw I have it on a USB3.1 port instead of USB2.0

Still got a crash while copying data.

Now I removed 3 RAM modules and just left one in. I could copy my data without a crash of Krusader or the system. So everything which caused a crash before went through so far.

 

I'm using 2 16GB Kits of 2x8GB RAM and I read somewhere that this might cause issues as both kits are not tested together. Or maybe I just mixed the modules and put them in the "wrong" slot as the kit pairs should be in A2 & B2 and A1 & B1. Will try that out as well and put them back one after the other to see if I get some more crashes. Just hope that this is really the root cause of my issues.

Will report back my experience after testing. Any idea how long I should run the server with the different combinations of RAM to get a rough idea if it is stable or not?

 

 

  • 2 weeks later...
  • Author

so after testing a while now with only one kit installed it seems to be running.

I think one of the RAM modules was not working correctly. Memtest was showing no errors but whenever I added this module the server crashed after a while.

I returned this kit.

Guess this thread can be closed now.

Thanks a lot for your support JorgeB

  • JorgeB changed the title to [SOLVED] multiple server crashes

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.