Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Random Unraid Server Crashes (No Logs, Inconsistent Intervals)

Featured Replies

Hi all,

I'm at my whits end after spending an entire day troubleshooting this with ChatGPT and other resources.

I’m running into recurring stability issues with my Unraid server that I can’t seem to pin down. The system will randomly crash or reset without leaving anything useful in the syslogs. Crashes can happen hours apart, days, or even weeks—it’s very inconsistent. I have another Unraid server set to receive the logs and it just cuts off prior to the crash.

This started happening again about a week or two ago. Around that time, I had added a new Docker container and disabled another, but otherwise made very few changes. I already tried disabling the new container, but the crashes continue.

A bit of history:

  • About a year ago, I had the exact same problem (random crashing, no logs).

  • I replaced the PSU at that time, but it didn’t solve it.

  • I then adjusted some BIOS settings to avoid AMD CPU C-state and power idle issues, updated the BIOS, and added another fan to bring cooler air into the room. After making those changes, the problem went away for quite a while.

  • Fast-forward to now, the random crashes are back. I swapped in another PSU again just to rule it out, but no improvement.

So far, the system just drops dead with no warning and nothing in the logs. I’m trying to determine if this is hardware-related (CPU, RAM, motherboard, etc.), BIOS/firmware settings, or something within Unraid/Docker that’s triggering instability.

BIOS Settings I’ve Adjusted (for stability)

Since this seems to be hardware/firmware related, here are the BIOS settings I’ve already reviewed or changed while troubleshooting:

CPU / Power Management:

  • Disabled Global C-States and set Power Idle Control to a non-low power mode (to avoid Ryzen deep idle bugs).

  • Disabled Core Performance Boost for stability (trading performance for consistency).

  • Disabled CPU Watchdog Timer (server was crashing even with it off).

Memory / DRAM:

  • Enabled/disabled Power Down Enable (tried both).

  • Left Gear Down Mode and Cmd2T at Auto for compatibility.

  • Experimented with DRAM refresh modes and scrubber controls (kept mostly on Auto to avoid instability).

  • XMP/DOCP profile currently disabled (running JEDEC defaults for max stability).

Interconnect / DF Options:

  • Tweaked Memory Interleaving and related settings (left at Auto eventually).

  • Disabled DF Sync Flood Propagation (to prevent cascading errors).

  • Left DF C-States mostly disabled for consistency.

PCIe / GPU / IOMMU:

  • PCIe ASPM Mode set to Disabled (to prevent link power management from destabilizing GPUs).

  • Using multiple GPUs (WX5100 passthrough (VFIO), plus iGPU & WX2100 for Dockers) - stopped using WX2100 and only using iGPU, VM using WX5100 stopped

  • UMA Frame Buffer Size set manually to avoid auto-adjustment issues (1G).

Security / Virtualization:

  • TSME (Transparent Secure Memory Encryption) disabled (or default, I forget but didn't matter).

  • SEV / SEV-ES options left off.

  • CSM (Compatibility Support Module) can’t be enabled anymore after BIOS update (seems forced UEFI).

  • Secure Boot disabled.

Misc / Timers:

  • High Precision Event Timer (HPET): currently disabled.

  • Power Loading: enabled to stabilize low-power PSU states (Typical Current).

  • Spread Spectrum: disabled (to avoid clock modulation issues).

Other Notes:

  • System previously stable after disabling C-states and forcing power idle workarounds.

  • Crashes now occur regardless of these settings.

System Specs

  • Motherboard: Gigabyte X570 AORUS ULTRA (latest BIOS, F39g)

  • CPU: AMD Ryzen 7 5700G with Radeon Graphics @ 3800 MHz (with iGPU – Cezanne Vega)

  • Memory: 128 GB non-ECC (running at JEDEC defaults, XMP off)

  • GPUs:

    • AMD Radeon WX 5100 (passthrough to VM)

    • AMD Radeon WX 2100 (Docker use / compute)

    • Integrated GPU (Cezanne Vega, used for Docker/host tasks)

  • PSU: Recently replaced (Seasonic 800W)

  • Cooling: Added extra intake fan for airflow in server room (which was part of the year-ago effort)

  • Boot: Unraid on USB flash, UEFI-only (CSM disabled by firmware), Secure Boot off

It may or may not be related to running a separate Ubuntu VM, which is currently shut down, and I am waiting to see if it crashes.

Has anyone else experienced this pattern of random, log-less crashes on Unraid, especially with AMD platforms? What would be the most systematic way to isolate the cause?

Thanks in advance for any ideas or troubleshooting steps I might have missed.

Solved by Unsealed0019

  • Community Expert

If there's nothing relevant logged, this can also be a hardware issue, one thing you can try is to boot the server in safe mode with all docker containers/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one, including the individual docker containers.

  • Author

Thanks JorgeB - I will try that. I'm not sure how to tell which hardware component would be most likely to be having the problem. I'm leaning towards replacing the memory first but would likely get ECC this time which also requires a different CPU anyway.

One other thing I forgot to mention is that when I had Tips and Tweaks installed, it did not show there being a CPU governor, which I wasn't sure if it was because of the BIOS tweaks or a software issue that could be resolved.

This System Driver is not present on the crashing server:

acpi_cpufreq (ACPI Processor P-States Driver) : cpufreq

What is the likelihood this is a failing USB Flash drive - can I test for that?

  • Community Expert

if you have multiple sticks try using the server with just one, if the same try with a different one, that will basically rule out bad RAM.

  • Community Expert
1 hour ago, Unsealed0019 said:

What is the likelihood this is a failing USB Flash drive - can I test for that?

That will usually leave something in the syslog.

  • Author
2 hours ago, JorgeB said:

That will usually leave something in the syslog.

If, when I plug it into a windows PC, it says 'errors detected, scan and fix' would that then indicate an issue and should I perform that action?

  • Community Expert

That could just be from the server crashing, since it won't be cleanly unmounting the flash drive, but you can try using a different one to make sure.

  • Author

I’ve hit a new wall in troubleshooting:

  • Memory:

    • I pulled 2 RAM sticks (down to 64 GB, 2×32 GB). With this setup, the system won’t boot at all.

    • I cannot run the bundled Unraid memtest (it freezes at “Loading... /something OK”).

    • Currently running PassMark MemTest86 instead (UEFI), since at least that one launches.

  • Boot behavior:

    • Attempting to boot Unraid from the flash now also freezes during startup (USB devices shut off, must hard reset).

    • Safe mode resets on it's own.

    • Normal mode hangs at the boot menu screen, which feels less like an OS/driver issue and more like a hardware or flash media issue.

  • Flash drive suspicion:

    • Plugging the Unraid USB stick into a Windows machine triggers “errors detected, scan and fix”.

    • This makes me suspect the flash drive may be at least part of the problem, though as you mentioned this is also likely to happen from unclean shutdowns.

  • BIOS:

    • I restored to optimized defaults and only made a couple of edits:

      • Typical Current Idle

      • Enabled SVM and IOMMU

      • Secure Boot off

Will let memtest go for a while then will try booting the new USB drive. I do have some older smaller memory sticks I could revert back to just to outright rule out the current sticks.

  • Author

PassMark Memtest passed 100% no errors - booted without issue.

Booting from a restored backup USB drive (brand new drive) runs into same boot failure (won't get past Loading /bzroot... OK) - USB devices turn off after and system hangs.

New USB sticks show exact same symptoms (Unraid or Ubuntu).

After removing all GPUs, reverting memory to some older sticks, unplugging power to hard drive cages, behavior is still about the same.

After a CMOS reset, CSM would remain enabled - motherboard strangeness after bios update.

Seems like I need a new motherboard or PSU but the PSU swap in the past didn't resolve the issue and only by tweaking some BIOS power settings did the issue temporarily go away (about a year).

  • Author
  • Solution

Swapped the motherboard - that wasn't it. Swapped the CPU and could then boot anything under the sun again.

Edited by Unsealed0019

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.