[Unraid 6.9.2] Complete lock-up, cannot identify possible cause


Recommended Posts

Hey there,

I have been running Unraid for a bit over a year now. It's been perfectly stable, only in mid-Dec 2021 did seemingly random lock-ups start. I've had a total of 4-5 lock-ups now, every one of which acting the same:

I only ever realize the server has locked-up when Plex goes offline or the WebGui is unresponsive. Upon checking the WebGui and not being able to get anything from it (ultimately timing out) or not being able to ping the server I have to do a hard reset. Using the power/reset button doesn't work, I have to turn off the PSU.

 

The start of said lock-ups correlate with me getting tdarr to work on my machine. I did have tdarr and tdarr_node running when the first lock-ups occurred, seemingly pointing me to this bug report. However, I couldn't confirm this for my server, since I didn't have the Syslog Server set up (properly).

The server has been running fine for ~3 weeks now, with both tdarrs stopped. Yet, I still got a lock-up during past night (early Feb 1, 2022).

 

Power Supply Idle Control and Global C-States have been set accordingly (don't remember the exact wording in the Asus BIOS of my MB). Also, the RAM speeds have been set to 2400 MT/s.

 

I am at my wits' end. Maybe someone can point me to the right direction. Thanks!

 

Syslog and diagnostics have been attached. Please let me know if there's any info missing.

 

System specs:

Spoiler

Unraid 6.9.2

 

Hardware:

  • CPU: AMD Ryzen 7 2700X
  • MB: ASUS ROG Strix B450-F Gaming
  • RAM: Crucial Ballistix 32GB (2x 16GB), DDR4-3200, CL16 | running at 2400
  • PSU: Corsair HX750 80+ Plat
  • HBA: LSI SAS 9201-16i | on top-most PCIe slot, cooled with an extra fan
  • USB: SanDisk Cruzer Fit 16GB USB 2.0
  • SSD (SATA): 1x Crucial MX500 250GB, 3x Crucial MX500 1TB, 1x Patriot Burst 480GB (pass-through) | all connected through MB SATA ports
  • SSD (NVMe): 1x WD SN750 1TB | M.2 on MB
  • HDD (SATA): 6x Exos X14 12TB, 6x Exos X16 16TB | all connected through HBA
  • UPS: APC Back-UPS 700VA

 

1x Win 10 VM running, with half of the CPU cores/HT pinned and isolated (see pic)

cpu_pinning.png.c2938249e55e96d132feeb4f7c7c10b8.png

 

no CPU core pinning for Docker

 

Docker:

  • vm_custom_icons
  • JDownloader2*
  • binhex-krusader
  • lidarr (linuxserver)
  • netdata*
  • pihole
  • plex (linuxserver)*
  • tdarr
  • tdarr_node

where * denote dockers that were running when the server locked-up (to my knowledge)

 

Plugins:

  • Fix Common Problems
  • CA Cleanup Appdata
  • CA Mover Tuning
  • Community Applications
  • Dynamix Active Streams
  • Dynamix Auto Fan Control
  • Dynamix Cache Directories
  • Dynamix S3 Sleep
  • Dynamix SSD TRIM
  • Dynamix System Information
  • Dynamix System Statistics
  • Dynamix System Temperature
  • File Activity
  • Nerd Tools
  • Preclear Disks
  • Unassigned Devices
  • Unassigned Devices Plus
  • unBALANCE
  • User Scripts

 

moria-diagnostics-20220201-1430.zip syslog-192.168.2.105.log

Edited by NeptuneSpear0205
random link was inserted, Unraid version
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.