Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Help troubleshooting total system freeze with Nvidia 3090

Featured Replies

Hey guys, I am in desperate hope that someone can help me troubleshoot some system stability as I think I've confirmed it's related to the 3090 TI in my system and the nvidia-driver plugin. I say this because I had a similar issue when my system was running on Ubuntu natively, and ultimately downgrading the drivers to some version (can't remember what) solved it. 

 

I'm on the latest 7.0.1 build of Unraid Server

Currently I've removed the power cables from the GPU, but it's still physically plugged in. I want to wait another day or two for the parity checks to complete, and also ensure the system runs stable with the GPU disconnected.

 

The short version: At some point within 24 hours, my system will completely lock up, to the point where it won't even respond on console. My only solution is to physically power cycle the system, which we all know is terrible. And yes, I've run a memory test. The initial run passed just fine, and after almost 24 hours later, there was still no errors.

 

The longer version: As I mentioned, this physical server used to run Ubuntu natively and it was built to be an AI server. I've got months into troubleshooting this issue now and after swapping out my HBA etc...I finally plugged the GPU back in and enabled some of the docker containers that will use the 3090 (Ollama and Plex). Since I can't easily replicate the error, I simply have to wait 24 hours or so to see if it fails or not....which makes this just the most painful thing I've ever had to troubleshoot.

 

The lack of logs also makes this incredibly difficult. I did setup a syslog server on my network, but the logs don't have any information on what might have happened. The only message within the timeframe of when it locked up is

 

Mar 10 03:57:39 123.123.123.64 monitor_nchan: Stop running nchan processes

 

I have tried using the different driver versions available within the nvidia-drivers plugin and it seems that all of them fail. I can't remember what version I used on Ubuntu that eventually cleaned this up.

 

Ultimately I'm hoping someone can either find something in the diagnostics, or has first hand experience of a working 3090 TI in their Unraid server.

 

serenity-diagnostics-20250310-0838.zip

Edited by ChadDa3mon

  • Community Expert

If you confirm the issue really is the GPU I would recommend asking for help here:

 

  • Author

Thanks. I created a post there as well.

  • ich777 locked this topic
Guest
This topic is now closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.