[Plugin] Linuxserver.io - Unraid Nvidia


Recommended Posts

Started running into this error after (i believe) the last driver update was pushed out: 

 

Oct 21 12:17:32 Unraid1 kernel: NVRM: GPU 0000:0d:00.0: Failed to copy vbios to system memory.

Oct 21 12:17:32 Unraid1 kernel: NVRM: GPU 0000:0d:00.0: RmInitAdapter failed! (0x30:0xffff:794)

Oct 21 12:17:32 Unraid1 kernel: NVRM: GPU 0000:0d:00.0: rm_init_adapter failed, device minor number 1

 

Running 6.9.0-Beta30 with nvidia driver 450.80.02

unraid1-diagnostics-20201021-1232.zip

Link to comment

Hello everyone:
Please bare with me as Im going to be very detailed in regards to my issue. I did searched and found nothing related to what Im experiencing. Im fairly new to Unraid but Ive had my build working fine since it was completed this past April. All components have been the same since and nothing has changed software wise. I have to mention that I am using the Nvidia Unraid plugin, followed Spaceinvader's vid for configuration and it worked perfectly to pass the GPU to the Plex docker yet still having video coming from the card while in Unraid GUI. Ive also had transcodes going on while keeping video in GUI. 
This past saturday, I performed a shutdown on my server to move it out of the way for a work I was going to do. As mentioned, nothing was done or changed either hardware or software wise. The PC turned of and was moved. When I booted it, it started normally, MB brand logo came on, then came the UnRaid option selection, which I pick OS with GUI as always; it goes thru commands and such but right at the moment that the "Login" page is supposed to come up, the video dies. The monitor/TV says no input from device.
I did some troubleshooting, turned off the Plex docker since is whats using the GPU, even restared it with the docker not auto-staring but still the same, no video when GUI is supposed to be visible.

Last night I uninstalled the Nvidia Unraid plugin and did a reboot and viola!!!! Video worked as it did before. I got the Login page and everything. So I went ahead and reinstalled the plugin to see if it normalized and sure enough, lost video again. Im not a guru at this or even at reading the logs but I dont know what could be causing this.
Im asking if anyone has any ideas or possible steps to troubleshoot this some more before I do a full wipe and start from scratch. TBH, I wont loose much as i have backups of everything already done.
Can anyone help?

Much appreciated

Specs:
Asus Prime X570-Pro AMD Mb with latest Bios version
Rysen 9
Nvidia Quadro P2000
64 RAM

do-minion-syslog-20201021-2117.zip

Link to comment

Hey everyone,

 

All of a sudden I'm having issues with unRAID Nvidia not recognizing my GPU. I've got a Quadro P400 running in a Dell R510. I had the whole system running perfectly, and then all of a sudden, my plex installation crashed. After having to hard boot my system and a subsequent failure of plex to start, I narrowed the culprit down to the GPU. I disabled the additional arguments in the docker shell that allowed it to use the GPU for hardware accelerated transcoding and then began my troubleshooting.

I moved the GPU into my desktop computer and it seemed to run just fine. I then plugged the GPU into the riser cable I had, and plugged that into my desktop - no go. Assuming I had found the issue, I ordered a new riser cable. It arrived today and so I eagerly shut down the server, plugged the riser cable and GPU in and booted the server. At this time, unRAID Nvidia does not recognize it however when I run the command "lspci | grep -i nvidia" my output is as follows: 

03:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P400] (rev a1)
03:00.1 Audio device: NVIDIA Corporation GP107GL High Definition Audio Controller (rev a1)

 

In the system log, I see this:

Oct 21 18:58:01 PowerEdgeR510 kernel: NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x31:0xffff:973)
Oct 21 18:58:01 PowerEdgeR510 kernel: NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0

 

Any insight would be fantastic as I'm unsure of what to do next.

Link to comment

error-nvidia.thumb.jpg.eeda5948fdc457b00604e4f9edbde207.jpg

 

OK, I know this has been posted a ton. I've read the past 60+ pages in here.

 

I've deleted my network.cfg & network-rules.cfg, totally fresh settings. No Pi-Hole or dns filter. I even put my server on the DMZ for a moment to be sure.

 

Has anyone run into these conditions before?

Link to comment

It's having problems downloading the sha256 files for unraid 6.8.3

 

Now installing Nvidia version 6.8.3

Base URL: https://lsio.ams3.digitaloceanspaces.com/unraid-nvidia/6-8-3/nvidia

TO AVOID CORRUPTION
DO NOT CLOSE THIS WINDOW UNTIL YOU SEE THE DONE PROMPT

Downloading: /tmp/mediabuild/bzimage ... done
Downloading: /tmp/mediabuild/bzroot ... done
Downloading: /tmp/mediabuild/bzroot-gui ... done
Downloading: /tmp/mediabuild/bzmodules ... done
Downloading: /tmp/mediabuild/bzfirmware ... done
Downloading: /tmp/mediabuild/bzimage.sha256 ...

I left it there for tens of minutes and it doesn't pass this step.

Link to comment
On 10/16/2020 at 10:18 PM, saarg said:

Looks good. I think there have been a few other 1650 owners having the same issue as you. I don't remember if they managed to solve it somehow or not. Check the previous post and you might find out.

Okay, thanks for the response but unfortunately that doesn't help.

Searching this thread, I did find the previous posts about issues with the 1650, I have tried the suggestions of moving the card to an alternate PCI slot, and connecting a monitor. Neither of those have helped the plugin to detect the card :(

 

What irks me is that it worked before, transcoding in Plex and all, and all I did to break it was to boot into the GUI. Once that was done, even after rebooting back into CLI mode, no graphics cards can be found by the plugin despite it showing up in IOMMU groups...

 

Is it possible that booting into the GUI has set up some boot-surviving configuration for the Xorg whereby it claims the card, even if it's not actually booting into the GUI? If so, I am guessing I would need to find the configuration files for that to remove such an entry?

Sorry I do not have a good understanding of the UNRAID booting process, or Xorg in general. I may be way off here, but grasping at straws seems to be my only option here.

 

Does anyone have any new suggestions?

 

Many thanks

 

EDIT:

Poking around on the filesystem, I found this file:

image.png.3a089a4a1c3037b69c993ecc1b83c2db.png

Am I barking up a tree or does this information help at all? Does it confirm my suspicion that Xorg is getting in my way, or is this a relatively normal, expected configuration?

Edited by Ninjadude101
Link to comment
21 minutes ago, Ninjadude101 said:

Okay, thanks for the response but unfortunately that doesn't help.

Searching this thread, I did find the previous posts about issues with the 1650, I have tried the suggestions of moving the card to an alternate PCI slot, and connecting a monitor. Neither of those have helped the plugin to detect the card :(

 

What irks me is that it worked before, transcoding in Plex and all, and all I did to break it was to boot into the GUI. Once that was done, even after rebooting back into CLI mode, no graphics cards can be found by the plugin despite it showing up in IOMMU groups...

 

Is it possible that booting into the GUI has set up some boot-surviving configuration for the Xorg whereby it claims the card, even if it's not actually booting into the GUI? If so, I am guessing I would need to find the configuration files for that to remove such an entry?

Sorry I do not have a good understanding of the UNRAID booting process, or Xorg in general. I may be way off here, but grasping at straws seems to be my only option here.

 

Does anyone have any new suggestions?

 

Many thanks

 

EDIT:

Poking around on the filesystem, I found this file:

image.png.3a089a4a1c3037b69c993ecc1b83c2db.png

Am I barking up a tree or does this information help at all? Does it confirm my suspicion that Xorg is getting in my way, or is this a relatively normal, expected configuration?

I never use the gui of unraid with an Nvidia GPU and passing it through to a container, but I don't think that should come in the way. You can easily check that by booting the non-gui mode.

 

Did you also run the command to manually find the GPU UUID that is somewhere in this thread, posted by chbmb?

 

Have you updated your bios or changed any settings lately?

Link to comment
2 minutes ago, saarg said:

I never use the gui of unraid with an Nvidia GPU and passing it through to a container, but I don't think that should come in the way. You can easily check that by booting the non-gui mode.

 

Did you also run the command to manually find the GPU UUID that is somewhere in this thread, posted by chbmb?

 

Have you updated your bios or changed any settings lately?

To clarify, I am not currently booted into the GUI mode when I ran that command. In fact, I will be staying far away from GUI mode from now on!

I did run the command yes - nvidia-smi -L

Unfortunately that took a few seconds and then reported the same - "No devices found."

 

I have not touched BIOS settings since installing UNRAID as I don't want to cause an issue where it won't boot. Appreciate this thread is about transcoding, plex, etc but I have a lot of other, more important, stuff on the server which would be devastating to lose (I have suffered many data-loss incidents in the past before I started using UNRAID).

 

As for "other" settings, it's difficult to say. I don't go around messing with settings unless there's a reason but I have definitely made some changes recently.

Most notably it's network settings - I have some docker containers which run network level caching, handle a bit of routing, that kind of thing.

I have also been trying to get this plugin to work once again, which has involved many cycles of un-installing (not the plugin, the UNRAID build with Nvidia support) and reinstalling, and inserting, relocating, and removing the graphics card from PCI slots etc.

 

Would you be able to use any information in the Diagnostics zip UNRAID creates if I upload that here?

 

Thanks

Dan

Link to comment

If you have not already done it you might also want to try power-cycling the server in case the GPU has got into a state that means it needs to be restarted from cold.

 

in principle Unraid always unpacks itself afresh from the archives on the flash drive so that there should be no remnant of using GUI mode when you reboot after a power-cycle.

  • Thanks 1
Link to comment
12 hours ago, MadeOfCard said:

error-nvidia.thumb.jpg.eeda5948fdc457b00604e4f9edbde207.jpg

 

OK, I know this has been posted a ton. I've read the past 60+ pages in here.

 

I've deleted my network.cfg & network-rules.cfg, totally fresh settings. No Pi-Hole or dns filter. I even put my server on the DMZ for a moment to be sure.

 

Has anyone run into these conditions before?

It resolved itself in the morning. IDK

Link to comment
5 hours ago, MadeOfCard said:

It was a vanilla build that got reformatted again and again. Of course don’t expose your person server to DMZ. I was was just trying to nail down the issue. 

It's not only that server. You open up your whole network if the attackers breach your unraid server.

Link to comment

Hi Everyone,

 

I've recently purchased and installed a Geforce GTX 1650 (after initially setting up the plugin on the existing GeForce GT 1030 - which doesn't 'do' transcoding I found out afterwards).  I've changed the UUID (to the new card), rebooted/powered-off a few times, changed from the Nvidia 6.8.3 to Nvidia 6.9.0 beta 30 builds - all in an effort to get it working (as evidenced by seeing the 'running process' and percentage load changed via nvidia-smi engage when I watched a video that is 'transcoding' in PLEX).  All to no avail - nothing I seem to do is getting the GPU to engage (constantly at 0%).  

Is it because I'm running the plexinc/pms-docker rather than the LinuxServer.io version?  If so - that would seem strange but can someone provide a HowTo in switching a PLEX setup (with a few external users) from one to the other?

Is there something I can share (logs, etc) that might help?

 

Thanks in advance - just trying to get it to work and at wits end.

Link to comment
1 hour ago, ds679 said:

Hi Everyone,

 

I've recently purchased and installed a Geforce GTX 1650 (after initially setting up the plugin on the existing GeForce GT 1030 - which doesn't 'do' transcoding I found out afterwards).  I've changed the UUID (to the new card), rebooted/powered-off a few times, changed from the Nvidia 6.8.3 to Nvidia 6.9.0 beta 30 builds - all in an effort to get it working (as evidenced by seeing the 'running process' and percentage load changed via nvidia-smi engage when I watched a video that is 'transcoding' in PLEX).  All to no avail - nothing I seem to do is getting the GPU to engage (constantly at 0%).  

Is it because I'm running the plexinc/pms-docker rather than the LinuxServer.io version?  If so - that would seem strange but can someone provide a HowTo in switching a PLEX setup (with a few external users) from one to the other?

Is there something I can share (logs, etc) that might help?

 

Thanks in advance - just trying to get it to work and at wits end.

Did you activate hardware transcoding in plex and do you have plex pass?

 

Did you follow the first posts for setting up the plex container?

 

Post your docker run command.

Link to comment
4 minutes ago, saarg said:

Did you activate hardware transcoding in plex and do you have plex pass?

 

Did you follow the first posts for setting up the plex container?

 

Post your docker run command.

Check box within PLEX Settings --> Transcoder (Disable video stream trancoding - unchechecked, Use hardware acceleration when available - checked, Use hardware-accelerated video encoding - checked))

Yes, PLEX Lifetime pass

Setting up PLEX Container --> added the 'Extra Parameters:  --runtime=nvidia, and two variables (with 'all' and UUID)

docker run command:

root@localhost:# /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker run -d --name='PlexMediaServer' --net='host' --privileged=true -e TZ="America/Chicago" -e HOST_OS="Unraid" -e 'PLEX_CLAIM'='Insert Token from https://plex.tv/claim' -e 'PLEX_UID'='99' -e 'PLEX_GID'='100' -e 'VERSION'='latest' -v '/mnt/user/appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Cache/Transcode/':'/transcode':'rw' -v '/mnt/user/appdata/PlexMediaServer/':'/data':'rw' -v '/mnt/user/Recorded_TV-Archive2/Kindle_Movies/':'/media/Kindle_Movies':'rw,slave' -v '/mnt/user/Recorded_TV-Archive2/':'/media/TV_Shows-Tower3':'rw,slave' -v '/mnt/user/Music-Library/':'/media/music_redo':'rw,slave' -v '/mnt/user/Pictures/':'/media/Photos':'rw,slave' -v '/mnt/user/appdata/PlexMediaServer':'/config':'rw' --runtime=nvidia 'plexinc/pms-docker'

8201a247e5df38516ddff953b83e577e35e2391d02d50adf2f4203eaa35b0c3e

 

Link to comment
1 hour ago, ds679 said:

Check box within PLEX Settings --> Transcoder (Disable video stream trancoding - unchechecked, Use hardware acceleration when available - checked, Use hardware-accelerated video encoding - checked))

Yes, PLEX Lifetime pass

Setting up PLEX Container --> added the 'Extra Parameters:  --runtime=nvidia, and two variables (with 'all' and UUID)

docker run command:

root@localhost:# /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker run -d --name='PlexMediaServer' --net='host' --privileged=true -e TZ="America/Chicago" -e HOST_OS="Unraid" -e 'PLEX_CLAIM'='Insert Token from https://plex.tv/claim' -e 'PLEX_UID'='99' -e 'PLEX_GID'='100' -e 'VERSION'='latest' -v '/mnt/user/appdata/PlexMediaServer/Library/Application Support/Plex Media Server/Cache/Transcode/':'/transcode':'rw' -v '/mnt/user/appdata/PlexMediaServer/':'/data':'rw' -v '/mnt/user/Recorded_TV-Archive2/Kindle_Movies/':'/media/Kindle_Movies':'rw,slave' -v '/mnt/user/Recorded_TV-Archive2/':'/media/TV_Shows-Tower3':'rw,slave' -v '/mnt/user/Music-Library/':'/media/music_redo':'rw,slave' -v '/mnt/user/Pictures/':'/media/Photos':'rw,slave' -v '/mnt/user/appdata/PlexMediaServer':'/config':'rw' --runtime=nvidia 'plexinc/pms-docker'

8201a247e5df38516ddff953b83e577e35e2391d02d50adf2f4203eaa35b0c3e

 

You have not added the GPU to the container. There are no env variables for the UUID and the compute options. So no wonder it's not working.

Also, why do you volume mount the transcode folder of the appdata to /transcode? It's already in the appdata volume mount.

Link to comment
40 minutes ago, saarg said:

You have not added the GPU to the container. There are no env variables for the UUID and the compute options. So no wonder it's not working.

Also, why do you volume mount the transcode folder of the appdata to /transcode? It's already in the appdata volume mount.

 

So strange...I have it listed as a PLEX variable entry?!  I've been also setting up LinuxServer.io and noticed that is was present in the docker run command there....so, progress!  

 

when you say 'compute options' - do you mean the NVIDIA_DRIVER_CAPABILITIES = 'all'  ?  or something else?

the transcode info must be part of PlexInc's setup....should I remove?

 

Thanks again - this really helps!

Link to comment
On 10/21/2020 at 7:06 PM, Scroopy Noopers said:

Hey everyone,

 

All of a sudden I'm having issues with unRAID Nvidia not recognizing my GPU. I've got a Quadro P400 running in a Dell R510. I had the whole system running perfectly, and then all of a sudden, my plex installation crashed. After having to hard boot my system and a subsequent failure of plex to start, I narrowed the culprit down to the GPU. I disabled the additional arguments in the docker shell that allowed it to use the GPU for hardware accelerated transcoding and then began my troubleshooting.

I moved the GPU into my desktop computer and it seemed to run just fine. I then plugged the GPU into the riser cable I had, and plugged that into my desktop - no go. Assuming I had found the issue, I ordered a new riser cable. It arrived today and so I eagerly shut down the server, plugged the riser cable and GPU in and booted the server. At this time, unRAID Nvidia does not recognize it however when I run the command "lspci | grep -i nvidia" my output is as follows: 

03:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P400] (rev a1)
03:00.1 Audio device: NVIDIA Corporation GP107GL High Definition Audio Controller (rev a1)

 

In the system log, I see this:

Oct 21 18:58:01 PowerEdgeR510 kernel: NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x31:0xffff:973)
Oct 21 18:58:01 PowerEdgeR510 kernel: NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0

 

Any insight would be fantastic as I'm unsure of what to do next.

As a follow up to this, I have attempted to downgrade to 6.8.3, but the unraid nvidia is still not seeing the GPU, however it is seeing my cache drive as an unassigned device. When I try to assign it, it comes up as a new drive, so I am going to reinstall (again) beta 30.

 

This being said, I am able to assign the GPU for passthrough to my Win Server VM.

Link to comment
  • trurl locked this topic
Guest
This topic is now closed to further replies.