[Plugin] Nvidia-Driver


ich777

Recommended Posts

2 hours ago, ich777 said:

As you can see here you have a Code 79 XID error (like it was before):

NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.

(You can read more about that here in the table of possible causes)

 

Sadly enough this error is a bit hard to troubleshoot since it can mean almost anything like: that you have a faulty card (frame buffer error), aerror in the PCI bus, issues with the power supply, a thermal issue or an issue with the driver (the last one is actually unlikely because on other systems the driver is working as expected since the plugin always uses the same driver package which is pre-compiled).

 

Did you change anything in your system, a BIOS setting, added hardware or similar,...?

 

Are you also sure that your power supply is still up to the task and can deliver enough juice to run the card properly (IIRC the last user had a faulty power supply)?

 

Please also make sure that you enable Above 4G Decoding in your BIOS and Resizable Bar Support (if you have that option) in your BIOS.

Have you yet tried to boot with UEFI instead of Legacy mode (don't forget to allow UEFI boot)?

 

You could also try, if this is possible for you, to put the card in another system, install the driver and put some 3D load on it for half an hour and see if the card is working properly.

with this gt 710 it seams to work but plex dont stream anything to that

Thu Dec 14 18:26:07 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.199.02   Driver Version: 470.199.02   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 N/A |                  N/A |
| 50%   41C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
docker run
  -d
  --name='Plex-Media-Server'
  --net='host'
  -e TZ="Europe/Berlin"
  -e HOST_OS="Unraid"
  -e HOST_HOSTNAME="nas"
  -e HOST_CONTAINERNAME="Plex-Media-Server"
  -e 'PLEX_UID'='99'
  -e 'PLEX_GID'='100'
  -e 'VERSION'='latest'
  -e 'NVIDIA_VISIBLE_DEVICES'='GPU-7e5696c2-b6d8-ed4b-a534-a7b6c28ce8d4'
  -e 'NVIDIA_DRIVER_CAPABILITIES'='all'
  -l net.unraid.docker.managed=dockerman
  -l net.unraid.docker.webui='http://[IP]:[PORT:32400]/web'
  -l net.unraid.docker.icon='https://raw.githubusercontent.com/plexinc/pms-docker/master/img/plex-server.png'
  -v '/tmp/':'/transcode':'rw'
  -v '/mnt/user/Media/':'/data':'rw'
  -v '/mnt/user/Plex_Config/':'/config':'rw'
  --runtime=nvidia 'plexinc/pms-docker'

560b98e904be23bfb575b4d1815d4a08d5c4c0f4ad0749c23694e160d706ff47

 

Edited by VolzanIT
Link to comment
3 minutes ago, VolzanIT said:

with this gt 710 it seams to work but plex dont stream anything to that

I think you mean that Plex isn't transcoding correct?

Please note that the GT710 is a pretty low end card and is only capable of transcoding h264 (AVC) and not h265 (HEVC), so this might be the issue here.

 

Maybe look for something like a Nvidia T400 it is pretty low power, a recent card (Turing based), doesn't need external power and you can get it for pretty cheap.

Link to comment
16 minutes ago, ich777 said:

I think you mean that Plex isn't transcoding correct?

Please note that the GT710 is a pretty low end card and is only capable of transcoding h264 (AVC) and not h265 (HEVC), so this might be the issue here.

 

Maybe look for something like a Nvidia T400 it is pretty low power, a recent card (Turing based), doesn't need external power and you can get it for pretty cheap.

yes is just for trying i was transcoding on a 1650 previusly

Link to comment
Dec 20 14:25:54 nas kernel: NVRM: GPU at PCI:0000:01:00: GPU-458b205b-492c-8589-395a-9599271b9ce7
Dec 20 14:25:54 nas kernel: NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.
Dec 20 14:25:54 nas kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
Dec 20 14:25:54 nas kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x22:0x56:762)
Dec 20 14:25:54 nas kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

so changed the gpu with a 1660 with external psu it throw the same error, next step is bios update then i think i need to change the MB i hope not :( is to old and with a i5 6600k and find a mb that support this cpu it will be a pain

Link to comment
30 minutes ago, VolzanIT said:
Dec 20 14:25:54 nas kernel: NVRM: GPU at PCI:0000:01:00: GPU-458b205b-492c-8589-395a-9599271b9ce7
Dec 20 14:25:54 nas kernel: NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.
Dec 20 14:25:54 nas kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
Dec 20 14:25:54 nas kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x22:0x56:762)
Dec 20 14:25:54 nas kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

so changed the gpu with a 1660 with external psu it throw the same error, next step is bios update then i think i need to change the MB i hope not :( is to old and with a i5 6600k and find a mb that support this cpu it will be a pain

As said, this could be also a Firmware issue, so to speak a BIOS error which most of the times isn't solved with a BIOS update.

Link to comment

My Nvidia GTX 1060 card is not showing up in the Driver Plugin. I can see the card in System Devices, but in the plugin it says no devices were found.12466701_Screenshot2023-12-21at2_14_29PM.thumb.png.27e25717de0fef1cbc49258333a436a1.png

 

1588367665_Screenshot2023-12-21at2_15_15PM.thumb.png.230cffe4d4039fca8c478f0aadff93a4.png

2003138694_Screenshot2023-12-21at2_19_44PM.png.f4516320975bb31ef179607204ddd85b.png

 

I do however see this error in the logs:

Dec 21 14:11:15 Unraid kernel: nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
Dec 21 14:11:15 Unraid kernel: nvidia-uvm: Loaded the UVM driver, major device number 239.
Dec 21 14:11:16 Unraid kernel: NVRM: GPU 0000:05:00.0: RmInitAdapter failed! (0x24:0x72:1436)
Dec 21 14:11:16 Unraid kernel: NVRM: GPU 0000:05:00.0: rm_init_adapter failed, device minor number 0
Dec 21 14:11:16 Unraid kernel: NVRM: GPU 0000:05:00.0: RmInitAdapter failed! (0x24:0x72:1436)
Dec 21 14:11:16 Unraid kernel: NVRM: GPU 0000:05:00.0: rm_init_adapter failed, device minor number 0

 

I have read that sometimes that rminitadapter failed error could be related to power. I am using the 6pin Pcie power cable from my PSU. I have also tried a 15pin Sata power to 6pin pcie adapter and saw the same error.

 

Attached are my diagnostics. Any suggestions? 

 

unraid-diagnostics-20231221-1417.zip

Edited by hovee
Link to comment
9 hours ago, hovee said:

I have read that sometimes that rminitadapter failed error could be related to power. I am using the 6pin Pcie power cable from my PSU. I have also tried a 15pin Sata power to 6pin pcie adapter and saw the same error.

Are you really sure that this card is supported in your server? HP servers often have a device lock and they only allow certain cards to work.

 

However the lockups that you are experiencing is most certainly because of the plugin calling nvidia-smi and that's what causing that.

 

Are you sure that the card is working properly?

Please check if you have Above 4G decoding and Resizable Bar support enabled in your BIOS.

Are you also sure that your power supply is able to deliver enough power to the card?

Can you swap the card to another PCIe slot?

Are you on the latest BIOS version from your Motherboard?

Link to comment

 

10 hours ago, ich777 said:

Are you really sure that this card is supported in your server? HP servers often have a device lock and they only allow certain cards to work.

 

However the lockups that you are experiencing is most certainly because of the plugin calling nvidia-smi and that's what causing that.

 

Are you sure that the card is working properly?

Please check if you have Above 4G decoding and Resizable Bar support enabled in your BIOS.

Are you also sure that your power supply is able to deliver enough power to the card?

Can you swap the card to another PCIe slot?

Are you on the latest BIOS version from your Motherboard?

I believe the card is supported as I've read posts and videos of people using a 1060 and a 1080 in my server HP z620.

 

I upgraded the bios to the latest version today. I didn't see any options for 4g decoding or resizable bar in the bios. I did try changing a setting for 'Video Options Rom' from Legacy to EFI. That didn't seem to make a difference.

The power supply is 800watts and should supply enough power.

An older quadro card I had works in the pci slot and displays video output of unraid booting, etc..., so I know the pci slot is ok.

 

I'm going to try the 1060 card in another PC I have to see if it works over there as I'm wondering if the card is bad. 

  • Like 1
Link to comment
4 hours ago, hovee said:

I'm going to try the 1060 card in another PC I have to see if it works over there as I'm wondering if the card is bad. 

I tried it in another PC that has Windows 10 installed. The card doesn't work in that machine either. No video output. I think the card may be bad. I'll see if I can get it replaced and try again with a new card.

  • Thanks 1
Link to comment
23 hours ago, hovee said:

I tried it in another PC that has Windows 10 installed. The card doesn't work in that machine either. No video output. I think the card may be bad. I'll see if I can get it replaced and try again with a new card.

I put in a different GTX 1060 and it was recognized instantly in the Nvidia Plugin. It was definitely a bad card. Thank you for your help!

  • Like 1
Link to comment
48 minutes ago, cyruspy said:

Would you consider adding support for Tesla cards?.

What card do you have, some are supported.

 

48 minutes ago, cyruspy said:

Wouldn't going with vGPU/SR-IOV allow using the card for containers (emby/shinobi) and VMs (Windows) at the same time?

Can‘t do that because the driver is proprietary and you only can get access to if registred and even if I have the driver, redistribution is not allowed by Nvidia.

Link to comment
1 hour ago, cyruspy said:

Thanks. Will give it a try.

but consider this driver is then not meant to be used with

 

On 12/24/2023 at 1:11 PM, cyruspy said:

2. Wouldn't going with vGPU/SR-IOV allow using the card for containers (emby/shinobi) and VMs (Windows) at the same time?

thats a different story as note, you can use the P4 on Host (Dockers) then ... but NOT simultan in VM / Docker, its a either / or usage ... driver for host (dockers) OR passed through to a (1) VM

Link to comment

howdy all,

I'm having an issue with my new Tesla P4, it hums along happily enough but the I get the dreaded GPU has fallen off the bus issue that i've seen throughout this topic.

I've messed aroudn in my bios (mainly setting the PCIE to gen 2 which ive seen mentioned here) but my X79-UD3 doesn't have either the 4g or BAR options which I fear might be my issue.

Anyhoo, I'm hoping someone can poke at my logs and see something obvious.

Thanks in advance.

 

nvidia-bug-report.log.gz squirrelserver-diagnostics-20231230-1601.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.