[Support] ich777 - AMD Vendor Reset, CoralTPU, hpsahba,...


Recommended Posts

5 hours ago, 1812 said:

shout out to @ich777. We'd had some problems patching the proliant edition of unraid (https://forums.unraid.net/topic/72681-unraid-hp-proliant-edition-rmrr-error-patching/) After spending some time with this docker and updating the patch, it looks like proliant owners will get to continue on unRaiding for a bit longer. much appreciated!

Much appreciated. ;)

Can you share what you've did exactly, eventually I can update the container to support it natively. :)

Link to comment
6 hours ago, ich777 said:

Much appreciated. ;)

Can you share what you've did exactly, eventually I can update the container to support it natively. :)

native support would be fantastic!

this is all a work in progress but this is what I've gotten so far:

 

the patch needed updating to fix changes to the iommu/intel-iommu.c driver. we use to compile this using another script, but it stopped working sometime during the 6.8 series. Once the patch was updated, it was just dropped in the patched folder.

 

the patch still needs a little more work as it is mostly working on my system, but causing problem on another testers. its a simple patch and hopefully will be all figured out by today.

 

for integration, would we just pull the patch (which may need updating from time to time) from a GitHub repository? if so, it may need versioning if they make anymore changes to the driver. 

Link to comment
1 hour ago, 1812 said:

native support would be fantastic!

this is all a work in progress but this is what I've gotten so far:

 

the patch needed updating to fix changes to the iommu/intel-iommu.c driver. we use to compile this using another script, but it stopped working sometime during the 6.8 series. Once the patch was updated, it was just dropped in the patched folder.

 

the patch still needs a little more work as it is mostly working on my system, but causing problem on another testers. its a simple patch and hopefully will be all figured out by today.

 

for integration, would we just pull the patch (which may need updating from time to time) from a GitHub repository? if so, it may need versioning if they make anymore changes to the driver. 

If you figure all out please let me know and I will update the container and also the template so that you can set a variable (let's say something like 'RMRR_ERROR_PATCH') to 'true' and the container will automatically pull the patch, integrate it and build everything.

 

Also please provide a link to the patch and what needs to be patched (I think one of the downloads on the bottom of the first post should do the trick Click).

 

 

EDIT: Eventually this is something that solves another issue Click (could also implement that in the container).

 

EDIT2: So you simply drop the .patch file in the user-patches folder and everything works as expected? Am I understanding that right? :)

Link to comment
3 hours ago, ich777 said:

EDIT2: So you simply drop the .patch file in the user-patches folder and everything works as expected? Am I understanding that right? :)

well, thats how its suppose to work. I can verify that the path does modify the correct driver. But it doesnt quite fix everything. I am able to do GPU passthrough on beta 25 where I could not before. But when I try to pass through a 10gbe adapter, it is still blocked.

 

3 hours ago, ich777 said:

EDIT: Eventually this is something that solves another issue Click (could also implement that in the container).

this may be needed to fix the issue with the network card, but its implementation is a bit over my head. if anyone wants to help, it would be appreciated!

 

 

3 hours ago, ich777 said:

Also please provide a link to the patch and what needs to be patched (I think one of the downloads on the bottom of the first post should do the trick Click).

 

i re-ran with the v2 in that post and can confirm that it does patch the driver and is essentially the same as what I found to work.

Edited by 1812
Link to comment
1 hour ago, 1812 said:

well, thats how its suppose to work. I can verify that the path does modify the correct driver. But it doesnt quite fix everything. I am able to do GPU passthrough on beta 25 where I could not before. But when I try to pass through a 10gbe adapter, it is still blocked.

So it is working partially... :D

 

1 hour ago, 1812 said:

this may be needed to fix the issue with the network card, but its implementation is a bit over my head. if anyone wants to help, it would be appreciated!

That's no problem, I can build it for you and send the images to you if you want to, hook me up with a short PM.

 

1 hour ago, 1812 said:

i re-ran with the v2 in that post and can confirm that it does patch the driver and is essentially the same as what I found to work.

So the you need the image with v2 and also with the kernel modules? But I think if I compile it with the Kernel modules the patch file isn't necessary (haven't got time to look into it exactly).

Link to comment
1 hour ago, ich777 said:

So it is working partially... :D

yeah, but it may also be because i have always used acs override multifunction to break up into the smallest iommu groups. and if the sound portion of the GPU triggers it (which happens frequently) then perhaps its not working at all and i'm just working around it.

 

 

1 hour ago, ich777 said:

That's no problem, I can build it for you and send the images to you if you want to, hook me up with a short PM.

will contact!

 

1 hour ago, ich777 said:

So the you need the image with v2 and also with the kernel modules? But I think if I compile it with the Kernel modules the patch file isn't necessary (haven't got time to look into it exactly).

i dont know if it requires both. the post isnt clear if they did the patch and then made the module to go together or if the modules are fine on their own. it may take trial and error. but you will have tons of appreciation from me and the growing proliant community!

Link to comment

Hello,

 

Firstly - thank you for this brilliant development. Baking in these drivers has made UnRaid even more versatile.

 

I've been having issues with GPU passthrough to plex. I used your compiled files for Unraid 6.9.0 beta 25 with Nvidia and ZFS. Via the plugin, I can see my Quadro P600 with a GPUID. However when I insert the usual parameters into plex (documented previously in this thread, and the linuxserver thread as well). I get the following error:

 

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused "process_linux.go:432: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: device error: GPU-8762e169-fd10-da16-81e8-d34aa3b03103: unknown device\\n\""": unknown.

 

I can provide other information if it helps. Thanks again!
 

Screen Shot 2020-07-27 at 4.24.02 PM.png

Link to comment
15 minutes ago, AlexHuang said:

Hello,

 

Firstly - thank you for this brilliant development. Baking in these drivers has made UnRaid even more versatile.

 

I've been having issues with GPU passthrough to plex. I used your compiled files for Unraid 6.9.0 beta 25 with Nvidia and ZFS. Via the plugin, I can see my Quadro P600 with a GPUID. However when I insert the usual parameters into plex (documented previously in this thread, and the linuxserver thread as well). I get the following error:

 

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused "process_linux.go:432: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: device error: GPU-8762e169-fd10-da16-81e8-d34aa3b03103: unknown device\\n\""": unknown.

 

I can provide other information if it helps. Thanks again!
 

Screen Shot 2020-07-27 at 4.24.02 PM.png

Is it possible that you have a space (more likely a hidden new line) infront or at the end of the uuid?

 

If not please try to copy the uuid in a texteditor (notepad is the best choice if you are on windows) delete all spaces or whatever the case may be and copy that into the uuid field in the template.

 

Edit: have you built the image yourself or did you download a precompiled?

Link to comment
19 minutes ago, ich777 said:

Is it possible that you have a space (more likely a hidden new line) infront or at the end of the uuid?

 

If not please try to copy the uuid in a texteditor (notepad is the best choice if you are on windows) delete all spaces or whatever the case may be and copy that into the uuid field in the template.

 

Edit: have you built the image yourself or did you download a precompiled?

Annndd I'm embarassed to say that this was exactly the issue... I really thought I had removed all the spaces, etc.


Thank you! I will be testing out the HW transcoding soon!

  • Like 1
Link to comment

I'm currently searching for some users that help test my custom build with iSCSI built into Unraid (v6.9.0 beta25).

EDIT: Build for Unraid v6.8.3 is also now available.

 

Currently the creation of the iSCSI target is command line only (will write a plugin for that but for now it should also work this way - only a few commands in targetcli).

 

The configuration is stored on the boot drive and loaded/unloaded with the array start/stop.

If somebody is willing to test the build please contact me.

 

As always I will release the complete source code and also implement it in my 'Unraid-Kernel-Helper Docker Container' so that everyone can build his own version with other features like nVdidia, ZFS, DVB also built in.

 

Also made a post here:

 

 

Bildschirmfoto_2020-07-29_23-04-46.png.1a500fb2b8f946fa1255be0b14e06f78.png

  • Thanks 1
Link to comment
1 hour ago, Omri said:

Hi

Any chance that I can enable thunderbolt networking with this tool?

Thanks in advance

I already tried it but without success, I don't have any hardware that is Thunderbolt capable...

 

Also I think thunderbolt should be supported natively by Unraid, correct my if I'm wrong.

 

I think you also have to authorize a device that is pluged in.

 

Also wrote with @Dtrain about it but got no success.

Link to comment

Hi, thank you for the replay

AFAIK, it's supported natively in the linux kernel

regarding the authorization, you are correct

in my case, I disabled the security in the bios and it's working without any need to authorize

tested with debian (OMV) and ububtu (20.04)

Link to comment
3 hours ago, Omri said:

Hi, thank you for the replay

AFAIK, it's supported natively in the linux kernel

regarding the authorization, you are correct

in my case, I disabled the security in the bios and it's working without any need to authorize

tested with debian (OMV) and ububtu (20.04)

So is it then working in Unraid?

Is the thunderbolt module loaded?

Link to comment

Updated both 6.8.3 and 6.9.0 images to create builds with iSCSI Targets (this feature is at the time command line only - please read the manuals on how to create a target).

I also uploaded prebuilt images for 6.8.3 and 6.9.0beta25 with iSCSI Target builtin in the first post of this thread.

 

ATTENTION: If you mount a block volume always do it with the path: '/dev/disk/by-id/...' otherwise you risk data loss, please download/read the manual in the first post!

Manual Block Volume.txt Manual FileIO Volume.txt

  • Like 1
Link to comment
On 8/1/2020 at 12:30 AM, Omri said:

Sorry for the misunderstanding

It's not working in Unraid

I meant Linux kernel in general

this is exactly my point, on one hand Unraid is advertisting thunderbolt as fully supported ... 

 

CONFIG_THUNDERBOLT: Thunderbolt support

CONFIG_INTEL_WMI_THUNDERBOLT: Intel WMI thunderbolt force power driver

 

but in real live... no one care about it, i have tried many optiona and wasted days of work on this matter.

 

ICH777 was very very helpfull so far, but still no green light in sight.

 

Edited by Dtrain
Link to comment

@Dtrain & @Omri I've recently took a deep dive into thunderbolt again and it should be possible with Unraid but I also have to say that that I've got no thunderbolt device, sure I could mod my BIOS of my motherboard to support PCIe bifurcation and then buy a TitanRidge controller from Gigabyte (that is nowhere available) and then mod it to power up in my motherboard (please note that this is all possible since, like I've wrote above, took a deep dive into thunderbolt again).

I even got 'boltd' working with unraid (partially I think).

 

Also please understand that @limetech is also busy with the new release of Unraid and have to focus on other things at the time I think.

Eventually they can look into it after they released 6.9.0.

 

The main problem is I think the authentication of new devices but that should be no problem with 'boltd', since I've already compiled it and it's working with Unraid this should not be the problem (but keep in mind it's also important what your security level in the BIOS is set to for Thunderbold).

Link to comment
2 hours ago, Omri said:

@ich777

Thank you

any way to test your compiled version of boltd?

 

 

@Dtrain

Yes and no... :D

I think the basics work now but I don't know if the service starts as intended but that's a minor tweak.

Please keep in mind that you must authorize the device if you connected it and also if a special driver is needed you have to integrate it in the build.

 

Which devices do you want to use?

 

Contact me if you want the finished build, currently I've built it with Unraid v6.9.0beta25.

 

I'm now on vacation so please keep in mind that my respond times are a little longer...

 

EDIT: Contact me via PM

Link to comment

@ich777

I tried doing my own Nvidia custom build with mellanox and the nvidia drivers won't load.

 

I see Nvidia updated there tools in July, maybe something broke?

 

Also where are the Mellanox tools install at?

 

I want to flash firmware with Flint, so I need to drop the .bin files in that install dir :)

Edited by Dazog
Link to comment
3 hours ago, Dazog said:

@ich777

I tried doing my own Nvidia custom build with mellanox and the nvidia drivers won't load.

 

I see Nvidia updated there tools in July, maybe something broke?

 

Also where are the Mellanox tools install at?

 

I want to flash firmware with Flint, so I need to drop the .bin files in that install dir :)

A little bit more information please... :D
Which version 6.8.x or 6.9.0beta25?

What did they update exactly (btw wanted to build also a new build for my server in a few minutes, will look into it). ;)

 

These should be the steps for flashing the Mellanox cards:

 

  1. Download the firmware for you card: https://www.mellanox.com/support/firmware/connectx2en
  2. extract the binfile to your server lets say to one of your shares (in this example 'firmware')
  3. open up a terminal and got to the share that you've copied the firmware (in this example 'firmware'): 'cd /mnt/user/firmware'
  4. then type in: '/sbin/lspci -d 15b3:' and you should get something like:
  5. '07:00.0 Ethernet controller: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0)'
  6. then type in: 'mstflint -d 07:00.0 -i firmware.bin burn' (replace '07:00.0' with the device ID from the output from step 4 and also replace 'firmware.bin' with the name of the extracted binfile from step 2) this should start burning/flashing the firmware

(It's a little different since it's based on the open source files)

 

Have you also installed the Unraid-Kernel-Helper Plugin? Can you also send a screenshot from it?

 

EDIT: Opened a issue on Github because of this issue:

tar -C /usr/src/libnvidia-container/deps/src/nvidia-modprobe-396.51 --strip-components=1 -xz nvidia-modprobe-396.51/modprobe-utils
######################################################################### 100.0%curl: (28) Failed to connect to codeload.github.com port 443: Connection timed out


gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
make[1]: *** [/usr/src/libnvidia-container/mk/nvidia-modprobe.mk:34: /usr/src/libnvidia-container/deps/src/nvidia-modprobe-396.51/.download_stamp] Error 2
make[1]: Leaving directory '/usr/src/libnvidia-container'
make: *** [Makefile:223: deps] Error 2

 

EDIT2: In the meantime I can send you my image, it's built with nVidia, Mellanox Tools, DigitalDevicesDVB and also iSCSI support.

 

EDIT3: I think Github has some problems with redirection, if I try the command multiple times it works everytime after the 2nd or 3rd attempt.

 

EDIT4: Everything is now working again, also please redownload or update the container since I've added a check for this to not happen again and also implemented multicore and better compression of the images itself.

  • Like 1
Link to comment
  • ich777 changed the title to [Support] ich777 - AMD Vendor Reset, CoralTPU, hpsahba,...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.