[Support] ich777 - AMD Vendor Reset, CoralTPU, hpsahba,...


Recommended Posts

2 hours ago, Phil85 said:

Another question !

Can I use Digital devices, TBS and other tv cards withe libreelec dvb driver package ?

Because the ordered hauppauge quad card is not orderable at the moment and i have to search after a differnt solution.

 

Thanks

Eventually...

DigitalDevices should also work.

It depends on which modules are needed.

If you are using a TBS card I recommend the TBS build.

Link to comment

I finely made time to try this dockercontainer.

Got it to work, the Win10 VM works perfectly with the reset bug included, Linux VM not really (2e boot crashed at efi vga to amdgpudrmfb notification)

But i found out that the GPU that unRAID using is staying at P0 state, tried if it would work if i switch from GPU, no luck...

also its spamming the logbook with:

kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
kernel: caller _nv000745rm+0x1af/0x200 [nvidia] mapping multiple BARs

this is with unRAID 6.8.3

i am not sure if the 6.9 beta doesn't have this issue... (tried Nvidia v450.66, v450.57 and 440.100)

 

did update the vega64 driver on win10 (finely it works!) and switched for now back to the LS.io version)

Edited by sjaak
Link to comment
1 minute ago, sjaak said:

I finely made time to try this dockercontainer.

Got it to work, the Win10 VM works perfectly with the reset bug included, Linux VM not really (2e boot crashed at efi vga to amdgpudrmfb notification)

But i found out that the GPU that unRAID using is staying at P0 state, tried if it would work if i switch from GPU, no luck...

also its spamming the logbook with:


kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
kernel: caller _nv000745rm+0x1af/0x200 [nvidia] mapping multiple BARs

this is with unRAID 6.8.3

i am not sure if the 6.9 beta doesn't have this issue... (tried Nvidia v450.66, v450.57 and 440.100)

Can't tell what's wrong but you know that the driver that you install in the image with this container is mainly for use in the Docker containers itself and not for a VM?

I think that the error could indicate that the graphics card is used for a VM and also for a Docker or at least also reserved for the Docker service.

 

What hardware are you using?

 

If you are using AMD Ryzen hardware I would strongly recommend trying the Unraid beta (but keep in mind that this a beta and not everything is working properly).

Link to comment
2 minutes ago, ich777 said:

Can't tell what's wrong but you know that the driver that you install in the image with this container is mainly for use in the Docker containers itself and not for a VM?

I think that the error could indicate that the graphics card is used for a VM and also for a Docker or at least also reserved for the Docker service.

 

What hardware are you using?

 

If you are using AMD Ryzen hardware I would strongly recommend trying the Unraid beta (but keep in mind that this a beta and not everything is working properly).

i am not using the Nvidia GPU for the VM, that use the Vega64 (this if why i needed the AMD Reset Bug Patch ;) ) its Passthrought.

unRAID use the Geforce gt710 (GUI boot), Plex media server use the geforce 1050ti. with the build from this container the GT710 stays on P0 state and spamming the logbook with the notification,

If i switch in the BIOS from the gt710 to the 1050ti (and use it for unraid en plex) then the gt710 is staying on P0 state and spamming the logbook)

 

I am on Threadripper gen1. still doubt to check the beta build out...

Link to comment
4 minutes ago, sjaak said:

i am not using the Nvidia GPU for the VM, that use the Vega64 (this if why i needed the AMD Reset Bug Patch ;) ) its Passthrought.

unRAID use the Geforce gt710 (GUI boot), Plex media server use the geforce 1050ti. with the build from this container the GT710 stays on P0 state and spamming the logbook with the notification,

If i switch in the BIOS from the gt710 to the 1050ti (and use it for unraid en plex) then the gt710 is staying on P0 state and spamming the logbook)

 

I am on Threadripper gen1. still doubt to check the beta build out...

Have to look this up on my own since I don't have multiple graphics cards in my system (personally have a Xeon and not AMD) and I only use the nVidia GPU for my Docker containers.

Link to comment

...quick question, as your container sounds like a very powerful tool.

Is it possible to change existing modules as well, instead of adding new ones?

 

I have a problem with an unraid build, based on 10th gen Intel Motherboards, with i219V NIC.

As it turns out, many Socket 1200 MBs do come with a newer revision of the onboard NIC and the default driver/module (e1000e) for 6.8.3 does not detect these, while driver in 6.9b25 does fine.

Maybe one could use your container for backporting the driver/module from 6.9.b25 to 6.8.3?

 

many thanks in advance and keep up the good work!

 

regards,

ford

Link to comment
13 minutes ago, Ford Prefect said:

...quick question, as your container sounds like a very powerful tool.

Is it possible to change existing modules as well, instead of adding new ones?

 

I have a problem with an unraid build, based on 10th gen Intel Motherboards, with i219V NIC.

As it turns out, many Socket 1200 MBs do come with a newer revision of the onboard NIC and the default driver/module (e1000e) for 6.8.3 does not detect these, while driver in 6.9b25 does fine.

Maybe one could use your container for backporting the driver/module from 6.9.b25 to 6.8.3?

 

many thanks in advance and keep up the good work!

 

regards,

ford

Yes this is possible.

But you have to find someone that does a backport of the driver and then implement it, the implementation is not the problem. ;)

Why are you not running the beta25?

Link to comment
Just now, Ford Prefect said:

...because its beta ;-)

And what's bad about that?

I run it on my main server and also a buddy of mine on a Intel 10th gen platform.

 

1 minute ago, Ford Prefect said:

(i hope)

You could make a stable version of Unraid worse than a beta if you don't know what you are doing, keep that in mind... Just saying...

 

2 minutes ago, Ford Prefect said:

So, how do I go about it, assuming the latest driver is here: https://sourceforge.net/projects/e1000/files/e1000e stable/  ?

If you want to do that you must extend the buildscript or do every build step manually and then you also need to compile the driver I think manually (please don't ask me since I don't have such a card to test and I don't want to destroy anything on your server or your entire server).

Link to comment

Yes, I fully understand.

I'd rather trust a backported 6.8.3 instead ;-) (in this special case)

 

I may be a bit rusty but I do have experience in building a kernel from early 0.8 version.

 

These drivers exist since linux kernel 2.4, and the build structure hasn't changed for years, as far as I can see by the Makefiles from the current and older versions.

Hence I assume that this holds true for the way the driver build-structure in set up in 6.8.3 and in 6.9b25.

So the the way of the kings could be to pull the source-tree from your 6.9b25 docker, push/copy it to an instance of your 6.8.3 docker, overwriting the source tree of the e1000e driver there and walk through the build steps.

Link to comment
6 minutes ago, Ford Prefect said:

Yes, I fully understand.

I'd rather trust a backported 6.8.3 instead ;-) (in this special case)

 

I may be a bit rusty but I do have experience in building a kernel from early 0.8 version.

 

These drivers exist since linux kernel 2.4, and the build structure hasn't changed for years, as far as I can see by the Makefiles from the current and older versions.

Hence I assume that this holds true for the way the driver build-structure in set up in 6.8.3 and in 6.9b25.

So the the way of the kings could be to pull the source-tree from your 6.9b25 docker, push/copy it to an instance of your 6.8.3 docker, overwriting the source tree of the e1000e driver there and walk through the build steps.

Wait, the basic container is for 6.8.3

I think you don't read the first thread. You don't have to pull anything from github, I built in a custom mode where the container copys aver all necessary files to the main directory, can editthe buildscriptand run it then from the docker console.

Youst add it after the kernel is builtand exported, if you do everything right than it should compile and pack everything in the new images.

 

Edit: please don't pull anything from a beta25 to a 6.8.3 these are two different kernels.

You have to compile the driver that you linked with the version 6.8.3 of the container.

Link to comment
7 minutes ago, Ford Prefect said:

So the the way of the kings could be to pull the source-tree from your 6.9b25 docker, push/copy it to an instance of your 6.8.3 docker, overwriting the source tree of the e1000e driver there and walk through the build steps.

There is also a readme in the linked file how to compile the drivers.

So as I said it would best to append that steps to the buildscript after the bzimage is built.

This is right before the nVidia build stage.

Link to comment

Hmmm, not quite sure if I understand what you are saying.

 

I assumed that your container can be configured to either produce a kernel, including modules, for 6.8.3 or 6.9b25.

Yes, I don't want to pull binaries from beta to stable, but rather the source-tree, including its build config/makefile - only for this specific driver.

 

I guess I'll have to check how your container is doing that internally....starting with the readme ;-)

 

Many thanks for your support...gute Nacht! ;-)

  • Like 1
Link to comment
7 hours ago, Ford Prefect said:

Hmmm, not quite sure if I understand what you are saying.

Set the variable CUSTOM_BUILD to 'true' and the container copies over the build script for the kernel and every additional build stage (nVidia, ZFS, DVB,...) over to the main directory.

 

7 hours ago, Ford Prefect said:

I assumed that your container can be configured to either produce a kernel, including modules, for 6.8.3 or 6.9b25.

Correct, that's described on the first post of this thread. ;)

 

7 hours ago, Ford Prefect said:

Yes, I don't want to pull binaries from beta to stable, but rather the source-tree, including its build config/makefile - only for this specific driver.

I think you are talking about my build script and then include your linked driver into the script.

 

 

7 hours ago, Ford Prefect said:

Many thanks for your support...gute Nacht!

No problem, was a little late yesterday... :D

 

I would do it that way:

  1. Download the container from the CA APP
  2. Make your configuration by setting the things to 'true' or simply leave it empty if you don't need it
  3. Set the 'Custom Build' to 'true' (then the container copies over the build script and some other stuff and will sleep infinity
  4. Go to your appdata directory and download your linked driver and place it there
  5. In that folder you will also find the file 'buildscript.sh'
  6. Open that file and edit it
  7. Add the necessary build steps (extract, make,...) commands there (the best way to do this after '## Copy Kernel image to output folder' simply search for that text in 'buildscript.sh')
  8. After you made the changes/additions, save and close 'buildscript.sh' and open up a console terminal (click on the icon on the Dockerpage and select 'console')
  9. Go to the data directory (type in 'cd $DATA_DIR') and execute 'buildscript.sh' (with './buildscript.sh')

 

If everything went well you have everything ready in the output folder. ;)

NOTE: the path to the working directory is set with a variable and you can use that also in your additional build steps for the driver '$DATA_DIR' (points to '/usr/src' inside the container).

 

Hope this helps.

  • Like 1
Link to comment

Thank you very much for your detailed response.

I have understood the basic concept, so far, I think.

 

But I still cannot get my head around something.

 

Your docker is for building new kernel modules, but in my special case I am talking about an existing (standard kernel) module.

So, when your buildscripts pulls the kernel source (which should also include standard modules), I'd assume that the e1000e module will be present there in the source-tree.

With your buildscript, you are building all standard modules anyway, without the need for an additional section in your buildscript.

Let's assume I only enable ZFS module in the docker config, I will end up with a new, complete kernel (re-)build, including all standard modules plus ZFS-module(s).

 

Is this assumption correct, so far?

Then, why would I need a new section for my module in the build-script?

 

In my mind, I'd use three steps to build a new kernel with my standard module:

 

1) basically split your build scrpt into two steps, i.e by inserting a stop/wait to the buildscript in the section where you just pulled the kernel source and confirm, that my desired driver (source - in this case old version) module is there, in the container.

 

2) then I would replace that with the source of the newer driver version ... and...instead of using the external sourceforge link, I use a tree collected the same way from another instance of your container, running in beta (and custom) mode...because I know that this instance works, including build-config (which I assume did not change much between k4.x and k5.x).

 

3) third step then is to let the rest of the buildscript continue.

 

...step (2), I am going to verify during this day...fingers crossed ;-)

 

Edit: I'd make the script wait before starting the (re-)build at the "make oldconfig" section.

Edit2: OK, driver source for 6.8.3 is in "/usr/src/linux-4.19.107-Unraid/drivers/net/ethernet/intel/e1000e"

Edited by Ford Prefect
Link to comment
17 minutes ago, Ford Prefect said:

Is this assumption correct, so far?

Correct

 

19 minutes ago, Ford Prefect said:

2) then I would replace that with the source of the newer driver version ... and...instead of using the external sourceforge link, I use a tree collected the same way from another instance of your container, running in beta (and custom) mode...because I know that this instance works, including build-config (which I assume did not change much between k4.x and k5.x).

I think the driver - make script - should be built for that and should remove the old driver and replace it with the new one (most drivers does this if you compile them).

 

I'm strictly against that, my beta container use another gcc compile and much other different stuff.

 

Do it the way you want, but if something breaks I can't help, I would recommend that you compile it with the sourceforge driver.

 

 

Btw, have you read the instructions from the driver there is described how to update the driver if it's already installed, I would strongly recommend starting there...

 

Also what's wrong to build it in the script from what I've read that are the commands:

  1. tar zxf e1000e-<x.x.x>.tar.gz
  2. cd e1000e-<x.x.x>/src/
  3. make install

 

29 minutes ago, Ford Prefect said:

Edit: I'd make the script wait before starting the (re-)build at the "make oldconfig" section.

You could also do it that way.

 

 

Like I said you can do it your way but I got no hardware to test and can't help.

Hope you get it to work.

  • Like 1
Link to comment

Yes, I understand.

But I first wanted to check what the difference between both  versions of the driver are, in terms of config & makefile etc.

Also I cannot be sure, that the sourceforge link is actually the driver I have in mind.

 

I know that this is not the proper way, but just a first step.

I am just curious, nothing can break at this time. I am just gathering info for a future build, based on S1200/10thgen.

My Workstation has that NIC and I can simply test this way with it....using the docker on my existing, older 6.8.3

 

Edit: the structure in both versions is the same...although the source files obvioulsly contain changes, but the version string is identical in both...so far for doing things the correct way ;-)

 

I might create a patch from both versions and inject this in your docker, then 🙃

Edited by Ford Prefect
Link to comment
8 hours ago, ich777 said:

Like I said you can do it your way but I got no hardware to test and can't help.

Hope you get it to work.

...it worked...your way ;-)

A patchfile did almost work, but a single header had been moved out of the tree and had massive changes, so I did not pursue that route.

I basically set the build script to sleep for a while, after the standard kernel and modules had been created, then I built the module from the sourceforge link manually from a second commandline in the docker, before the build script continued to run and finally created the image ready with the new module.

 

Thanks again for your support and the fine docker!

Edited by Ford Prefect
  • Like 1
Link to comment

Hello,

 

This is an awesome tool to have and thank you!

 

I'm trying to patch the amd hd audio, but whenever I compile it all, it give me this error "Symbol version dump ./Module.symvers is missing, modules will have no dependencies and modversions" when adding the nvidia drivers (ver 450.66).... It still goes through with everything else, but then the server won't boot since it's missing that.  Any help on how to resolve this issue?

 

Thanks

Edited by PickleRick
Link to comment
6 hours ago, PickleRick said:

Hello,

 

This is an awesome tool to have and thank you!

 

I'm trying to patch the amd hd audio, but whenever I compile it all, it give me this error "Symbol version dump ./Module.symvers is missing, modules will have no dependencies and modversions" when adding the nvidia drivers (ver 450.66).... It still goes through with everything else, but then the server won't boot since it's missing that.  Any help on how to resolve this issue?

 

Thanks

Have you tried to compile it with the nvidia driver alone and without your patch file?

 

Have you installed a cache drive in your system?

On which unraid version are you?

Can you provide a full log? You can enable the option save to log to save the output to a logfile.

 

Something seems wrong the file should be created when the modules are compiled.

Link to comment
On 9/2/2020 at 11:32 PM, cybrnook said:

Are you on a clean IOMMU grouping, meaning you are not trying to pass through a grouping with multiple GPU's or something?

GPU are not sharing the same group. with the stock and LS.io version it working fine...

 

After some more reading on the beta version i did make a full backup of the flash drive and upgraded to the 6.9 beta25. it works without problem so i decided to build a beta version... no luck, same behavior, unRAID gpu stays at P0 state and the screen is only showing a blinking stripe in the left upper corner, only the startup and shutdown is visible, local login is not possible...

kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
kernel: caller _nv000745rm+0x1af/0x200 [nvidia] mapping multiple BARs

 

i tried the prebuilt images from TS, no luck too.

Maybe there is a glitch in my hardware, or maybe this is one of the reasons why Limetech didn't include the patch in the official images...

for now i'm stick to the LS.io version (the beta) and have to accept that the reset bug will be here...

Link to comment
2 hours ago, sjaak said:

GPU are not sharing the same group. with the stock and LS.io version it working fine...

 

After some more reading on the beta version i did make a full backup of the flash drive and upgraded to the 6.9 beta25. it works without problem so i decided to build a beta version... no luck, same behavior, unRAID gpu stays at P0 state and the screen is only showing a blinking stripe in the left upper corner, only the startup and shutdown is visible, local login is not possible...


kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
kernel: caller _nv000745rm+0x1af/0x200 [nvidia] mapping multiple BARs

 

i tried the prebuilt images from TS, no luck too.

Maybe there is a glitch in my hardware, or maybe this is one of the reasons why Limetech didn't include the patch in the official images...

for now i'm stick to the LS.io version (the beta) and have to accept that the reset bug will be here...

Can you try to disable all your VM's and also your Docker containers at startup, reset all your IOMMU assignments, reboot and then try to assign it and see if it works?

 

Can't imagine why this isn't working...

 

EDIT: You also tried the prebuilt one from the first post in this thread?

Link to comment
4 hours ago, ich777 said:

Can you try to disable all your VM's and also your Docker containers at startup, reset all your IOMMU assignments, reboot and then try to assign it and see if it works?

 

Can't imagine why this isn't working...

 

i tried a 'clean' flashdrive (read as, new setup, only the license key was reused, no configs), so that a config conflict isn't the problem.

 

4 hours ago, ich777 said:

EDIT: You also tried the prebuilt one from the first post in this thread?

yep:

6 hours ago, sjaak said:

i tried the prebuilt images from TS, no luck too.

i think that my hardware has a glitch...

Link to comment
  • ich777 changed the title to [Support] ich777 - AMD Vendor Reset, CoralTPU, hpsahba,...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.