AMD GPU Reset Bug?


111 posts in this topic Last Reply

Recommended Posts

On 1/25/2021 at 6:58 PM, snolly said:

Hello all,

 

owner of a 5700 XT here. I need to clarify a few things.

 

1. I have read that if a VM is shut down gracefully (clean shutdown) then GPU gets reset as it should. That is not the case for me. Even with a clean shut down I get the AMD reset error if I try to Start the VM again.

 

This has nothing to do with clean or forced shutdowns.. old amd cards (i.e. polaris all the way up to navi, except big navi) don't support flr (function level reset) as specified in pcie specifications. Without any form of patch you will not get it to work properly.

 

On 1/25/2021 at 7:03 PM, ich777 said:

Exactly, I had to remove the support pre beta or RC since there where some pretty nasty posts so I decided to support only the current version and since RC2 is really stable and has only visual bugs in it for some users (temps not displayed after a reboot).

 

I can recommend you RC2 since as I said above @giganode & @derpuma run RC2 with the patch and everything works just fine. ;)

 

Totally agree with that :) 

Edited by giganode
Link to post
  • Replies 110
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Popular Posts

You can now switch to the new 'feature/audio-reset' of the vendor-reset. Just use @ich777's docker to compile your custom build as you need it.   I can boot between Windows 10 20H2, Ubuntu 2

I can also confirm ich777's build is working with the reset patch with a RX 5600XT. Thanks so much!

Thanks to ich777 and his Kernel Helper container I was able to build an Unraid kernel with the navi reset patch super easily! I have an RX 5600XT and everything seems to be working. The container also

Posted Images

On 1/25/2021 at 8:03 PM, ich777 said:

Exactly, I had to remove the support pre beta or RC since there where some pretty nasty posts so I decided to support only the current version and since RC2 is really stable and has only visual bugs in it for some users (temps not displayed after a reboot).

 

I can recommend you RC2 since as I said above @giganode & @derpuma run RC2 with the patch and everything works just fine. ;)

 

14 hours ago, giganode said:

 

This has nothing to do with clean or forced shutdowns.. old amd cards (i.e. polaris all the way up to navi, except big navi) don't support flr (function level reset) as specified in pcie specifications. Without any form of patch you will not get it to work properly.

 

 

Totally agree with that :) 

 

OK then RC2 here I come this weekend!

 

Question about patching. What happens in future Unraid updates? Do I install the official update and the repatch?

Link to post
29 minutes ago, snolly said:

Question about patching. What happens in future Unraid updates? Do I install the official update and the repatch?

Exactly this is the plan, it should also work if you are on the old RC but I don't recommend it.

EDIT:

  1. Upgrade to the new stock version of Unraid
  2. Redownload the Unraid-Kernel-Helper from the CA App
  3. Move the generated bz* files to your USB Boot device
  4. Reboot

 

The patch is not mature enough to integrate it in Unraid itself...

 

I do my best to update the Unraid-Kernel-Helper as quick as possible so that it works with any future versions, but it should work OOB also with newer Unraid versions if nothing trivial changes.

Link to post
1 minute ago, snolly said:

Is there hope that it will ever be?

You have to keep in mind that this patch has to be completely finished first before it can even make it into Unraid.

Also keep in mind that this will increase the Kernel size significantly (even if it looks not much on the naked eye).

 

Another thing is that this patch could lead to other problems for example if you build the patch and install the Nvidia Plugin from the CA App afterwards that the server would crash on boot -> but if you build the patch and the Nvidia drivers with the Unraid-Kernel-Helper so that it will be integrated in the images it will boot up just fine.

 

This is a little bit more complicated than it seems on the first sight...

Link to post
19 minutes ago, ich777 said:

You have to keep in mind that this patch has to be completely finished first before it can even make it into Unraid.

Also keep in mind that this will increase the Kernel size significantly (even if it looks not much on the naked eye).

 

Another thing is that this patch could lead to other problems for example if you build the patch and install the Nvidia Plugin from the CA App afterwards that the server would crash on boot -> but if you build the patch and the Nvidia drivers with the Unraid-Kernel-Helper so that it will be integrated in the images it will boot up just fine.

 

This is a little bit more complicated than it seems on the first sight...

 

Gotcha, thanks for the detailed response. If one runs a rig with AMD GPU only i reckon it's safe to use?

Also is going back and forth between pathed and official kernel possible? I guess it's a matter of copying the patched/unpatched *bz files back and forth into the usb stick right?

Edited by snolly
Link to post
4 minutes ago, snolly said:

If one runs a rig with AMD GPU only i reckon it's safe to use?

You can run also a Nvidia and AMD GPU with the patch but be sure to also select the Nvidia Driver in the Unraid-Kernel-Helper, but yes if you are only running a AMD GPU with this patch it's also safe. :)

 

5 minutes ago, snolly said:

Also is going back and forth between pathed and official kernel possible? I guess it's a matter of copying the patched/unpatched *bz files back and forth into the usb stick right?

Exactly and a reboot.

But why would you do that? That's really not how you should use this patch or even a prebuilt image that is builded with the Unraid-Kernel-Helper.

 

Btw I also use a prebuilt image from my Unraid-Kernel-Helper but I've integrated other things like Intel iGPU, iSCSI & Mellanox Firmware Tools. I like to have all in one image and this also speeds up the boot process...

Link to post
14 minutes ago, ich777 said:

But why would you do that? That's really not how you should use this patch or even a prebuilt image that is builded with the Unraid-Kernel-Helper.

 

I would not do that, I would just like to know if by booting up with a patched kernel will modify Unraid permanently somehow and I wouldn't be able to go back if needed. Is my insanity for cautiousness that kicked in :) - again thanks for the replies

Link to post
1 minute ago, snolly said:

I would not do that, I would just like to know if by booting up with a patched kernel will modify Unraid permanently somehow and I wouldn't be able to go back if needed.

No, you always can download the default bz* images and replace the custom built ones from the Unraid-Kernel-Helper on your USB Boot device and you will be back to stock Unraid since the images are the main things that makes Unraid, well Unraid... ;)

 

3 minutes ago, snolly said:

Is my insanity for cautiousness that kicked in :)

No problem, just asking. :)

Link to post

Does amd series 6000 works only on big sur?.. If not, Id sell my 5700XT, this amd reset but got me almost to the madness.. I need to reboot or restart my mac vm without having to restart the whole server, I got 10 others vms running at the same time with services that I can not stop, but at the same time, I need to work with java 8 and big sur was not working with it. 

 

I dont get whats the problem with AMD, why they refuses to help solve this issue. 

Edited by mSedek
Link to post
4 minutes ago, mSedek said:

Does amd series 6000 works only on big sur?.. If not, Id sell my 5700XT, this amd reset but got me almost to the madness.. I need to reboot or restart my mac vm without having to restart the whole server, I got 10 others vms running at the same time with services that I can not stop, but at the same time, I need to work with java 8 and big sur was not working with it. 

 

I dont get whats the problem with AMD, why they refuses to help solve this issue. 

First of all, they did not refuse. Instead gnif and amd had a few conversations and with others on reddit etc. as well..

Secondly they fixed the issue on big navi cards... So they definetly heard what the costumers said. 

 

Can't tell you if or when Big Sur will run with 6000 cards.. @derpuma told me that they is already something in big sur that belongs to big navi..

 

Nevertheless my 5700xt just runs fine in Big Sur...

Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.