Jump to content
Living Legend

AMD GPU Reset Bug?

54 posts in this topic Last Reply

Recommended Posts

Want to let you know we are going to remove this patch in next Unraid OS release 6.8.0-rc5.  This is turning out to not be reliable.

Share this post


Link to post

Tnx. Would have been nice if it had worked consitently accross the board but seems more llike an unreliable bandadge that in some cases kills the patient.
Stability is more important for a core storage solution.

Share this post


Link to post

It's not implemented yet, and I'm too newb to build my own kernel to test 

Share this post


Link to post

I am running a AND ryzen 3700 and AMD RX5700XT and sitting with the reset bug. I want to buy unraid but can't as my VMs dont release and then I need to restart the whole HOST PC. Not viable. My trial for Unriad is running out now 3 days left and I am still trying to get it to work with hacks. Is there a fix in the foreseeable future for this bug of no release on the GPU ?

 

Really would love to run Unraid on my box as I am running it on my small HP NAS PC and love it.

Share this post


Link to post
25 minutes ago, righardt.marais said:

Is there a fix in the foreseeable future for this bug of no release on the GPU ?

This question should be posed to AMD.

Share this post


Link to post
19 minutes ago, limetech said:

This question should be posed to AMD.

Yes you're 100% correct. How do we beg them to place this as priority on fix radar?

 

Share this post


Link to post
24 minutes ago, righardt.marais said:

Yes you're 100% correct. How do we beg them to place this as priority on fix radar?

 

My theory (and no evidence to back this up, just pure speculation) is that they don't want to fix it.

Share this post


Link to post
13 hours ago, limetech said:

My theory (and no evidence to back this up, just pure speculation) is that they don't want to fix it.

Possibly because there is no profit in it and joe public doesnt give $0.02 about gpu passthrough and virtualisation. All about trying to compete with nvidia for the "faster than yours" gpu crown. If only AMD could do for the gpu division the excellent work they have done for CPUs

Share this post


Link to post
22 hours ago, righardt.marais said:

I am running a AND ryzen 3700 and AMD RX5700XT and sitting with the reset bug. I want to buy unraid but can't as my VMs dont release and then I need to restart the whole HOST PC. Not viable. My trial for Unriad is running out now 3 days left and I am still trying to get it to work with hacks. Is there a fix in the foreseeable future for this bug of no release on the GPU ?

 

Really would love to run Unraid on my box as I am running it on my small HP NAS PC and love it.

What board are you using? The navi reset patch stability seems completely dependent on your hardware. It works flawlessly for many. Other hardware, not so much. If you have a x570 board I would give the reset patch a shot, it works great on this platform.

Share this post


Link to post
14 hours ago, Skitals said:

What board are you using? The navi reset patch stability seems completely dependent on your hardware. It works flawlessly for many. Other hardware, not so much. If you have a x570 board I would give the reset patch a shot, it works great on this platform.

Hey Skitals

 

I have spoken to you before regarding my rig. I have narrowed my issues down to the reset issue.

Running Gigabyte x570 Aorus Ultry Wifi + Ryzen 3700 + Gigabyte rx5700XT Gaming OC + Corsair 16GB x 2 Vengeance RAM + 2 x 2TB nvme.
 

I have managed to restart from scratch and get UnRaid running with Win 10 VM:

Win 10 VM (1909) installed with latest win updates

Updated devices in Device Manager with virtio drivers:

2 x system devices

1 x network adapter

This is all via using VNC to connect for the first time.

Thereafter,

I edit the VM properties to select the 5700 as the display and the sound for pass through. Add the downloaded rom for the 5700xt from techpowerup.com.

Booting the VM passes the display - which is great!

I install the AMD Adrenalin drivers to grant proper gpu usage - replacing the Microsoft Basic display.

Display looks nice and sitting at full resolution.

 

Reboot the windows 10 vm - 127 error displayed because of the AMD Navi reset issue (as per Space Invader One vids)

Reboot the UnRaid server and start the VM - it works fine.

 

I followed Space Invader One youtube [ https://www.youtube.com/watch?v=0uZODoPQH9c ] for the reset script - this works ..... BUT one draw back. Its all good and well you place the server in a suspended state. And then you need to press the power button again....but this would bounce any other dockers/vms you have running on your box??

 

If you say that the patch should work....could you give me some step by step guide / location where i can read up ->>>> as I am noob regarding linux patching etc?

 

Then on a Array/Cache/Parity drive setup note: I have 2 x 2tb nvme drives, 1 x 500gb mech 2.5" drive and 1 x 256gb ssd drive.

My need is to have a win10 gaming pc with gpu pass through, another win10 for work (vnc into it), linux vm (ssh into it). May play around with few dockers etc.

Just wanted to know what the best hdd allocation is for my needs. As I setup the 2 nvme drives in the array now and unraid is complaining that ssd is not supported in array.

 

Thanks

 

EDIT: Would it be best to pass the 1 nvme through as dedicated hardware nvme for the gaming pc and install win10 on that. Then use the other nvme as unraid storage for the general VMs and Dockers ?

 

Edited by righardt.marais

Share this post


Link to post
6 hours ago, righardt.marais said:

Hey Skitals

 

I have spoken to you before regarding my rig. I have narrowed my issues down to the reset issue.

Running Gigabyte x570 Aorus Ultry Wifi + Ryzen 3700 + Gigabyte rx5700XT Gaming OC + Corsair 16GB x 2 Vengeance RAM + 2 x 2TB nvme.
 

I have managed to restart from scratch and get UnRaid running with Win 10 VM:

Win 10 VM (1909) installed with latest win updates

Updated devices in Device Manager with virtio drivers:

2 x system devices

1 x network adapter

This is all via using VNC to connect for the first time.

Thereafter,

I edit the VM properties to select the 5700 as the display and the sound for pass through. Add the downloaded rom for the 5700xt from techpowerup.com.

Booting the VM passes the display - which is great!

I install the AMD Adrenalin drivers to grant proper gpu usage - replacing the Microsoft Basic display.

Display looks nice and sitting at full resolution.

 

Reboot the windows 10 vm - 127 error displayed because of the AMD Navi reset issue (as per Space Invader One vids)

Reboot the UnRaid server and start the VM - it works fine.

 

I followed Space Invader One youtube [ https://www.youtube.com/watch?v=0uZODoPQH9c ] for the reset script - this works ..... BUT one draw back. Its all good and well you place the server in a suspended state. And then you need to press the power button again....but this would bounce any other dockers/vms you have running on your box??

 

If you say that the patch should work....could you give me some step by step guide / location where i can read up ->>>> as I am noob regarding linux patching etc?

 

Then on a Array/Cache/Parity drive setup note: I have 2 x 2tb nvme drives, 1 x 500gb mech 2.5" drive and 1 x 256gb ssd drive.

My need is to have a win10 gaming pc with gpu pass through, another win10 for work (vnc into it), linux vm (ssh into it). May play around with few dockers etc.

Just wanted to know what the best hdd allocation is for my needs. As I setup the 2 nvme drives in the array now and unraid is complaining that ssd is not supported in array.

 

Thanks

 

EDIT: Would it be best to pass the 1 nvme through as dedicated hardware nvme for the gaming pc and install win10 on that. Then use the other nvme as unraid storage for the general VMs and Dockers ?

 

Get on 6.8.0-rc5. It is the newest build with a prebuilt kernel with the reset patch. Then download the 6.8.0-rc5 kernel here. Extract the files into the root of your unraid usb. That's it.

 

I would recommend doing as you say, pass through one of the nvmes for win10 and use the other for unraid cache.

 

 

 

Share this post


Link to post
1 hour ago, Skitals said:

Get on 6.8.0-rc5. It is the newest build with a prebuilt kernel with the reset patch. Then download the 6.8.0-rc5 kernel here. Extract the files into the root of your unraid usb. That's it.

 

I would recommend doing as you say, pass through one of the nvmes for win10 and use the other for unraid cache.

 

 

 

6.8.0-rc5 Where do I download that as the manual downloads page does not provide it and the automated usb creator neither does.

 

"pass through one of the nvmes for win10"  -  Is this covered in a Space Invader One youtube?

"use the other for unraid cache"  -  Ok so the second one would be cache where you would put unraid "shares" - the domains, system and appdata shares. These are for the use of other vms and dockers etc....

 

On the placement of the nvmes in unraid the one goes now to cache location ...do I need to put the other one into cache or array ... or do I just leave it and the passthrough would take care of it. (just trying to make sense of the placement of the drives.

 

Thanks for the help - APPRECIATE IT !!!!!

Share this post


Link to post

is it an idea to put the patch back in unRAID? and to use it, you need to type in commando in the "syslinux.cfg" file? i would love to test it...

my win10 vm can't update to any driver higher then 17.3.x (latest driver that asus profited at there website for the vega64 strix) every driver after 17.x result in a black screen. but on the other side, with v17 a can reboot the vm without any problems.

my linux VM still suffers from this "bug"

Share this post


Link to post
1 hour ago, sjaak said:

is it an idea to put the patch back in unRAID? and to use it, you need to type in commando in the "syslinux.cfg" file?

The patch modifies source code of the Linux kernel.  To make these changes "configurable" would take way too much effort.  Theoretically someone could create and maintain a "amd-reset-bug-build" version of the kernel with this patch much like the unofficial "nvidia-build" of the kernel.

Share this post


Link to post
1 hour ago, limetech said:

The patch modifies source code of the Linux kernel.  To make these changes "configurable" would take way too much effort.  Theoretically someone could create and maintain a "amd-reset-bug-build" version of the kernel with this patch much like the unofficial "nvidia-build" of the kernel.

 

thnx for replay.

Share this post


Link to post
16 hours ago, righardt.marais said:

6.8.0-rc5 Where do I download that as the manual downloads page does not provide it and the automated usb creator neither does.

 

"pass through one of the nvmes for win10" - Is this covered in a Space Invader One youtube?

"use the other for unraid cache" - Ok so the second one would be cache where you would put unraid "shares" - the domains, system and appdata shares. These are for the use of other vms and dockers etc....

 

On the placement of the nvmes in unraid the one goes now to cache location ...do I need to put the other one into cache or array ... or do I just leave it and the passthrough would take care of it. (just trying to make sense of the placement of the drives.

 

Thanks for the help - APPRECIATE IT !!!!!

Could someone help me with my questions?

1. Location of the older downloads (6.8.0-rc5) on unraid site.

2. nvme placement in my scenario - 1 as cache disk and then 1 as unassigned drive to pass through

   But then the array is empty....cant start unraid without array....would this mean I need to add another drive just as something into the array       section and is there a SIZE REQUIREMENT for that array dummy drive? Seeing that I want to run my VM on passed through nvme controller       and disk and the other dockers/vms on the nvme in cache (shares: domains/appdata/system)

 

 

EDIT: Point #2 I found two 1TB HDD I am now using as array disks. So the array not populated is now sorted.

Still need that location to download the 6.8.0-rc5 - then I can use the patch and give it a go.

 

Thanks guys

 

Edited by righardt.marais

Share this post


Link to post
2 hours ago, righardt.marais said:

is there a SIZE REQUIREMENT for that array dummy drive?

Nope. Theoretically I suppose you could assign a 1GB USB thumb drive as disk1.

Share this post


Link to post
10 hours ago, Benjamin Picard said:

Someone just shared his kernel for 6.8.2 with navi patch. Is there a way to include this patch in the official release?

Is it a new patch or is it the same patch that was included and later removed from the RC because it broke other things that worked?

 

The OP of the topic you quoted even share the procedure to compile your own kernel so perhaps it's best to compile your own kernel.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.