VFIO Gotcha (PCI IDS Change) (VM Lockout due to autostart)


Go to solution Solved by JorgeB,

Recommended Posts

Hi Team,

 

I would like to raise a concern around my VFIO passthrough experience. May be this is a bug which can be addressed. Or this experience helps someone else from this pitfall.

 

I have a pfSense VM with 4 NIC's passthrough to the VM. The VM is set to autostart ofcourse.

The four NIC's are listed as below.

1308162091_Screenshot2022-11-25at3_31_20am.thumb.png.c12d0daa0ed661b5cf7eb99a8c83e97e.png

 

Now I decided to add a new Asus PCI M.2 X16 card which supports 4 M.2 drives with PCI bifurcation.

When I plugged in this device. My system would boot but freeze on startup. The error thrown out is a Filesystem corruption error displayed on the screenshot below.

640571823_Screenshot2022-11-25at3_32_42am.thumb.png.4c5c4b6a00a701e3efb24ea4fa4fbc46.png

 

After multiple hours of troubleshooting, I found that the PCI IDS have changed for the NIC's.

The four NIC's are now listed as below.

984522553_Screenshot2022-11-25at3_33_54am.thumb.png.fb232c864b310d72138c2bdcfbf91c62.png

 

And if you see carefully, the old PCI IDS are now occupied by the SATA controllers.

1188351911_Screenshot2022-11-25at3_36_33am.thumb.png.8deaf54115e96caea7f26cd73a8b39f2.png

 

I disabled Autostart of array, but could not disable Autostart of VM. Had to pull out the new Asus card, go back to old config. Set the pfSense VM not to autostart. Then plugged back the Asus card in. And finally I was able to start the VM Manager. At this point, if I manually start the VM without editing the passthrough devices, the same BTRFS corruption error happens and everything is frozen. So I hard rebooted the machine and before starting the pfSense VM, I edited the passthrough devices and everything works. Yey !!!

 

Can this be fixed in anyway ? The lockout situation is very bad, any check before starting the VM to see if the correct devices are passed through would be good. Then at least the VM manager starts and the VM config can be edited. My situation was severely bad as it was throwing out the array disks itself.

Link to comment
3 minutes ago, Jayant P Saikia said:

But as soon as I enable VM Manager the error happend.

That should not happen, VMs should never autostart if array autostart is disabled, it's a constant user complain, i.e., "VMs don't start with autostart enable", because they only start at first boot if array auto start is enabled.

Link to comment
  • Solution
38 minutes ago, Jayant P Saikia said:

Well that's not what I experienced. 

You are correct, behavior changed in the current release, going to find out when and report it since like you've experience this is a problem.

 

32 minutes ago, Jayant P Saikia said:

Also, anyway to check [8086:10d3] 07:00.0 with [8086:1d6b] 07:00.0. and find a miss match. Which stops the PCI device reset ?

Not really my area but I'll also inquire about that, it would make things easier if that's possible to do.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.