Current Status of ASPEED IPMI KVM and iGPU


Recommended Posts

After some discussion in the ASUS W680 board specific thread, I want to find out if other boards have the same problem with the current Unraid version (or rather specific Linux kernel).

What is the matter?

If you have a IPMI/BMC that uses the ast driver in Unraid (the problem turned up with the ASPEED AST2600, but this could be universal to other BMCs as well) and want to use the iGPU from the CPU (e.g. for a VM or docker), you run into the following situation:

 

- If no device (or dummy plug) is inserted, you can verify the complete boot process of Unraid via the hardware VGA or via the KVM screen in your IPMI GUI but lose the iGPU in Unraid (e.g., when you want to use the Intel SR-IOV plugin). For me, it even crashes the boot process

- If a device (or dummy plug) is inserted, you still keep it in Unraid, but the VGA/KVM stops updating after the blue boot loader screen of Unraid.

 

This seems to be an issue that has popped up (at least with the board from the thread) only AFTER Unraid 6.12.4 (so 6.12.5 and beyond). I have found this kernel commit which seems to be describing exactly this behavior and could correlate with the kernel updates in the relevant Unraid releases.

 

To find out if this is a general issue or specific to this board, I would now like to find people with the following constellation:

 

- Updated Unraid to a release >6.12.4

- Use of an ASPEED BMC (or other BMC using the ast driver)

- Use of BMC VGA/KVM (with full output until the CLI login) AND a detected iGPU in Unraid (with or without dummy plug)

 

For reference: My setup is a ASUS Pro WS W680M-ACE SE with a 12600K. Multi-Monitor (and iGPU) is activated in the BIOS - BIOS settings don't seem to matter. I am currently on Unraid 6.12.8 (Linux Kernel 6.1.74) and can use SR-IOV with a dummy plug but KVM drops out after the bootloader. Other users with the exact same board report that they have KVM + iGPU/SR-IOV. However, they are still on 6.12.4 (Linux Kernel 6.1.49).

 

If you want to find out, if your BMC in Unraid is adressed with ast, you can run

lspci -v

 

Then look for your BMC. The last lines should tell you the required part.

For me, the (relevant) output looks like this:

06:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 52) (prog-if 00 [VGA controller])
        Subsystem: ASPEED Technology, Inc. ASPEED Graphics Family
        Flags: medium devsel, IRQ 19, IOMMU group 18
        Memory at 84000000 (32-bit, non-prefetchable) [size=64M]
        Memory at 88000000 (32-bit, non-prefetchable) [size=256K]
        I/O ports at 4000 [size=128]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/4 Maskable- 64bit+
        Kernel driver in use: ast
        Kernel modules: ast

 

Edited by HumanTechDesign
Added ast check
  • Upvote 1
Link to comment

Thank you for starting this thread. Hopefully it gets some attention so that the issue can be addressed (and hopefully it is something that actually can be addressed).

 

Adding my +1 as a data point, I have a ASUS Pro WS W680M-ACE SE and a 13500 and am impacted by this issue. As such, I have remained on 16.2.4 at this time.

Link to comment
9 hours ago, firstTimer said:

everything is working just fine. I actually use the igpu for Plex a GTX 1050 for my VM and everything is fine

Thank you for your report. If I remember correctly, you had to dummy plug the dGPU to get everything working correctly together - right? Did/could you try without the dGPU put into the server? I could imagine that a dGPU changes the behavior in this instance.

  • Like 1
Link to comment

@HumanTechDesign I corrected the answer above. Anyway yes, 6.12.8 is installed and everything runs. For example now I can see the data about the dGPU (frequency temperature) because only docker containers use it. When I pass the dGPU to the VM then I get a message on the panel of the dGPU saying that it is bound VFIO or similar

Link to comment
1 minute ago, firstTimer said:

For example now I can see the data about the dGPU (frequency temperature) because only docker containers use it.

But based on that comment, it seems like your iGPU is not used at all in this configuration. Do you see the iGPU in the system devices in that constellation? I believe the problem arises if only the BMC VGA device and the iGPU is available. Then (based on the above commit) the iGPU (or the BMC) gets blocked out. I would imagine that the dGPU is not even part of this chain and therefore gets treated differently. Therefore it would be interesting if you still have the BMC (KVM via Web) AND the iGPU available if you take out your dGPU (with Unraid after 6.12.4).

Sorry if I sound so unconvinced. I am just really trying to understand if that Linux Kernel or Unraid or the BIOS/board is the issue.

Link to comment

Interesting. Maybe it has something to do with the fact that on the ATX version of the board, the IPMI is handled via the external card instead of a BMC that is directly soldered to the board. Nevertheless, your setup contradicts my assumption in the first post. It would be interesting to hear about other boards with soldered BMCs (such as Supermicro etc.).

Link to comment
On 3/12/2024 at 5:41 AM, HumanTechDesign said:

I have found this kernel commit which seems to be describing exactly this behavior and could correlate with the kernel updates in the relevant Unraid releases.

 

 

Not 100% sure, but from what I can tell, this commit is talking about physical video connections on the BMC itself. The Asus boards have a VGA connection that goes through the BMC.

 

I'm one of the people still on 6.12.4 and the iGPU is working for me without a dummy plug attached. I'll try the upgrade to the latest Unraid version at some point. I did try upgrading to 6.12.6 a while back and the SR-IOV addin started causing kernel panics on boot so I had to revert.

Link to comment
  • 2 weeks later...

OK, so I am on the same boat as you guys, same setup.  Anyone updated to 6.12.9?

 


00:02.0 VGA compatible controller: Intel Corporation AlderLake-S GT1 (rev 0c) (prog-if 00 [VGA controller])
        DeviceName: Onboard - Video
        Subsystem: ASUSTeK Computer Inc. AlderLake-S GT1
        Flags: bus master, fast devsel, latency 0, IRQ 180, IOMMU group 0
        Memory at 6004000000 (64-bit, non-prefetchable)
        Memory at 4000000000 (64-bit, prefetchable)
        I/O ports at 7000
        Expansion ROM at 000c0000 [virtual] [disabled]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
        Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
        Capabilities: [d0] Power Management version 2
        Capabilities: [100] Process Address Space ID (PASID)
        Capabilities: [200] Address Translation Service (ATS)
        Capabilities: [300] Page Request Interface (PRI)
        Capabilities: [320] Single Root I/O Virtualization (SR-IOV)
        Kernel driver in use: i915
        Kernel modules: i915

 

Edited by DivideBy0
Link to comment

Updated to 6.12.9 and experienced the same no iGPU and remote KVM.  My combo is a 12th gen Intel and WS W680M-ACE SE.  I also updated to the latest BIOS (3401) with no difference.  I disabled the iGPU again in the BIOS after testing.

  • Thanks 1
Link to comment

Updated to 6.12.10 from last night and the problem remains. However, I would have also been surprised because 6.12.10 specifically rolled back the kernel to 6.1.79 (which is lower than the kernel from 6.12.9 but still higher than the kernels after Unraid 6.12.4).

 

I will wait for the next major kernel jump (AFAIK should happen with the upcoming next major Unraid release). If that does not change the situation, I will specifically file bug report. Hopefully, Limetech can figure it out.

  • Like 1
Link to comment
  • 3 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.