mavrrick Posted September 28, 2022 Share Posted September 28, 2022 So i had a working VMware environment with VMWare ESXI up until a few months ago. For various reasons it was abandoned for a time. Now i can't seem to get VMWare ESXi to start any VM's at all. It will install and start up fine, but isn't really usable. I keep getting a error message when i try to start any VM's that indicates VMware is having a issue with the system it is running on running another VM Software. I have tried rebuilding the several times and following whatever tutorials i can find online to install VMWare 7.x. I have tried 7.01 and 7.03g. I have tried using SEABIOS which seems to now cause a error during startup and OVMF with and without TPM. The error is "vcpu-0:Invalid VMCB.' Has anyone else had issues with VMWare working the latest version of Unraid. 1 Quote Link to comment
SimonF Posted September 28, 2022 Share Posted September 28, 2022 Have you enabled nesting on the syslinux boot options? Quote Link to comment
mavrrick Posted September 28, 2022 Author Share Posted September 28, 2022 I had thought of this. I checked to make sure that Nesting is enabled, and it is. I checked cat /sys/module/kvm_amd/parameters/nested to make sure i got a value of 1 which means Nested virtualization is enabled. Quote Link to comment
mavrrick Posted October 25, 2022 Author Share Posted October 25, 2022 I have tried a few more things since I last posted. 1. I tried loading vmware 8.x ESXI since it is available now. It is giving two error messages: 1. module "MonitorMode" failed to start. 2. AMD-V is supported by the platform, but is implemented in a way that is incompatible. Where there any changes with QEMU that could effect how the CPU's are passed to underlying environments. This seems to me like a feature isn't being passed and causing the Nested environment to not work. I have also tried a few of the options i have seen to enable nested Virtualization. Quote Link to comment
afc_rich Posted November 17, 2022 Share Posted November 17, 2022 (edited) Have you managed to get any further with this? I am in the same boat, whereby I am unable to start any vms inside of ESXi. When checking the vmkernel.log I can see entries containing the same error "vcpu-0:Invalid VMCB" I have following set up for ESXi guest: AMD Ryzen 9 Processor passthrough Q35-7.1 OVMF 16GB RAM SATA Disks Running unraid 6.11.3, the Nested flag is set to 1 via user script (spaceinvader) - I have also checked /sys/module/kvm_amd/parameters/nested and the flag is set. I will continue to have a look myself and update here with any findings. EDIT: esxcfg-info | grep "HV Support" reports back "3" (3VT-x / AMD-V is enabled in the BIOS and can be used. ). So AMD-V is getting passed through to the ESXi guest VM. Edited November 17, 2022 by afc_rich update Quote Link to comment
mavrrick Posted December 2, 2022 Author Share Posted December 2, 2022 (edited) I have not made any progress. I have attempted again today to try to get this working manipulating a few things. Now I am just going to focus on ESXI 8.0. It generates that strange message about: AMD-V is supported by the platform, but is implemented in a way that is incompatible. What is strange is that i can't get ESXi 7.x working either which was working a while back. I have confirmed the platform works as expected if using ESXI on Bare Metal instead of Unraid. For testing i switched my box to boot into ESXi first and then run Unraid underneath ESXi. This kind of worked for testing, but complications with PCIe Passtrough requirements caused me to move away from running ESXi as the main hypervisor. My suspicion is that some of the updates that were applied to KVM in the last few releases may have borked something with VMWare compatibility as a nested hypervisor. VMWARE ESXi 8 flat out says it can't access AMD-V features when being installed. Clearly unraid does based on what it is reporting. It is almost acting like the virtual bios the VM is using has it disabled. Not sure if that is possible but that is what it kind of looks like. When i ran esxcfg-info | grep "HV Support" i get a line that says |----HV Support .....................................................1 From what I can find that means it is supported but disabled in bios. For reference my machine hardware is below. Ryzen 5950X Asus ROG Strix MB with latest bios 128GB Ram I have tried both SEABIOS and OVMF bios in the settings. I have also tried adding SVM as required for the profile as well. Edited December 2, 2022 by mavrrick Quote Link to comment
mavrrick Posted December 5, 2022 Author Share Posted December 5, 2022 So I got out my old Unraid server hardware and got it in a working state, then got Unraid running on it. Tested it with trying to run VMware ESXi7.0 on it as this older hardware doesn't support ESXi8.0 and it seems to have worked. It doesn't say much though as that older hardware is a Sandybridge Intel Chip that is rather dated at this point. I think this is likely something between KVM, and the AMD Ryzen CPU and what features are being allowed to the nested hypervisor. I tried forcing the CPU Profile to EPYC and EPYC-IBM and both of these profiles were found to automatically disable the Monitor CPU Feature for the lower VM. Quote Link to comment
afc_rich Posted December 6, 2022 Share Posted December 6, 2022 You have the same hardware as I do. Ryzen 5950x with ROG Strix board. I performed same tests as you did above: ESXi 6.7 - |----HV Support .....................................................3 ESXi 7.0 - |----HV Support .....................................................3 ESXi 8.0 - |----HV Support .....................................................1 Interesting that the result is different for ESXi 8.0 🤔 XML output for CPU is: <cpu mode='host-passthrough' check='none' migratable='on'> <topology sockets='1' dies='1' cores='2' threads='2'/> <cache mode='passthrough'/> <feature policy='require' name='topoext'/> </cpu> I have attempted to change the CPU mode and also the features to no avail thus far. Next step is to configure a linux vm to do further tests on the CPU flags. Fingers crossed I can find one that is missing and narrow the search down. Quote Link to comment
ryanm91 Posted December 6, 2022 Share Posted December 6, 2022 I will also add I cannot get nested VM (SVM) to pass to the guest on AMD Ryzen zen 3 (5900x)Sent from my SM-G996U using Tapatalk Quote Link to comment
afc_rich Posted December 6, 2022 Share Posted December 6, 2022 Further update.... I have built a CentOS 7.8 VM and have looked at the contents of /proc/cpuinfo: The output is below: processor : 0 vendor_id : AuthenticAMD cpu family : 25 model : 33 model name : AMD Ryzen 9 5950X 16-Core Processor stepping : 0 microcode : 0xa201016 cpu MHz : 3393.622 cache size : 512 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 16 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm art rep_good nopl extd_apicid eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core retpoline_amd ssbd ibrs ibpb stibp vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat npt lbrv nrip_save tsc_scale vmcb_clean pausefilter pfthreshold v_vmsave_vmload vgif umip pku ospke vaes vpclmulqdq spec_ctrl intel_stibp arch_capabilities bogomips : 6787.24 TLB size : 2560 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: As you can see, the SVM flag (along with many others) is being passed through to the VM. I'm going to dive deeper to see if there are any other flags that are potentially missing causing the errors in ESXi. Quote Link to comment
ryanm91 Posted December 6, 2022 Share Posted December 6, 2022 Further update.... I have built a CentOS 7.8 VM and have looked at the contents of /proc/cpuinfo: The output is below:processor : 0vendor_id : AuthenticAMDcpu family : 25model : 33model name : AMD Ryzen 9 5950X 16-Core Processorstepping : 0microcode : 0xa201016cpu MHz : 3393.622cache size : 512 KBphysical id : 0siblings : 2core id : 0cpu cores : 1apicid : 0initial apicid : 0fpu : yesfpu_exception : yescpuid level : 16wp : yesflags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm art rep_good nopl extd_apicid eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core retpoline_amd ssbd ibrs ibpb stibp vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat npt lbrv nrip_save tsc_scale vmcb_clean pausefilter pfthreshold v_vmsave_vmload vgif umip pku ospke vaes vpclmulqdq spec_ctrl intel_stibp arch_capabilitiesbogomips : 6787.24TLB size : 2560 4K pagesclflush size : 64cache_alignment : 64address sizes : 48 bits physical, 48 bits virtualpower management: As you can see, the SVM flag (along with many others) is being passed through to the VM. I'm going to dive deeper to see if there are any other flags that are potentially missing causing the errors in ESXi.@afc_rich what does your xml for the VM look like? Anything special? I added require SVM to my windows 11 VM and could not launch a hyper-v manager vm and coreinfo did not detect svmOptions from CPU below, it has been confirmed by others that they have been able to get nested virtualization to work on windows 11 Sent from my SM-G996U using Tapatalk Quote Link to comment
afc_rich Posted December 7, 2022 Share Posted December 7, 2022 (edited) More info on CentOS.... [root@centos /]# sudo virt-host-validate QEMU: Checking for hardware virtualization : PASS QEMU: Checking if device /dev/kvm exists : PASS QEMU: Checking if device /dev/kvm is accessible : PASS QEMU: Checking if device /dev/vhost-net exists : PASS QEMU: Checking if device /dev/net/tun exists : PASS QEMU: Checking for cgroup 'memory' controller support : PASS QEMU: Checking for cgroup 'memory' controller mount-point : PASS QEMU: Checking for cgroup 'cpu' controller support : PASS QEMU: Checking for cgroup 'cpu' controller mount-point : PASS QEMU: Checking for cgroup 'cpuacct' controller support : PASS QEMU: Checking for cgroup 'cpuacct' controller mount-point : PASS QEMU: Checking for cgroup 'cpuset' controller support : PASS QEMU: Checking for cgroup 'cpuset' controller mount-point : PASS QEMU: Checking for cgroup 'devices' controller support : PASS QEMU: Checking for cgroup 'devices' controller mount-point : PASS QEMU: Checking for cgroup 'blkio' controller support : PASS QEMU: Checking for cgroup 'blkio' controller mount-point : PASS QEMU: Checking for device assignment IOMMU support : WARN (No ACPI IVRS table found, IOMMU either disabled in BIOS or not supported by this hardware platform) LXC: Checking for Linux >= 2.6.26 : PASS LXC: Checking for namespace ipc : PASS LXC: Checking for namespace mnt : PASS LXC: Checking for namespace pid : PASS LXC: Checking for namespace uts : PASS LXC: Checking for namespace net : PASS LXC: Checking for namespace user : PASS LXC: Checking for cgroup 'memory' controller support : PASS LXC: Checking for cgroup 'memory' controller mount-point : PASS LXC: Checking for cgroup 'cpu' controller support : PASS LXC: Checking for cgroup 'cpu' controller mount-point : PASS LXC: Checking for cgroup 'cpuacct' controller support : PASS LXC: Checking for cgroup 'cpuacct' controller mount-point : PASS LXC: Checking for cgroup 'cpuset' controller support : PASS LXC: Checking for cgroup 'cpuset' controller mount-point : PASS LXC: Checking for cgroup 'devices' controller support : PASS LXC: Checking for cgroup 'devices' controller mount-point : PASS LXC: Checking for cgroup 'blkio' controller support : PASS LXC: Checking for cgroup 'blkio' controller mount-point : PASS LXC: Checking if device /sys/fs/fuse/connections exists : FAIL (Load the 'fuse' module to enable /proc/ overrides) Could it possibly be IOMMU detection causing the issue? I haven't attempted to pass through any devices. Edited December 7, 2022 by afc_rich typo Quote Link to comment
ryanm91 Posted December 7, 2022 Share Posted December 7, 2022 Okay so I noticed that my CPU shows SVM in windows 11 but as soon as I turn on hyper-v then SVM is no longer detected and then performance suffers. Sent from my SM-G996U using Tapatalk Quote Link to comment
mavrrick Posted December 7, 2022 Author Share Posted December 7, 2022 (edited) @afc_rich What is the bios of your MB. I wonder if something changed in the Ages AM4 part of the bios. I am on the latest bios for my main board which is 4408 Just to make sure it is the same board it is a "Asus Rog Strix X570-E Gaming" with bios 4408 dated on 10/27/22 but I also tried a few back. Your comment about IMMOU is interesting. I do use a Window 10 VM that is attached to a few devices that are passed through. @ryanm91 In my research trying to get this working earlier i did see allot of references to VMWare Workstation having problems when HyperV was turned on. that may be related to what you are experiencing. I think you will need HyperV turned off if you are nesting VMware from what i found. Edited December 7, 2022 by mavrrick Quote Link to comment
afc_rich Posted December 7, 2022 Share Posted December 7, 2022 @mavrrick I am running ROG-STRIX-X570-E-GAMING-ASUS-4403. Prior to this I was running a version 4-5 iterations back with the same issue. Quote Link to comment
mavrrick Posted December 7, 2022 Author Share Posted December 7, 2022 Yea.. I just upgraded from Asus 4403 to 4408 a few days ago hoping it may help. I was running a earlier version as well. I just checked with running virt-host-validate on my older test system and it seems that it gets a similar message. It is different since it is Intel, but this may indicate the concern about IOMMU and ACPI may not be the cause. 1 Quote Link to comment
b0n3v Posted December 22, 2022 Share Posted December 22, 2022 The same issue here, I have one VM with Vmware which use one time of 3 mounts, and last time after may be 6.11.X update face same error. Nothing is changed to the VM or config, he is stopped all the time its only powered on when its need. I try with some xml custom config with no success. It's important to note, I do BIOS update and Unraid update, before that all work fine. Quote Link to comment
har8ing3r Posted January 21, 2023 Share Posted January 21, 2023 So I also had this problem, for the time I've reverted all the way back to UnRAID 6.9.2 which exhibits none of these issues. I have both ESXi running as a Guest on UnRAID and a Windows 10 VM that runs VMware Workstation (where my vCenter is installed). I went through the trouble spinning up a Windows 11 VM and testing compatibility in there as well. The primary behavior that was noticed is that this error message when running ESXi 7 and VMware Workstation 16.5 is along the lines of "vcpu0:invalid VMCB". I tested VMware Workstation 17 as well and "AMD-V is supported by the platform, but is implemented in a way that is incompatible. " After some searching it turns out that pre-2011 AMD's version of AMD-V botched the VMCB flags and didn't include the proper virtualization parameters. My best guess at the moment is that the QEMU version in UnRAID 6.11.x is for some reason implementing an extremely outdated version of AMD-V that is getting passed through...when they were doing it properly before. No amount of XML flags seems to fix the issue. Can anyone chime in on QEMU regression changes? 2 2 1 Quote Link to comment
mavrrick Posted February 2, 2023 Author Share Posted February 2, 2023 Is there someone we can attach or link to this thread to for review of this issue in development. Quote Link to comment
CONTINUUM Posted June 23, 2023 Share Posted June 23, 2023 Hi, exact same problem here with an AMD Epyc 7763. Any kernels below 5.19 would work just fine but only with ESXi 7.0. That’s a no go for ESXi 8.0. Any updates on that issue please? Thanks Quote Link to comment
ve_tower Posted July 23, 2023 Share Posted July 23, 2023 Has a fix been found for this issue? I am using a Ryzen 7950x in unraid and i am trying to setup a nested VM with esxi7. It installed ok but I am not able to start any VMs. Version 8 of esxi indicates the same CPU error. Quote Link to comment
har8ing3r Posted November 15, 2023 Share Posted November 15, 2023 I concur with mavrick and continuum. Does anyone know of a way to raise this issue? Has anyone tried upgrading recently? Quote Link to comment
mikeyosm Posted January 21 Share Posted January 21 Just tried to run esxi 7 and 8 on unraid 6.12, same issue. I have x570 and 3950x cpu. Anyone managed to get this working? Quote Link to comment
SimonF Posted January 21 Share Posted January 21 5 minutes ago, mikeyosm said: Just tried to run esxi 7 and 8 on unraid 6.12, same issue. I have x570 and 3950x cpu. Anyone managed to get this working? Do you have nested vm enabled Quote Link to comment
mikeyosm Posted January 21 Share Posted January 21 54 minutes ago, SimonF said: Do you have nested vm enabled Yep. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.