VM tab becomes unresposive


Recommended Posts

Ok so ive got an issue that is puzzling me.

Everything else works fine except when i try to go into the VM tab, vm settings or the main dashboard, basically anything to do with VM's makes my webui unresponsive.

running /etc/rc.d/rc.php-fpm restart & /etc/rc.d/rc.php-fpm reload gives me access back to webui but still unable to access the same tabs.

I have tried hard resetting my server and that did not help.

Is my USB going bad? Please help.

blackbox-diagnostics-20220622-1306.zip

Link to comment

It seems you have a panic for your HD 7450, probably when you get to the vm tab:

Jun 22 06:36:50 BlackBox kernel: fbcon: radeondrmfb (fb1) is primary device
Jun 22 06:36:50 BlackBox kernel: fbcon: Remapping primary device, fb1, to tty 1-63
Jun 22 06:36:50 BlackBox kernel: BUG: kernel NULL pointer dereference, address: 0000000000000194
Jun 22 06:36:50 BlackBox kernel: #PF: supervisor read access in kernel mode
Jun 22 06:36:50 BlackBox kernel: #PF: error_code(0x0000) - not-present page
Jun 22 06:36:50 BlackBox kernel: PGD 0 P4D 0 
Jun 22 06:36:50 BlackBox kernel: Oops: 0000 [#1] SMP PTI
Jun 22 06:36:50 BlackBox kernel: CPU: 5 PID: 30416 Comm: rpc-libvirtd Not tainted 5.15.46-Unraid #1
Jun 22 06:36:50 BlackBox kernel: Hardware name: GIGABYTE MX31-BS0/MX31-BS0, BIOS R12 01/09/2020
Jun 22 06:36:50 BlackBox kernel: RIP: 0010:fbcon_del_cursor_timer+0x1a/0x39
Jun 22 06:36:50 BlackBox kernel: Code: 5e e9 cd e7 c3 ff 58 5b 5d 41 5c 41 5d 41 5e c3 0f 1f 44 00 00 48 81 bf e8 01 00 00 74 14 47 81 75 26 53 48 8b 9f 40 03 00 00 <f6> 83 94 01 00 00 02 74 13 48 8d bb d8 00 00 00 e8 9f f9 c5 ff 83
Jun 22 06:36:50 BlackBox kernel: RSP: 0018:ffffc900019379a8 EFLAGS: 00010246
Jun 22 06:36:50 BlackBox kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jun 22 06:36:50 BlackBox kernel: RDX: 000000000000003e RSI: 0000000000000000 RDI: ffff888105baf800
Jun 22 06:36:50 BlackBox kernel: RBP: ffff88868521d000 R08: 0000000000000001 R09: 0000000000000001
Jun 22 06:36:50 BlackBox kernel: R10: 0000000000aaaaaa R11: ffffc900077d8420 R12: 000000000000003e
Jun 22 06:36:50 BlackBox kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
Jun 22 06:36:50 BlackBox kernel: FS:  000014b0084e7640(0000) GS:ffff88885fd40000(0000) knlGS:0000000000000000
Jun 22 06:36:50 BlackBox kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 22 06:36:50 BlackBox kernel: CR2: 0000000000000194 CR3: 00000001a2132005 CR4: 00000000003726e0
Jun 22 06:36:50 BlackBox kernel: Call Trace:
Jun 22 06:36:50 BlackBox kernel: <TASK>
Jun 22 06:36:50 BlackBox kernel: con2fb_release_oldinfo.constprop.0+0x81/0x129
Jun 22 06:36:50 BlackBox kernel: set_con2fb_map+0x146/0x27b
Jun 22 06:36:50 BlackBox kernel: fbcon_fb_registered+0x11f/0x12e
Jun 22 06:36:50 BlackBox kernel: register_framebuffer+0x25d/0x2ae
Jun 22 06:36:50 BlackBox kernel: __drm_fb_helper_initial_config_and_unlock+0x3bf/0x494 [drm_kms_helper]
Jun 22 06:36:50 BlackBox kernel: radeon_fbdev_init+0xdd/0x102 [radeon]
Jun 22 06:36:50 BlackBox kernel: radeon_modeset_init+0x89b/0x8bd [radeon]
Jun 22 06:36:50 BlackBox kernel: radeon_driver_load_kms+0xf2/0x197 [radeon]
Jun 22 06:36:50 BlackBox kernel: drm_dev_register+0xf1/0x1b9 [drm]
Jun 22 06:36:50 BlackBox kernel: radeon_pci_probe+0xc8/0xf3 [radeon]
Jun 22 06:36:50 BlackBox kernel: local_pci_probe+0x44/0x85
Jun 22 06:36:50 BlackBox kernel: pci_device_probe+0x145/0x197
Jun 22 06:36:50 BlackBox kernel: really_probe+0x11e/0x2e8
Jun 22 06:36:50 BlackBox kernel: __driver_probe_device+0x91/0xc0
Jun 22 06:36:50 BlackBox kernel: driver_probe_device+0x1e/0x73
Jun 22 06:36:50 BlackBox kernel: __device_attach_driver+0x76/0x85
Jun 22 06:36:50 BlackBox kernel: ? driver_allows_async_probing+0x45/0x45
Jun 22 06:36:50 BlackBox kernel: bus_for_each_drv+0x8a/0xb0
Jun 22 06:36:50 BlackBox kernel: __device_attach+0xba/0x132
Jun 22 06:36:50 BlackBox kernel: bus_rescan_devices_helper+0x3a/0x65
Jun 22 06:36:50 BlackBox kernel: drivers_probe_store+0x34/0x4c
Jun 22 06:36:50 BlackBox kernel: kernfs_fop_write_iter+0x126/0x152
Jun 22 06:36:50 BlackBox kernel: new_sync_write+0x7f/0xb7
Jun 22 06:36:50 BlackBox kernel: vfs_write+0xd9/0x124
Jun 22 06:36:50 BlackBox kernel: ksys_write+0x76/0xbe
Jun 22 06:36:50 BlackBox kernel: do_syscall_64+0x83/0xa5
Jun 22 06:36:50 BlackBox kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Jun 22 06:36:50 BlackBox kernel: RIP: 0033:0x14b009c5b4df
Jun 22 06:36:50 BlackBox kernel: Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 89 b8 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 dc b8 f8 ff 48
Jun 22 06:36:50 BlackBox kernel: RSP: 002b:000014b0084e6610 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Jun 22 06:36:50 BlackBox kernel: RAX: ffffffffffffffda RBX: 000000000000000c RCX: 000014b009c5b4df
Jun 22 06:36:50 BlackBox kernel: RDX: 000000000000000c RSI: 000014afb41afac0 RDI: 0000000000000029
Jun 22 06:36:50 BlackBox kernel: RBP: 000014afb41afac0 R08: 0000000000000000 R09: 000014affc0063e0
Jun 22 06:36:50 BlackBox kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000029
Jun 22 06:36:50 BlackBox kernel: R13: 0000000000000029 R14: 0000000000000000 R15: 000014b00a3dcbb0
Jun 22 06:36:50 BlackBox kernel: </TASK>
Jun 22 06:36:50 BlackBox kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap veth xt_nat xt_tcpudp xt_conntrack nf_conntrack_netlink nfnetlink xt_addrtype br_netfilter xfs md_mod amdgpu gpu_sched it87 hwmon_vid iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding igb x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel radeon kvm ast drm_vram_helper drm_ttm_helper ttm crct10dif_pclmul ipmi_ssif crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd drm_kms_helper rapl intel_cstate drm mpt3sas intel_uncore ahci i2c_i801 raid_class input_leds i2c_smbus scsi_transport_sas i2c_algo_bit led_class libahci intel_pch_thermal acpi_ipmi agpgart video i2c_core ipmi_si backlight
Jun 22 06:36:50 BlackBox kernel: syscopyarea sysfillrect sysimgblt fb_sys_fops button thermal fan [last unloaded: igb]
Jun 22 06:36:50 BlackBox kernel: CR2: 0000000000000194
Jun 22 06:36:50 BlackBox kernel: ---[ end trace 123781c430b71e14 ]---
Jun 22 06:36:50 BlackBox kernel: RIP: 0010:fbcon_del_cursor_timer+0x1a/0x39
Jun 22 06:36:50 BlackBox kernel: Code: 5e e9 cd e7 c3 ff 58 5b 5d 41 5c 41 5d 41 5e c3 0f 1f 44 00 00 48 81 bf e8 01 00 00 74 14 47 81 75 26 53 48 8b 9f 40 03 00 00 <f6> 83 94 01 00 00 02 74 13 48 8d bb d8 00 00 00 e8 9f f9 c5 ff 83
Jun 22 06:36:50 BlackBox kernel: RSP: 0018:ffffc900019379a8 EFLAGS: 00010246
Jun 22 06:36:50 BlackBox kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jun 22 06:36:50 BlackBox kernel: RDX: 000000000000003e RSI: 0000000000000000 RDI: ffff888105baf800
Jun 22 06:36:50 BlackBox kernel: RBP: ffff88868521d000 R08: 0000000000000001 R09: 0000000000000001
Jun 22 06:36:50 BlackBox kernel: R10: 0000000000aaaaaa R11: ffffc900077d8420 R12: 000000000000003e
Jun 22 06:36:50 BlackBox kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
Jun 22 06:36:50 BlackBox kernel: FS:  000014b0084e7640(0000) GS:ffff88885fd40000(0000) knlGS:0000000000000000
Jun 22 06:36:50 BlackBox kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 22 06:36:50 BlackBox kernel: CR2: 0000000000000194 CR3: 00000001a2132005 CR4: 00000000003726e0

 

But unfortunately I don't know how to fix it...

Link to comment

That doesn't help.  If you're passing through the gpu, then you shouldn't be using it for anything else at all.  The 2 are incompatible with each other, and the GPU should also be isolated from the rest of the system via Tools - System Devices (same as with any other devices which get passed through).

Link to comment

Is there a way to stop / delete the vm causing the issue from the command line? Or should I just power off and pull the GPU? I’ve tried commenting out the amdgpu line in my go file and I have checked the boxes via tools - system devices but the lock ups still happen. 

Link to comment
4 hours ago, SavageAUS said:

Is there a way to stop / delete the vm causing the issue from the command line?

You could use virsh command:

virsh destroy vmname

to force stop the vm

 

virsh shutdown vmname

to gracefully shutdown the vm

 

virsh undefine --nvram vmname

to delete the vm (note the --nvram, required for ovmf vms)

 

virsh list --all

to list the status of all vms

 

PS: as far as 'vmname', if it contains space(s) use double quotes, for example: virsh destroy "Windows 10"

Edited by ghost82
Link to comment

Prior to my last hard reboot I force deleted the passed through vm vdisks which may have helped. Running those commands would usually cause a lock up but this time it worked and now I can access the VM tab again. Thank you heaps for help. 
Before I try another vm with pass through what steps do I need to take first? Not adding amdgpu to the go file and in tools system devices stub the audio and video of the gpu?

Link to comment
41 minutes ago, SavageAUS said:

Before I try another vm with pass through what steps do I need to take first? Not adding amdgpu to the go file and in tools system devices stub the audio and video of the gpu?

You should not edit the go file at all for gpu passthrough.

General steps are:

- bind to vfio in system devices all the components of the gpu

- setup the vm by passing through all the bounded devices in a multifunction device

- for blackscreen/garbage screen additional entries in the syslinux configuration, for example video=efifb:off

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.