slimshizn Posted November 12, 2018 Share Posted November 12, 2018 As stated, this happens and then I can not connect to my server at all. Only thing I can do is reboot (unclean) and try again. Seems to be happening only when I shut down or reboot the VM. Quote Nov 12 15:07:46 Gaming php-fpm[17706]: [WARNING] [pool www] server reached max_children setting (20), consider raising it Nov 12 15:07:55 Gaming kernel: INFO: rcu_sched detected stalls on CPUs/tasks: Nov 12 15:07:55 Gaming kernel: 27-...0: (1 ticks this GP) idle=c32/0/1 softirq=1547981/1547981 fqs=14983 Nov 12 15:07:55 Gaming kernel: 35-...0: (1 GPs behind) idle=9fa/1/4611686018427387904 softirq=1679893/1679896 fqs=14983 Nov 12 15:07:55 Gaming kernel: (detected by 11, t=60002 jiffies, g=869761, c=869760, q=77420) Nov 12 15:07:55 Gaming kernel: Sending NMI from CPU 11 to CPUs 27: Nov 12 15:07:55 Gaming kernel: NMI backtrace for cpu 27 Nov 12 15:07:55 Gaming kernel: CPU: 27 PID: 0 Comm: swapper/27 Tainted: G O 4.18.17-unRAID #1 Nov 12 15:07:55 Gaming kernel: Hardware name: Cirrascale VB1416/GA-7PESH2, BIOS R17 06/26/2018 Nov 12 15:07:55 Gaming kernel: RIP: 0010:native_queued_spin_lock_slowpath+0xfd/0x16d Nov 12 15:07:55 Gaming kernel: Code: e1 eb 7b 31 c9 eb 36 c1 e9 12 83 e0 03 ff c9 48 c1 e0 04 48 63 c9 48 05 c0 17 02 00 48 03 04 cd 00 17 da 81 48 89 10 8b 42 08 <85> c0 75 04 f3 90 eb f5 48 8b 0a 48 85 c9 74 c9 0f 18 09 8b 07 66 Nov 12 15:07:55 Gaming kernel: RSP: 0018:ffff88046fc43dc0 EFLAGS: 00000046 Nov 12 15:07:55 Gaming kernel: RAX: 0000000000000001 RBX: 0000000000000100 RCX: 0000000000000023 Nov 12 15:07:55 Gaming kernel: RDX: ffff88046fc617c0 RSI: 0000000000700000 RDI: ffff88046f421dc0 Nov 12 15:07:55 Gaming kernel: RBP: ffff88046fc43e28 R08: 0000000000900000 R09: ffff88046f421dc0 Nov 12 15:07:55 Gaming kernel: R10: 0000000000000214 R11: 0000000000000084 R12: 0000000000000850 Nov 12 15:07:55 Gaming kernel: R13: ffff88046f421dc0 R14: 0000000000000084 R15: ffff88046f427400 Nov 12 15:07:55 Gaming kernel: FS: 0000000000000000(0000) GS:ffff88046fc40000(0000) knlGS:0000000000000000 Nov 12 15:07:55 Gaming kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 12 15:07:55 Gaming kernel: CR2: fffff804ac8bf070 CR3: 0000000004e0a004 CR4: 00000000001626e0 Nov 12 15:07:55 Gaming kernel: Call Trace: Nov 12 15:07:55 Gaming kernel: <IRQ> Nov 12 15:07:55 Gaming kernel: _raw_spin_lock+0x16/0x19 Nov 12 15:07:55 Gaming kernel: qi_submit_sync+0x265/0x2db Nov 12 15:07:55 Gaming kernel: qi_flush_iotlb+0x66/0x80 Nov 12 15:07:55 Gaming kernel: iommu_flush_iova+0x5c/0xa7 Nov 12 15:07:55 Gaming kernel: iova_domain_flush+0x18/0x22 Nov 12 15:07:55 Gaming kernel: fq_flush_timeout+0x2e/0x90 Nov 12 15:07:55 Gaming kernel: call_timer_fn+0x12/0x6f Nov 12 15:07:55 Gaming kernel: ? fq_ring_free+0x96/0x96 Nov 12 15:07:55 Gaming kernel: expire_timers+0x7f/0x8e Nov 12 15:07:55 Gaming kernel: run_timer_softirq+0x72/0x120 Nov 12 15:07:55 Gaming kernel: ? __hrtimer_run_queues+0xbd/0x105 Nov 12 15:07:55 Gaming kernel: ? recalibrate_cpu_khz+0x1/0x1 Nov 12 15:07:55 Gaming kernel: ? ktime_get+0x3a/0x8d Nov 12 15:07:55 Gaming kernel: __do_softirq+0xce/0x1c8 Nov 12 15:07:55 Gaming kernel: irq_exit+0x56/0x95 Nov 12 15:07:55 Gaming kernel: smp_apic_timer_interrupt+0x7e/0x89 Nov 12 15:07:55 Gaming kernel: apic_timer_interrupt+0xf/0x20 Nov 12 15:07:55 Gaming kernel: </IRQ> Nov 12 15:07:55 Gaming kernel: RIP: 0010:cpuidle_enter_state+0xe8/0x141 Nov 12 15:07:55 Gaming kernel: Code: ff 45 84 ff 74 1d 9c 58 0f 1f 44 00 00 0f ba e0 09 73 09 0f 0b fa 66 0f 1f 44 00 00 31 ff e8 e2 b7 be ff fb 66 0f 1f 44 00 00 <48> 2b 1c 24 b8 ff ff ff 7f 48 b9 ff ff ff ff f3 01 00 00 48 39 cb Nov 12 15:07:55 Gaming kernel: RSP: 0018:ffffc90003353ea0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 Nov 12 15:07:55 Gaming kernel: RAX: ffff88046fc60c00 RBX: 000008bfdf1b91be RCX: 000000000000001f Nov 12 15:07:55 Gaming kernel: RDX: 000008bfdf1b91be RSI: 0000000000000000 RDI: 0000000000000000 Nov 12 15:07:55 Gaming kernel: RBP: ffff88046fc69370 R08: 00001a75706d8f8a R09: 00000000000002e9 Nov 12 15:07:55 Gaming kernel: R10: 00000000001feeac R11: 071c71c71c71c71c R12: 0000000000000004 Nov 12 15:07:55 Gaming kernel: R13: 0000000000000004 R14: ffffffff81e588d8 R15: 0000000000000000 Nov 12 15:07:55 Gaming kernel: do_idle+0x192/0x20e Nov 12 15:07:55 Gaming kernel: cpu_startup_entry+0x6a/0x6c Nov 12 15:07:55 Gaming kernel: start_secondary+0x197/0x1b2 Nov 12 15:07:55 Gaming kernel: secondary_startup_64+0xa5/0xb0 Nov 12 15:07:55 Gaming kernel: Sending NMI from CPU 11 to CPUs 35: Nov 12 15:07:55 Gaming kernel: NMI backtrace for cpu 35 Nov 12 15:07:55 Gaming kernel: CPU: 35 PID: 36818 Comm: CPU 15/KVM Tainted: G O 4.18.17-unRAID #1 Nov 12 15:07:55 Gaming kernel: Hardware name: Cirrascale VB1416/GA-7PESH2, BIOS R17 06/26/2018 Nov 12 15:07:55 Gaming kernel: RIP: 0010:_raw_spin_lock+0xb/0x19 Nov 12 15:07:55 Gaming kernel: Code: ff 48 29 e8 48 3d 24 f4 00 00 77 aa b8 c9 00 00 00 eb cb 89 d8 5b 5d c3 90 90 90 90 90 90 90 31 c0 ba 01 00 00 00 f0 0f b1 17 <85> c0 74 09 89 c6 e8 fc 9a a4 ff 66 90 c3 fa 66 0f 1f 44 00 00 31 Nov 12 15:07:55 Gaming kernel: RSP: 0018:ffffc900042c3b48 EFLAGS: 00000087 Nov 12 15:07:55 Gaming kernel: RAX: 0000000000600000 RBX: 0000000000000100 RCX: ffff88046fc617c0 Nov 12 15:07:55 Gaming kernel: RDX: 0000000000000001 RSI: 0000000000900000 RDI: ffff88046f421dc0 Nov 12 15:07:55 Gaming kernel: RBP: ffffc900042c3ba8 R08: 00000000006c0000 R09: ffff88046f421dc0 Nov 12 15:07:55 Gaming kernel: R10: 000000000000020c R11: 0000000000000082 R12: 0000000000000830 Nov 12 15:07:55 Gaming kernel: R13: ffff88046f421dc0 R14: 0000000000000082 R15: ffff88046f427400 Nov 12 15:07:55 Gaming kernel: FS: 00001495e7ee3700(0000) GS:ffff88087fbc0000(0000) knlGS:0000000000000000 Nov 12 15:07:55 Gaming kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 12 15:07:55 Gaming kernel: CR2: ffffb38e68cfcc08 CR3: 0000000412c4a002 CR4: 00000000001626e0 Nov 12 15:07:55 Gaming kernel: Call Trace: Nov 12 15:07:55 Gaming kernel: qi_submit_sync+0x265/0x2db Nov 12 15:07:55 Gaming kernel: modify_irte+0xe3/0x129 Nov 12 15:07:55 Gaming kernel: intel_irq_remapping_deactivate+0x2d/0x47 Nov 12 15:07:55 Gaming kernel: __irq_domain_deactivate_irq+0x27/0x33 Nov 12 15:07:55 Gaming kernel: irq_domain_deactivate_irq+0x15/0x22 Nov 12 15:07:55 Gaming kernel: __free_irq+0x10a/0x216 Nov 12 15:07:55 Gaming kernel: free_irq+0x42/0x59 Nov 12 15:07:55 Gaming kernel: vfio_msi_set_vector_signal+0x72/0x233 Nov 12 15:07:55 Gaming kernel: ? kvm_fast_pio+0x10a/0x147 [kvm] Nov 12 15:07:55 Gaming kernel: vfio_msi_set_block+0x64/0x96 Nov 12 15:07:55 Gaming kernel: vfio_msi_disable+0x61/0xa0 Nov 12 15:07:55 Gaming kernel: vfio_pci_set_msi_trigger+0x44/0x228 Nov 12 15:07:55 Gaming kernel: ? pci_bus_read_config_word+0x44/0x66 Nov 12 15:07:55 Gaming kernel: vfio_pci_ioctl+0x4e1/0x974 Nov 12 15:07:55 Gaming kernel: ? vfio_msi_config_write+0x7b/0x89 Nov 12 15:07:55 Gaming kernel: ? __seccomp_filter+0x39/0x1ed Nov 12 15:07:55 Gaming kernel: vfs_ioctl+0x19/0x26 Nov 12 15:07:55 Gaming kernel: do_vfs_ioctl+0x518/0x540 Nov 12 15:07:55 Gaming kernel: ksys_ioctl+0x39/0x58 Nov 12 15:07:55 Gaming kernel: __x64_sys_ioctl+0x11/0x14 Nov 12 15:07:55 Gaming kernel: do_syscall_64+0x57/0xe6 Nov 12 15:07:55 Gaming kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Nov 12 15:07:55 Gaming kernel: RIP: 0033:0x1495f20e0427 Nov 12 15:07:55 Gaming kernel: Code: 00 00 90 48 8b 05 69 0a 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 0a 0d 00 f7 d8 64 89 01 48 Nov 12 15:07:55 Gaming kernel: RSP: 002b:00001495e7ee0be8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Nov 12 15:07:55 Gaming kernel: RAX: ffffffffffffffda RBX: 000014936573ec00 RCX: 00001495f20e0427 Nov 12 15:07:55 Gaming kernel: RDX: 00001495e7ee0bf0 RSI: 0000000000003b6e RDI: 000000000000003e Nov 12 15:07:55 Gaming kernel: RBP: 0000000000000001 R08: 0000000000000000 R09: 000000000000006c Nov 12 15:07:55 Gaming kernel: R10: 000000000000006c R11: 0000000000000246 R12: 0000000000000002 Nov 12 15:07:55 Gaming kernel: R13: 000000000000006a R14: 0000000000000080 R15: 0000000000000002 Quote M/B: GIGABYTE - GA-7PESH2 CPU: Intel® Xeon® CPU E5-2690 v2 @ 3.00GHz HVM: Enabled IOMMU: Enabled Cache: 320 kB, 2560 kB, 25600 kB Memory: 32 GB Single-bit ECC (max. installable capacity 768 GB) Network: bond0: fault-tolerance (active-backup), mtu 1500 eth0: 1000 Mb/s, full duplex, mtu 1500 eth1: 1000 Mb/s, full duplex, mtu 1500 Kernel: Linux 4.18.17-unRAID x86_64 OpenSSL: 1.1.1 Using a GTX 1080 as the GPU If any more information is needed please let me know. Quote Link to comment
John_M Posted November 12, 2018 Share Posted November 12, 2018 15 minutes ago, slimshizn said: If any more information is needed please let me know. Much more useful than the snippet you posted: Tools -> Diagnostics. Attach the resulting zip file. Quote Link to comment
slimshizn Posted November 12, 2018 Author Share Posted November 12, 2018 gaming-diagnostics-20181112-1558.zip Quote Link to comment
slimshizn Posted November 12, 2018 Author Share Posted November 12, 2018 Noticed I'm using virtIO driver virtio-win-0.1.141-1.iso and there is virtio-win-0.1.160-1.iso available, would this be the cause of the issue I'm having? Quote Link to comment
jonp Posted November 13, 2018 Share Posted November 13, 2018 Possibly. Try updating the drivers in your VM and see if the problem goes away. Quote Link to comment
slimshizn Posted November 13, 2018 Author Share Posted November 13, 2018 1 hour ago, jonp said: Possibly. Try updating the drivers in your VM and see if the problem goes away. Thanks for the reply, I also found some information about MSI interrupts and implementing unsafe interrupts. I'm backing up the vdisks as we speak before I do anything drastic. Quote Link to comment
slimshizn Posted November 13, 2018 Author Share Posted November 13, 2018 Something was corrupt in the XML or something else causing weird issues trying to start it, thankfully I've backed up virt iso and all of my vdisks. I re created the VM and it's running fine. I'll update if it causes a crash again. Quote Link to comment
slimshizn Posted November 20, 2018 Author Share Posted November 20, 2018 Went back to 6.6.3, everythings working and after a few restarts of the VM there have been no crashes. Not sure if this is related to the 6.6.5 update or not but I'm sure I'll be staying here, at least for a while. Quote Link to comment
slimshizn Posted November 27, 2018 Author Share Posted November 27, 2018 So today after 6 days of having no problems restarting the VM I had to after a Nvidia gamestream issue. Restarted IN the VM and all the sudden the server just took a nose dive. Couldn't connect at all, completely locked up. Did a unclean reboot and everything is working fine again. Is there anything else I should be doing here with the W10 VM and nvidia gtx 1080? I don't want to worry about having to reboot the VM causing a hard crash. Quote Link to comment
slimshizn Posted December 5, 2018 Author Share Posted December 5, 2018 Been posting here as well about this. I do need some help here with this as MSI interrupts did not help alleviate the issue. I need some guidance in order to have a windows 10 VM working with a GTX 1080 correctly and not cause crashes. Quote Link to comment
jonp Posted December 5, 2018 Share Posted December 5, 2018 After reviewing your logs and hardware configuration, if I was a betting man, this is the result of using a Gigabyte motherboard. I'm sorry to say but I've had nothing but problems both myself and with other users trying to use that garbage. For whatever reason, Gigabyte hardware (motherboards, GPUs, etc.) all seem to have issues that mainline providers like Asus, ASRock, and Supermicro, just don't seem to have. All I can suggest is that you check for a BIOS update, but if I were a betting man, I'd bet heavily on switching motherboards resolving the issue. That or you'll have to wait for the 4.20 kernel to see if there are any fixes in there specific to your hardware/drivers, but I would doubt that is going to solve your issue. If there was something more specific in the logs / call trace, I would gladly investigate, but the generic messages and lack of any smoking gun leads me to look at the hardware, and without having seen it yet, I was already betting that either the GPU or the motherboard were Gigabyte-based, and it appears at least the motherboard is guilty of that. If you can reproduce the issue using a different motherboard, then I think we'd have something to look at, but considering how many people here are using Xeon CPUs and NVIDIA GPUs with VMs and no issues, I doubt that's going to be the case. Quote Link to comment
slimshizn Posted December 5, 2018 Author Share Posted December 5, 2018 (edited) That's very unfortunate to hear. I do have an Asus MOBO and can use the Gigabyte as a backup only, just to make sure I could use the same USB on another MOBO, correct? Any other changes I should/need to make other than the obvious? Edit: wanted to include that I'm also passing through a USB 3.0 hub as well with the board. I just read elsewhere that others had issues when the hub was included in the passthrough. Might be worth investigating. Edited December 5, 2018 by slimshizn Quote Link to comment
slimshizn Posted December 5, 2018 Author Share Posted December 5, 2018 (edited) I wanted to reply here in case anyone was following or reading this. As a work around for now, I was able to log out of the VM completely instead of selecting reboot or shutdown, and THEN shut down. Doing this allowed me to not run into a server lock up/crash. I am still troubleshooting absolutely everything I can before I use this MB for file serving and non-pass through VM's. Edit: after removing VFIO allow interrupts, rebooting server, starting back up and starting the server back up and shutting down the VM in the same fashion, I had another crash. I'm enabling VFIO allow interrupts again to see if that was the culprit in this work around or just a coincidence. Edit 2: Seems to have been the USB controller I had passed through. Took that out of the equation and had a single error which I'm still working out. Edited December 8, 2018 by slimshizn Quote Link to comment
hmoney007 Posted February 16, 2020 Share Posted February 16, 2020 On 12/5/2018 at 5:31 PM, slimshizn said: I wanted to reply here in case anyone was following or reading this. As a work around for now, I was able to log out of the VM completely instead of selecting reboot or shutdown, and THEN shut down. Doing this allowed me to not run into a server lock up/crash. I am still troubleshooting absolutely everything I can before I use this MB for file serving and non-pass through VM's. Edit: after removing VFIO allow interrupts, rebooting server, starting back up and starting the server back up and shutting down the VM in the same fashion, I had another crash. I'm enabling VFIO allow interrupts again to see if that was the culprit in this work around or just a coincidence. Edit 2: Seems to have been the USB controller I had passed through. Took that out of the equation and had a single error which I'm still working out. I have the same mobo and am having the same issues. what usb controller were you passing through? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.