February 26, 20179 yr Hi, I have been getting multiple call traces in syslog after i upgraded to 6.3.0. I upgraded from 6.2.4. I then upgraded to 6.3.2 and the problem still persists. I never had any of these issues while on 6.2.4. i also notice that after the upgrade all my windows vms take longer to start up and utilize 100 percent of there cores for a very long time. Most of the time they will idle at 100 percent usage when looking at the info from the unraid dashboard. the performance is sluggish from within the windows vm after startup. Both windows vms are passing through video cards.both cards are amd radeon 5450. i made some adjustments from within the bios to see if i could correct the problem.. mainly just made sure hyperthreading was enabled and tried adjusting memory timings to see if that would help.. but, it didn't help at all.. and suggestions would be helpful.. i also use isolcpus for my main windows 10 vm. I tried switching to different cores and that did not help either. attached is the diagnostics from unraid. this is the main call trace error being reported. Feb 19 22:29:40 DNASON kernel: CPU: 6 PID: 6544 Comm: CPU 0/KVM Not tainted 4.9.10-unRAID #1 Feb 19 22:29:40 DNASON kernel: Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X99P-SLI-CF, BIOS F2 11/12/2015 Feb 19 22:29:40 DNASON kernel: ffffc9002078fc68 ffffffff813a353e 0000000000000000 ffffffffa07e6f58 Feb 19 22:29:40 DNASON kernel: ffffc9002078fca8 ffffffff8104d00c 000005592078fd50 ffff880a505b8000 Feb 19 22:29:40 DNASON kernel: ffff880f55685e00 0000000000000034 0000000000325b59 ffff880d863f0cc0 Feb 19 22:29:40 DNASON kernel: Call Trace: Feb 19 22:29:40 DNASON kernel: [<ffffffff813a353e>] dump_stack+0x61/0x7e Feb 19 22:29:40 DNASON kernel: [<ffffffff8104d00c>] __warn+0xb8/0xd3 Feb 19 22:29:40 DNASON kernel: [<ffffffff8104d0d4>] warn_slowpath_null+0x18/0x1a Feb 19 22:29:40 DNASON kernel: [<ffffffffa07d359a>] kvm_lapic_expired_hv_timer+0x2a/0x63 [kvm] Feb 19 22:29:40 DNASON kernel: [<ffffffffa08f4041>] handle_preemption_timer+0x9/0x10 [kvm_intel] Feb 19 22:29:40 DNASON kernel: [<ffffffffa08fe011>] vmx_handle_exit+0xfc7/0x105f [kvm_intel] Feb 19 22:29:40 DNASON kernel: [<ffffffffa07c0777>] kvm_arch_vcpu_ioctl_run+0x357/0x1165 [kvm] Feb 19 22:29:40 DNASON kernel: [<ffffffffa07bb040>] ? kvm_arch_vcpu_load+0x162/0x1a0 [kvm] Feb 19 22:29:40 DNASON kernel: [<ffffffffa07b0d6a>] kvm_vcpu_ioctl+0x178/0x499 [kvm] Feb 19 22:29:40 DNASON kernel: [<ffffffff8112fe72>] vfs_ioctl+0x13/0x2f Feb 19 22:29:40 DNASON kernel: [<ffffffff811303a2>] do_vfs_ioctl+0x49c/0x50a Feb 19 22:29:40 DNASON kernel: [<ffffffff81138f7f>] ? __fget+0x72/0x7e Feb 19 22:29:40 DNASON kernel: [<ffffffff8113044e>] SyS_ioctl+0x3e/0x5c Feb 19 22:29:40 DNASON kernel: [<ffffffff8167d2b7>] entry_SYSCALL_64_fastpath+0x1a/0xa9 Feb 19 22:29:40 DNASON kernel: ---[ end trace 05e7f5d44214fff3 ]--- dnason-diagnostics-20170226-1427.zip
February 27, 20179 yr Call traces weren't reported before 6.3.0, so it's likely you would find them in earlier syslogs too. Can you post a diagnostics or syslog from 6.2.4 and syslogs from before that? The one above is the first, and you should probably have rebooted at that time. It's the only one that is "not tainted", the rest are tainted. All of them occur unrelated to anything else before or after. All of them appear related to KVM instability, almost all appear to be related to some kind of KVM timing issue. I'd recommend rebooting daily. I suspect you'll find the same Call Traces in earlier syslogs too. In addition to the Call Traces, there are periods of USB port or controller issues, with VGA and bridge issues too, all related to VM usage, I think. I can't tell what's wrong, but it seems to be something either wrong with the hardware, or a VM is not configured correctly. Also, check for a motherboard BIOS update. A general comment - I'm tempted to speak up about how fast we're moving lately, too close to the leading edge. That means we are getting the latest upgrades with the newest features, but it also means we are running the least tested and therefore most buggy code. I'd rather stay farther behind, let others find the bugs and get them fixed. We might therefore have a KVM with fewer features but better patched.
February 27, 20179 yr Author thanks for the reply. I just reseated the processor, gpu and ram so lets see if that helps.. i dont have a syslog from 6.2.4 so thats probably why i never noticed it before. all of them vms should be configured properly. they were all configured before 6.3.2 and worked without issues.. one thing that i noticed is that when passing through my gpu and a usb controller on my windows vm it takes a really long time before i see the seabios line. this just started all after 6.3.0 update. in the log it only says that qemu-system-x86_64: vfio: Cannot reset device 0000:00:11.4, no available reset mechanism but that has always been there. even with 6.2.4 so im not sure whats causing the hold up. windows 10 reboots also take a very long time in 6.3.2.
February 27, 20179 yr Just making sure, did you read the UnRAID OS version 6 Upgrade Notes when you upgraded to 6.3, specifically the section about poor VM performance?
February 27, 20179 yr Author yeah, i saw that and tried upgrade the machine type but that didnt help. Actually crashed windows on first boot after i did it than rebooted windows and it was fine. The problem i think is that on startup its uses 100 percent of all the cores i assign to the vm. and it does that for awhile then will slowly come down. its almost like it has a hard time trying to passthrough the devices its assigned. i had quite a few issuses after the upgrade. like for instance directly after the upgrade i had to balance my btrfs pool bc i was getting an error msg... than after that was done my pfsense virtual disk got wiped somehow and i had to restore from backup.. than after i upgraded to 6.3.2 my pfsense virtual hdd got wiped again so we will see after the next update if if gets wiped. but the main issue was the call trace and poor vm performance. i guess i will have to just wait for future updates to fix the problem..
February 27, 20179 yr Author i also get this error sometimes when trying to reboot my windows vm.. do u have ne idea what could be causing this? KVM internal error. Suberror: 3 extra data[0]: 80000b0e extra data[1]: 31 KVM internal error. Suberror: 3 extra data[0]: 80000b0e extra data[1]: 31 RAX=0000000000000001 RBX=ffffdd80b6de5750 RCX=ffffdd80b6de5750 RDX=fffff80599fb27f0 RSI=ffffdd80b6d71a80 RDI=ffffdd80b6d6b180 RBP=ffffdd80b71ce900 RSP=ffffdd80b6bfcf20 R8 =ffffdd80b6de5750 R9 =ffffdd80b6de56c0 R10=0000000000000006 R11=ffffdd80b71ceb40 R12=0000001bb3335abe R13=ffffaf0e7ef50f70 R14=ffffdd80b6be0180 R15=fffff802e247f000 RIP=fffff80599fb2812 RFL=00010202 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA] CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA] SS =0018 0000000000000000 00000000 00409300 DPL=0 DS [-WA] DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA] FS =0053 0000000000000000 0000bc00 0040f300 DPL=3 DS [-WA] GS =002b ffffdd80b6d6b000 ffffffff 00c0f300 DPL=3 DS [-WA] LDT=0000 0000000000000000 ffffffff 00000000 TR =0040 ffffdd80b6d71b40 00000067 00008b00 DPL=0 TSS64-busy GDT= ffffdd80b6d78c00 0000006f IDT= ffffdd80b6d78c70 00000fff CR0=80050031 CR2=ffffdd80b6de5754 CR3=00000000001a8000 CR4=001506f8 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000d01 Code=c1 01 83 f8 01 74 1c 8b 41 04 85 c0 74 0d 0f 1f 40 00 f3 90 <8b> 41 04 85 c0 75 f7 33 c0 48 83 c4 20 5b c3 48 8b 41 08 48 8b 51 18 48 8b 49 10 ff 15 7d RAX=0000000000000001 RBX=ffffdd80b6de5750 RCX=ffffdd80b6de5750 RDX=fffff80599fb27f0 RSI=ffffdd80b6d8ea80 RDI=ffffdd80b6d88180 RBP=ffffdd80b71c6900 RSP=ffffdd80b7069f20 R8 =ffffdd80b6de5750 R9 =ffffdd80b6de56c0 R10=0000000000000006 R11=ffffdd80b71c6b40 R12=0000001bb32dcd7a R13=ffffaf0e7ef50f70 R14=ffffdd80b6be0180 R15=fffff802e247f000 RIP=fffff80599fb2812 RFL=00010202 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA] CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA] SS =0018 0000000000000000 00000000 00409300 DPL=0 DS [-WA] DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA] FS =0053 0000000000000000 0000fc00 0040f300 DPL=3 DS [-WA] GS =002b ffffdd80b6d88000 ffffffff 00c0f300 DPL=3 DS [-WA] LDT=0000 0000000000000000 ffffffff 00000000 TR =0040 ffffdd80b6d8eb40 00000067 00008b00 DPL=0 TSS64-busy GDT= ffffdd80b6d95c00 0000006f IDT= ffffdd80b6d95c70 00000fff CR0=80050031 CR2=ffffdd80b6de5754 CR3=00000000001a8000 CR4=001506f8 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000d01 Code=c1 01 83 f8 01 74 1c 8b 41 04 85 c0 74 0d 0f 1f 40 00 f3 90 <8b> 41 04 85 c0 75 f7 33 c0 48 83 c4 20 5b c3 48 8b 41 08 48 8b 51 18 48 8b 49 10 ff 15 7d 2017-02-27T02:26:44.776652Z qemu-system-x86_64: terminating on signal 15 from pid 13806
Archived
This topic is now archived and is closed to further replies.