Foxyll Posted November 28, 2019 Share Posted November 28, 2019 (edited) Hello, first post here, sorry if it's not very understandable. I've been using unRAID as a trial for a week or so now, and I'm really loving it and have a lot of projects planned with it. I've just been having an issue that isn't annoying enough to make me stop using the OS, but is annoying enough that I want to try to find a fix... For a day or so, my server is fine. But after around 24-36 hours it will become pretty unstable, especially when it comes to my VMs. After this time, it will be generally slow, but still fine, but then when I go to start a VM, it slows to basically a halt and a force reset is almost always required. I had started to just reboot my server gracefully when I got home everyday, but I have been getting jealous of these nearly year-long uptimes I've seen on here lol. Today when I got home, I forgot to restart gracefully and tried to boot up my VM. This time though, I managed to pull the syslog before it became completely unresponsive. Here is my system specs, and the syslog file is attached. I have tried to look at the syslog, but I can't make much sense of it, I'm pretty new to this kind of thing, as I've only ever worked with Windows and basic Linux use with some bash scripting. Can anyone else get an insight of what might be going on here? Sys specs: CPU: AMD Ryzen 3600 Memory: 16GB Corsair Vengeance DDR4-3000MHz GPU: XFX Radeon RX 580-8GB Factory OC'd Mobo: Asus TUF Gaming - X570 Plus (Wi-Fi) EDIT: Figured I should add a couple of the steps I've taken after googling around. I've tried changing VM settings around, as that seems to be the main cause of instability, but Machine, CPU mode, and core count don't seem to help. I've added to my syslinux file "vfio-pci.ids=1002:67df,1002:aaf0 video=efifb:off" I've made sure that IOMMU is ENABLE on my BIOS, instead of set to auto. These are about all the steps I've tried, after these didn't work I figured that I wasn't on the right track with my searching/troubleshooting and gave up as it wasn't annoying me too bad syslog Edited November 28, 2019 by Foxyll Added steps taken Quote Link to comment
trurl Posted November 28, 2019 Share Posted November 28, 2019 Go to Tools-diagnostics and attach the complete Diagnostics zip file to your NEXT post. Quote Link to comment
Foxyll Posted November 28, 2019 Author Share Posted November 28, 2019 My server has currently only been up for ~7 hours, and I haven't noticed any slowdown yet, but here is the current diags. If you don't notice anything in this one I'll skip restarting tomorrow so that I can try to get a diag while its struggling. vixnet-diagnostics-20191128-0105.zip Quote Link to comment
trurl Posted November 28, 2019 Share Posted November 28, 2019 Have you done memtest? Quote Link to comment
Foxyll Posted November 28, 2019 Author Share Posted November 28, 2019 No, all of my components (minus GPU) are new so I didn't think to.... I'll run it in a bit and post results. Quote Link to comment
Foxyll Posted November 28, 2019 Author Share Posted November 28, 2019 Ok, that seems worrying.... Trying to run Memtest86+ from my unRAID flash drive results in the system simply resetting.... Perhaps I will grab another flash drive tomorrow and flash only memtest on it to try that. Quote Link to comment
JorgeB Posted November 28, 2019 Share Posted November 28, 2019 6 hours ago, Foxyll said: Trying to run Memtest86+ from my unRAID flash drive results in the system simply resetting. That's normal if booting UEFI, memtest only works with legacy boot. Quote Link to comment
Foxyll Posted November 28, 2019 Author Share Posted November 28, 2019 Thank you for the help so far, I realized that UEFI could be an issue and managed to run a memtest before going to bed, pictures attached. If you want a longer one let me know, but I did not want to let it run while I slept in case it shut off before I could get the results or something. Quote Link to comment
Foxyll Posted November 30, 2019 Author Share Posted November 30, 2019 Hello again, sorry that I haven't been posting, I haven't really captured anything that was new. When it got to its "unstable" state I decided to poke around a little more to see if I could force anything new/isolating out of it. I did so, but I still can't really make sense of the results. The file named messing_with_vms is my messing around with my VM settings, first I tried to boot my normal windows VM with my graphics card, then windows VM with VNC, then I tried to boot my Arch linux VM with my graphics card, then Arch with VNC. The only result that got anything was Arch on VNC, the rest just sat at a blinking cursor. I tried Spaceinvaderone's AMD Reset bug fix, but the same issue persisted. On my next boot, for some reason the computer would not post until I pulled my unRAID usb out. I then tried something else I found on the internet, iommu=soft, but this just resulted in my GPU not showing up at all to be passed to a VM. iommu=pt is now causing me an issue.... I can't bootup, because it auto starts my vms, but then the computer locks up, and the syslog named hardcrash is the only thing I can grab... (As I was finishing up writing this I got to boot up and force stop the VM before it could crash the system, so I fixed the iommu in syslinux and am at least back to normal now) Any help is much appreciated you guys! messing_with_vms hardcrash Quote Link to comment
trurl Posted November 30, 2019 Share Posted November 30, 2019 59 minutes ago, Foxyll said: I can't bootup, because it auto starts my vms You can edit config/domain.cfg on flash to disable VM service. Similarly for config/docker.cfg Quote Link to comment
Foxyll Posted December 1, 2019 Author Share Posted December 1, 2019 (edited) Hello again, another day another crash. Today I noticed some new looking errors in the syslog after trying to start my windows VM. Seemed to be drive related, so I was going to stop and then start the array again, but I seem to be unable to as it says "Disabled -- BTRFS operation is running" I tried to look this up, and found an old forum post that never seemed to get solved. I cannot seem to grab the diagnostics, I have been trying for the past few minutes but it is just hung, and my processor is showing odd utilization. I have attached the syslog however, as well as the image of my processor's utilization EDIT: I don't have much to do today, so I can just use my laptop and leave my server in this state, any troubleshooting you want me to try I'm all ears Thank you in advanced syslog(4) Edited December 1, 2019 by Foxyll Quote Link to comment
trurl Posted December 1, 2019 Share Posted December 1, 2019 Looks like it is having problems with your cache SSD. Check connections. Quote Link to comment
Foxyll Posted December 1, 2019 Author Share Posted December 1, 2019 Ah, the clip is broken on the SATA connector that plugs into my motherboard from the SSD. I don't have another cable, will get another later. Another thing, for some reason when I launched back into unRAID after booting back up I had to re-enable VMs, they had become disabled. Quote Link to comment
trurl Posted December 1, 2019 Share Posted December 1, 2019 5 minutes ago, Foxyll said: for some reason when I launched back into unRAID after booting back up I had to re-enable VMs, they had become disabled. Earlier in this thread you were having a problem because your VMs were autostarting and I told you how to edit a file on flash to disable the VM Manager. If you did edit that file then that is the reason. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.