DCG

Members
  • Posts

    5
  • Joined

  • Last visited

DCG's Achievements

Noob

Noob (1/14)

0

Reputation

  1. It's been a while, with only a crash just now (ran for more than a month before I wanted to add in different GPU and reseat the NVME's) I was tinkering with a windows VM today (can't seem to get the GPU to consistently identify itself...), when the entire system crashed. I bound the GPU (12:00.0) to VFIO, but windows kept throwing an error 31, which I managed to "fix" to a 43 by installing an hidden qemu device in device manager. GPU-Z wouldn't show manufacturer or chip data, other than the BIOS and chip family. I had dumped my own Vbios through a bare metal install, but I was thinking something might have been wrong with it When I tried to dump it through the windows VM, I lost the RDP connection (Which I thought might have been just because of the dump), but when I refreshed the VM window of Unraid, I got kicked to the login screen of Unraid.... It had also almost finished a Parity check (which has automatically been restarted). Fix common problems reported an device error, and prompted me to install "mcelog". I've attached both the longer running syslog and the diagnostics. One thing to not might be that I reset my router every morning at 3 AM, so that might show up in the logs as well. A couple of days ago I noticed the time on the server was incorrect again (got african times, whilst I'm in GMT +1 ) Now my best guess is that I messed something up with the GPU in the VM causing it to take (part) of the rest of the system down with it, but I'm not certain. Edit. Something I did manage to fix on my own was the NIC, flow control seems to have been the bane in that regard (even though I did test that previously...). syslog-192.168.1.170.log nas2-diagnostics-20210630-1711.zip
  2. I have enabled the syslog server and confirmed it is writing an file. When it crashes again, I'll post an update containing the file.
  3. I've been running into this issue a couple of times now and I can't wrap my head around what could be causing it... When I first used Unraid I ran it on a Asrock x570 Pro4, paired with a R5 3600 and 2x 16 GB ECC ram. It worked fine, so I added a LSI SAS 9211-8i to be able to connect more HDD's in the future. The next upgrade was an Intel X550-T2 and a X570 Phantom Gaming X, to increase the speed to my desktop (only seem to get half the speed, but that's another thing I need to figure out). After swapping the motherboard I would experience seemingly random system hangs, in which the data on the unraid unit would be inaccessible, sometimes I could log into the unraid unit, othertimes I couldn't, I could only "fix" it by shutting it down via the power button... If I could get into the webgui, I couldn't shut it down via the shutdown or reboot command. This happened about 3 months, about every 9 or 14 days. Since I had upgraded to 6.9.0 in the mean time and started using the X550-T2 at the same I swapped the motherboard, I wanted to make sure it wasn't one of the others causing the issue. I replaced the older x570 Pro4 and had it running for a month without any issues, so I swapped back the Phantom X again, this time with a change to power supply idle control as per: It had been running fine for about 14 days (daily checks to see if anything weird was up) and this evening I noticed 2 cores at 100%. Thinking this was weird I tried to run a diagnostic, but it wouldn't do anything for about an hour... I tried to see if I could access the data on the HDD's and I could, both trough samba and directly trough the webgui. Now the webgui didn't show the drives spinning up? The command line didn't respond either. One other thing I did notice was that the unassigned drive was unmounted, whilst I'm quite certain that it was mounted (used for the VM, which is off by default) this morning. The attached diagnostics are with the unraid unit only being up for 4 hours, so I'm not to certain they are useful, but maybe there's something obvious I'm missing in there. I'm also running Fix common problems, but other than it reporting I'm not having auto updates for docker and plugins, it doesn't find anything nas2-diagnostics-20210519-2359.zip
  4. It's most likely under "advanced -> AMD CBS -> CPU Common Options", on the same page as global C-State Control. (at least it was for me)
  5. @trurl It seems I've been running into this issue for the past few months after upgrading to a motherboard with more PCI-e lanes... The funny thing is the old motherboard would run fine for months on end (Asrock X570 Pro4), whilst the newer one (Asrock X570 Phantom Gaming X) hangs around every 9 days... With everything identical, save for the motherboard. I'll give the C-States a look tomorrow after work.