NerdyGriffin

Members
  • Posts

    32
  • Joined

  • Last visited

Everything posted by NerdyGriffin

  1. I have found a temporary workaround in Unraid's routing table which allows me to restore the connection between Unraid and the VMs. If I add an entry to the Unraid routing table using the VM IP address as the route entry and the IP of the router as the gateway, it fixes the connection issue until the next time I reboot Unraid. For now I have added the following lines to my /boot/config/go to add these routes at startup: # Workaround for Unraid VM's unable to connect to host server ip route add 10.0.0.8/29 via 10.0.0.1 metric 1 ip route add 10.0.0.16/29 via 10.0.0.1 metric 1 This results in the following routing table: VM:~$ ip route list default via 10.0.0.1 dev bond0 metric 1 default via 10.10.128.1 dev bond0.10 metric 10 10.0.0.0/16 dev bond0 proto kernel scope link src 10.0.0.2 metric 1 10.0.0.8/29 via 10.0.0.1 dev bond0 metric 1 10.0.0.16/29 via 10.0.0.1 dev bond0 metric 1 10.10.128.0/17 dev bond0.10 proto kernel scope link src 10.10.128.2 metric 1 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 172.18.0.0/16 dev br-52fc2ba579fe proto kernel scope link src 172.18.0.1 linkdown 172.19.0.0/16 dev br-3be2b68f3367 proto kernel scope link src 172.19.0.1 linkdown 192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown Notice that the default table entry for the `10.0.0.0/16` subnet does not include a `via`, which would imply that it should be using the `default via 10.0.0.1 ...`. However, in practice it is failing to route packets between Unraid and the VMs (as explained in my original post). By adding routing table entries that explicitly route the VM IPs to the router (`via 10.0.0.1`) I am able to restore the connection. This allows me to continue using the VMs as normal for now, but it does not identify and fix the root cause of the problem. I would like to understand why this is happening in the first place.
  2. Starting last night, my two ubuntu VMs are unable to connect to unraid and vice-versa. In the following examples, I will be replacing the real IP addresse of the Unraid server with 10.0.0.XX and that of the VM with 10.0.0.YY If I try to ssh from Unraid to a VM, I get the following error ssh: connect to host 10.0.0.YY port 22: No route to host If I run a ping to the VM IP from Unraid I get the following error: ping -c 4 10.0.0.YY PING 10.0.0.YY (10.0.0.YY) 56(84) bytes of data. From 10.0.0.XX icmp_seq=1 Destination Host Unreachable From 10.0.0.XX icmp_seq=2 Destination Host Unreachable From 10.0.0.XX icmp_seq=3 Destination Host Unreachable From 10.0.0.XX icmp_seq=4 Destination Host Unreachable --- 10.0.0.YY ping statistics --- 4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 3041ms pipe 2 The result of running traceroute on the unraid server: UNRAID:~$ traceroute 10.0.0.YY traceroute to 10.0.0.YY (10.0.0.YY), 30 hops max, 60 byte packets 1 unraid.lan (10.0.0.XX) 3055.969 ms !H 3055.943 ms !H 3055.933 ms !H If I ssh from the VM to unraid it hangs for several minutes before giving me `Connection timed out' If I ping from the VM to Unraid: VM:~$ ping -c 4 10.0.0.XX PING 10.0.0.XX (10.0.0.XX) 56(84) bytes of data. --- 10.0.0.XX ping statistics --- 4 packets transmitted, 0 received, 100% packet loss, time 3057ms If I traceroute from VM to Unraid: VM:~$ traceroute 10.0.0.XX traceroute to 10.0.0.XX (10.0.0.XX), 30 hops max, 60 byte packets 1 * * * 2 * * * 3 * * * 4 * * * 5 * * * 6 * * * 7 * * * 8 * * * 9 * * * 10 * * * 11 * * * 12 * * * 13 * * * 14 * * * 15 * * * 16 * * * 17 * * * 18 * * * 19 * * * 20 * * * 21 * * * 22 * * * 23 * * * 24 * * * 25 * * * 26 * * * 27 * * * 28 * * * 29 * * * 30 * * * However, when I ssh to the VMs from my Windows desktop or from a Raspberry Pi it works just fine. The same goes for ssh to the Unraid server from those physical devices. It seems like all the physical devices on the network can connect to Unraid and the VMs as normal, but the VMs and Unraid are unable to connect to eachother.
  3. Thank you so much! I have been trying to fix this problem ever since I updated to 6.12
  4. When I try to edit for config of any of my docker container, the WebUI is loading a template file from the path /boot/config/plugins/dockerMan/templates-user/bak-2022-10-02/my-*.xml rather than using /boot/config/plugins/dockerMan/templates-user/my-*.xml If I make changes and hit Apply, the changes are saved to the file in /boot/config/plugins/dockerMan/templates-user/my-*.xml and it appears Unraid is actually reading that version of the file, but if I try to edit it again the WebUI still shows me the version from /boot/config/plugins/dockerMan/templates-user/bak-2022-10-02/my-*.xml I have only noticed this issue starting today, so I don't know when it truly started. Based on the date of that "bak" folder, I am guessing this started after the last Unraid version update I can edit the files in /boot/config/plugins/dockerMan/templates-user/my-*.xml through the command line, and it appears those changes are applied when I restart a container, but why is the Unraid WebUI pulling from a subfolder with old versions of my docker templates? I also tried removing the `bak-2022-10-02` folder, which just makes the WebUI display a blank/broken template page. Does anyone know why the WebUI is using this different path for the docker templates, and is there any way I can change it back to normal?
  5. I have found the cause, it appears to be caused by a recent update to iptables, which is a dependency of these Privoxy docker containers. The solution was found here (see Q24 and Q25)
  6. I know this is over a year old thread, but I am just now encountering this same issue as well and I have not yet found a solution. Everything was working fine until a few weeks ago, and now all the containers routed through the VPN seem to be inaccessible. I will continue troubleshooting, I am just throwing this out there incase anyone sees it or has any suggestions
  7. On my system, I have always had PCIe ACS override set to "Both" and never had issues with the NVMe cache, but today I tried setting PCIe ACS to "Disabled" and also tried "Downstream", and in those modes I got the "disk missing" from the cache drive every time I tried to startup the array. This makes some sense, since NVMe is a PCIe device, so I just wanted to add this incase it's helpful to someone In summary, if you get those "device ... Missing" errors with an NVMe drive as cache drive, it might help to try each of the possible options for the "PCIe ACS override" setting, to see if one of those options fixes that cache issue. The results will probably be different for everyone depending on your motherboard and probably also your arrangements of any other PCIe devices
  8. I am having this exact same latency problems ever since I upgraded to 6.9.2, but exerything was flawless before that update. When I rollback the server to 6.8.3 the problem goes away. I have tried all the things mentioned here, but unfortunately it made no difference for me. The "vcpusched" change actually caused worse latency for me. I have also tried installing a fresh Windows 10 install, but the latency issues remained. Oddly enough, I happened to try playing around with a Windows 11 VM (following the @SpaceInvaderOne video) and the Windows 11 has had absolutely no latency problems, despite passing through to exact same hardware: In fact, the XML for the Windows 11 and Windows 10 VMs are nearly identical, and yet the Windows 10 show all these latency issue while to other does not. Edit: Upon further testing, the same latency issue do persist in Windows 11 Insider Build.
  9. The new drive arrived, everything swapped over smoothing and is back to normal. Now I just have to figure out how/where to responsibly recycle or dispose of the old drive...
  10. @tommykmusic I'm no expert in this, but based on the message in that image I would guess it is either the HTPC share having a problem in its config/settings, or it is an issue with the user login stuff not matching correctly with what it is expecting After re-reading more closely, I wonder if it is more of a network-related issue, since you said the HTPC does seem to be connecting to any SMB stuff in or out. If you want to explore the "user login" stuff, you might want to try manually adding an entry to the Credential Manager in Windows (if you haven't tried that already). Let me know if you want me to explain what I mean by that in more detail. I can try to take some screenshots and walk through that process if you want. (I doubt the "Credential Manager" thing will fix your particular issue, but there is not harm in trying) Also a quick clarifying question: Is that a picture of the HTPC attempting to connect to an Unraid share, or a picture of another PC trying to connect to the HTPC share?
  11. Starting yesterday, while I was moving a lot of data around, I received a lot of warnings about SMART reporting "Current pending sector" and "Offline uncorrectable" from one of my parity drives. The "raw value" for both of these was 16 at first, then about half an hour later it changed to 24, then another half hour later it was 46 or something like that, So, as per a recommendation that I read in this thread, I decided to run a SMART Extended Self-test on this drive, which eventually ended with the result "Errors occurred - Check SMART report" It looks to me like the Extended self-test failed, which I assume means I should replace that drive as soon as possible, but I am not entirely sure how to interpret the massive pile of data that is provided in the SMART report. So, I figured I would share the report here to hopefully get a second opinion from someone who knows more about this stuff, before I commit to spending $100+ on a new hard drive griffinunraid-smart-20210509-1602.zip
  12. As a small update: After downgrading to 6.8.3, the performance of everything is much much better than it was before, so I don't know if or how that helps, but clearly the issues I have been having are related to something that changes in 6.9.x. While I do love the new features added in 6.9, I guess I will now be sticking around in 6.8.3, at least until I can figure out what was wrong or until I get way too curious and try experimenting again
  13. I don't mean to be annoying, but might I recommend updating the title to clarify that this thread is specifically about issues with connecting to Unraid shares from a Windows machine. I only mention this because I often see the thread and it is not obvious (from the title alone) that this thread is not about Windows (VM) issues with Unraid, which is an equally common use-case/topic. I personally think it would be helpful to make the title more clear/specific, but maybe I'm wrong and that is not actually a common point of confusion for normal people
  14. I tried looking for existing posts about this but couldn't find any, so hopefully this isn't a duplicate It would be really helpful to be able to set the `multifunction='on'` for certain passthrough devices, such as a NIC for pfSense. Use Case: In order to get my pfSense VM to recognize the NIC ports as 4 separate interfaces (rather seeing than the whole card as only one interface), I have to manually edit the xml to set the 4 "devices" that make up the NIC so that the first has multifunction='on' and the others have the correct matching bus and function address values. However, the problem really is that if I forget and make any edit via the VM template in the webUI, then it will "undo" this multifunction option. Since this "multifunction" thing is actually (usually) a capability of the device, it should be possible for Unraid to detect when it should provide this option by looking for the fact that the addresses of the devices are all in the form 04:00.0 04:00.1 04:00.2 04:00.3 etc... As an alternative or additional method of detection, you may also notice that they should all have the same device ids, For example mine shows 8086:1521 on all 4 of these because they are the 4 interfaces of same physically PCIe card: IOMMU group 20: [8086:1521] 04:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) IOMMU group 21: [8086:1521] 04:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) IOMMU group 22: [8086:1521] 04:00.2 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) IOMMU group 23: [8086:1521] 04:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) Of course, I understand that just because this looks simple in theory, that doesn't mean it will be easy to implement in practice. I don't fully know how the UI--XML generation works, but I have been involved in software development long enough to know that there may be other obstacles or limitations of the system that I am completely unaware of. So, I completely understand if this request/idea gets rejected. Lastly, I would like to provide an example of the "desired" XML with multifunction enabled vs. the default XML generated by the VM Template Here is the XML with multifunction enabled: ... <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0' multifunction='on'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/> </source> <alias name='hostdev1'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x1'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x2'/> </source> <alias name='hostdev2'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x2'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x3'/> </source> <alias name='hostdev3'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x3'/> </hostdev> ... Here is the XML generated by the webUI: ... <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/> </source> <alias name='hostdev1'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x2'/> </source> <alias name='hostdev2'/> <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x3'/> </source> <alias name='hostdev3'/> <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/> </hostdev> ...
  15. Would it be possible to experiment with some of these ideas as a plugin or something? Because if so, I would be very interested in trying to create a plugin/addon for that myself. Now that I think about it, I know there is already a systems for themes, either built-in or available as an addon, but I never really bother to play around with it or to look into the limitations of what can already be done using those tools...
  16. Last night I decided to give-in and downgrade back to 6.8.3, and I posted some stuff here about it, but I will try to keep the performance troubleshooting discussion in this thread since to other thread is specifically about the downgrade process I have not tried running games yet, but I can already tell that after the downgrade, I am already seeing a huge improvement in performance just in the Unraid GUI mode, and the webUI in general. The entire webUI experience was super laggy for me after upgrading to 6.9, so the issue was not just VM performance, it was all of Unraid performing poorly for some reason. As I mentioned in the linked comment, even just the startup-login screen of the Windows VM was super responsive, whereas with Unraid 6.9 that Windows login screen was laggy and typing in the password had a huge delay before it would actually show the inputs on screen. It never missed a keypress along the way, so everything was getting recieved and processed correctly eventually, it was basically just absurd CPU or I/O latency across the board, and I have no idea why If it continues to run smoothly in version 6.8 after more testing, then I guess I will leave it at that version until I have more free time to be able to try upgrading again and troubleshooting, (otherwise it will have to wait until I can afford to buy or build another computer, at which point I likely will leave the current one as Unraid and use the new one as a normal desktop/workstation) However, @SimonF, feel free to ask additional questions about the setup, either the current state or the state of it before the downgrade, because I have backups of all the config files and the VM xml files, taken before the upgrade, after the upgrade, after troubleshooting, etc. (I made backups a few times a month after upgrading to 6.9 because I was experimenting and trying different setting so much that I didn't want to forget what things I had changed or what config I had started with before tinkering) On that note, let me know if you want me to share any of those XML files, if it helps in any way
  17. I am using OVMF and i440fx-## [whatever the highest version number is in the list]
  18. I have seen this several places, and unfortunately it made no difference in my case. I was already using the "performance" setting before the upgrade, and for troubleshooting sake I also tried all the other options for that and I did not change my problem, and I also tried changing the overclock in the BIOS, and resetting everything back to stock speeds in the BIOS, I tried all sorts of variations. None of it made any difference, the problem still persists (in 6.9) Since I haven't finished the downgrade yet, I do not know yet whether downgrading to 6.8 made the problem go away, but the first time the Windows VM booted up it was running way better than before, even just the login screen during startup was way more responsive than it was before, so I think the downgrade probably made a difference
  19. My downgrade process has hit a bit of a snag many times along the way, so I will leave the following information here in case of other people who try to downgrade while using a very non-stock configuration At first, I was hesitant to start a downgrade because I was worried it would be a lot of troubleshooting along the way, but then I saw this thread and it sounded like it would be simple: Just download the zip of the version I want, copy over the `bz*` files, reboot, and reassign the cache drive. In hindsight, I guess I should have listen to my instincts telling me that it would turn into hours of troubleshooting, because that is exactly what happened. However, I will admit that most of these problems are my fault to over-complicating my Unraid setup. I have a pfSense as my router, and the VM domains are on the cache drive, so... First, I had to find a way to reassign the cache drive... but since the pfSense VM was on the cache drive, that means the router VM did not start, so I had no way to access Unraid over the network. I had to hookup my keyboard and monitor directly and reboot Unraid into GUI mode, but at least I was prepared for that hardware-wise. Next, I was finally inside the Windows VM and thought everything was all set, until I realized that the VM had no internet and was failing to get an IP from DHCP. So I went poking around in Unraid again... I saw that pfSense was running, but all the other linux VMs had failed to start. I realized that the other VMs failed to start because they were trying to use newer versions of Q35 or i440fx that were available in unraid 6.9 but not available in 6.8, so I had to fix all those And THEN, with the pfSense VM still running, Unraid still had no internet because I forgot to re-add the 'vfio-pci.ids=...' stubbing to the syslinux.cfg, so Unraid was trying to use the Ethernet NIC that was being passed-through to pfSense! Somehow it was able to successfully passthrough that network card, because pfSense was working fine (because the Wi-Fi network could now access the internet!), but at the same time the Unraid OS still thought it was using that same interface, so Unraid and all the other VMs had no internet. After fixing the syslinux.cfg, I had to reboot twice for some reason before Unraid finally booted up with the motherboard ethernet as `eth0` rather than trying to hold onto the pfSense network card. Now, it seems like I have the array and the VMs up and running correctly in 6.8, and I will now try re-enabling the dockers (I disabled the dockers because I kept getting hung-up on things due to either the cache drive being missing, or the network missing, or whatever other things happened while I was troubleshooting) So I guess TLDR here is that be careful about trying a downgrade if your Unraid setup is very non-standard or very customized. Make sure to prepare to re-do any workarounds that you had previous used before Unraid 6.9 (such as the new improvements to vfio-pci binding), otherwise you will have to fight your way into the server to setup all those things after the downgrade
  20. @Jokerigno and @Crimson Unraider it sounds like you are having the same problem as me, with VM's that used to run perfect in 6.8 but now are super laggy with 6.9. I have not figured out a solution or even a cause, so I ended up here after I decided to "bite the bullet" and downgrade back to 6.8. I doubt this will help, but here is another thread where I have posted some comments discussing the issues I'm seeing. There has been some back-and-forth about similar issues, but no universal solution it seems.
  21. I tried various changed to VM settings, nothing has made any difference yet, but I did try turning down the polling rate of the mouse and it improved the in-game issues, not improved to a playable extent, but the drop in performance is less dramatic. (Importantly, that is only in-game issues. The issues seen on the desktop with "normal" programs are completely unchanged by that, so there is some other root cause) I also noticed that the CPU usage reported in task manager drops from 60% to below 30% during the moments of the in-game lag, so I would guess that means something is stuck "waiting" an abnormal amount of time for I/O things for some reason
  22. I tried that, in various configurations, but it actually made the problems worse in the VM. I was actually using it with 6 cores for many months before the update to 6.9, and then the performance problems started after the 6.9 update so I tried reducing the load from everything else (turn off extra VMs, turn off extra dockers, limit docker cpu usage, pin docker cpu cores to leave more unused cores for the host OS, etc.) so that I could rebalance things enough to give the Windows VM more cores. In 6.8, it was perfectly smooth with 6 cores, but now in 6.9 the VM runs so bad with 6 cores that it is completely unusable for daily work, and with 8 cores it is slightly better but nowhere near what it performed when I was using Unraid 6.8
  23. To add some new information: I have since done some more serious testing using benchmarks such as 3DMark, and the results were possibly informative. Qualitatively looking at the graphs, the GPU benchmarks showed completely normal performance, but the CPU benchmarks showed a massive and obvious drop in performance compared to what I have seen on the same computer in the past. By massive, I mean the scores or FPS or whatever measure were about 20-40% of how this system performed last time I tested it (which was probably a year or more ago). My interpretation of this: Benchmarks showed that it is most likely a CPU bottleneck or emulation overhead of some kind, because the performance only drops/stutters during CPU-focused workloads (load inside the VM only).
  24. 3.0 (qemu...) I also tried using 3.0 nec, and I tried 2.0, it made no difference to these issues
  25. I am actually running a game right now as a test, and if I don't touch anything at all, it sits smoothly at 60 FPS, but if I move the mouse around it freezes/stutters... Update: Oh wow, I tried moving the character in game with the keyboard while intentionally keep the mousing still, and there is absolutely no lag at all with keyboard input, so somehow it is slightly better than before, in the sense that only the mouse triggers game lag? This still does not explain the lag problems outside of games though... It still has lag just typing in the powershell terminal, web browsers, etc.... Edit: This is of course just one game and not a very rigorous test, so not enough to make assumptions about the cause of the problem