bwnautilus Posted December 3, 2023 Share Posted December 3, 2023 This is especially noticeable when selecting the Dashboard, Plugins or Docker tabs. Sometimes it takes up to 45sec to render the page. On the Dashboard tab there are always at least 3 CPUs that are pegged at 100%. The page will finish rendering when the CPUs go back to normal load. I also notice this process in htop that pops to the top when the page is rendering: /usr/local/bin/unraid-api/unraid-api /snapshot/api/dist/unraid-api.cjs start I will be rolling back to 6.11.5. Diags attached. Thanks in advance. mediatower-diagnostics-20231203-1452.zip Quote Link to comment
ljm42 Posted December 3, 2023 Share Posted December 3, 2023 It looks like you've been with us for a while so I'd recommend navigating to Settings > Docker, switching to advanced view, and changing the "Docker custom network type" from macvlan to ipvlan. This is the default setting that Unraid 6.11 ships with. This will prevent crashes related to macvlan call traces, I'm not sure if the slowdowns you are seeing are possibly related to macvlan issues. If that doesn't help, since htop is pointing at the unraid-api I'd suggest uninstalling the Connect plugin to see if that helps. It may be a symptom of the problem and not the cause, but might as well rule it out. Quote Link to comment
bwnautilus Posted December 4, 2023 Author Share Posted December 4, 2023 @ljm42 Thanks for your suggestions. After rolling back to 6.11.5 (GUI was back to normal) I changed the Docker settings to ipvlan and removed Connect. Downloaded 6.12.6 and rebooted. With the array not started, the GUI is still unresponsive on Dashboard and Plugins tabs. I will roll back to 6.11.5 again and wait for an updated Unraid release. Quote Link to comment
ljm42 Posted December 5, 2023 Share Posted December 5, 2023 This feels like something specific to your environment that may not be automatically solved by a new release. It is up to you, but if you'd like to keep going here's what I'd recommend... Setup a new flash drive with 6.12.6 and boot into a default config. Navigate around the webgui and see how it responds. If this all works fine, then that points to a configuration issue with your server that we can work to isolate. If it is still slow, I would start by focusing on your client. Try accessing the server from private/incognito mode, or a different browser or even a different computer. The webgui did change, so it is possible that browser extensions or security software on the client could be causing issues with the updated webgui. Either way, be sure to grab diagnostics while in this state. Quote Link to comment
aje14700 Posted January 8 Share Posted January 8 @bwnautilus It seems like I had something similar. My dashboard would take 1-2 minutes to load, and 1 CPU thread would be locked to 100%, the process would die, and then pop back up on a logical CPU. It _seems_ like it was related to CPU temperature monitoring. The process that was locking up the CPU was `sensors`. It would switch between `sensors -A` to `sensors -u -A`, and `sensors -u -c /tmp/sensors.conf`. In trying to turn off CPU temp detection, it would settle down for about 20 seconds, before it started back up. My cpu is an AMD Ryzen 5 5600G, and I'm using the `k10temp` driver. For me, `6.12.4` is fine, but `6.12.6` has this issue. You could try updating to `6.12.4` and see if the issue still occurs, or try disabling CPU temperature sensing (if turned on) before updating. Quote Link to comment
aje14700 Posted January 16 Share Posted January 16 After switching back to 6.12.4, I had 0 issues until about 8 days later. UI would hang and take forever, and 1 logical core was always pegged 100%. HTOP still showed it as "sensors" sucking up CPU. I believe I have figured out the issue. At somepoint, the sensors can no longer read particular attributes from my CPU (`amdgpu-pci-0900`). When running `sensors -A`, this was my output: ~# sensors -A amdgpu-pci-0900 vddgfx: N/A vddnb: N/A edge: N/A PPT: N/A k10temp-pci-00c3 MB Temp: +33.8°C nvme-pci-0800 Composite: +34.9°C (low = -60.1°C, high = +89.8°C) (crit = +94.8°C) And it would take forever on the amdgpu chip. Running with JSON output gave some extra clues: # sensors -j { "amdgpu-pci-0900":{ "Adapter": "PCI adapter", "vddgfx":{ ERROR: Can't get value of subfeature in0_input: Can't read }, "vddnb":{ ERROR: Can't get value of subfeature in1_input: Can't read }, "edge":{ ERROR: Can't get value of subfeature temp1_input: Can't read }, "PPT":{ ERROR: Can't get value of subfeature power1_average: Can't read } }, "k10temp-pci-00c3":{ "Adapter": "PCI adapter", "MB Temp":{ "temp1_input": 33.750 } }, "nvme-pci-0800":{ "Adapter": "PCI adapter", "Composite":{ "temp1_input": 32.850, "temp1_max": 89.850, "temp1_min": -60.150, "temp1_crit": 94.850, "temp1_alarm": 0.000 } } } I ended up modifying `/boot/config/plugins/dynamix.system.temp/sensors.conf` to include the following: chip "amdgpu-pci-0900" ignore "in0" ignore "in1" ignore "temp1" ignore "power1" And then for it to take effect without a reboot, `cp /boot/config/plugins/dynamix.system.temp/sensors.conf /etc/sensors.d/sensor.conf` And once the file copied, the issue immediately went away. So the issue seems to be related to the chip timing out for sensor readings. I'm sure there's a better approach, but hopefully this helps @bwnautilus and anyone else running into this issue. Quote Link to comment
RoTalk Posted January 16 Share Posted January 16 I am watching this because I might be experiencing the same thing. I kept on seeing 1 logical core in same env. and noticed the that suddenly I can't connect or get into the server, I'd try it from another laptop only to freeze. Went as far as the docker/vlan settings, replaced my thumb drive and will attempt your fix and report back. Quote Link to comment
aje14700 Posted January 16 Share Posted January 16 Note for my solution above, if you need to change any settings in System Temp, then the UI will take a while to load as it runs `sensors` without the configuration. Additionally, it'll also overwrite any changes made to that file on save (since it's not aware of our changes). I setup a user script that appends my ignore lines onto the conf and copies to the running configuration in /etc/senors.d so I can manually trigger it if needed. Quote Link to comment
bwnautilus Posted January 24 Author Share Posted January 24 On 1/16/2024 at 11:35 AM, aje14700 said: And once the file copied, the issue immediately went away. So the issue seems to be related to the chip timing out for sensor readings. I'm sure there's a better approach, but hopefully this helps @bwnautilus and anyone else running into this issue. Thanks for looking into this. My Unraid system that's experiencing this problem is Xeon-based and I do not see any CPU spikes when running 'sensors -A'. But as I mentioned previously, I'm back on 6.11.5 - don't want to do the upgrade/downgrade thing again. Glad the solution worked for you. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.