[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

jbartlett · March 2, 2019

I did an "Inspect Element" on the bar and it's doing something odd. The other CPU graphs are updating every second and only integers but this one is going from 99% to 100% in about half of that time, going through several decimal stops along the way.

Animation of the change: https://gyazo.com/88ebee4954d9d3b7e9655c6f7e9f2a80

Edited March 2, 2019 by jbartlett

jbartlett · March 9, 2019

Rebooting makes the "stuck bar" go away. I've been really busy over the past few days and haven't been monitoring it closely but now CPU 27 is pegging in the Dashboard which is not represented in htop. Only CPU's 2-15 are in use, the rest are available for unraid's use. The average load includes the pegged VM. This issue has not appeared on my Intel backup system with a hex core CPU.

Side by side of the Dashboard & htop video

https://gyazo.com/ff4b45e4173b48ed88667d637be33ad4

Edited March 9, 2019 by jbartlett

jbartlett · March 10, 2019

Now it's CPU 27 & 28 doing it. New diagnostic file attached. You can ignore the USB errors at the end of the syslog, was testing a failing USB security dongle.

nas-diagnostics-20190310-0902.zip

phbigred · March 28, 2019

Experienced the same issue, phantom CPU in dashboard. Running rc-6

jbartlett · April 11, 2019

Happening in RC7 after about 5 days uptime.

IamSpartacus · April 15, 2019

This happened to me the other day.

I just noticed this behavior as well. Running RC7. To make things even weirder you can see the one CPU thread that is showing as pinned at 100% is isolated to a VM that is powered off. I also noticed something else which could be completely unrelated. But while this is happening if I switch tabs in Chrome so that Unraid is in the background, when I go back to that tab I'm at a blank (black) web page and I have to reload the page to access the WebGUI again.

spe-unraid01-diagnostics-20190413-0224.zip

jbartlett · April 16, 2019

On your HTOP, the matching CPU from the Dashboard graph is also maxed which isn't the scenario I reported here.

I'd recommend excluding the VM CPU's in the sys config to keep the OS away from them.

IE: append isolcpus=12,13,14,15,28,29,30,31 initrd=/bzroot

phbigred · April 16, 2019

Are we just seeing a theme with Ryzen? Might be an unidentified bug.

Edited April 16, 2019 by phbigred

jbartlett · April 16, 2019

I had that same thought. I just built a 2950 system but it hasn't been on long enough at once to see if it shows up. It has not shown up on my Intel build.

IamSpartacus · April 17, 2019

19 hours ago, jbartlett said:

On your HTOP, the matching CPU from the Dashboard graph is also maxed which isn't the scenario I reported here.

I'd recommend excluding the VM CPU's in the sys config to keep the OS away from them.

IE: append isolcpus=12,13,14,15,28,29,30,31 initrd=/bzroot

I already have the CPU's isolated as you can see from my screen shot but my syslinux file looks like shows isolcpus=12-15,28-31 instead of isolcpus=12,13,14,15,28,29,30,31 like you described. Is there a difference there?

jbartlett · April 17, 2019

Other than I didn't know about being able to hyphenate the range, no.

IamSpartacus · April 17, 2019

5 minutes ago, jbartlett said:

Other than I didn't know about being able to hyphenate the range, no.

I didn't do it manually. By isolating the CPUs in the settings it appended the syslinux file automatically.

jbartlett · April 17, 2019

53 minutes ago, IamSpartacus said:

I didn't do it manually. By isolating the CPUs in the settings it appended the syslinux file automatically.

Oh neat, I never noticed the CPU Isolation part on the CPU Pinning page (never scrolled down far enough)

tjb_altf4 · April 18, 2019

On 4/17/2019 at 1:49 AM, phbigred said:

Are we just seeing a theme with Ryzen? Might be an unidentified bug.

If it is, it's probably tied to NUMA mode as I don't see these problems on my 1950x in UMA mode.

Given the difference between htop and the dash was the inclusion of iowait time, I wonder if processes are idling waiting for resources they can't physically access due to enforced cpu node separation and isolation.

Just a theory anyway.

phbigred · May 3, 2019

Anyone in this thread having issues with RC8? My issue still hasn't tripped causing the phantom CPU spikes in GUI.

presence06 · May 3, 2019

If you're using Windows 10 and it's on 1903 version/Insider Edition, and you are remoting into it, once you disconnect it will peg 1 or 2 threads/cpu's.... I found this out and had to remove the Inside Edition from my VM and went back to the previous Version of Win10... As of today it's still doing it even after the latest Insider update.

If you remote in, DC and then remote back in, it will settle..but once you DC again it'll spike... something to do with the new background blur maybe in Win 10? not sure.

Edited May 3, 2019 by presence06

phbigred · May 3, 2019

44 minutes ago, presence06 said:

If you're using Windows 10 and it's on 1903 version/Insider Edition, and you are remoting into it, once you disconnect it will peg 1 or 2 threads/cpu's.... I found this out and had to remove the Inside Edition from my VM and went back to the previous Version of Win10... As of today it's still doing it even after the latest Insider update.

If you remote in, DC and then remote back in, it will settle..but once you DC again it'll spike... something to do with the new background blur maybe in Win 10? not sure.

This isn't VM related this is a general UI bug. I pin my CPUs and it happens to ones assigned to unraid outside of VM isolation.

limetech · May 3, 2019

Quote

I noticed this back on RC3 but since RC5 came out, I updated and waited to see if it happened again.

Next time you see this happen, please type this command:

date +%s ; cat /proc/stat

Save that output. Then type the command again, and save that output. Finally post both sets of output.

The code that generates the graphs is based on a daemon that polls /proc/stat every second to monitor CPU load.

jbartlett · May 10, 2019

Here ya go. CPU 26 is pegging.

Debug1.txt Debug2.txt

Edited May 10, 2019 by jbartlett

bonienl · May 10, 2019

The calculation for CPU26 based on these two samples is correct and 100%.

One big difference between CPU26 and all the other CPUs is the IOwait time

      user    nice  system  idle       iowait    irq softirq
cpu25 784637  25697 1747440 295493912  189817    0   9588    0 0 0
cpu26 1567648 96275 4562200 21299937   270014868 0   38855   0 0 0
cpu27 782504  26063 1728628 295529225  183739    0   9300    0 0 0

This implies CPU26 is waiting most of the time on disk I/O activity to finish

Edited May 10, 2019 by bonienl

jbartlett · May 14, 2019

I tried to identify what might be causing that - stopped my VM's and the array but the current pegged CPU stayed pegged. Found a WD NVMe drive hasn't unmounted and wouldn't unmount from "Unassigned Devices", nor could I pull up a LS of the share, it just hung. Didn't look like I was actually using it so I rebooted & unmounted it. If the pegging is not related to that, it'll show up again in a few days.

jbartlett · May 21, 2019

Looks like the cause for the IOWAIT is a fstrim being executed on a WD Black 256GB nvme drive, executed by the Trim plugin. The process gets stuck in an uninterruptible sleep.

Not a bug in the Dashboard.

limetech · May 21, 2019

8 hours ago, jbartlett said:

Looks like the cause for the IOWAIT is a fstrim being executed on a WD Black 256GB nvme drive, executed by the Trim plugin. The process gets stuck in an uninterruptible sleep.

Not a bug in the Dashboard.

Thank you for the update, guess now there's an issue with fstrim?

limetech · May 21, 2019

Changed Status to Closed

jbartlett · May 21, 2019

I'm guessing bad device. I rebooted to free it up and now the NVMe drive doesn't register.

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

User Feedback

Recommended Comments

jbartlett 275

Link to comment

jbartlett 275

Link to comment

jbartlett 275

Link to comment

phbigred 13

Link to comment

jbartlett 275

Link to comment

IamSpartacus 34

Link to comment

jbartlett 275

Link to comment

phbigred 13

Link to comment

jbartlett 275

Link to comment

IamSpartacus 34

Link to comment

jbartlett 275

Link to comment

IamSpartacus 34

Link to comment

jbartlett 275

Link to comment

tjb_altf4 397

Link to comment

phbigred 13

Link to comment

presence06 6

Link to comment

phbigred 13

Link to comment

limetech 3327

Link to comment

jbartlett 275

Link to comment

bonienl 1764

Link to comment

jbartlett 275

Link to comment

jbartlett 275

Link to comment

limetech 3327

Link to comment

limetech 3327

Link to comment

jbartlett 275

Link to comment

Join the conversation