jbartlett

May 21, 2019

I'm guessing bad device. I rebooted to free it up and now the NVMe drive doesn't register.

May 21, 2019

Looks like the cause for the IOWAIT is a fstrim being executed on a WD Black 256GB nvme drive, executed by the Trim plugin. The process gets stuck in an uninterruptible sleep.

Not a bug in the Dashboard.

May 14, 2019

I tried to identify what might be causing that - stopped my VM's and the array but the current pegged CPU stayed pegged. Found a WD NVMe drive hasn't unmounted and wouldn't unmount from "Unassigned Devices", nor could I pull up a LS of the share, it just hung. Didn't look like I was actually using it so I rebooted & unmounted it. If the pegging is not related to that, it'll show up again in a few days.

May 10, 2019

Here ya go. CPU 26 is pegging.

Debug1.txt Debug2.txt

April 17, 2019

53 minutes ago, IamSpartacus said:

I didn't do it manually. By isolating the CPUs in the settings it appended the syslinux file automatically.

Oh neat, I never noticed the CPU Isolation part on the CPU Pinning page (never scrolled down far enough)

April 17, 2019

Other than I didn't know about being able to hyphenate the range, no.

April 16, 2019

I had that same thought. I just built a 2950 system but it hasn't been on long enough at once to see if it shows up. It has not shown up on my Intel build.

April 16, 2019

On your HTOP, the matching CPU from the Dashboard graph is also maxed which isn't the scenario I reported here.

I'd recommend excluding the VM CPU's in the sys config to keep the OS away from them.

IE: append isolcpus=12,13,14,15,28,29,30,31 initrd=/bzroot

April 11, 2019

Happening in RC7 after about 5 days uptime.

April 9, 2019

Using htop will give you an apples to apples comparison.

April 6, 2019

Addendum: It seems the parity job duration displayed only includes the time it took since it was last resumed from a paused state. My array takes roughly 12 hours to complete start to finish but it's saying it took only 5 hours.

April 5, 2019

When pausing a Parity Check, the Dashboard reports "Error code: aborted"

April 5, 2019

My Intel box updated from RC6 to RC7 with no issues. My Threadripper box did not upgrade cleanly from RC5 to RC7.

After it taking 3x as long to reboot as normal, the web admin was still offline. I pinged, got a response, telnetted in and pulled up a tail of the syslog. It looked odd so I copied the syslog to the flash drive and looked at it and I realized the system had not rebooted yet as it still had events from days ago. It looked like it stalled while checking PIDs left on /dev/md* when the only ones left was for the array drives. I executed the "reboot" command and five minutes later I still had no ping response.

After connecting a monitor and forcing a reboot, the system locked up at the UEFI boot menu with only CTRL-ALT-DEL working. I plugged the flash drive into my Win10 machine, it detected errors and prompted me to scan it which I did so. No errors found. Plugged the USB drive back into the NAS and it rebooted without issue.

Attached is the diagnostics from after the reboot. Also attached is the syslog entries from when I clicked the "Check for Update" button on to the last entry. I could not find any new diagnostic file referenced in the 2nd to last line on the root directory of the flash drive.

nas-diagnostics-20190405-1659.zip

partial_syslog.txt

April 4, 2019

I don't think it has anything to do with actual CPU usage. My guess is that there's a bug in the data gathering/display formatting process.

March 29, 2019

Try "Inspect Element" on the CPU bar that's pegging 100% to see if it's looping over a fractional number between 99 and 100. If it is, it's related to my issue linked in the first comment.

Animation of the change I see: https://gyazo.com/88ebee4954d9d3b7e9655c6f7e9f2a80

March 10, 2019

Now it's CPU 27 & 28 doing it. New diagnostic file attached. You can ignore the USB errors at the end of the syslog, was testing a failing USB security dongle.

nas-diagnostics-20190310-0902.zip

March 9, 2019

Rebooting makes the "stuck bar" go away. I've been really busy over the past few days and haven't been monitoring it closely but now CPU 27 is pegging in the Dashboard which is not represented in htop. Only CPU's 2-15 are in use, the rest are available for unraid's use. The average load includes the pegged VM. This issue has not appeared on my Intel backup system with a hex core CPU.

Side by side of the Dashboard & htop video

https://gyazo.com/ff4b45e4173b48ed88667d637be33ad4

March 2, 2019

I did an "Inspect Element" on the bar and it's doing something odd. The other CPU graphs are updating every second and only integers but this one is going from 99% to 100% in about half of that time, going through several decimal stops along the way.

Animation of the change: https://gyazo.com/88ebee4954d9d3b7e9655c6f7e9f2a80

November 16, 2018

Just for my understanding so correct me if I'm worng, if the first line doesn't report that the microcode was updated, then unraid left things as they were and the following lines in the syslog are just reported for informational purposes.

Nov 10 22:04:44 NAS kernel: microcode: CPU0: patch_level=0x08001137 (repeated for each hyperthreaded core)

Nov 10 22:04:44 NAS kernel: microcode: Microcode Update Driver: v2.2.

November 16, 2018

I wasn't able to duplicate with my VM's nor having a "Windows 10 New" VM.

November 15, 2018

One thing to note about Windows 10 VM's, you can install an instance without a Windows Key - you can bypass entering it during the installation. Create a local User ID if you do this vs using a Microsoft account. You can then simply copy the hard drive of this instance to create additional VM's. This is a typical practice for testing quick installations but if you want to persist them, then you can enter a key for each instance to make it all good.

November 15, 2018

I have multiple Windows VM's running on one of my unraid servers with one conveniently created & named "Windows 10". I'll back up the VM's files tonight and will will try to delete "Windows 10 Handbrake 1".

Any other test condition to make sure of prior?

November 15, 2018

I've had this happen a couple times, the cursor on the console will even stop blinking. Seemed to happen whenever I made changes to the hardware such as adding or removing a PCI-E card to test hardware passthrough to a VM. Required a hard reset but never happened again after in my case (until the next mucking about the PCI-E cards).

Ryzen Threadripper, current BIOS.

This is not going to be an easy thing to track down with no means to capture logs or specifying the syslog to be written to the flash drive instead of RAM.

September 28, 2018

I've got two unraid systems, both with two M.2 NVMe drives as well, VM's running of them and no issues so far.

It wasn't quite clear but are the two m.2 drives the same model? If so, check out my DiskSpeed docker app and run a bench mark on both system (assuming you can access both currently) to see if they have similar speed graphs. Just because there's no flags being raised doesn't mean there aren't any.

September 27, 2018

Do you think that the m.2 drive failed and it's killing your system? A more descriptive title would be helpful.

jbartlett

Posts

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Report Comments posted by jbartlett

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc7] extreme high cpu usage in dashboard but not top/htop

[6.7.0 RC7] When pausing a parity check, the Dashboard reports "Error code: aborted"

Unraid OS version 6.7.0-rc7 available

Unraid OS version 6.7.0-rc7 available

[6.7.0-rc7] extreme high cpu usage in dashboard but not top/htop

[6.7.0-rc7] extreme high cpu usage in dashboard but not top/htop

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

[6.7.0-rc5] Dashboard CPU erroneously stuck at 100%

unRAID 6.6.5 Total system lockup unresponsive from console, ssh, network, parity check stops, no disk activity at all, no VM working, nothing functioning what so ever.

Removing Similar Named VM Removed Both

Removing Similar Named VM Removed Both

Removing Similar Named VM Removed Both

unRAID 6.6.5 Total system lockup unresponsive from console, ssh, network, parity check stops, no disk activity at all, no VM working, nothing functioning what so ever.

6.6.0 Bug M.2 NVME Issues

6.6.0 Bug M.2 NVME Issues