Jump to content
Anon

(SOLVED) 100% CPU - Read/Write between 0 - 3 MB

16 posts in this topic Last Reply

Recommended Posts

Hello,

 

recently I have the problem that any sort of transfer immediately sets my Unraid to 100% CPU load.

 

I run my Unraid on top of a ESXI Server which has worked fine for many months. Suddenly though as soon as any sort of transfer is started my Unraid starts to signal 100% CPU and becomes slow to respond etc. ESXI itself shows CPU usage up to 30% but sometimes just 8% during such a situation while the system itself shows 100%. I do not get any read or write Errors in the array overview. 

 

I have tried around many things but I can't seem to get it to work like it used to. I have also tested other machines on my ESXI and they can use their CPU to the fullest potential without problems. I have reset the config for "Tipp and Tweaks" back to default to make sure nothing there is messing up my results.

 

I can also provide Diagnostics but wanted to know what state you wanted the logs to be in (system running for a while, fresh boot, fresh boot but with 1 file transfer try etc.)

 

Some Screenshots to showcase:

 

1.png

2.png

 

3.png

 

After starting the reading of a 9 GB file where read speeds went down to below 1 MB:

4.png

5.png 
6.png

The CPU load of the Host in the ESXI server. As you can see it does not really use up more than 30% of its CPU from the eyes of the ESXI:
7.png

Some more Info on how the Unraid VM was configured:
8.png

 

Any help is highly appreciated as I seem to have arrived at the end of my ideas on what it could be. Next I would try to do a fresh install of Unraid but I really wanna avoid that if at all possible.

 

In advance many thanks

 

Anon
 

 

Edited by Anon
added extra Information

Share this post


Link to post
10 minutes ago, Anon said:

I run my Unraid on top of a ESXI Server

moving moved to appropriate subforum

Share this post


Link to post

i don't know it's this case, but unraid's CPU usage is showed with IO wait included. so, if CPU just sitting and waiting for IO, it still shows as busy. 

Share this post


Link to post
25 minutes ago, uldise said:

i don't know it's this case, but unraid's CPU usage is showed with IO wait included. so, if CPU just sitting and waiting for IO, it still shows as busy. 

Thanks a lot for that Info. I will try and focus on finding out if some IO problems are causing this. I have all my disks connected via a PCIe SATA Hub. So if something there went wrong it could explain all these problems.

Share this post


Link to post
7 minutes ago, Anon said:

PCIe SATA Hub

what particular device are you using?

Share this post


Link to post
2 hours ago, uldise said:

what particular device are you using?

Sata Card: https://www.amazon.de/gp/product/B07QKJDY98/ref=ppx_yo_dt_b_asin_title_o08_s00?ie=UTF8&psc=1

I restarted my Server replugged the card and all the connections but still the same speed (maybe slightly faster at sometimes 2-3 MB). Not all my disks are connected there though. Only 2 currently. The other devices work from USB 3.0 ports that are directly connected to the Motherboard. 

 

I have then also removed the card itself and plugged the disks in directly to the Mainboard. It is still the same situation though. I get fastish speeds (50Mbyte) for the first few seconds up to around 1-2 GB and then it goes back down to 1-3 MB. (From just reading a file from the array and writing it to a local SSD)

 

Below a diagnostic file for the new booted server with one test of transferring data with all disks either connected via USB 3.0 or directly SATA to the Mainboard.

 

 

tower-diagnostics-20201118-2115.zip

Share this post


Link to post

Some more testing:

 

ran hdparm 12 times for all my disk´s. Almost all of my disks have very abysmal speeds there (2-15 MB with occasional higher spikes). My Disk 2 had a test run where it had a consistent 50 Mbyte for all 12 readings but then when tried again later it was down to almost nothing (0.2 - 3 MB)

I am really confused as to what could have suddenly caused these disks to become so slow to access

 

Edit: Interesting thing is that when I reboot the first 2 Gigs seem to go fast (70 Mbyte) and then it falls down. But if the disks were so slow then why would those first 2 Gigs still go so fast? (only when Unraid is shut down and then booted again. Reboot of Unraid causes immediate slowdown right away as well as ESXI not starting the maschine at 0 MB Ram as usual but continuing the rate it had when reboot was initiated)

Edited by Anon

Share this post


Link to post

I have now created a new "Hull" in esxi and just connected all the already existing drives to unraid. I then booted from that machine and all speeds are back to normal. ESXI is now actually using all of the CPU. I have no idea what specifically caused this but now I just switched to the new hull I created and speeds are back to what I would expect / have experienced in the past (100 read, 30 write).

Share this post


Link to post
On 11/18/2020 at 10:21 PM, Anon said:

usually these card are with two ports, and then uses port multiplier, which is not recommended. maybe @JorgeB have a comment about this card?

 

23 hours ago, Anon said:

I have now created a new "Hull" in esxi

what do you mean with "hull"?

Share this post


Link to post
Just now, uldise said:

usually these card are with two ports, and then uses port multiplier, which is not recommended. 

 

Thanks for the Feedback. I had read up on those recommendations before and heard the same thing in other Forum posts. Due to the low access rates on my server I accepted the downside in performance and "risked" it. So far it has worked fine but I have only used the ports in the way of connecting only one drive per multiplier (like only used every second port). I had initially bought it as I had 5 other old drives lying around and they would have not all fit on a small card. My Plan was to pass the whole card through to my VM. That didn't work sadly as those 5 other drives had to many errors to use in Unraid so they went to the cold backup instead. Passing the card through to the VMs netted me some problems so now I just use this method: 

 

Quote

what do you mean with "hull"?

I meant making a new VM in ESXI. At work we refer to this as the hull of a Maschine. I then just connected all the disks that i had already created and connected them to that new "hull". So like this:

Capture.jpg

 

After doing that and booting it (first only with one core and 2 GB ram) it worked great and ESXI actually showed it using its resources fully when needed. Then I put it up to 4 cores and added a bit more ram and its been running fine since then. Before it would only show a usage of max 25% on CPU usage in the ESXI but 100% (or 400% in HTOP) in Unraid itself. So somehow the communication there did not work. ESXI said Unraid was barely doing anything while Unraid thought it was at its capacity CPU wise.

 

Still have no clue as to why that suddenly happened with me not doing anything to either ESXI or Unraid and just having them running normally. Only change I had done was add a plex docker a week or so before the problems started. I had then removed the plex docker again but it didn't change anything. All in all I am just happy that it's working smoothly again.

Share this post


Link to post
4 minutes ago, Anon said:

My Plan was to pass the whole card through to my VM

this is a good plan, i use my addon cards that way. what motherboard/cpu do you use? and are VT-d enabled in BIOS?

Share this post


Link to post
1 hour ago, uldise said:

maybe @JorgeB have a comment about this card?

According to one photo it's 4 x a 2 port Asmedia controllers, if that's the case it should be fine, but description is all over the place, one place mentions 6 port controller and x2 while the picture is clearly of an x4 controller.

Share this post


Link to post
3 hours ago, uldise said:

this is a good plan, i use my addon cards that way. what motherboard/cpu do you use? and are VT-d enabled in BIOS?

My whole server is just made from parts that I still had. For the Mainboard its a "M5A97_EVO_R20". I actually do not know if it has that enabled in its BIOS settings. It's been awhile since i have been in there. Mainly due to the fact that i do not even have a GPU attached to the server so I would have to put one in to do any changes to the BIOS.

 

2 hours ago, JorgeB said:

According to one photo it's 4 x a 2 port Asmedia controllers, if that's the case it should be fine, but description is all over the place, one place mentions 6 port controller and x2 while the picture is clearly of an x4 controller.

Thanks for the research. Much appreciated. I mainly chose it as it was a very cheap option with lots of ports. I did also buy a different card that was recommended on some Forum post but I didn't use it in the end so I gifted it to a coworker as the other card seemed to run just fine.

Share this post


Link to post
10 minutes ago, Anon said:

My whole server is just made from parts that I still had. For the Mainboard its a "M5A97_EVO_R20". I actually do not know if it has that enabled in its BIOS settings. It's been awhile since i have been in there. Mainly due to the fact that i do not even have a GPU attached to the server so I would have to put one in to do any changes to the BIOS.

you are on AMD system, sorry, i have no experience with AMD at all, so i can't help you more with card passtrough.. 

Share this post


Link to post
6 minutes ago, uldise said:

you are on AMD system, sorry, i have no experience with AMD at all, so i can't help you more with card passtrough.. 

Oh that's totally fine. Sorry for not making it clear that I was not looking for help on how to pass the card through. With the solution I have found I still get pretty speedy results without any problems (if you don't count this problem I had. It does seem to be CPU bound not Disk bound though).

So I am happy enough with leaving things as it is. But still thanks.

 

Just in case you wanted to know what my problem back then with passthrough was:

 

- It seemed like the card didn't show up as its own SATA controller and I was just able to pass through the main SATA controller.

- Naive me thought: Lets try that not much that can go wrong.

- I set that setting. ESXI saved it into its configuration on its boot disk. I rebooted. ESXI loads up and passes all drives to the VM. Now ESXI does not have access to its own boot disk anymore. This means I was not able to do any more changes to the settings as it could not save those settings onto the disk. (Ended up reinstalling ESXI)

Share this post


Link to post
4 minutes ago, Anon said:

- I set that setting. ESXI saved it into its configuration on its boot disk. I rebooted. ESXI loads up and passes all drives to the VM. Now ESXI does not have access to its own boot disk anymore. This means I was not able to do any more changes to the settings as it could not save those settings onto the disk. (Ended up reinstalling ESXI)

yes, this is very common mistake to pass MB main sata controller to VM - and then a solution is ESXi reinstall.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.