Any Current Solutions For Critical on CPU_IOWAIT Errors On UNRAID System ?


ButchR

Recommended Posts

I have the following  CPU_IOWAIT errors on my UNRAID:

 

image.thumb.png.2019a4bb906e81a472905f0901059029.png

 

I have research this error  a little and there is some older information available with maybe  issues with older versions of UNRAID.   Some discussions  related to Mover and/or Samsung drives starting at a different location causing  a delay of data transfer (not very clear).  Is there a definitive answer or maybe a guide to solve the problem?

Edited by ButchR
mistake
Link to comment

It really all depends upon what's going on at the time.  IO wait isn't an "error" per se.  It just means that the system is hung up waiting for data to arrive from an I/O device (most likely hard drive(s))

 

But, if you started your journey with Unraid prior to (I believe) 6.8.x, then at some point you should be reformatting your cache drive to take advantage of increased speeds IF the partition doesn't start on sector 64.

Link to comment

Well  I started my journey with UNRAID at  6.8.x.  I am at version 6.9.2 right now.    What steps do I need to take to have my Samsung's  drives (SSD & Samsung 970_EVO)  to start at sector 64?  Any help would be appreciated. My Plex is  is not stable.  I been reading about formatting  drives,  which is understandably  a needed step.  Most of the  comments assumed the reader was familiar with the process and/or UNRAID.    Just  need  some help to achieve the end result of starting at sector 64  that hopefully will resolve this issue. 

 

I hope that  I can achieve some successful help  from the community.  Thanks in advance. 

Link to comment

Unsuccessful  so far in finding the issue that causing my URAID/ Plex server to be so  unstable. I had a few months of Plex server running  great.  Now, I  thinking maybe I will have to start over from the beginning.  Reading and trying a few things and wondering how it could have gone  so wrong .  Only if I could decipher the critical  warnings  better.  Making list of things to try and crossing them off my list.  Wish me luck and will need it for sure.

 

image.thumb.png.53621209d034399fc6eee345aec32241.png

Link to comment

I have exhausted trying things that I have discovered  with the help of the forum.  My Unraid server had been running good for three months.  At the present time  it is not stable  enough to run Plex with out crashing  or not functioning al at all.  What  are so basic steps I can take to start over?  I thinking  of cause use my flash and slowly add hardware back one step at a time.  Hopefully, I will be able to get my Unraid server back running.    

 

Any help with steps or guides on how to start over with  an unstable Unraid server from scratch.   

Link to comment

Here is a list of the hardware:

Supermicro Motherboard MBD-X11SCA-F-O 

Intel Core i9-9900K 3.6 GHz Eight-Core

Corsair Vengeance LPX 8GB (1 x 8GB) DDR4 DRAM 2400MHz (PC4-19200) 

Corsair Vengeance LPX 8GB (1 x 8GB) DDR4 DRAM 2400MHz (PC4-19200) 

Dell 6Gbps H310 SAS HBA w/ LSI 9211-8i P20 IT Mode

HP NVIDIA Quadro P620 Graphics Card - 2GB GDDR5

Parity      ST6000VN001-2BB1_ZR10HA6A - 6 TB  (sde)

Disk 1ST  6000VN001-2BB1_ZR10J15M - 6 TB  (sdf)

Disk 2     WDC_WD60EFAX-68S_WD-WX11D192V428 - 6 TB   (sdc)

Disk 3     ST4000DM004-2CV1_ZTT15GZD - 4 TB (sdl)

Disk 4    WDC_WD40EZRZ-00G_WD-WCC7K7PA29TT - 4 TB (sdm)

Disk 5   WDC_WD40EZRZ-00G_WD-WCC7K7PA267U - 4 TB (sdb)

Disk 6    ST4000DM004-2CV1_ZTT17LGL - 4 TB (sdg)

Cache_nvme  Samsung_SSD_970_EVO_Plus_2TB_S59CNM0R419774J - 2 TB   (nvme1n1)

Pool of two devices

Cache_protected  SHGS31-500GS-2_ED0CN90861140CP4E - 500 GB (sdh)

Cache_pr...ted       SHGS31-500GS-2_EN08N831310208U2A - 500 GB (sdj)

 

Cache_ssd         CT1000BX500SSD1_2110E583F484 - 1 TB (sdd)

Pool of two devices

Ssdpool_test     Samsung_SSD_850_EVO_250GB_S3PZNF0JC10412H - 250 GB (sdk)

Ssdpool_test 2  Samsung_SSD_850_EVO_250GB_S3PZNF0JC08916B - 250 GB (sdi)

 

Vms_nvme          WDBRPG0010BNC-WRSN_205348801911 - 1 TB (nvme0n1)

 

 

 

 

 

 

Link to comment
On 12/13/2021 at 10:15 PM, ButchR said:

Still researching information on cpu_iowait and troubleshooting my unraid  server issues.  Attached a diagnostic file and screen capture. from Glances.

 

IO wait simply means that the CPU is waiting for information from a device.  It all depends upon what's going on at the time this is happening.  Now if you're seeing that spike on the hour, every hour its because you're trimming the drives far far too often than necessary.  Once a week during off times is more than sufficient.  Everything gets hung up while that's going on.

Link to comment

Okay How do make adjustments to drive  trimming ?  Any information you could provide?   Also,  I read where Samsung  drives  need to be reformatted for a different starting bit for Samsung drives.  I  have used hard drive utilities  in the past to  partition etc. . Do I need to use some software like that to prep the Samsung  drives? Do you know the steps  I need to take?     

 

Link to comment

This is completely a bandwidth issue with multiple concurrent I/O's happening and the system having to wait to complete.  EG: You appdata share exists on two of the cache pools and on the array.  This may be by design, but it existing on the array is going to be a huge penalty.

 

The same with the system share (this won't affect anything unless within the system share the docker.img is the file that's existing on the array)

 

The worst case for this happening with say Plex is buffering, stuttering etc.  It won't cause any crashes.  Crashes would be something else.

 

The graphs in Netdata showed that the I/O wait stopped.  The key to diagnosing is figuring out what is changing with what's going on at the point in time when the I/O wait increases to the point its causing issues and what changed when the I/O wait drops back down.

Link to comment

I  see how,  not by design  I have set up bandwidth bottlenecks.   I  as a novice unraid  user I took a little from various guides and videos  to  create my dream  unraid server.  I overlap or doubled share locations and set a share in an inefficient location.    Now I have conflicting I/O's ; appdata on two shares  and  cache pools on the array.  I  would like to correct these issues and  have a standard setup for I/O's.  Then  I  could learn the functions of shares  and modify in the future if needed.

 

What steps can I take to perform surgery on the unraid server's  setup to make the setup  more efficient?  

Link to comment

Here is what my shares look like.  I  have added apps that look cool  but have not totally implemented.  So I am open to remove any apps  that may look like an issue. Help with  removing  conflicts and setting up a move efficient structure were be so appreciated.

 

To correct:    "multiple concurrent I/O's happening and the system having to wait to complete.".

                   "appdata share exists on two of the cache pools and on the array"

                    Efficient "system share"

 

 

image.thumb.png.1094edda0227bda2d71120a9706e5540.png

 

.

 

 

 

 

 

 

Link to comment

Any suggestions  how I can start to make changes to my unraid  shares?  I thinking to create basic share structure and build on the basic structure.    I wish I was an expert with unraid  to know what moves to make next.   Just like learning chess for the first time.  After sometime playing chess one knows what moves to make.   I am a little frustrated  that  all I put in my unraid server so far is not working well.   Hopefully one day I get my unraid server where I would like .......

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.