6.8.3 Disk writes causing high CPU

jonp · September 22, 2020

Can I assist with testing data?

Any insights / information you can provide would be much appreciated. We really need to narrow down reproducable steps.

Sent from my Pixel 3 XL using Tapatalk

CowboyRedBeard · September 22, 2020

I've posted most of it to this thread, but when took it from 2 - SSDs in a pool (BTRFS) and had single SSD on XFS it was less of a problem, but still a problem.

I could run some tests or tools and show you the results if you can tell me which ones would be meaningful. Basically it's any time there's a large write (20G) to SSD.

CowboyRedBeard · September 22, 2020

I mean, at this point I'm happy to actually rebuild the server.... fresh install... if that has a high likelihood of making this go away

CowboyRedBeard · October 6, 2020

On 9/22/2020 at 1:44 PM, jonp said:

Any insights / information you can provide would be much appreciated. We really need to narrow down reproducable steps.

Sent from my Pixel 3 XL using Tapatalk

I saw that beta 29 is out, does it address this in any way?

Also, happy to provide any data that might be helpful... If you can tell me how to collect it.

dnLL · November 15, 2020

I'm having the same issue of high disk writes on btrs cache with dockers. This seems to be very well known...

My SSDs are are 35 days old, my server didn't do much (the mover barely moves anything) and yet, 25 TBW on both SSDs. They're 500G SSDs, so that's 50x their size in a month with a server idling 99.999% of the time. Insane.

itimpi · November 15, 2020

7 hours ago, dnLL said:

I'm having the same issue of high disk writes on btrs cache with dockers. This seems to be very well known...

My SSDs are are 35 days old, my server didn't do much (the mover barely moves anything) and yet, 25 TBW on both SSDs. They're 500G SSDs, so that's 50x their size in a month with a server idling 99.999% of the time. Insane.

There are fixes in the 6.9.0 beta series to address this.

dnLL · November 15, 2020

There are fixes in the 6.9.0 beta series to address this.

I'm happy to hear about this. I will wait for the stable version, hopefully soon.

Sent from my Pixel 3 using Tapatalk

CowboyRedBeard · November 18, 2020

I did the beta and still have issue.

CowboyRedBeard · December 3, 2020

Still a problem, this shows server temps and fan speeds, etc.... This is when I have it set to download and unpack files at 02:00

Any chance this gets fixed soon?

Alternatively, @jonp / @itimpi do you think maybe a quicker fix for me would be to rebuild unraid from scratch?

I mean one of the other side affects is that SSD write speeds are way way down, to 80MBps range... which also sucks.

Edited December 3, 2020 by CowboyRedBeard

mctavish01 · December 9, 2020

I guess I will add to this as myself and a friend have same issues.
When writing to array from off server (like unloading gopro footage) our cpu is absolutely pinned causing issues with all dockers and the like, essentially rendering dockers useless while transfer is active.

Anything useful we can provide to help resolve the issue?

CowboyRedBeard · December 21, 2020

My Christmas wish is for this to get fixed.... 😃

jonp · December 21, 2020

I know it is you guys :-). we've had a hell of a time trying to recreate this issue. And then we thought we had it figured out with that SSD partitioning issue, but apparently that's not the silver bullet for everyone. I'm hoping to find some time over the holidays to really beat again at the server in the lab and see if I can reproduce. That is the key at this point. We have to be able to reproduce this issue to have any chance of solving it. Alternatively, if anybody has a system that does not have production data on it that is exhibiting this issue, I may be interested in requesting remote access to that box directly. I just don't want to do this to anyone's system that has data on it that they are concerned about.

Sent from my Pixel 3 XL using Tapatalk

CowboyRedBeard · December 21, 2020

17 minutes ago, jonp said:

I know it is you guys :-). we've had a hell of a time trying to recreate this issue. And then we thought we had it figured out with that SSD partitioning issue, but apparently that's not the silver bullet for everyone. I'm hoping to find some time over the holidays to really beat again at the server in the lab and see if I can reproduce. That is the key at this point. We have to be able to reproduce this issue to have any chance of solving it. Alternatively, if anybody has a system that does not have production data on it that is exhibiting this issue, I may be interested in requesting remote access to that box directly. I just don't want to do this to anyone's system that has data on it that they are concerned about.

Sent from my Pixel 3 XL using Tapatalk

Well, mine is in production with data... but I'd be happy to run some tests and provide metrics / data / output...

I reproduce it daily. 😪

likesboc · February 18, 2021

hi, is there any progress o this? i'm a new unraid user and it appears i'm facing a similar issue. cpu spikes while writing on the ssd cache. rtorrent is maxing out with 100% cpu although i only have few torrents running. when the mover starts it gets even worse, dockers barely useable, so currently i'm only letting it run at night. any help is appreciated, i can offer system diags if it helps.

CowboyRedBeard · February 19, 2021

I've not seen a fix yet, I've been experiencing it for nearly a year.

likesboc · February 21, 2021

On 2/19/2021 at 3:15 AM, CowboyRedBeard said:

I've not seen a fix yet, I've been experiencing it for nearly a year.

does it make sense to disable the cache for media shares entirely and only use it for appdata? how do you circumvent this until it is adressed in a new unraid version? i'm still on trial so any additional information is very much appreciated!

CowboyRedBeard · March 3, 2021

I've been able to limp it along by scheduling large write I/O jobs, but that might not work for everyone

I'm actually thinking about building another install just to see if it goes away

Additionally... I've upgraded from beta 25 to 6.9.0 and the issue persists

Edited March 3, 2021 by CowboyRedBeard

GMAsterAU · April 1, 2021

I just discovered the same issue. Copying 300 GB to my BTRFS encrypted Cache (RAID 5) makes the IO Wait time go through the roof, rendering the server useless for the time of copying. It completely normalises after the transfer is complete.

I am running 6.9.1. Happy to run tests and provide information, as I can see this thread has been open for a long time. I sympathise with you @CowboyRedBeard

CowboyRedBeard · May 13, 2021

Yeah, this is more than a year on at this point. And there's a number of folks with the issue.

It'd be great to get a resolution, since I'd had unraid for YEARS without this problem.

Funes · September 9, 2021

Hi,

New unraid user here, two weeks fighting with it trying to solve this issue. I would like to bump this thread since this topic really looks like the issue I am experiencing. Every time I need to write or read from the array the CPU goes crazy and the IO Wait skyrockets. I have tried different things I have been collecting from different threads:

1. Download torrents to cache. That is ok, but my cache is 500Gb and I download that amount every day so I have qBitorrent move the files after downloading to the array, this causes the issue when the move happens.

2. Rclone uploads to the cloud, this would read a folder from the array and upload it to Google. Same issue, CPU and IO Waits go to the moon. I have scheduled the upload during the night.

3. I have changed the /config in docker from /mnt/user/appdata to /mnt/cache/appdata as suggested in other threads. Same result.

4. I have disabled the Tunable (enable Direct IO) under Global Share Settings as suggested in another thread. Same result.

5. Moved all the appdata share to cache with mover. Same result.

6. Changed my rclone upload script pointing to upload folder from /mnt/user/download to /mnt/disk1/download to try to bypass Unraid's SHFS. Same result, posting screenshot running rclone upload:

7. Folder caching plugging is not installed.

8. Cache filesystem and array filesystem matches (XFS).

9. My CPU governor is Performance.

To be honest I am out of ideas and I don't know where else to look for information on how to solve this issue, but it really desperates me, as I can't use the server as a media center to serve content to my TV since the content freezes all the time (Freezes on TV match CPU spikes due to this issue)

Any advise?

EDIT: By the way this is on latest stable version 6.9.2

Edited September 10, 2021 by Funes

CowboyRedBeard · September 10, 2021

I've been waiting over a year to see resolution to this problem... I've actually changed how I use unraid to mitigate it's affects on applications and users.

I've not had the time, but thought about doing a fresh install in hopes of fixing it since this install of unraid is pretty old and even has gone through 2 physical servers... But new users seeing it makes me think that won't help.

CowboyRedBeard · November 8, 2021

Guys, is there any hope that there will be a resolution to this issue?

JorgeB · November 8, 2021

Sorry if this was already suggested, I don't remember, but there have been some reports where it helps for servers with lots of RAM, install the Tips and Tweaks plugin and set "vm.dirty_background_ratio" to 1 and "vm.dirty_ratio" to 2, then test to see if it makes it any better.

CowboyRedBeard · November 8, 2021

6 hours ago, JorgeB said:

Sorry if this was already suggested, I don't remember, but there have been some reports where it helps for servers with lots of RAM, install the Tips and Tweaks plugin and set "vm.dirty_background_ratio" to 1 and "vm.dirty_ratio" to 2, then test to see if it makes it any better.

What does that do? I'm assuming this is something to do with VMs, which I don't understand how that'd have an effect on general OS disk I/O

Mine are currently 10 & 20 in a server with 128G of RAM

JorgeB · November 8, 2021

11 minutes ago, CowboyRedBeard said:

I'm assuming this is something to do with VMs

It doesn't, it has to do with RAM cache for writes.

12 minutes ago, CowboyRedBeard said:

Mine are currently 10 & 20 in a server with 128G of RAM

Those are the defaults.

6.8.3 Disk writes causing high CPU

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation