jonp Posted September 22, 2020 Share Posted September 22, 2020 Can I assist with testing data?Any insights / information you can provide would be much appreciated. We really need to narrow down reproducable steps. Sent from my Pixel 3 XL using Tapatalk Quote Link to comment
CowboyRedBeard Posted September 22, 2020 Author Share Posted September 22, 2020 I've posted most of it to this thread, but when took it from 2 - SSDs in a pool (BTRFS) and had single SSD on XFS it was less of a problem, but still a problem. I could run some tests or tools and show you the results if you can tell me which ones would be meaningful. Basically it's any time there's a large write (20G) to SSD. Quote Link to comment
CowboyRedBeard Posted September 22, 2020 Author Share Posted September 22, 2020 I mean, at this point I'm happy to actually rebuild the server.... fresh install... if that has a high likelihood of making this go away Quote Link to comment
CowboyRedBeard Posted October 6, 2020 Author Share Posted October 6, 2020 On 9/22/2020 at 1:44 PM, jonp said: Any insights / information you can provide would be much appreciated. We really need to narrow down reproducable steps. Sent from my Pixel 3 XL using Tapatalk I saw that beta 29 is out, does it address this in any way? Also, happy to provide any data that might be helpful... If you can tell me how to collect it. Quote Link to comment
dnLL Posted November 15, 2020 Share Posted November 15, 2020 I'm having the same issue of high disk writes on btrs cache with dockers. This seems to be very well known... My SSDs are are 35 days old, my server didn't do much (the mover barely moves anything) and yet, 25 TBW on both SSDs. They're 500G SSDs, so that's 50x their size in a month with a server idling 99.999% of the time. Insane. Quote Link to comment
itimpi Posted November 15, 2020 Share Posted November 15, 2020 7 hours ago, dnLL said: I'm having the same issue of high disk writes on btrs cache with dockers. This seems to be very well known... My SSDs are are 35 days old, my server didn't do much (the mover barely moves anything) and yet, 25 TBW on both SSDs. They're 500G SSDs, so that's 50x their size in a month with a server idling 99.999% of the time. Insane. There are fixes in the 6.9.0 beta series to address this. Quote Link to comment
dnLL Posted November 15, 2020 Share Posted November 15, 2020 There are fixes in the 6.9.0 beta series to address this.I'm happy to hear about this. I will wait for the stable version, hopefully soon. Sent from my Pixel 3 using Tapatalk Quote Link to comment
CowboyRedBeard Posted November 18, 2020 Author Share Posted November 18, 2020 I did the beta and still have issue. Quote Link to comment
CowboyRedBeard Posted December 3, 2020 Author Share Posted December 3, 2020 (edited) Still a problem, this shows server temps and fan speeds, etc.... This is when I have it set to download and unpack files at 02:00 Any chance this gets fixed soon? Alternatively, @jonp / @itimpi do you think maybe a quicker fix for me would be to rebuild unraid from scratch? I mean one of the other side affects is that SSD write speeds are way way down, to 80MBps range... which also sucks. Edited December 3, 2020 by CowboyRedBeard Quote Link to comment
mctavish01 Posted December 9, 2020 Share Posted December 9, 2020 I guess I will add to this as myself and a friend have same issues. When writing to array from off server (like unloading gopro footage) our cpu is absolutely pinned causing issues with all dockers and the like, essentially rendering dockers useless while transfer is active. Anything useful we can provide to help resolve the issue? Quote Link to comment
CowboyRedBeard Posted December 21, 2020 Author Share Posted December 21, 2020 My Christmas wish is for this to get fixed.... 😃 Quote Link to comment
jonp Posted December 21, 2020 Share Posted December 21, 2020 I know it is you guys :-). we've had a hell of a time trying to recreate this issue. And then we thought we had it figured out with that SSD partitioning issue, but apparently that's not the silver bullet for everyone. I'm hoping to find some time over the holidays to really beat again at the server in the lab and see if I can reproduce. That is the key at this point. We have to be able to reproduce this issue to have any chance of solving it. Alternatively, if anybody has a system that does not have production data on it that is exhibiting this issue, I may be interested in requesting remote access to that box directly. I just don't want to do this to anyone's system that has data on it that they are concerned about.Sent from my Pixel 3 XL using Tapatalk Quote Link to comment
CowboyRedBeard Posted December 21, 2020 Author Share Posted December 21, 2020 17 minutes ago, jonp said: I know it is you guys :-). we've had a hell of a time trying to recreate this issue. And then we thought we had it figured out with that SSD partitioning issue, but apparently that's not the silver bullet for everyone. I'm hoping to find some time over the holidays to really beat again at the server in the lab and see if I can reproduce. That is the key at this point. We have to be able to reproduce this issue to have any chance of solving it. Alternatively, if anybody has a system that does not have production data on it that is exhibiting this issue, I may be interested in requesting remote access to that box directly. I just don't want to do this to anyone's system that has data on it that they are concerned about. Sent from my Pixel 3 XL using Tapatalk Well, mine is in production with data... but I'd be happy to run some tests and provide metrics / data / output... I reproduce it daily. 😪 Quote Link to comment
likesboc Posted February 18, 2021 Share Posted February 18, 2021 hi, is there any progress o this? i'm a new unraid user and it appears i'm facing a similar issue. cpu spikes while writing on the ssd cache. rtorrent is maxing out with 100% cpu although i only have few torrents running. when the mover starts it gets even worse, dockers barely useable, so currently i'm only letting it run at night. any help is appreciated, i can offer system diags if it helps. Quote Link to comment
CowboyRedBeard Posted February 19, 2021 Author Share Posted February 19, 2021 I've not seen a fix yet, I've been experiencing it for nearly a year. Quote Link to comment
likesboc Posted February 21, 2021 Share Posted February 21, 2021 On 2/19/2021 at 3:15 AM, CowboyRedBeard said: I've not seen a fix yet, I've been experiencing it for nearly a year. does it make sense to disable the cache for media shares entirely and only use it for appdata? how do you circumvent this until it is adressed in a new unraid version? i'm still on trial so any additional information is very much appreciated! Quote Link to comment
CowboyRedBeard Posted March 3, 2021 Author Share Posted March 3, 2021 (edited) I've been able to limp it along by scheduling large write I/O jobs, but that might not work for everyone I'm actually thinking about building another install just to see if it goes away Additionally... I've upgraded from beta 25 to 6.9.0 and the issue persists Edited March 3, 2021 by CowboyRedBeard Quote Link to comment
GMAsterAU Posted April 1, 2021 Share Posted April 1, 2021 I just discovered the same issue. Copying 300 GB to my BTRFS encrypted Cache (RAID 5) makes the IO Wait time go through the roof, rendering the server useless for the time of copying. It completely normalises after the transfer is complete. I am running 6.9.1. Happy to run tests and provide information, as I can see this thread has been open for a long time. I sympathise with you @CowboyRedBeard Quote Link to comment
CowboyRedBeard Posted May 13, 2021 Author Share Posted May 13, 2021 Yeah, this is more than a year on at this point. And there's a number of folks with the issue. It'd be great to get a resolution, since I'd had unraid for YEARS without this problem. Quote Link to comment
Funes Posted September 9, 2021 Share Posted September 9, 2021 (edited) Hi, New unraid user here, two weeks fighting with it trying to solve this issue. I would like to bump this thread since this topic really looks like the issue I am experiencing. Every time I need to write or read from the array the CPU goes crazy and the IO Wait skyrockets. I have tried different things I have been collecting from different threads: 1. Download torrents to cache. That is ok, but my cache is 500Gb and I download that amount every day so I have qBitorrent move the files after downloading to the array, this causes the issue when the move happens. 2. Rclone uploads to the cloud, this would read a folder from the array and upload it to Google. Same issue, CPU and IO Waits go to the moon. I have scheduled the upload during the night. 3. I have changed the /config in docker from /mnt/user/appdata to /mnt/cache/appdata as suggested in other threads. Same result. 4. I have disabled the Tunable (enable Direct IO) under Global Share Settings as suggested in another thread. Same result. 5. Moved all the appdata share to cache with mover. Same result. 6. Changed my rclone upload script pointing to upload folder from /mnt/user/download to /mnt/disk1/download to try to bypass Unraid's SHFS. Same result, posting screenshot running rclone upload: 7. Folder caching plugging is not installed. 8. Cache filesystem and array filesystem matches (XFS). 9. My CPU governor is Performance. To be honest I am out of ideas and I don't know where else to look for information on how to solve this issue, but it really desperates me, as I can't use the server as a media center to serve content to my TV since the content freezes all the time (Freezes on TV match CPU spikes due to this issue) Any advise? EDIT: By the way this is on latest stable version 6.9.2 Edited September 10, 2021 by Funes Quote Link to comment
CowboyRedBeard Posted September 10, 2021 Author Share Posted September 10, 2021 I've been waiting over a year to see resolution to this problem... I've actually changed how I use unraid to mitigate it's affects on applications and users. I've not had the time, but thought about doing a fresh install in hopes of fixing it since this install of unraid is pretty old and even has gone through 2 physical servers... But new users seeing it makes me think that won't help. Quote Link to comment
CowboyRedBeard Posted November 8, 2021 Author Share Posted November 8, 2021 Guys, is there any hope that there will be a resolution to this issue? Quote Link to comment
JorgeB Posted November 8, 2021 Share Posted November 8, 2021 Sorry if this was already suggested, I don't remember, but there have been some reports where it helps for servers with lots of RAM, install the Tips and Tweaks plugin and set "vm.dirty_background_ratio" to 1 and "vm.dirty_ratio" to 2, then test to see if it makes it any better. Quote Link to comment
CowboyRedBeard Posted November 8, 2021 Author Share Posted November 8, 2021 6 hours ago, JorgeB said: Sorry if this was already suggested, I don't remember, but there have been some reports where it helps for servers with lots of RAM, install the Tips and Tweaks plugin and set "vm.dirty_background_ratio" to 1 and "vm.dirty_ratio" to 2, then test to see if it makes it any better. What does that do? I'm assuming this is something to do with VMs, which I don't understand how that'd have an effect on general OS disk I/O Mine are currently 10 & 20 in a server with 128G of RAM Quote Link to comment
JorgeB Posted November 8, 2021 Share Posted November 8, 2021 11 minutes ago, CowboyRedBeard said: I'm assuming this is something to do with VMs It doesn't, it has to do with RAM cache for writes. 12 minutes ago, CowboyRedBeard said: Mine are currently 10 & 20 in a server with 128G of RAM Those are the defaults. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.