6.8.3 Disk writes causing high CPU


Recommended Posts

  • Replies 210
  • Created
  • Last Reply

Top Posters In This Topic

5 minutes ago, CowboyRedBeard said:

Yeah, but I had Crucial drives in and had the same issue. So the brand of drive isn't what the problem is here.

 

The problem was QVO, EVO do better a lot, not brand name issue.

 

What model of that Crucial ?

Also, what download speed of SAB usually got ?

Edited by Benson
Link to comment

The Crucial were MX500.... and the part you might have missed is that this issue hasn't always happened. In fact, part of the reason I put the Samsung drives in (apart from running out of space regularly on the MX500) was because I wondered if they might had been part of the issue.

So, MX500 drives didn't have the issue... then at some point it started. I'm pretty sure it started right when I went to 6.7 but mostly had these operations happening late at night.

Link to comment
7 minutes ago, Benson said:

Problem should be Samsung QVO SSD, due to bad write performance. ( one of worst SSD )

 

inTcFx7iNPDHYY38ZMbGpJ.png

 

Well if i would see those performance numbers i would be happy.
but while the mover is running i see 20-30 Mbyte / s and 30% IO wait times...  Thats not normal.

All SSDs are currently running against a LSI2008 (flashed to IT mode Perc310) fw p16 because no trim with p20 ...
The other disks are on an 710p as raid0 each ( i know not optimal for later - and two more 310 are on its way to me to migrate asap. ) 


Anyhow HDD performance is ok so far - just the SSD's once the mover is running are under total lockup. I still tend to think its an issue with btrfs and the partition offset. The other box has a Samsung 850 SSD.
Performance is easily in the 500Mbyte/s range on that single SSD - and didn't suffocate with all those docker containers.
Before that box was the only box - so all the IO intensive things were running on it - IO was never an issue - just not enough disk mounting space and RAM, hence the new box.

Alignment in the old box (850 SSD) :
root@box:~# lsblk -o  NAME,ALIGNMENT,MIN-IO,OPT-IO,PHY-SEC,LOG-SEC  /dev/sdd
NAME   ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC
sdd            0    512      0     512     512
├─sdd1         0    512      0     512     512
└─sdd2         0    512      0     512     512

 

New box :

root@Tower:/var/log# lsblk -o  NAME,ALIGNMENT,MIN-IO,OPT-IO,PHY-SEC,LOG-SEC  /dev/sd[a-o]
NAME   ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC
sda            0    512      0     512     512
└─sda1         0    512      0     512     512
sdb            0    512      0     512     512
└─sdb1         0    512      0     512     512
sdc            0    512      0     512     512
└─sdc1         0    512      0     512     512
sdd            0    512      0     512     512
└─sdd1         0    512      0     512     512
sde            0    512      0     512     512
└─sde1         0    512      0     512     512
sdf            0    512      0     512     512
└─sdf1         0    512      0     512     512
sdg            0    512      0     512     512
└─sdg1         0    512      0     512     512
sdh            0    512      0     512     512
└─sdh1         0    512      0     512     512
sdi            0    512      0     512     512
└─sdi1         0    512      0     512     512
sdj            0    512      0     512     512
└─sdj1         0    512      0     512     512

 

SSD 860 QVO
sdk            0    512      0     512     512
└─sdk1         0    512      0     512     512

SSD 860 QVO
sdl            0    512      0     512     512
└─sdl1         0    512      0     512     512
sdm            0   4096      0    4096     512
└─sdm1         0   4096      0    4096     512
sdn            0   4096      0    4096     512
└─sdn1         0   4096      0    4096     512
sdo            0   4096      0    4096     512
└─sdo1         0   4096      0    4096     512

 

Link to comment
1 hour ago, CowboyRedBeard said:

and the part you might have missed is that this issue hasn't always happened

 

1 hour ago, CowboyRedBeard said:

but mostly had these operations happening late at night.

MX500 should a good SSD.

With Samsung QVO, does problem happen more frequently ? .... I really haven't much idea if it relate time factor.

 

I focus on "Backlog", those long response time already show overload in disk. Sometimes ago, I replace Crucial BX200 with a 7 years age Crucial  M4 for BT jobs, it perform quite good ( usually 16-35MB/s writing ) and much lot fast response time during load. Both SSD support TRIM ( but not support secure erase ), TRIM on BX200 won't help, read and write speed can't resume normal.

 

image.png

Edited by Benson
Link to comment
1 hour ago, ephigenie said:

Anyhow HDD performance is ok so far - just the SSD's once the mover is running are under total lockup. I still tend to think its an issue with btrfs and the partition offset.

I haven't use mover, some long post have report problem on mover and problem should be fix.

I use RAID0 spindle disk in cache / UD, in fact no unexpected performance issue. ( All Spindle and SSD was BTRFS since day 1 )

 

Would you try move your SSD from LSI HBA to onboard controller ??

Edited by Benson
Link to comment

I appreciate the help, but it isn't the drives. Check out the netdata graphs I posted at the start of this thread.


As you can see from my first posts, I'm not getting anything near these speeds:

https://ssd.userbenchmark.com/SpeedTest/667965/Samsung-SSD-860-QVO-1TB

 

https://www.pcworld.com/article/3322947/samsung-860-qvo-ssd-review.html

 

The issue was the same with the Crucial drives. So... I'm pretty sure it's not the drives themselves. I'm not on the same level as most of you guys with this stuff, but given the discussion thus far it seems clear to me that this is something unique to unRAID and cache.

 

Link to comment
14 minutes ago, CowboyRedBeard said:

I appreciate the help, but it isn't the drives. Check out the netdata graphs I posted at the start of this thread.


As you can see from my first posts, I'm not getting anything near these speeds:

https://ssd.userbenchmark.com/SpeedTest/667965/Samsung-SSD-860-QVO-1TB

 

https://www.pcworld.com/article/3322947/samsung-860-qvo-ssd-review.html

 

The issue was the same with the Crucial drives. So... I'm pretty sure it's not the drives themselves. I'm not on the same level as most of you guys with this stuff, but given the discussion thus far it seems clear to me that this is something unique to unRAID and cache.

 

 

May be I am wrong, anyway 75MB/s could be max in worst case. ( BX200 even worst, 60MB/s )

 

https://www.pcworld.com/article/3322947/samsung-860-qvo-ssd-review.html

 

image.png.578da7e3e0d6de3f3d5f6c362ede28a6.png

 

And some figure seems quite match

 

image.png

 

 

Edited by Benson
Link to comment

For a test, here's me copying a file (83G) from the array to the cache drive (via MC in shell):

 

image.thumb.png.abe8a394d8a8a92e715aefe4bc01f416.png

 

And where this drive drops off utilization, you can scroll down to the other cache drive and watch it pick up where this one left off.

 

Now this is writting at only 85MBs (+/-), it's pulling from a spinning disk... which should be able to feed it more than that. But there's your backlog you asked about.

 

And as a test, I have a PCIE Intel Optane drive in the box that I copied this file to FROM cache:

image.thumb.png.f2576cb965c85de54d88a8fa09516bf1.png

 

And this is the Optane, which typically seems to be able to take as fast as you can give:

image.thumb.png.1655f8decf58802743dd7b41d35cc109.png

 

 

Hopefully this adds some light to the issue?

Link to comment

I mean, I'm not seeing the speeds as limited as others... although I'm unable to obtain the speeds I did in the past. The larger issue for me is that when I DO write to the cache as fast as it can... it crushes the server and other services basically stop. As seen in the first post.

Edited by CowboyRedBeard
Link to comment
1 hour ago, Benson said:

Would you try move your SSD from LSI HBA to onboard controller ??

I had a no-name 6 channel Sata 6gb controller before (same issue) and i tried the onboard 3Gb Sata Controller ( felt worse ) my
server is a Dell T620 - so the onboard controller is only meant for a i.e. DVDRom, not really for disks - its a Perc 110. I also tried running that controller in Raid as well as AHCI mode - but it didn't make any difference.

 

My conclusion so far - I don't think its an issue of the (LSI) controller - during i.e. scrubbing i see performance > 500Mbyte / s  on both SSD's at the same time, well above 1Gbyte/s .

 

Edited by ephigenie
Link to comment
1 hour ago, CowboyRedBeard said:

I mean, I'm not seeing the speeds as limited as others... although I'm unable to obtain the speeds I did in the past. The larger issue for me is that when I DO write to the cache as fast as it can... it crushes the server and other services basically stop. As seen in the first post.

Understand. During read / write test with Optane drive , does same problem happen ( server crushes and other services basically stop ) ?

 

I agree there are big change from 6.7 to 6.8, so some user case may not happen at all.

 

You mention you have try setting "Dirty ratio" and even worst, could you try LOWER it. My system also have 128GB memory, but I set "Dirty ratio" to a very high level to suit my need. Different behavior in different Unraid ver.

 

I also report a case which about "direct I/O" and issue seems not since 6.9.0-beta1 .

 

Edited by Benson
Link to comment

So this is my latest test, I shut the system down and installed a PCIE SATA controller card. Moved both cache drives over to it (from the MB SATA3) and here was an 8G file:
image.thumb.png.802bd761cc3034a6c394d75c5b11667e.png

 

More or less the same issue with IO wait, however the speed might be a little better... hard to say as this file was half the size of the previous tests

 

image.thumb.png.645df316c67a53a4ce071eaaa3d26b18.png

Link to comment
8 hours ago, Benson said:

Understand. During read / write test with Optane drive , does same problem happen ( server crushes and other services basically stop ) ?

 

I agree there are big change from 6.7 to 6.8, so some user case may not happen at all.

 

You mention you have try setting "Dirty ratio" and even worst, could you try LOWER it. My system also have 128GB memory, but I set "Dirty ratio" to a very high level to suit my need. Different behavior in different Unraid ver.

 

I also report a case which about "direct I/O" and issue seems not since 6.9.0-beta1 .

 

No it did not happen when copying to the Optane drive, however it is formatted to XFS. Maybe that is part of the issue?

 

I wonder if there's an easy way for me to convert my cache pool to XFS and then try?

Link to comment
8 minutes ago, CowboyRedBeard said:

No it did not happen when copying to the Optane drive, however it is formatted to XFS. Maybe that is part of the issue?

 

I wonder if there's an easy way for me to convert my cache pool to XFS and then try?

 

I converted my cache (had to break the pool) to xfs (which does not support pooled drives) and the issue seems to be resolved. I can't say 100% but I've not seen any slow downs since the conversion. The conversion was not exactly simple and I found the steps in the thread linked earlier in this thread.

 

1) Stop the array, then disable Docker service and VMs.

2) Change cache pool to single mode (if you're on 6.8.3) and allow the balance to complete. Then stop the array, unassign one of the cache disks, then restart the array.

3) The array will balance the cache again. I could tell it was done because writes stopped happening to the drive I removed from the cache. Also, the text "a btrfs operation is in progress" appeared at the bottom of the main tab by the stop array button.

4) When the text was gone I formatted the spare cache disk through unassigned devices (you must have unassigned devices+ installed)

5) Use the console to rsync the data from the main cache to the spare cache drive e.g.

rsync -avrth --progress //mnt/cache/ //mnt/disks/second_ssd

6) Once the copy is done, stop the array again and format the remaining cache drive as xfs. Note: you must change the number of available cache drives to 1 for xfs to appear as a file system.

7) Once the format is complete, restart the array and copy the data back using the same sync command with the paths flipped.

e.g. rsync -avrth --progress //mnt/disks/second_ssd //mnt/cache/

8. Once the copy is done, you should be all set. Restart the Docker and VM services.

 

If there is an easier way, I couldn't find it in the forums, but I'm glad to have this working now. I may have skipped a step above as I did this from memory, but that is the gist. I am not pleased that I no longer have 1TB of cache from two pooled 500gb SSDs, but I'd rather it function properly than not at all.

Link to comment
13 minutes ago, kernelpanic said:

 

I converted my cache (had to break the pool) to xfs (which does not support pooled drives) and the issue seems to be resolved. I can't say 100% but I've not seen any slow downs since the conversion. The conversion was not exactly simple and I found the steps in the thread linked earlier in this thread.

 

1) Stop the array, then disable Docker service and VMs.

2) Change cache pool to single mode (if you're on 6.8.3) and allow the balance to complete. Then stop the array, unassign one of the cache disks, then restart the array.

3) The array will balance the cache again. I could tell it was done because writes stopped happening to the drive I removed from the cache. Also, the text "a btrfs operation is in progress" appeared at the bottom of the main tab by the stop array button.

4) When the text was gone I formatted the spare cache disk through unassigned devices (you must have unassigned devices+ installed)

5) Use the console to rsync the data from the main cache to the spare cache drive e.g.

rsync -avrth --progress //mnt/cache/ //mnt/disks/second_ssd

6) Once the copy is done, stop the array again and format the remaining cache drive as xfs. Note: you must change the number of available cache drives to 1 for xfs to appear as a file system.

7) Once the format is complete, restart the array and copy the data back using the same sync command with the paths flipped.

e.g. rsync -avrth --progress //mnt/disks/second_ssd //mnt/cache/

8. Once the copy is done, you should be all set. Restart the Docker and VM services.

 

If there is an easier way, I couldn't find it in the forums, but I'm glad to have this working now. I may have skipped a step above as I did this from memory, but that is the gist. I am not pleased that I no longer have 1TB of cache from two pooled 500gb SSDs, but I'd rather it function properly than not at all.

Actually, as I typed this I queued up some huge downloads. They all finished around the same time and the server took a shit. So, its not fixed, but is definitely better than it was. FWIW, I have two SSDs. One is a samsung and one is not. I think I'll move it all to the non-samsung SSD tomorrow and try again.

 

Edit: Take that back, that was my machine kernel panicking again. I still stand by the xfs conversion being a good call.

Edited by kernelpanic
Link to comment
1 hour ago, CowboyRedBeard said:

You can't have redundant cache (pool) with XFS...

https://wiki.unraid.net/UnRAID_6/Storage_Management#Switching_the_cache_to_pool_mode

 

That's a problem... I suppose this is something I could try, but don't see lack of a pool as a viable option going forward.

Yes, exactly. There is no redundancy with xfs so far as I can tell. I backup appdata to the share every night so if the SSD dies I’ll have backup to some degree. Would like to see the btrfs fixed at some point though.

Link to comment
4 hours ago, CowboyRedBeard said:

You can't have redundant cache (pool) with XFS...

https://wiki.unraid.net/UnRAID_6/Storage_Management#Switching_the_cache_to_pool_mode

 

That's a problem... I suppose this is something I could try, but don't see lack of a pool as a viable option going forward.

What data are you storing in cache?

 

Given SSD is more reliable (than HDD) and even when it fails, it tends to fail gracefully (i.e. giving you time to respond and replace), having a backup can be more useful than mirror redundancy. In fact, running RAID-1 doesn't mean you don't want to have a backup.

 

And separating write-heavy and read-heavy data to 2 different SSDs will help improving the lifespan of both by reducing wear leveling and write magnification.

 

Not saying that redundancy isn't useful but it sounds to me like you might be over-valuing it and dismissing the alternatives.

Link to comment

I can see that angle... But SSD's are cheap these days also.

 

I'm running VMs on it, sabnzbd and a few other operations. Also a few VMs.

 

I guess I could migrate it to XFS to test and see if that fixes it... but honestly I'd prefer the BTRFS / cache issue fixed if that's what's going on here. What's the best way to do that? Copy everything off and then start in maintenance mode and then copy back?

 

Anyone from Limetech looked into this situation at all since it seems I'm not the only one?

Edited by CowboyRedBeard
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.