Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

6.8.3 Disk writes causing high CPU

Featured Replies

  • Author
root@Tower:~# ls -lah /mnt/cache/system/docker
total 0
drwxrwxrwx 1 root   root   0 Mar 26 09:00 ./
drwxrwxrwx 1 nobody users 26 Feb  6  2019 ../
root@Tower:~# ls -lah /mnt/cache/system/libvirt
total 105M
drwxrwxrwx 1 root   root    22 Feb  6  2019 ./
drwxrwxrwx 1 nobody users   26 Feb  6  2019 ../
-rw-rw-rw- 1 nobody users 1.0G Mar 26 05:30 libvirt.img
root@Tower:~#

 

  • Replies 210
  • Views 27.6k
  • Created
  • Last Reply

Top Posters In This Topic

Posted Images

  • Community Expert

OK, that looks like the system folder on disk1 is obsolete.

 

From the command line again

cd /mnt/disk1
rm -r system

 

  • Community Expert

And post new diagnostics

  • Community Expert

Looks good. None of system share on the array.

 

Go to Settings - Docker and change your docker image size to 20G then enable dockers. You can add your dockers back just as they were using the Previous Apps feature on the Apps page.

 

But, just as they were may not be good enough, since you had already filled 20G before.

 

How many dockers do you normally have installed?

 

 

  • Author

I had a good dozen, will they get all their config data back from /appdata ?

Do I have to reconfigure all the parameters, like mappings and stuff? Is there somewhere that has that still I can copy from?

  • Community Expert
1 minute ago, CowboyRedBeard said:

I had a good dozen, will they get all their config data back from /appdata ?

Do I have to reconfigure all the parameters, like mappings and stuff? Is there somewhere that has that still I can copy from?

 

5 minutes ago, trurl said:

You can add your dockers back just as they were using the Previous Apps feature on the Apps page.

See this post I made yesterday for more explanation if you want to understand how this works:

 

https://forums.unraid.net/topic/90146-docker-fqdn/?do=findComment&comment=836919

 

  • Author

OK, putting it all back together and will report. Is there a way to tell which of the apps was last installed? I have a few of them that I've installed from different sources, one worked well and the other didn't. So whatever one was last installed was the one I had working.

  • Author

I did move all the sab operations to a dedicated share that is set to only.... it's still giving me high IO wait

 

image.thumb.png.b5f4191c725e00f812a8a53b69465b86.png

 

  • Author

Which, this is one of the cache drives...

 

image.thumb.png.20084b094f74dab951639296e96edb65.png

  • Community Expert
2 hours ago, CowboyRedBeard said:

Is there a way to tell which of the apps was last installed?

You might get some clue by checking the timestamps of the templates stored on flash in that folder I mentioned in the link.

 

 

  • Author

OK, I'll take a look. Most of those are non-essential anyway.

 

Is there any further ideas on why I'm still getting the high IO? Latest diag attached.

6th-tower-diagnostics-20200326.zip

  • Author

I've gotten everything back to normal, but still have the high IO situation when downloading with sab

 

??

  • Community Expert

See what happens if you make sab download to a disk in the array.

  • Author

OK, so this is what I did. I created a share "sabtest" that was cache "no" and then copied the file structure of the cache only one "sab" to it....

This was the performance during copying from cache to the array:
image.thumb.png.1cb74444aad42a37b511d91a5617b254.png
 

 

That was 20G of files, from unfinished downloads.

 

 

This was performance during sab running to this array only share:

This is during download, and I think notable here is how it's now downloading at about 40% of the previous speed. And while the overall impairment of the system's performance and response of other services wasn't as bad... It's still sluggish.
image.thumb.png.a15243cb8e8d43a6ceb5bc4551cc3c1a.png

 

Interestingly enough, after the download has "finished" sab seems to hang up for a few minutes, before unpacking
image.thumb.png.fd92c886f4901044ba7c031edba40038.png

 

And then this is what it looks like during unpacking:

image.thumb.png.a049da62108ba94dad500b8ed09471a0.png

 

  • Author

And this is copying from the array only sab test share to my media share, which is cache "yes"

 

So it's definitely something to do with the cache drives, which are new... and the previous ones did the same. This is not likely to be hardware I'm guessing, so I'm not sure where to look next. Thoughts?

 

image.thumb.png.579b93ecd17e57b9826fdf86badf1914.png

I have almost exactly the same server (128gb ram, dual xeon 2690, 2 x QVO 860 1 TB ) - same issue.
Sounds to me as if it is related to this topic :
 

 

  • Author
9 hours ago, ephigenie said:

I have almost exactly the same server (128gb ram, dual xeon 2690, 2 x QVO 860 1 TB ) - same issue.
Sounds to me as if it is related to this topic :
 

 

I have only parsed through all that.... but the issue started before using the Samsung drives. I had Crucial drives before switching to the Samsung and the problem occurred then too.

 

This box was running fine for over a year, it started I think after an upgrade. I've only noticed it recently as I had most of these sort of operations happening in the dead of night. Is this some sort of formatting error?

Edited by CowboyRedBeard

  • Author

So reading through this some more, it does sound like what I'm experiencing... although I was experiencing it with the Crucial drives also.

 

This stuff is out of my realm of understanding, should I try a different file system than btrfs?

And again. Reformatted the cache disks, put them into a raid0, ran balance, ran fstrim -av etc..

 

Performance is an abosute disaster when the mover is active. Docker container die, VM's become unusable etc.

This is a serious BUG!

 

The write / read speed btw. during those times is  around 15Mbyte/s per SSD, 50Mbyte/s read + write for the full array ( 13 disks ) .

Mover runs since 10h +

 

1220161152_Screenshot2020-03-3113_00_23.png.95a2fa7504e898b25e8bfd64ba286945.png

  • Community Expert
5 hours ago, ephigenie said:

Mover runs since 10h +

I agree you should get better performance, but you might want to reconsider how you are using cache. Mover works best during idle time, and you don't have to cache everything.

 

My cache is for dockers and VMs, and for DVR since there is some performance advantage when playing and recording at the same time. Most of the writes to my server are scheduled backups and queued downloads, so I don't care if they are a little slower since I am not waiting on them. They all go directly to the array where they don't need to be moved, and where they are already protected.

 

Others will have different use cases of course, but think about it. Some people just cache everything all the time without thinking about it.

  • Author

I use cache for downloads because sometimes I want to put media onto Plex and watch right after. MUCH less wait time.

 

I've also found benefit when copying lots of files to a share of it using cache to make that take less time.

 

Like you said, everyone's use case is different. But I'd like my cache to work like it used to for sure.

 

It seems like ephigenie and others are seeing the same thing I am. Which for over a year I did not see. And, with two different sets of cache drives from different manufacturers it doesn't appear to be a hardware issue on the surface of it.

 

I used to get write speeds to cache of 500M + 

 

What are some good next steps for troubleshooting the issue?

Edited by CowboyRedBeard

23 hours ago, trurl said:

I agree you should get better performance, but you might want to reconsider how you are using cache. Mover works best during idle time, and you don't have to cache everything.

 

My cache is for dockers and VMs, and for DVR since there is some performance advantage when playing and recording at the same time. Most of the writes to my server are scheduled backups and queued downloads, so I don't care if they are a little slower since I am not waiting on them. They all go directly to the array where they don't need to be moved, and where they are already protected.

 

Others will have different use cases of course, but think about it. Some people just cache everything all the time without thinking about it.

 

Well i have a separate machine with local ssd storage and capacity drives as download client.

This is in order to separate the IO a bit and have the main machine free for other tasks.

 

However this means that sometimes a few 100GB are being copied over to the cache. As long as the mover is not running, access speeds are fine, UI is reacting properly etc. I am also hosting a bunch of containers ( from cache only ) on both sides (on the download & main machine ).


However while the single SSD on the download machine (which has much lower specs) can easily cope with parallel IO (running ext4fs) the big Unraid box is struggeling totally.


Now i want to find out why and remove this problem. I know my way around strace etc. and the next time i will do a bit more investigation to see what is really going on. However the hints so far from this forum are the partition start for Samsung SSD's which should not start at sector 64 but 2048.

 

 

 

  • Author

Please post what you find, I use mine as a "one box to do it all" so this is killing me. And it's not normal operation to have this much IO wait, as it hadn't been the case in the past.

 

I assume the following article is what we're looking at:

https://linux-blog.anracom.com/2018/12/03/linux-ssd-partition-alignment-problems-with-external-usb-to-sata-controllers-i/

 

This is mine:
 

root@Tower:~# lsblk -o  NAME,ALIGNMENT,MIN-IO,OPT-IO,PHY-SEC,LOG-SEC  /dev/sdc
NAME   ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC
sdc            0    512      0     512     512
└─sdc1         0    512      0     512     512
root@Tower:~# lsblk -o  NAME,ALIGNMENT,MIN-IO,OPT-IO,PHY-SEC,LOG-SEC  /dev/sdb
NAME   ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC
sdb            0    512      0     512     512
└─sdb1         0    512      0     512     512
root@Tower:~# 

 

This more than I'm comfortable with goofing around with on my own. Hoping to get some guidance here.

 

Thanks

Edited by CowboyRedBeard

Problem should be Samsung QVO SSD, due to bad write performance. ( one of worst SSD )

 

inTcFx7iNPDHYY38ZMbGpJ.png

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.