High Loop2 Reads


Andiroo2

20 posts in this topic Last Reply

Recommended Posts

I’m having a new issue where my Loop2 process is reading insane amounts of data (350-400MB/s) from my cache (2x 500GB WD Blue SSDs in BTRFS RAID1) when I’m watching 4K videos in Plex Docker.  The bursts last for 5-10 mins or so.  I know it’s Loop2 because it’s showing in IOTOP s the top process when it’s happening, and the read rate matches the Unraid interface.  In this case, the Plex movies are on the cache, and my docker and all system/domains files are on cache.  I have an E5-2699v4 CPU (42 threads) with 50% dedicated to Plex.  All Plex metadata is on the cache as well.  

 

Now, I’m direct playing the videos in Plex Docker, so I’m not transcoding there.  I don’t know that it’s related to the 4K movies playing, but that’s when I notice it (it eventually chokes the docker service and Plex stops for everyone watching).

 

So, to trouble this from the Unraid perspective, how can I see into the Loop2 process to understand what those reads really are? 

 

EDIT:  This doesn’t happen constantly, only in these bursts, which is why I’m not yet sure that it’s related to the Plex activity. 

 

Thanks!

Edited by Andiroo2
Details
Link to post
  • 4 weeks later...
  • 3 weeks later...

I'm having the same issue as well.  Was researching and posted to the below linked topic.  I'm going to remove that docker and try a different Plex Docker Image.

I add the repository which I think is directly from plex (https://github.com/plexinc/pms-docker).

 

I originally thought it was a brtfs issue with the cache but it doesn't seem to be the problem... ?

 

Any solutions on your end?

 

 

 

 

 

https://forums.unraid.net/topic/98114-new-cache-pool-ssd-default-format-to-btrfs-i-want-xfs/?tab=comments#comment-929089

Link to post

I'm pretty certain Plex isn't the cause in my issue linked above. I re-created my docker.img btrfs with no effect.

 

/dev/loop2 is the docker image that lives on the cache drive (usually). So it's not just used for Plex - but for any/all dockers you may have running.

 

Note that the issue we are seeing is lots of *reads* (this thread and mine linked above), not writes on the /dev/loop2 device.

Link to post
  • 2 months later...

I'm seeing the same behaviour, docker loop device is showing a very high amount of reads. System load climbs to ridiculous amount, CPU usage shows as 100%, server becomes unresponsive/sluggish/things start crashing/misbehaving.

I've moved 'docker.img' to a separate SSD from 'app_data', but that hasn't helped.

Does anyone have a resolution?

Link to post

My original issue was related to Plex RAM transcoding set up incorrectly. I was transcoding to /tmp, which allowed the Plex docker to eventually use all the available RAM in the system without properly freeing it up when it ran out. 
 

I changed to a fixed RAM disk instead, set to 12GB, and the system has worked perfectly ever since. When it approaches 12GB used for transcoding, it properly frees up RAM on the RAM disk and transcoding keeps on moving without impacting the rest of the server. 

Link to post
15 hours ago, soabwahott said:

6.8.3

I am still running 6.8.3 and have this issue, so I don't think this is related to 6.9.x.

 

I do run Plex but it's not a transcoding issue. 

 

Are you running Crashplan? I don't believe Crashplan is specifically the problem, but perhaps somehow causes it more often. At the moment my system is doing it within a few hours of starting the Crashplan docker.

Link to post
  • 1 month later...

I think I'm having the same issue.  I posted previously about it (link to thread with multiple diagnostic files below).  Here's what I've found...

  • iowait causes all 4 CPUs to peg at 100%, system becomes mostly unresponsive
  • iotop -a shows large amount of accumulating READS from the cache disk (at >300MB/S), specifically, loop2
  • restarting a docker container via command line will fix the problem (for example, docker restart plex, or docker restart netdata)

I can not figure out a pattern to when this happens.  Mover or TRIM is not running.  No one watching a plex movie.

 

I'm on 6.8.3, all drives formatted to XFS

 

 

 

Link to post

As per others in this thread my issue turned out to be related to filling up the RAM on my server.

 

I use Borg for backups and was running Ubuntu Server previously. Borg by default creates cache files in the home directory. On ubuntu server this was fine because they sat on my main SSD, whereas on Unraid the home directory is created on the ramdisk. Once I told Borg to put the cache files on my array the problem went away and I've had no issues since.

 

If anyone else sees similar behaviour with high reads on loop2 please check your ram usage, if this is showing as very high when the issue occurs it is likely that something is filling it up, e.g. plex transcoding or something else writing to /tmp/ or running outside of docker/vms. 

Link to post

I am fairly certain it wasn't RAM related for me. I have 10GB and whilst I run quite a few dockers I don't think I'm hitting any memory issues.

 

 

Looking at my screenshot of top here, shows ~2GB cached so not running out of memory. If that were an issue you should start seeing oom logs.

 

In the general process of things I have replaced my cache drive (so new filesystem on it and then re-created the docker.img which is loop2 so new filesystem there too). I also moved if from the SATA II only internal port to a spare SATA III port I had on an add in card. It does appear to be happening less, but I had another instance the other day so that certainly hasn't resolved it.

 

My CPUs (4C/8T Xeon E3) weren't getting pegged, but there was a lot of iowaiting happening.

Link to post

Another thing I noticed during my last incident, when I ran 'docker stats', I could see that the netdata container was marked 'unhealthy'

 

docker.thumb.jpg.bfdbfdc3b032e87a4662e6c0de98cf2c.jpg

 

I never had netdata running previously when this happened, I just turned it on recently to try and figure this issue out. So I don't think this specific docker is the cause.

 

Also, from the 5 diagnostics top files I have accumulated while this occurs:

  1. MiB Mem :   7667.6 total,    117.7 free,   6593.1 used,    956.7 buff/cache
  2. MiB Mem :   7667.6 total,    131.4 free,   6260.4 used,   1275.8 buff/cache
  3. MiB Mem :   7667.6 total,    121.9 free,   6270.2 used,   1275.5 buff/cache
  4. MiB Mem :   7667.6 total,    121.2 free,   6304.1 used,   1242.3 buff/cache
  5. MiB Mem :   7667.6 total,    117.0 free,   6156.9 used,   1393.7 buff/cache

So similar to @Shonky in terms of RAM related:  It doesn't see to point to that.

 

Curious how to figure out what is causing the loop2 read or if there is a workaround to restart a docker if it's marked as unhealthy?

 

 

 

Link to post
2 minutes ago, DingHo said:

Curious how to figure out what is causing the loop2 read or if there is a workaround to restart a docker if it's marked as unhealthy?

I don't think it's your unhealthy container, but have nothing to back that.

 

If I catch mine in the early stages, things like docker ps, docker stop/start etc work ok. Usually by the time I catch it though, it's because something like Plex or Pi-Hole dockers are being blocked from doing anything. It's too late then and docker commands just get blocked until it comes back. So I can't properly shut down a docker and have to kill some processes that docker is running.

 

I've toyed with just as a work around, something monitoring system load. Any time it gets over say 20 or 30 even on 1 minute and restart a docker perhaps after capturing some logs. Whilst CrashPlanPRO seems to be involved I have had occasions where restarting other very lightweight dockers was enough.

 

Unfortunately mine does it even less than it used to now.

Link to post

I also run 'netdata' and saw similar behaviour, either my 'netdata' docker would stop collecting data or crash completely.

 

The RAM stats posted above are similar to what my own were. I thought I didn't have a problem because I had a fair chunk of memory shown as buff/cache. However, I think because unraid runs in the ramdisk this isn't a true reflection of what's going on compared to a conventional Linux OS. 

Link to post
Posted (edited)

buff/cache is RAM allocated to buffers by the OS. It is not allocated RAM and is available if something tries to allocate RAM. I doubt there is any significant difference in the buff/cache handling in unRAID. It's still pretty close to a regular Linux kernel.

 

If unRAID was filling its ram disk (and you didn't have enough), you'll start seeing OOM errors. My system shows / as 4.8GB total with about 900MB used.

 

I don't have a netdata docker and haven't tried it either.

Edited by Shonky
Link to post
  • 2 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.