cferrero

May 19, 2018

On 5/16/2018 at 7:56 AM, CHBMB said:

On 5/15/2018 at 6:08 PM, cferrero said:

What I could find, the issue with bandwidth limits is caused by a bugged libtorrent library (v 1.1.5). Linuxserver.io rebased the image to Alpine Edge that pulls that version which doesn't work with Deluge 1.3.15

Got some links to that please?

sadly no, I was searching about the bandwidth issue (ignoring limits) and what I could find was several coments about being a libtorrent library issue, then I found that comment about the rebase (I can't remember where but was just that line), I checked the update logs and tested the previous build and other based on archlinux, both obey the global limit BUT then I found that something is still off with the upload limit, let's say my line's upload is 20+ Mb/s and I set the global limit on 5, and there are 4 torrents seeding, the individual upload of each torrent will be pretty random each few seconds, ranging between 0 and a few hundreds but the total (0-2 Mb/s) pretty far from the limit (5 Mb).

If I remove the limit, the upload will go 15+++ Mb/s fast. If put a limit by torrent (1 Mb/s), I can see the upload of each one close to that limit, 800-900 Kb/s. Right now I have the latest version (linuxserver) with individual limit.

May 15, 2018

What I could find, the issue with bandwidth limits is caused by a bugged libtorrent library (v 1.1.5). Linuxserver.io rebased the image to Alpine Edge that pulls that version which doesn't work with Deluge 1.3.15

March 2, 2018

I sent limetech diagnostics with shfsExtra=-logging 2.

Now, I'm testing 6.5.0 rc1 and it's working fine, no leaks at the moment, in fact, after sending the message to @limetech .... it went from 6400 to 5776 ... looks like is working as intended

March 1, 2018

56 minutes ago, limetech said:

Please test out 6.5.0-rc1. We didn't find any obvious memory leak in the code. But here are some observations.

Any time you reference files in user share file system FUSE allocates a structure called a node and also can allocate some heap memory using malloc(). The node contains info that describes the file and servers as a kind of interface between the FUSE kernel layer and user space layer. These nodes contain a reference count which is incremented, eg, when a file is opened. Normally these nodes are deallocated (along with any heap memory free()'ed) after returning request information to client and/or after file is closed. There is an internal FUSE '"clean" thread which wakes up every 10 seconds that does the actual deallocation. HOWEVER, the internal FUSE "inode number" of these nodes is what's used as the NFS file handle when accessing a user share via NFS. For this reason we cannot permit these nodes to expire so quickly or else NFS clients will get "stale file handle". This is the purpose the "fuse_remember" tunable on the NFS Setings page. The value there of 330 (when NFS is enabled) tells FUSE to keep these things in memory for min 5 1/2 minutes. This was chosen because the typical NFS client side handle cache is 5 minutes.

So... If you have an application which is constantly referencing a huge number of files/directories within /mnt/user, the 'shfs' memory footprint is going to grow, seemingly unbounded, especially if you have NFS enabled and fuse_remember is set to 330.

In our testing we have noticed that we can do something like:

find /mnt/user/ >/dev/null

And watch shfs memory usage grow via htop. Finally when the 'find' command exits it appears that 'shfs' memory is never deallocated. But this is not actually the case. If you type this:

echo 3 >/proc/sys/vm/drop_caches

It tells Linux to forcibly mark "available" pages as 'free" pages. Normally Linux would do this on-demand.

Bottom line: we continue to work on this issue...

I will try to do some test in the next days/weekend and get the debugging log (I couldn't the past week), a few tidbits:

- I found out about the issue because it couldn't free avaible ram (if it was marked as avaible) on demand (error allocating 2GB for a virtual machine in a 8GB server with just 3 containers started (emby, let'sencrypt and transmission)) and also If started (after reboots) and then "leaking" I was getting bizarre behavior in the VM (maybe the VM ram been overwrited, maybe a series of strange coincidences)

- In my test, with just 4-10 files seeding, was enough to appreciate the growth (I didn't try with less), just transmission working on its share (no other plugin/docker). So, not a huge number of files/directories.

- I'm guessing the NFS is a reference example but just in case, I have NFS disabled.

- The command I was using to check is "ps -e -orss=,args= | sort -b -k1,1n"

February 23, 2018

It looks like not only transmission, but sabnzb too (from the prerelease thread)

February 22, 2018

2 minutes ago, mudsloth said:

I haven't had a good chance to do troubleshooting yet. I'm also experiencing this issue, but it seems to be caused by the binhex-SABnzbd docker. I don't have Transmission or Deluge installed at this time, so it seems like more of a general symptom of heavy writing done by any docker.

Well, not any heavy writing docker, I have deluge up in two servers, 5-6 days uptime, downloading new files each day, no issue, but two different containers, that's someting interesting.

6 minutes ago, ffiarpg said:

I've been in the process of switching from transmission to qbittorrent over the last few months (transmission docker doesnt fully utilize gigabit for some reason) and finally turned off transmission completely. The memory still increases but much slower (instead of several % of my 16GB per day it is 0.1% every few days). I don't have hundreds of torrents in qbittorrent yet but it pushes out my required restarts from twice a week to likely every few months at my current rate. Definitely some issue with shfs and the more dockers we find that trigger it the easier it should be for devs to isolate the problem. Would be nice to get a developer response at some point. It would be one thing if this was open source but this huge issue being in a paid product with no reponse for weeks is pretty terrible IMO.

the assigned ram can and will go up, the normal behaivor it's to drop it when it's no need. No developer will look just on "something is wrong", luckily I only needed a few days to isolate the issue on my systems, and now, mudsloth added a new piece of information, not just transmission. Now there is something that can be reapeated and some clues to allow people to look and probably find something.

February 20, 2018

15 hours ago, BRiT said:

@cferrero make sure to edit your post in this thread to either change or remove out the rpc-password field, just in case someone can get to your server address. I was unaware it would have included that field.

I didn't check etheir, i was a clean test install with auth disabled, there is no outside access, but edited and removed just in case.

15 hours ago, BRiT said:

Now as to what preallocation is configured as ...

0 - None - No preallocation, just let the file grow whenever a new packet comes in

1 - Sparse - Preallocate by writing just the final block in the file

2 - Full - Preallocate by writing zeroes to the entire file

A method of Sparse should be fine, however I have mine set to "2". I would try setting it to "2" and do a restart to set everything to a clean slate and see where it goes from there.

I will test it after a reboot just to be sure, but it happened too just seeding files

15 hours ago, BRiT said:

For reference from a 6.3.5 system uptime of 78 days:


USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     10342  0.0  0.0 153296   596 ?        Ssl   2017   0:00 /usr/local/sbin/shfs /mnt/user0 -disks 14 -o noatime,big_writes,allow_other

root     10352  0.1  0.0 1514560 19240 ?       Ssl   2017 157:46 /usr/local/sbin/shfs /mnt/user -disks 15 2048000000 -o noatime,big_writes,allow_other -o remember=0

I think 6.3.5 is free of this, I didn't notice any problems. But for reference, in my test, an example would be:

around 500 bytes in standby, before starting transmission

around 300Mb after 4 hours with 5-8 torrents. (test)

around 5Gb after 4 days (observed)

the main server was using 10Gb in 9 days uptime ...

February 17, 2018

I don't think so, the issue persist after transmission is killed, also happens only seeding files, and I'm think it was causing the bizarre behavior on my openhab VM, now looks like its working fine.

the settings are:

{
    "alt-speed-down": 50,
    "alt-speed-enabled": false,
    "alt-speed-time-begin": 540,
    "alt-speed-time-day": 127,
    "alt-speed-time-enabled": false,
    "alt-speed-time-end": 1020,
    "alt-speed-up": 50,
    "bind-address-ipv4": "0.0.0.0",
    "bind-address-ipv6": "::",
    "blocklist-enabled": false,
    "blocklist-url": "http://www.example.com/blocklist",
    "cache-size-mb": 4,
    "dht-enabled": true,
    "download-dir": "/downloads/complete",
    "download-queue-enabled": true,
    "download-queue-size": 5,
    "encryption": 1,
    "idle-seeding-limit": 30,
    "idle-seeding-limit-enabled": false,
    "incomplete-dir": "/downloads/incomplete",
    "incomplete-dir-enabled": true,
    "lpd-enabled": false,
    "message-level": 2,
    "peer-congestion-algorithm": "",
    "peer-id-ttl-hours": 6,
    "peer-limit-global": 200,
    "peer-limit-per-torrent": 50,
    "peer-port": 51413,
    "peer-port-random-high": 65535,
    "peer-port-random-low": 49152,
    "peer-port-random-on-start": false,
    "peer-socket-tos": "default",
    "pex-enabled": true,
    "port-forwarding-enabled": true,
    "preallocation": 1,
    "prefetch-enabled": true,
    "queue-stalled-enabled": true,
    "queue-stalled-minutes": 30,
    "ratio-limit": 3,
    "ratio-limit-enabled": true,
    "rename-partial-files": true,
    "rpc-authentication-required": false,
    "rpc-bind-address": "0.0.0.0",
    "rpc-enabled": true,
    "rpc-host-whitelist": "",
    "rpc-host-whitelist-enabled": true,
    "rpc-password": "{1ddd3f1f6a71d655cde7767242a23a575b44c909n5YuRT.f",
    "rpc-port": 9091,
    "rpc-url": "/transmission/",
    "rpc-username": "",
    "rpc-whitelist": "127.0.0.1",
    "rpc-whitelist-enabled": false,
    "scrape-paused-torrents-enabled": true,
    "script-torrent-done-enabled": false,
    "script-torrent-done-filename": "",
    "seed-queue-enabled": false,
    "seed-queue-size": 10,
    "speed-limit-down": 100,
    "speed-limit-down-enabled": false,
    "speed-limit-up": 100,
    "speed-limit-up-enabled": false,
    "start-added-torrents": true,
    "trash-original-torrent-files": false,
    "umask": 2,
    "upload-slots-per-torrent": 14,
    "utp-enabled": true,
    "watch-dir": "/watch",
    "watch-dir-enabled": true
}

February 17, 2018

Hi.

I having some issues with this container, not sure if someone else notice, but transmission container looks like is triggering a memory leak in shfs process, I'm not having this issue with other containers (I will keep testing and monitoring), quoting myself:

Quote

I have been able to reproduce this with just transmission container, vm engine disabled and no plugins, docker image is loaded from a ssd outside of the array manually mounted before array start.

The test was:

All plugins removed, vm engine disabled, server rebooted to clean shfs from other test, ssd manually mounted, array started, transmission started, a few torrents seeding and/or downloading, in just a few minutes its clear that shfs memory is growing fast, but to be sure, waited 2+ hours and check again to see the ram over 200+ Mb and not getting lower even after stopping transmission.

The exact same test with deluge instead transmission, it never went over 30Mb after 15h and loads of torrents.

So, recap, the ¿leak? looks triggered by transmision container (linuxserver.io version) and needs a reboot to clear it (if the container is stopped and deluge started the ram usage continues to grow).

What is the exact problem, no idea at the moment, as Jeronyson noted, it's an issue not present on unraid 6.3.5

February 17, 2018

I have been able to reproduce this with just transmission container, vm engine disabled and no plugins, docker image is loaded from a ssd outside of the array manually mounted before array start.

The test was:

All plugins removed, vm engine disabled, server rebooted to clean shfs from other test, ssd manually mounted, array started, transmission started, a few torrents seeding and/or downloading, in just a few minutes its clear that shfs memory is growing fast, but to be sure, waited 2+ hours and check again to see the ram over 200+ Mb and not getting lower even after stopping transmission.

The exact same test with deluge instead transmission, it never went over 30Mb after 15h and loads of torrents.

So, recap, the ¿leak? looks triggered by transmision container (linuxserver.io version) and needs a reboot to clear it (if the container is stopped and deluge started the ram usage continues to grow).

What is the exact problem, no idea at the moment, as Jeronyson noted, it's an issue not present on unraid 6.3.5

Now that I have isolated the issue I will validate it on the main server, changing transmission to deluge while I think what more test to do.

February 17, 2018

I will post this here too, but I think tis better to continue on the other thread.

A few more test after ….

I have been able to reproduce this with just transmission container, vm engine disabled and no plugins, docker image is loaded from a ssd outside of the array manually mounted before array start.

The test was:

All plugins removed, vm engine disabled, server rebooted to clean shfs from other test, ssd manually mounted, array started, transmission started, a few torrents seeding and/or downloading, in just a few minutes its clear that shfs memory is growing fast, but to be sure, waited 2+ hours and check again to see the ram over 200+ Mb and not getting lower even after stopping transmission.

The exact same test with deluge instead transmission, it never went over 30Mb after 15h and loads of torrents.

So, recap, the ¿leak? looks triggered by transmision container (linuxserver.io version) and needs a reboot to clear it (if the container is stopped and deluge started the ram usage continues to grow).

What is the exact problem, no idea at the moment, as Jeronyson noted, it's an issue not present on unraid 6.3.5

Now that I have isolated the issue I will validate it on the main server, changing transmission to deluge while I think what more test to do.

February 15, 2018

I will add some information.

The secondary system, the one i'm using to test, I had cache_dirs installed but disabled, I removed 4 or 5 plugins, and leave it all night with deluge container with 3 or 4 torrents active, the usage went up more or less the same rate, so, adding this to transmission behavior ~~(transmission not guilty~~ ) ~~it looks like que issue is with heavy file access, It just went up when the torrent is active and stops when it's finished and not sharing. I didn't reboot the serve~~r (big mistake), I will do it now and test again. I will test too something to access files over the server with all plugins and dockers removed and probably same tramsission test avoiding user shares.

I did find out about this issue when I was trying to power up a VM and the server was out of memory to do it, in fact the bizarre behaviour I was having with that VM was the only reason I was looking, in normal conditions that VM is in autostart, then just to compare I checked the main server and there it was, 11Gb used at the moment. But I didnt notice problems on this server (dockers and VM on autostart, no reason to check). So its posible it's flying under the radar for a lot of people, adding that probably without a heavy usage on the user share (ex. torrent) no one will notice for a long time whitout reboots.

February 14, 2018

Hi,

I'm having the same issue with two different servers (shfs use of memory increases and increases).

After a little initial research, the main suspect is transmission container (linuxserver.io version), when its startet the shfs memory usage increses slowly and steady, if I stop it, the usage only moves a little. The servers hardware and configuration are different and only a few plugins and transmission container are the same, I'm planning on checking other bt client and discard the common plugins.

At least that's my case at the moment.

cferrero

Posts

Joined

Last visited

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Posts posted by cferrero

[Support] Linuxserver.io - Deluge

[Support] Linuxserver.io - Deluge

[6.4.0, 6.4.1] shfs taking a lot of memory (~8GB now)

SHFS Memory Leak

[6.4.0, 6.4.1] shfs taking a lot of memory (~8GB now)

SHFS Memory Leak

[6.4.0, 6.4.1] shfs taking a lot of memory (~8GB now)

[6.4.0, 6.4.1] shfs taking a lot of memory (~8GB now)

[Support] Linuxserver.io - Transmission

[6.4.0, 6.4.1] shfs taking a lot of memory (~8GB now)

SHFS Memory Leak

SHFS Memory Leak

SHFS Memory Leak