Dockers freezing, array stuck in "stopping services" unable to reboot unless forced

acbaldwi · April 7, 2021

Hello,

Been trying to download a rather large amount of data via sabnzbd, the dockers freeze up then I'm unable to even stop the array and reboot the machine without hard powering it off then back on. Tried rebuilding the docker.img and that didn't seem to help.

At this point im at a loss if it is sabnzbd or something else horribly wrong here..... I will say sabnzbd fills up the cache (within say 50gb) then refuses to continue to write to the array even though the share is set to "yes: cache"

any help will be appreciated

TIA

argos-diagnostics-20210407-1501.zip

acbaldwi · April 7, 2021

so quick update i was able to get it to reboot (took like 15 minutes till it shutdown for it to reboot gracefully) sicne then i restarted it with only the plex docker ran a show for about say 10 minutes before plex froze.

I then tried to stop plex and it is "spinning" and never actually stops

i see this in the log right now

Apr 7 15:37:47 Argos nginx: 2021/04/07 15:37:47 [error] 10243#10243: *3831 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.1.147, server: , request: "POST /plugins/dynamix.docker.manager/include/Events.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "argos.local", referrer: "https://argos.local/Docker"

but i think thats just my local pc timing out on the web gui

trurl · April 7, 2021

35 minutes ago, acbaldwi said:

fills up the cache (within say 50gb) then refuses to continue to write to the array even though the share is set to "yes: cache"

Each User Share has a Minimum Free setting.

Unraid has no way to know how large a file will become when it chooses a disk for it. If a disk has more than Minimum, the disk can be chosen. If the disk is chosen and the file doesn't fit the write fails. You must set Minimum Free to larger than the largest file you expect to write to the share.

Before 6.9 multiple pools, cache also had Minimum Free in Global Share Settings. Looks like you had that set to only 2GB but not relevant now.

With multiple pools, that setting is per pool. You set it on the page for the pool by clicking on the first disk in the pool. There have been reports that setting doesn't stick but it is working for me.

Not really the cause your your problems though. See MY next post.

Squid · April 7, 2021

Your share B....p exists on the cache drive. Did you change it's Use Cache Settings? Since it's currently set to Use Cache No, it won't ever get moved. You need to change it to Use Cache Yes, run mover and then set it back to No when Mover is finished

trurl · April 7, 2021

Apr  7 12:17:33 Argos emhttpd: shcmd (398): /usr/local/sbin/mount_image '/mnt/user/system/docker.img' /var/lib/docker 64
Apr  7 12:17:33 Argos kernel: BTRFS: device fsid 828ceb49-3940-4bd8-b2a0-6a61c1eabda6 devid 1 transid 2974 /dev/loop2 scanned by udevd (7798)
Apr  7 12:17:33 Argos kernel: BTRFS info (device loop2): using free space tree
Apr  7 12:17:33 Argos kernel: BTRFS info (device loop2): has skinny extents
Apr  7 12:17:33 Argos kernel: BTRFS info (device loop2): enabling ssd optimizations
Apr  7 12:17:33 Argos root: Resize '/var/lib/docker' of 'max'
Apr  7 12:17:33 Argos emhttpd: shcmd (400): /etc/rc.d/rc.docker start
...
Apr  7 14:29:19 Argos kernel: blk_update_request: critical space allocation error, dev loop2, sector 8217280 op 0x1:(WRITE) flags 0x100000 phys_seg 3 prio class 0
Apr  7 14:29:19 Argos kernel: blk_update_request: critical space allocation error, dev loop2, sector 8226024 op 0x1:(WRITE) flags 0x100000 phys_seg 46 prio class 0
Apr  7 14:29:19 Argos kernel: blk_update_request: critical space allocation error, dev loop2, sector 9003232 op 0x1:(WRITE) flags 0x100000 phys_seg 3 prio class 0
Apr  7 14:29:19 Argos kernel: blk_update_request: critical space allocation error, dev loop2, sector 23519128 op 0x1:(WRITE) flags 0x100000 phys_seg 14 prio class 0
Apr  7 14:29:19 Argos kernel: blk_update_request: critical space allocation error, dev loop2, sector 23519256 op 0x1:(WRITE) flags 0x104000 phys_seg 128 prio class 0
...

And next post

trurl · April 7, 2021

You have filled and corrupted docker.img even though you have given it 64G

20G is usually more than enough, and making it larger won't fix filling it, it will only make it take longer to fill.

The usual cause of filling docker.img is an application writing to a path that isn't mapped. Linux is case-sensitive, so any application path must match a mapped container path, including upper/lower case.

trurl · April 7, 2021

Also, your system share has files on disk7. Probably a docker.img that got created somewhere along the way when you were filling them up and rebooting.

trurl · April 7, 2021

Also, be sure to let your unclean shutdown parity check complete when you get things working better.

You were getting parity sync errors, so you will have to run a correcting parity check to fix those, then follow it with a non-correcting check to verify they were all fixed. Exactly zero sync errors is the only acceptable result and until you get there you still have work to do.

acbaldwi · April 7, 2021

16 minutes ago, trurl said:

e filled and corrupted docker.img even tho

Thanks, ive rebuilt the docker.img again and only enabled plex for now. is there any way to tellw hich docker it was that corrupted it? that would let me narrow down the culprit. though i suspect it's sab

trurl · April 8, 2021

3 hours ago, acbaldwi said:

suspect it's sab

Post docker run as explained at the very first link in the Docker FAQ

acbaldwi · April 8, 2021

12 hours ago, trurl said:

Post docker run as explained at the very first link in the Docker FAQ

Thanks again,

Here is radarr

root@localhost:# /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker create --name='binhex-radarr1' --net='bridge' -e TZ="America/Denver" -e HOST_OS="Unraid" -e 'UMASK'='000' -e 'PUID'='99' -e 'PGID'='100' -p '9878:7878/tcp' -v '/mnt/user/data/':'/data':'rw' -v '/mnt/user/appdata/binhex-radarr':'/config':'rw' 'binhex/arch-radarr'
e20cc14ece897e91a340f8510d4b4eed921de5252b1e065b1480f23d22df8e08

The command finished successfully!

Here is sabnzb

root@localhost:# /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker create --name='binhex-sabnzbd1' --net='bridge' -e TZ="America/Denver" -e HOST_OS="Unraid" -e 'UMASK'='000' -e 'PUID'='99' -e 'PGID'='100' -p '8080:8080/tcp' -p '8090:8090/tcp' -v '/mnt/user/data/usenet/':'/data/usenet':'rw' -v '/mnt/user/appdata/binhex-sabnzbd':'/config':'rw' 'binhex/arch-sabnzbd'
89c5f02a57a878eed0396ff5faae5c38643f914c2d6858b74e298e3df7ba0106

Here is Sonarr

root@localhost:# /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker create --name='binhex-sonarr1' --net='bridge' -e TZ="America/Denver" -e HOST_OS="Unraid" -e 'UMASK'='000' -e 'PUID'='99' -e 'PGID'='100' -p '9989:8989/tcp' -p '9899:9897/tcp' -v '/mnt/user/data/':'/data':'rw' -v '/mnt/user/appdata/binhex-sonarr':'/config':'rw' 'binhex/arch-sonarr'
3375b940fc90f617b50ff41e9729d06dbbca3c89e14b591ae521c527b06e8788

Here is plex

root@localhost:# /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker create --name='plex21' --net='host' -e TZ="America/Denver" -e HOST_OS="Unraid" -e 'VERSION'='docker' -e 'NVIDIA_VISIBLE_DEVICES'='' -e 'TCP_PORT_32400'='32400' -e 'TCP_PORT_3005'='3005' -e 'TCP_PORT_8324'='8324' -e 'TCP_PORT_32469'='32469' -e 'UDP_PORT_1900'='1900' -e 'UDP_PORT_32410'='32410' -e 'UDP_PORT_32412'='32412' -e 'UDP_PORT_32413'='32413' -e 'UDP_PORT_32414'='32414' -e 'PUID'='99' -e 'PGID'='100' -v '/mnt/user/Movies/':'/movies':'rw' -v '/mnt/user/Tv/':'/tv':'rw' -v '/mnt/user/':'/music':'rw' -v '/mnt/user/Transcode/plextmp/':'/plextranscode':'rw' -v '/mnt/user/Movies_archive/':'/archivemovies':'rw' -v '/mnt/user/Tv_Archive/':'/archivetv':'rw' -v '/mnt/user/Home_Movies/':'/homemovies':'rw' -v '/mnt/user/TV_RECORDINGS/':'/recordedtv':'rw' -v '':'/transcode':'rw' -v '/mnt/user/data/media/':'/data':'rw' -v '/mnt/user/appdata/plex':'/config':'rw' 'linuxserver/plex'
d6079193cd6ce613cd3e2773a255852cd23a17dd84127fcb49a9108f1ca680c7

trurl · April 8, 2021

What do you have in Plex for the transcode folder?

acbaldwi · April 8, 2021

I added this last night after we all started talking.... before it was blank.....

image.png.6554b4f9d03487ecf7838ed33e143872.png

Edited April 8, 2021 by acbaldwi

acbaldwi · April 8, 2021

Sadly plex just crashed again saying it cant find the media, tried restarting the docker and it just spun never restarted... new logs here

argos-diagnostics-20210408-1000.zip

Edited April 8, 2021 by acbaldwi

acbaldwi · April 8, 2021

All of the dockers are doing the same thing cant shut down or restart

trurl · April 8, 2021

reboot and post new diagnostics

acbaldwi · April 8, 2021

New diags are here, sorry it took a minute after rebooting none of my drives showed so i unplugged and replugged all drives good now....

argos-diagnostics-20210408-1100.zip

trurl · April 8, 2021

2 hours ago, acbaldwi said:

after rebooting none of my drives showed

A lot of disks, are you sure you don't have a power problem?

acbaldwi · April 8, 2021

Supermicro 24 bay chassis server lots of juice

acbaldwi · April 8, 2021

4 hours ago, trurl said:

A lot of disks, are you sure you don't have a power problem?

docker crashed again rebuilding the docker.img here are the most recent logs before a reboot

argos-diagnostics-20210408-1751.zip

trurl · April 9, 2021

This time you filled up cache which corrupted docker.img

acbaldwi · April 9, 2021

15 minutes ago, trurl said:

This time you filled up cache which corrupted docker.img

I have a funny feeling that may be the root of all my current evils

though i have the data share set to use cache but then write to the array when it fills it appears that it is not doing so and thus fills and kills.... i guess for now ill make it dl to the array direct see if that helps.... seems to be a waste of 2x 1tb nvme cache drives lol

trurl · April 9, 2021

Each User Share has a Minimum Free setting. Unraid has no way to know how large a file will become when it chooses a disk for it. If a disk has more than Minimum, the disk can be chosen and if the file is too large it will fill the disk and fail.

You must set Minimum Free to larger than the largest file you expect to write to the share.

For cache to overflow to the array, it must decide cache doesn't have enough free space when it chooses a disk for the file. Previous versions with only cache pool had a Minimum for cache in Global Share Settings.

Now with multiple pools, that setting is for each pool by clicking on the first disk of the pool to get to its settings page.

You must set Minimum Free for the pool to larger than the largest file you expect to write to the pool, and then cache-yes and cache-prefer shares will overflow to the array if the pool has less than Minimum Free.

acbaldwi · April 9, 2021

1 minute ago, trurl said:

Each User Share has a Minimum Free setting. Unraid has no way to know how large a file will become when it chooses a disk for it. If a disk has more than Minimum, the disk can be chosen and if the file is too large it will fill the disk and fail.

You must set Minimum Free to larger than the largest file you expect to write to the share.

For cache to overflow to the array, it must decide cache doesn't have enough free space. Previous versions with only cache pool had a Minimum for cache in Global Share Settings.

Now with multiple pools, that setting is for each pool by clicking on the first disk of the pool to get to its settings page.

You must set Minimum Free for the pool to larger than the largest file you expect to write to the pool, and then cache-yes and cache-prefer shares will overflow to the array if the pool has less than Minimum Free.

I see those settings but they are greyed out and cant be changed any idea why?

itimpi · April 9, 2021

2 minutes ago, acbaldwi said:

I see those settings but they are greyed out and cant be changed any idea why?

Have you tried stopping the array?

Dockers freezing, array stuck in "stopping services" unable to reboot unless forced

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation