• Posts

  • Joined

  • Last visited


  • Gender

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

Shonky's Achievements


Rookie (2/14)



  1. Jinxed it. Caught it happening again tonight so none of the above things fixed anything. I only touched two dockers to bring it back around. Gitea and Photoprism. The latter did seem to be scanning. I'm only playing with it so killed it pretty quickly. Previously it did seem like CrashPlanPRO was a cause, but not always. Seems like dockers that generate a lot of IO (Plex scanning, Photoprism scanning, CrashPlanPRO indexing/scanning etc ) tend to trigger it off. Stop one or two of them or even others sometimes and it comes back around. Almost like it's disk thrashing on a mechanical disk.
  2. @DingHo Now that you say that, I do also run plex and had some issues with the scheduled scan failing/timing out/something on a couple of random files. I think they were video files and I ended up removing them. If it was those files specifically, it wasn't 100% repeatable because it would have happened every night. I don't remember if if I was looking at the Plex issue because of the loop2 reads or independently. Sorry that's not really helpful. If you turn on debug in plex it's pretty easy to find the specific files causing the problem. And you can trigger the full scan manually. The scan runs but is killed after an hour from memory if it's taking too long.
  3. I've not made any definite change that fixed it but I haven't had a re-occurrence for a while now. Running 6.9.2 but I'm pretty certain that I was still getting it with 6.9.2 also. Not that 6.9.2 had changed expected to fix this. I think the most significant thing I did change was I moved my cache drive SSD SATA connection from the HP Microserver ( Gen 8 ) motherboard and used the PCIe SATA card I had installed. The main reason I did was that it was limited to SATA II speeds. I can't say 100% but at least since then I've not had an issue. The other thing I had done was that I changed the SSD itself which meant recreating the whole docker.img filesystem. No improvement from that change.
  4. buff/cache is RAM allocated to buffers by the OS. It is not allocated RAM and is available if something tries to allocate RAM. I doubt there is any significant difference in the buff/cache handling in unRAID. It's still pretty close to a regular Linux kernel. If unRAID was filling its ram disk (and you didn't have enough), you'll start seeing OOM errors. My system shows / as 4.8GB total with about 900MB used. I don't have a netdata docker and haven't tried it either.
  5. I don't think it's your unhealthy container, but have nothing to back that. If I catch mine in the early stages, things like docker ps, docker stop/start etc work ok. Usually by the time I catch it though, it's because something like Plex or Pi-Hole dockers are being blocked from doing anything. It's too late then and docker commands just get blocked until it comes back. So I can't properly shut down a docker and have to kill some processes that docker is running. I've toyed with just as a work around, something monitoring system load. Any time it gets over say 20 or 30 even on 1 minute and restart a docker perhaps after capturing some logs. Whilst CrashPlanPRO seems to be involved I have had occasions where restarting other very lightweight dockers was enough. Unfortunately mine does it even less than it used to now.
  6. I am fairly certain it wasn't RAM related for me. I have 10GB and whilst I run quite a few dockers I don't think I'm hitting any memory issues. Looking at my screenshot of top here, shows ~2GB cached so not running out of memory. If that were an issue you should start seeing oom logs. In the general process of things I have replaced my cache drive (so new filesystem on it and then re-created the docker.img which is loop2 so new filesystem there too). I also moved if from the SATA II only internal port to a spare SATA III port I had on an add in card. It does appear to be happening less, but I had another instance the other day so that certainly hasn't resolved it. My CPUs (4C/8T Xeon E3) weren't getting pegged, but there was a lot of iowaiting happening.
  7. Upgraded to 6.9.1 from 6.8.3. Some messages now seem to report as "devX" for the disks rather than "sdX". Why was this changed? e.g. old warning Event: unRAID device sdi SMART health [187] Subject: Warning [MARS] - reported uncorrect is 117 Description: ST2000DL003-9VT166_5YD2SZRE (sdi) Importance: warning New warning Event: Unraid device dev1 SMART health [187] Subject: Warning [MARS] - reported uncorrect is 117 Description: ST2000DL003-9VT166_5YD2SZRE (dev1) Importance: warning Yes I can tell from the serial number but what is dev1, dev2 etc? Some messages still use sdX Event: Unraid Disk 5 message Subject: Notice [MARS] - Disk 5 returned to normal utilization level Description: WDC_WD40EZRX-00SPEB0_WD-WCC4E0262246 (sdg) Importance: normal Ah I see "Dev 1" and "Dev 2" as the the first column of unassigned devices on the Main page. Then I guess the question becomes, why doesn't the space message use "Disk 1" etc. Not consistent.
  8. I am still running 6.8.3 and have this issue, so I don't think this is related to 6.9.x. I do run Plex but it's not a transcoding issue. Are you running Crashplan? I don't believe Crashplan is specifically the problem, but perhaps somehow causes it more often. At the moment my system is doing it within a few hours of starting the Crashplan docker.
  9. I'm getting a wierd double up Crashplan Docker. It names it "0" too. I definitely don't have two dockers and there's no old containers. Also strangely it's supplying the gotify icon for the image. I do have the gotify docker running too. Any ideas?
  10. I'm pretty certain Plex isn't the cause in my issue linked above. I re-created my docker.img btrfs with no effect. /dev/loop2 is the docker image that lives on the cache drive (usually). So it's not just used for Plex - but for any/all dockers you may have running. Note that the issue we are seeing is lots of *reads* (this thread and mine linked above), not writes on the /dev/loop2 device.
  11. Yes but that's the symptom, but not the cause. Just shows that lots of reads blocks docker from working properly. Why does something all of sudden start hammering the docker image file is the issue. In my case moving the docker image file to another disk probably won't help. Lots of IO on the drive containing the docker image in turn affects docker. So if the high reads occur on the docker image in the new location, docker will still be affected.
  12. It just happened again. Top says 2000MB cached. Top in the above screen shot shows 1800MB cached. So I don't think RAM is running out. I also remembered I can simulate the symptoms with a big torrent doing a force recheck in qbitorrent on a huge file (46GB). That doesn't of course help much other than show that high read IO on my cache drive (qbitorrent download loaction), brings docker down.
  13. Ah hadn't found much on this last I searched. I only just got around to posting my issue but I've been seeing it for quite a while now (>6 months). Sounds exactly like your issue:
  14. Been struggling with this one for a while I have quite a few Dockers running but nothing extremely intensive for long periods. Current running list is: Gitea nginx sabnzbd unifi-controller zigbee2mqtt jellyfin sonarr PlexMediaServer qbittorrent homeassistant pihole resilio-sync CrashPlanPRO gotify NodeRed-OfficialDocker docker-bubbleupnpserver After some period of time, sometimes it can occur in minutes sometimes it runs fine for weeks, the CPU load average rapidly increases getting up around 100. When running normally it usually sits around 1-1.5. This appears to be a result of something reading the docker image file amongst other things. iotop shows high read rate on loop2 (/dev/loop2 is the docker image) and also a number of the apps seem to start reading for some unknown reason. e.g. from a while ago so dockers running are slightly different. There is no particular reason that all these apps would be reading simultaneously. and top (albeit at a different time): The unRAID server itself remains quite responsive, but all the dockers basically stop operating properly. e.g. pihole stops serving DNS queries. Webservers for others stop responding. "docker ps" sometimes hangs (well gets blocked somewhat indefinitely) The "fix" is generally to shutdown a docker and immediately everything returns to normal. Usually I shutdown CrashPlanPRO. Sometimes I can't shut it down and have to kill some of the Crashplan processes. Sometimes I can catch it early and a simple restart works fine. It's definitely not just Crashplan but that's the one I usually restart. Once restarted it runs fine again until the next time it happens. Docker image is 30GB and 9.3GB used. I have tried deleting and recreating the image from scratch with no improvement. I found some references to *writing* the docker image but my problem is reading. There was this post on reddit: Edit: I see this one here recently posted too: Unraid 6.8.3 HP Gen 8 Microserver with a Xeon E3-1265L 4C/8T CPU and 10GB RAM. Cache drive is a Samsung 830 256GB on a 3 Gb/s SATA port due to the Microserver's limited SATA port. docker.img is on the cache drive Array is 5 x 4TB + parity. 2 drives are in an external enclosure via a Marvell 88SE9230. Anyone have any ideas, it basically kills all my docker instances and happens semi regularly but randomly.
  15. Yeah I think I'd found that, but /tmp would have been fine. The log (and the workaround) shows that actually it's creating it in the wrong place. That would appear to be a Crashplan bug. It shouldn't be creating temporary files where its binaries etc are stored. That's precisely what /tmp is for. Code42 probably won't care about a docker installation, but I would expect this to potentially cause problems in other systems too as unprivileged users may also not be able to write there. Perhaps there's an unset configuration or environment variable it looks for that's missing as a docker?