Ryonez

Members
  • Posts

    159
  • Joined

  • Last visited

Everything posted by Ryonez

  1. Issue is present in version 6.12.0 as well. It is getting very tedious to baby docker along... I can't even do backup because it'll trigger this. Here's the latest diagnostics. atlantis-diagnostics-20230616-0553.zip
  2. So, for some behavioural updates. Noticed my RAM usage was at 68, not the 60 it normally rests at. Server isn't really doing anything, so figured it was from the two new containers, diskover and elasticsearch, mostly from elasticsearch. So I shut that stack down. That caused docker to spike the CPU and RAM a bit. The containers turned off in about 3ish seconds, and RAM usage dropped to 60. Then, docker started doing it's thing: Ram usage started spiking up to 67 in this image, and CPU usage is spiking as well. After docker settles with whatever it's doing: This is from just two containers being turned off. This gets worse the more docker operations you preform.
  3. Diskover just happened to be what I was trying out yesterday when the symptoms occurred yesterday. It has not been present during the earlier times the symptoms occurred. After rebuilding docker yesterday I got diskover working and had it index the array, with minimal memory usage (I don't think I saw diskover go above 50mb). I'm not really sure how this is meant to help with finding folders using memory. It actually suggesting it's moving log folders to ram, which would increase the ram usage. If the logs are causing a blowout of data usage, that usage would go to the cache/docker location, not memory. It might be good to note I use docker in the directory configuration, not the .img, so I am able to access those folders if needed.
  4. Just a heads up on my progress from this morning. My containers are back up, didn't pass 50% of my RAM during this. So currently up and running until docker craps itself again/
  5. I don't believe this to be an issue with the docker containers memory usage. For example, the three images I was working on today: scr.io/linuxserver/diskover docker.elastic.co/elasticsearch/elasticsearch:7.10.2 alpine (elasticsearch-helper) diskover wasn't able to connect to elasticsearch, elasticsearch was crashing due to perm issues I was looking at, and elasticsearch-helper runs a line of code and shuts down. I don't see any of these using 10+GB of ram in a non functional state. The system when running steady uses 60% of 25GB of ram. And this wasn't an issue with the set of containers I was using until the start of this month. I believe this to be an issue in docker itself currently.
  6. When working with docker, I've been having issues for the past month, though not without finding what exactly the trigger is. The first time this happened was when I update all containers, around the 1st of May for me. Forced an unclean reboot after oom-killer was triggered, and the containers loaded up all right after the OS came back up, albeit with high CPU usage, though I can't remember if the memory usage spiked at the time. The second was a few days later, May the 4th for me. Again triggered during an update, forced another unclean reboot after oom-killer was triggered, this time the containers did not come back up, docker chocked the system up both on CPU and memory. Symptoms: Changes to the docker image seem to illicit high CPU usage from docker. Is also causes high spikes in memory usage. The spike in memory usage can memory to get completely used, triggering the oom-killer. The more containers in use, the more likely it seems to cause this every time. The amount required to trigger oom-killer is unknown currently. Things tried to resolve this: Checking the docker image filesystem (Result: No problems). Deleting and rebuilding the docker image, several times. Did a cache filesystem check (BTRFS, Result: Errors found.) Replaced the cache drives completely. Re-flashing the the boot USB and only copying configs back over. Starting shifting almost all my docker container configs to portainer, as docker template duplicate fields and restores unwanted fields from template updates. "Currently" (I'm having to rebuild the docker image again atm), only 3-4 containers are unRaid managed. Currently it seems if I only load containers up in small batches, I can get every container up. But only from a clean docker image, or when editing a small number of containers at a time (this I'm feeling iffy on). For the first diagnostic log attached (atlantis-diagnostics-20230507-1710), this seems to have logs from the 4th of May mentioned above. It'll have the logs from when docker was no longer able to be started. I ended up leaving docker off when it triggered the oom-killer. the log on the 7th shows that zombie process from the docker containers where preventing unmounting the cache. and unclean reboot was done. For the second diagnostic log attached (atlantis-diagnostics-20230521-1029), I was working on getting a small container stack of 3 images going. But I was configuring them, meaning they were reloaded several times in a row, and seemly creating a situation where the CPU and Memory spikes got stacked up enough for docker to trigger the oom-killer. Marking this as urgent as docker killing itself via oom-killer is leaving a big risk of data corruption, along with leaving docker pretty much useless currently. atlantis-diagnostics-20230507-1710.zip atlantis-diagnostics-20230521-1029.zip
  7. I've been experiencing what appears to be this issue. I've been shifting almost all of my docker management over to Portainer. My best guess so far is that the templates added fields back into templates that started screwing with the docker image and broke things somehow. I've been trying pretty much everything, deleting the docker img, even replaced the cache drives, reformatting the os drive. The only things kept that entire time was the config files. There's still some dockers to shift over, but most are on Portainer, and things seem to be steady. This issue first appeared when updating some images. Otherwise, I'd been running what I had for over a year.
  8. Changing the port was not done to secure it. It was done so a docker container could use it. Only specific ports are exposed, wireguard is used to connect when out of network for management, as management ports are not meant to be exposed. The issue is unRaid starts it with default configs, creating a window of attack until it restarts services with the correct configs. This is not expected behaviour. Diagnostics have been added to the main post.
  9. Hi there. I had to reboot my server recently, and I received a `Possible Hack Attempt` notification from a plugin called `Fix Common Problems'. Looking into this, I think I've found a bug/security issue. The server's SSH port should be configured to port `1122`. However while the system is booting up, SSH is started before the set configuration is read, leading to port 22 being exposed. Now, I have port 22 forwarded to the server for a service running `Forgejo`, a code repository service. The reason port 22 is being used is to lessen client configuration, and to make it simpler for users. The bug here is unRaid is loading SSH with default values for a time, and later restarting with the correct config. This creates a window for attackers to attempt to access the system through something that isn't meant to be exposed. Steps to reproduce: 1. Set a different port for SSH through unjRaid's web UI. 2. Reboot the server. 3. There is now a windows of opportunity to attack the server via the default SSH port as it starts. Plugins extend this window. Some of the Logs: SSH being started with port 22: SSH login attempts being made during this window of opportunity: There's more than just those. SSH being restarted with the correct configuration: I'll also make a quick note it looks like smbd is having the same treatment, it's loaded with defaults, then restarted with the proper configs later. So it appears there might be several system services being started like this when they shouldn't. Will make this as urgent, though I'll admit I'm not 100% if SSH will accept the correct password. This server runs a chat service, so uptime is import for me, and testing would interrupt that. Please adjust the urgency as needed if it's not as bad as I think it might be. atlantis-diagnostics-20230227-0254.zip
  10. Been working with a VaultHunters3 server using your image, and I've noticed the the minecraft servers aren't being cleanly shutdown when the container is shutdown. Is there a way to fix this? Clean shutdowns are very much preferred.
  11. Second week in a row: Apr 25 02:10:37 Atlantis CA Backup/Restore: docker stop -t 60 atlantis-yoko Apr 25 02:10:37 Atlantis CA Backup/Restore: Backing up USB Flash drive config folder to Apr 25 02:10:37 Atlantis CA Backup/Restore: Using command: /usr/bin/rsync -avXHq --delete --log-file="/var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log" /boot/ "/mnt/user/Backups/machine_specific_backups/atlantis_flash/" > /dev/null 2>&1 Apr 25 02:10:42 Atlantis CA Backup/Restore: Changing permissions on backup Apr 25 02:10:42 Atlantis CA Backup/Restore: Backing up libvirt.img to /mnt/user/Backups/machine_specific_backups/atlantis_libvirt/ Apr 25 02:10:42 Atlantis CA Backup/Restore: Using Command: /usr/bin/rsync -avXHq --delete --log-file="/var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log" "/mnt/user/system/libvirt/libvirt.img" "/mnt/user/Backups/machine_specific_backups/atlantis_libvirt/" > /dev/null 2>&1 Apr 25 02:10:42 Atlantis CA Backup/Restore: Changing permissions on backup Apr 25 02:10:42 Atlantis CA Backup/Restore: Backing Up appData from /mnt/user/appdata/ to /mnt/user/Backups/server_backups/atlantis_docker_appdata/[email protected] Apr 25 02:10:42 Atlantis CA Backup/Restore: Using command: cd '/mnt/user/appdata/' && /usr/bin/tar -cvaf '/mnt/user/Backups/server_backups/atlantis_docker_appdata/[email protected]/CA_backup.tar' --exclude "alteria-plex/Library/Application Support/Plex Media Server/Metadata" --exclude "0-pihole" --exclude "alteria-postgres" --exclude "alteria-hydrus-server" * >> /var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log 2>&1 & echo $! > /tmp/ca.backup2/tempFiles/backupInProgress Apr 25 02:10:42 Atlantis CA Backup/Restore: Backup Complete Apr 25 02:10:42 Atlantis CA Backup/Restore: Verifying backup Apr 25 02:10:42 Atlantis CA Backup/Restore: Using command: cd '/mnt/user/appdata/' && /usr/bin/tar --diff -C '/mnt/user/appdata/' -af '/mnt/user/Backups/server_backups/atlantis_docker_appdata/[email protected]/CA_backup.tar' > /var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log & echo $! > /tmp/ca.backup2/tempFiles/verifyInProgress Apr 25 02:10:42 Atlantis kernel: br-55ef392db062: port 1(veth391173c) entered blocking state Apr 25 02:11:43 Atlantis kernel: br-55ef392db062: port 70(vetha9a6916) entered forwarding state Apr 25 02:11:44 Atlantis CA Backup/Restore: ####################### Apr 25 02:11:44 Atlantis CA Backup/Restore: appData Backup complete Apr 25 02:11:44 Atlantis CA Backup/Restore: ####################### Apr 25 02:11:44 Atlantis CA Backup/Restore: Deleting /mnt/user/Backups/server_backups/atlantis_docker_appdata/[email protected] Apr 25 02:11:45 Atlantis CA Backup/Restore: Backup / Restore Completed That's two weeks of missed backups, this is a big issue.
  12. Just had the plugin fail to backup. Looking at the logs, everything worked fine, verification was good, but the backup file just does not exist. Apr 18 02:10:41 Atlantis CA Backup/Restore: docker stop -t 60 atlantis-yoko Apr 18 02:10:41 Atlantis CA Backup/Restore: Backing up USB Flash drive config folder to Apr 18 02:10:41 Atlantis CA Backup/Restore: Using command: /usr/bin/rsync -avXHq --delete --log-file="/var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log" /boot/ "/mnt/user/Backups/machine_specific_backups/atlantis_flash/" > /dev/null 2>&1 Apr 18 02:10:44 Atlantis CA Backup/Restore: Changing permissions on backup Apr 18 02:10:44 Atlantis CA Backup/Restore: Backing up libvirt.img to /mnt/user/Backups/machine_specific_backups/atlantis_libvirt/ Apr 18 02:10:44 Atlantis CA Backup/Restore: Using Command: /usr/bin/rsync -avXHq --delete --log-file="/var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log" "/mnt/user/system/libvirt/libvirt.img" "/mnt/user/Backups/machine_specific_backups/atlantis_libvirt/" > /dev/null 2>&1 Apr 18 02:10:44 Atlantis CA Backup/Restore: Changing permissions on backup Apr 18 02:10:44 Atlantis CA Backup/Restore: Backing Up appData from /mnt/user/appdata/ to /mnt/user/Backups/server_backups/atlantis_docker_appdata/[email protected] Apr 18 02:10:45 Atlantis CA Backup/Restore: Using command: cd '/mnt/user/appdata/' && /usr/bin/tar -cvaf '/mnt/user/Backups/server_backups/atlantis_docker_appdata/[email protected]/CA_backup.tar' --exclude "alteria-plex/Library/Application Support/Plex Media Server/Metadata" --exclude "0-pihole" --exclude "alteria-postgres" --exclude "alteria-hydrus-server" * >> /var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log 2>&1 & echo $! > /tmp/ca.backup2/tempFiles/backupInProgress Apr 18 02:10:45 Atlantis CA Backup/Restore: Backup Complete Apr 18 02:10:45 Atlantis CA Backup/Restore: Verifying backup Apr 18 02:10:45 Atlantis CA Backup/Restore: Using command: cd '/mnt/user/appdata/' && /usr/bin/tar --diff -C '/mnt/user/appdata/' -af '/mnt/user/Backups/server_backups/atlantis_docker_appdata/[email protected]/CA_backup.tar' > /var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log & echo $! > /tmp/ca.backup2/tempFiles/verifyInProgress Apr 18 02:10:45 Atlantis kernel: br-55ef392db062: port 1(veth2d7642e) entered blocking state Apr 18 02:11:46 Atlantis kernel: br-55ef392db062: port 70(veth7449b16) entered forwarding state Apr 18 02:11:46 Atlantis CA Backup/Restore: ####################### Apr 18 02:11:46 Atlantis CA Backup/Restore: appData Backup complete Apr 18 02:11:46 Atlantis CA Backup/Restore: ####################### Apr 18 02:11:46 Atlantis CA Backup/Restore: Backup / Restore Completed Looking at the folder the backup should be in, it's empty. Any idea what happened?
  13. I'm unable to fill a var via a command via scripts currently. Going back, I can see it stopped working correctly on the 14/12/21 (NZT). The command is this: #!/bin/bash -x DBLIST="$(docker exec -t alteria-postgres psql -U postgres -d postgres -q -t -A -c 'SELECT datname from pg_database')" echo "${DBLIST}" I'm at a loss atm, I can't think what would stop this, and the command can be run manually in the terminal, returning the list of databases correctly. The failure was very wonky as well. It saved all the databases fully, then I got zero bit files, then some had data but not all, then the script started running multiple times before outright just stopping. Overview of the base folder: Last Good Backup: When it started messing up Would anyone have any ideas what on earth happened and how to get this working?
  14. To be fair, this container can run without mysql. It's just with the template enforcing yanking the `DB_HOST` field back in it'll keep breaking the container. Honestly that's something that needs to be exposed to users, a way to tell CA to not change/update the template/s. As it stands the container is working fine with sqlite, showing mysql is not needed. While it has been a annoying experience for the user, seeing a requirement for something that's not actually needed would likely turn them away completely. If the `requires` tag is to be used, it should be used only for things that are actually required, otherwise it losses it's purpose.
  15. The detection doesn't have anything to do with the template. The template is basically a form someone has stuck together, telling unRaid where to pull the container, what env vars the user is expected to provide, and some other information, like support links. I found out the `DB_HOST` was causing the issue for looking a mysql server as I looked at the files used to make the container image, finding the relevant check here. Here is where you can find information on how templates are made and used: https://wiki.unraid.net/DockerTemplateSchema Also, templates gotten from CA will replace removed template values. `DB_HOST` will return at some point. I have this issue with templates exposing ports that just should not be exposed and it drive me freaking crazy. This is something managed by the CA plugin. The way to "fix" this is to remove the `TemplateURL` value from the template. You won't get updates for the template anymore, but it will stop the template from changing to what can be undesired or even unsafe values for your environment. You'll have to manually edit the template file on the usb (via the unraid console or ssh). Open the unRaid Console Navigate to `/boot/config/plugins/dockerMan/templates-user` Look for the xml file with the name matching the name of the template you're wanting to edit (This is the "Name" field at the top of the update container page for the container). `nano <name>.xml` should work. Ctrl+x and confirm save to close once the change are made. Find the template url field, it'll look like ` <TemplateURL>https://raw.githubusercontent.com/selfhosters/unRAID-CA-templates/master/templates/openldap.xml</TemplateURL>` Chance it to be just ` </TemplateURL>` Save the file. It might help to think of the CA Templates as just shortcuts. They are great for quickly setting things up, but they are bad for forcing values you might not want around on you. And don't worry to much about my time, I have plenty to spare >.<
  16. Right, hammered at this a little and got it kinda working. Delete the `config.json` file. It's setup incorrectly anyway (example config is here), but we'll have the container remake it. Remove `DB_HOST` from the template. This is what is triggering the search for a mysql database. Add a new variable to the template. Call it something like "DB_URL". The key: "CMD_DB_URL". The Value: "sqlite:///config/hedgedoc.sqlite". Don't include the quotes. Start the container, navigate to the exposed port. It might look kinda broken, that's because the config isn't fully setup. Google from this point should be able to help you fill out the config as needed, like why the page looks broken. I won't be to much help here, my set-up uses a database and a domain with a reverse proxy. It's also been a while since I've done it. Hope this helps.
  17. I should clarify, I am using ssd cache pools. I used hdd incorrectly in this instance. However, I'd still prefer to not have several database systems having a go at them, it feels much better having one system managing the read/writes and queuing.
  18. I'm not sure what you mean by docker layering sorry. But there's no issue using the template from CA (Community Apps), it's what I started with. Personally, I feel that's more for the people who make the application to tell you. Linux Server typically makes containers for applications that might not have one already, or to or to make a container that's better than the official one that might be provided. Linux Server have provided a link to the application configuration details from the Application Setup section of the doc, and state afterwards that example they provide uses mysql. In HedgeDoc's configuration doc is a basics section which has a link to the different databases you can use here. One of the options is an SQLite option, which is a single file database thing. Looking at the manual-setup here. You could drop the need for a database container by popping this into your config file for the db: "db": { "dialect": "sqlite", "storage": "/config/db.hedgedoc.sqlite" }, That should work for the CA Template. You can probably ignore the "DB_" sections of the template, the config using sqlite should mean it ignores those values.
  19. I'm not to much worried about the storage, more with having several database systems potentially thrash the hdd's. Might not be a big issue at the start, but it never hurt to learn and understand the systems you use and try to use them effectively. Plus, if you need to work on them, you can get at the from one point of entry. To be fair here, I'm 100% I don't do things effectively. There's so many different components, getting it all right sucks.
  20. It's considered bad practice to lump major containers from outside of your project into your container as far as I know. That's not to say there aren't cases where it's a good idea, for example building upon the base images, or maybe using the ngnix image if your thing needs a web service to function. Databases are one of those things where it's better to host one database container, then have other containers talk to that and use their own database in there. For example, one database container can host the databases for hedgedoc, keycloak, kanboard, etc. You have one container managing them all. Next to the install button (for hedgedocs in the Community Apps page) is a support button. Clicking that shows a Read Me First link at the top. This links to what's used to generate the docs you have linked above it seems. Hedgedoc does require a db. I'm using postgres as mysql database container, which my hedgedoc talks to. I suggest spending some time learning about databases. Don't just make a root user and password and give that to hedgedoc. You'll want to use this for other things down the line, so make a user and database for each service to use. I'd recommend postgres for the database, and pgadmin4 as a web tool you can use to work on the databases. If you are serving things to others (I consider it a thing to just do even if you're the only one using things), I really recommend securing things behind a reverse proxy with a valid https and a domain. I have my containers on a user defined bridge network, and only really expos ports where needed, like for the reverse proxy. You don't need to expose postgres's ports (like 5432). Exposing lets things on the lan access that port, container images don't need that. If you do do that, be awere anything with a templateurl will add then back on later, which is frankly frustrating as hell. I do take a more "I want to secure things as much as I can" approach. It's a lot of work, not everyone will need or be bothered by it. The docs and use of LinuxServer's Swag container would be a good place to start for the last little bit I dumped on ya: https://docs.linuxserver.io/general/swag Servers administration is a handful, good luck.
  21. This can be fixed by changing the container's config data path to data. This leads to another error about ssl for me. https://github.com/getferdi/server/issues/82
  22. @Frank Ruhl Please calm down on the caps, it's considered as shouting and thus rude. Your reply here seems to be about two things. One is the timeframe for getting a replacement key manually. There is a system in place were you can replace a key yourself, but it has a one year cooldown between uses. You've made no mention of weather or not you've even attempted this. If you have, there are suggestions above for using a trial key, which did work for me until my key was replace by the UnRAID team. The second thing seems to be some sort of issue? I recommend making a post in the general support forum. That is where you'd get support for problems, this is a feature request thread and isn't related to the problem. Good luck, I hope you are able to get help sorting it all out.