97WaterPolo

Members
  • Posts

    27
  • Joined

  • Last visited

Everything posted by 97WaterPolo

  1. Hi everyone, I set up a little test scenario below that illustrates the symptoms of what I am experiencing, So I have the following criteria Virtiofs Mode for "/mnt/user/Backup/Logs/" => "logs" fstab entry of "logs /mnt/logs virtiofs ro,relatime,sync 0 0" Full rwe on the test file I executed the commands back and forth from left to right. root "ls -l" to display the current directory on unraid os alexander "ls -l" to display current directory on the virtual machine root "cat testfile" to display the content of the file on unraid alexander "cat testfile" to display the content of the file on the virtual machine root "sudo nano testfile" and append a string in nano root "cat testfile" to display the new content after nano modify alexander "cat testfile" still has the old content of the file alexander "ls -l" which relists the directory and refreshes some cache alexander "cat testfile" now has the new content from unraid drive. I have tried this routine numerous times and it seems to always to have the old file until I do "ls -l" or some background process that I am not aware of refreshes the file. So far the only thing that will for sure refresh the file is "ls -l", I could consistently "cat testfile" for 10+ runs and it won't change until a "ls -l" was ran, then it changed instantly. One of the things that I thought might be an issue was if the share was on a cache pool, but upon checking it is disabled for my "Backup" share. Any help would be greatly appreciated!! Thank you!
  2. Had another MCE error today, checked the logs and I got an unhelpful message. Is there a way to check what went wrong without running mce since my CPU doesn't support it? May 7 04:30:08 AINCRAD root: Fix Common Problems: Error: Machine Check Events detected on your server May 7 04:30:08 AINCRAD root: mcelog: ERROR: AMD Processor family 25: mcelog does not support this processor. Please use the edac_mce_amd module instead. May 7 04:30:08 AINCRAD root: CPU is unsupported May 7 04:30:12 AINCRAD root: Fix Common Problems: Warning: Docker Update Patch not installed aincrad-diagnostics-20230507-2218.zip
  3. Hi, I logged on today and saw this error in my Fix Common Problems tab! I’ve seen it once before about a month ago and I cleared it as I thought it was a fluke since I recently did a restart. Now that the server has been running for awhile and I got this error I’m hoping someone could point me in the right direction. I tried running mcelog, but I got mcelog: ERROR: AMD Processor family 25: mcelog does not support this processor. dule instead. CPU is unsupported Please use the edac_mce_amdmo I have also attached my diagnostics in the hope someone could help! I haven’t noticed any issues or failures since I got the message. Thank you!! aincrad-diagnostics-20230430-2239.zip
  4. Hi everyone, Little nervous given the topics of the forums that I've found from searching, especially the one starting with "TL;DR If you're seeing constant logs from avahi-daemon, beware, you probably got hacked." What has happened recently: For the past few months I've had random crashes where unraidos would freeze up, I thought it was because of my build as well as docker containers, etc but none of it has stopped it. I've disabled C-States, locked docker containers to the core, altered the power idling, and I still have random crashes ranging from 1 day to 3 months (increasing more recently) Starting today I've had logs spamming my syslog from avahi-daemon and that lead me on a search to the attached forum posts I checked my ifconfig and I've found a bunch of Network Interfaces I've never seen before (I'm used to eth0, lo, wgo, and br0. I had a bond0, and multiple br-XXXXXXX I've had issues connecting to my server VIA WireGuard and Tailscale. I was able to a few weeks ago but I tried this morning and no connection. I was unable to ping anything, hostnames (google.com) or numerical lookup of google (142.250.68.110) Docker Community Apps throws an error saying it can't retrieve a feed Common Fix and Problems reports that it can't connect to github.com Recently got alerts from my "Deco" app, which notifies my whenever a new device joins the network and I've been getting a few "UNKNOWN DEVICE HAS JOINED THE NETWORK". This pops up from time to time so never thought about it till now. Exposure of UnraidOS server (192.168.68.114): Port Forwardings 192.168.68.114 (UnraidOS Server) Internal: 6881 External: 6881 (nothing running on that port) 192.168.68.114 (UnraidOS Server) Internal: 51820 External: 51820 (Wireguard VPN Service) 192.168.68.48 (Nginx) Internal: 8080 External: 80 (Nginx Proxy Manager) 192.168.68.48 (Nginx) Internal: 4443 External: 443(Nginx Proxy Manager) Nginx Proxy Manager All of my docker containers are on br0 and all have static IPs assigned to them, and then I do routing of services I want to expose outside (like Jellyfin, or Gitea, etc) of my network. All have SSL certs I have numerous shares on my network and they all require a username and password to access Only recently starting working on my UnraidOS instance again last couple days. For the most part I leave it alone, but one of the things I wanted to do was create VLANs either at the Router level, Unraid Level, or Software level so I enabled Virtual Machines and installed the stock CentOS iso to play around with. I've made some VLANs VIA Unraid and also tried to do some VIA my router but none of it has worked so I've kind of reversed whatever I did. Please any advice would be appreciated, my server crashed at 3AM this morning so I had to do an unclean shutdown so my parity is currently rebuilding.
  5. @kodyorris Did you ever figure out what happened with your UnraidOS box. I just started having those logs in console today and I have no clue what's happening.
  6. Hi @kris_wk Do you have an example of the Avahi logs that you were talking about? I recently had some Avahi logs start spamming my syslog and I'm not exactly sure what it is? Google takes me to this thread and it's kinda shocking to read. My errors are the following Apr 5 20:18:16 AINCRAD avahi-daemon[7137]: Joining mDNS multicast group on interface vethde33b3b.IPv6 with address fe80::3c9e:1dff:fe6b:2d30. Apr 5 20:18:16 AINCRAD avahi-daemon[7137]: New relevant interface vethde33b3b.IPv6 for mDNS. Apr 5 20:18:16 AINCRAD avahi-daemon[7137]: Registering new address record for fe80::3c9e:1dff:fe6b:2d30 on vethde33b3b.*. Apr 5 20:18:19 AINCRAD kernel: br-870a6a64b157: port 6(vethde2abeb) entered disabled state Apr 5 20:18:19 AINCRAD kernel: vethcf927ee: renamed from eth0 Apr 5 20:18:19 AINCRAD avahi-daemon[7137]: Interface vethde2abeb.IPv6 no longer relevant for mDNS. Apr 5 20:18:19 AINCRAD avahi-daemon[7137]: Leaving mDNS multicast group on interface vethde2abeb.IPv6 with address fe80::c7:e4ff:fed0:ed63. Apr 5 20:18:19 AINCRAD kernel: br-870a6a64b157: port 6(vethde2abeb) entered disabled state Apr 5 20:18:19 AINCRAD kernel: device vethde2abeb left promiscuous mode Apr 5 20:18:19 AINCRAD kernel: br-870a6a64b157: port 6(vethde2abeb) entered disabled state Apr 5 20:18:19 AINCRAD avahi-daemon[7137]: Withdrawing address record for fe80::c7:e4ff:fed0:ed63 on vethde2abeb. Apr 5 20:18:21 AINCRAD kernel: veth6b779e3: renamed from eth0 Apr 5 20:18:21 AINCRAD kernel: br-870a6a64b157: port 7(vethde33b3b) entered disabled state Apr 5 20:18:21 AINCRAD avahi-daemon[7137]: Interface vethde33b3b.IPv6 no longer relevant for mDNS. Apr 5 20:18:21 AINCRAD avahi-daemon[7137]: Leaving mDNS multicast group on interface vethde33b3b.IPv6 with address fe80::3c9e:1dff:fe6b:2d30. Apr 5 20:18:21 AINCRAD kernel: br-870a6a64b157: port 7(vethde33b3b) entered disabled state Apr 5 20:18:21 AINCRAD kernel: device vethde33b3b left promiscuous mode Apr 5 20:18:21 AINCRAD kernel: br-870a6a64b157: port 7(vethde33b3b) entered disabled state Apr 5 20:18:21 AINCRAD avahi-daemon[7137]: Withdrawing address record for fe80::3c9e:1dff:fe6b:2d30 on vethde33b3b. Apr 5 20:18:21 AINCRAD kernel: br-870a6a64b157: port 6(veth06c5b41) entered blocking state Apr 5 20:18:21 AINCRAD kernel: br-870a6a64b157: port 6(veth06c5b41) entered disabled state Apr 5 20:18:21 AINCRAD kernel: device veth06c5b41 entered promiscuous mode Apr 5 20:18:21 AINCRAD kernel: br-870a6a64b157: port 6(veth06c5b41) entered blocking state Apr 5 20:18:21 AINCRAD kernel: br-870a6a64b157: port 6(veth06c5b41) entered forwarding state Apr 5 20:18:22 AINCRAD kernel: eth0: renamed from veth3087c0e Apr 5 20:18:22 AINCRAD kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth06c5b41: link becomes ready Apr 5 20:18:23 AINCRAD avahi-daemon[7137]: Joining mDNS multicast group on interface veth06c5b
  7. What does switching from Macvlan to ipvlan do? Will I still be able to bind my docker containers to specific IPs on my network?
  8. Hi everyone, I've been having a lot of issues with my UnraidOS server randomly freezing with nothing triggering it as far as I know. All these freezes occur at random occurrences (from 1 day to 3 months apart) with a random load (from middle of the day when no one is using to middle of the night when others are connecting to the server). I have searched through the forums countless times trying to figure out what is wrong with my build and I can't find anything that points me in the right direction. I would love some input on what else to try because every time I attempted one of the fixes below it would work and then randomly crash which is quite heartbreaking because I keep thinking I fixed it. About 4 months ago I updated to 6.11.5 and updated some hardware and the crashes have been happening for the last 2 months or so. I used to leave my server alone in the corner of the room and only touch it whenever I wanted to add something new, but the amount of crashes and uncertainty recently has really been bothering me and I'd love to get some help! Hardware Specs: AMD Ryzen 9 5950X 16-Core @ 3400 MHz X570S AERO G 4 x 32GB @2133Mhz 2 x 2TB Samsung SSD 970EVO (Cache Pool) 2 x 18TB HDDs for parity 3 x 4TB HDDs for data 4 x 8TB HDDs for data NVIDIA GeForce GTX 1060 3GB (For Tdarr Encoding used rarely) Things I have attempted: Disabled XMP Profile Memtextx86, ran with no errors whatsoever Disable C-States globally () Pin my docker containers to specific CPU cores This is the first time I've had a crash with the Unraid server hooked up to another monitor with a syslog tail (In the past I've used the syslog server and that never captured any useful information) which is why I have this screenshot. Following up on some forum research, I saw a post here reference something to do with docker and switching to ipvlan after 6.10+ but the first URL is broken, is there any information regarding this? ( I have a good chunk of docker containers, but majority of everything is on br0 with custom IPs. I have Virtual Machines enabled but I don't have any running. Any help would be greatly appreciated, thanks in advance! aincrad-diagnostics-20230328-1832.zip
  9. Implemented a monthly scrub after running a scrub once more with no errors. Thank you for the input on the schedule and utilization! Much appreciated
  10. Got it, thank you for the input! After I restarted by system and it got mounted as read-only, I stopped the array and mounted just the cache. I was then able to delete the syslog file and then I ran the btrfs scrub which fixed a bunch of errors (thankfully no uncorrectable errors) and then I did a "btrfs device stats -z /mnt/cache" to zero out the numbers. Since then it has been running smoothly with no issues (last few days). Re-do the cache as in change all my shares to the array, and then re-format both of my cache drives in my pool? Thank you for clarifying on how it allocates chunks. I didn't realize that it dynamically adds and removes as needed. Since it was rather class to the size of my old hard drives (around 256GB) I thought it was something related to that rather than a file system corruption. I followed and set up alerts so that I will know if there are ever errors so I can do a scrub. Do you think it is still worth it to move cache to array, and then reformat my pool? EDIT: Upon checking my pool I see that the balance and scrub is disabled, should I enable them on some sort of schedule?
  11. Thank you! I've attached my fdisk -l result, I do have it as the full partion size, how do I increase the currently allocated btrfs size in UnraidOS so that it can fully use the whole drive?
  12. Hi guys, This morning my Unraid system suddenly started throwing a bunch of errors stating that a BTRFS errorwriting primary super block to device. It completely filled up by syslog and altered me to a problem. After running btrfs dev stats /mnt/cache it looked like it had a bunch of errors so I ended up remotely shutting down by Unraid system. When I got home I restarted the system and it mounted by /mnt/cache drive in read-only mode, it is currently using 275GB with 1.72TB free according to Unraid dashboard. After a lot of searching on the forums I think I finally figured out the issue. I did a migration from an old unraid build with the following steps for the cache pool Nvme_1_256GB and Nvme_2_256TB SSDs were on the system Remove Nvme_2_256GB and inserted Nvme_1_2TB Rebuild cache pool (now is Nvme_1_256GB and Nvme_1_2TB) Remove Nvme_1_256GB and inserted Nvme_2_2TB Rebuild cache pool (now is Nvme_1_2TB and Nvme_2_2TB) I think what happened is when it had different NVME drives the it copied over the old partion or raid, and now that I've passed the size of the original 256GB nvme, it is failing to write. If I open up my cache pool I can see the btrfs filesystem df which indicates that the total size is not the full 2TB but rather ~262GB and by usage ratio is 96.8%. Now that I have my 2TB NVMEs how do I fix it so that it uses the full 2TB? The drive has been mounted as read-only so I can't execute the mover to move files off this drive and do a normal swap. I am planning to copy the files from the /mnt/cache onto a disk on my array, and then reformat the two cache drives. Is this the correct way to ensure that all 2TB are available for the cache pool and to fix the issue? Is there another way to fix this issue and bring my cache back online? No matter what I'm copying everything off the read-only /mnt/cache to /dev/disk7/restore using midnight commander so that I have a backup of everything. I also attached my diagnostics! EDIT: I was able to successfully unmount and mount my cache pool and deleted a 2GB syslog file which brought it within operating size. I then did a btrfs scrub and fixed a bunch of errors. Everything is back to working but I feel like I'm playing with fire being so close to the max size. So the new main question is, how do I alter the size of the RAID1 partition on my cache pool since it is still set from the old 256GB drive? Thanks for the help! aincrad-diagnostics-20230308-1910.zip
  13. Having the same issue. I stupidly updated this container thinking it would be fine and it just went to hell. I managed to get the php artisan issue fixed by adding DB_PORT and DB_NAME as variables for the container, however when I try to log in I am having this issue where it is throwing a 500 whenever trying to login. Nothing seems to be fixing it. EDIT: I managed to go through all the latest docker versions downgrading until it worked again. Made it to "akaunting/akaunting:3.0.0" in which everything started working again. Moral of the story, don't update your main instance.
  14. +1 Again. My server had a hard crash and upon restart I did the usual steps of both ensuring the shim-br0 network is up (via the commands from OP) as well as did the Docker toggle Host access to custom networks and now it is still failing. EDIT: Ended up following this blog "UnraidOS host access to custom networks [Fix]" https://blog.siglerdev.us/unraidos/ which pointed me in the right direction. I forget that my network is 192.168.54.X not 192.168.1.X, so all I had to do was add the custom scripts plugin and add a script that runs the ip link code with the correct subnet. IE for me it had to be 54 when doing the ip link route add.
  15. +1 To this issue. I am on 6.11.5 and I used to do the Toggle "Host access to custom networks" to disabled, start docker, stop docker, and then set "Host access to custom networks" to enabled and start docker again. This used to fix it Pre- 6.11.5 Now going forward on the latest release I can't get connectivity between my host and the docker containers on the custom br0 network. Any help would be appreciated, I've tried a full restart, resetting router, and multiple iterations of turning docker on/off.
  16. +1 Running into issues with Host access to custom networks on 6.11.5
  17. Hi, since this is an old thread that never got answered I did manage to figure it out. I have a build server using Drone CI, and in one of the stages after it is done compiling, building, and publishing to my container repository, it executes a command line "curl" to a "Watchtower" container. This container if set up you can point towards a docker container on Unraid and have it pull the latest version and redeploy VIA a webhook.
  18. Hi, I currently have 2 x 256GB Samsung NVME and during black Friday I bought 2 x 2TB Samsung NVME and I am looking into upgrading the drives in my cache pool. As of now, I have the following share settings Prefer: Cache appdata (~42GB) domains (empty afaik) system (empty afaik) Yes: Cache isos (empty afaik) Development (~10GB of source code) I also have my docker.img size in the settings set to be 150GB for a rough total of 200GB of space used on my 256GB cache pool. I found this from the Unraid 6 FAQ and this and it seems straight forward enough but it only talks about replacing one of the drives. The issue is that I have 2 NVME slots on my board so I have to do a swap. Is there any issues with the following procedure? Conditions Slot #1 - Samsung 256GB Slot #2 - Samsung 256GB Out of build - Samsung 2TB Out of build - Samsung 2TB Stop array Unassign Slot #1's Samsung 256GB (do NOT change cache pool size) Start the array (after checking the "I'm sure" box next to the start array button) Wait until the "Stop Array" button is available Shutdown computer and move a Samsung 2TB into Slot#1 and take the Slot#1 Samsung 256GB out of build Turn on UnraidOS and assign Slot #1's Samsung 2TB hard drive to the cache pool Start the array (this will then copy the data from the other cache drive to the new 2TB drive?) Wait until the "Stop Array" button is available Repeat the above steps but replace Slot#1 with Slot#2? Questions Will the above procedures work to upgrade? When I unassign one of the drives and then insert the new 2TB drive, whatever data is on the current active cache drive will be copied over onto the new 2TB drive? Are the cache pool drives mirrored entirely, or is the data striped? Since I store data that I want to persist on the cache drive (appdata and Development) but be fast reading/writing since it changes a lot, is this the correct way to do it?
  19. Anyone have any ideas on how I would do this? Is there any sort of API I can call from within a docker container to update another docker container on the host system?
  20. Hi, I currently set up Gitea and Drone CI docker containers which build and deploy "MyDockerProject" (a container run on my unraid machine). I have all the synchronization between them done with the end result being my container published to docker hub. Right now I still have to manually go in and force-update "MyDockerProject" for it to pull down the latest version of my published container. Is it possible to have Unraid pull and deploy a new image from docker hub when executed within a docker container? IE) How can I have my Drone CI Docker Container communicate with UnraidOS that a new update for "MyDockerProject" has been published and for Unraid to go ahead and force the update? Best Regards, 97WaterPolo
  21. Hi, I currently have a Deco M5 Router and Wireguard set up. I was running a few of my docker containers for awhile in bridge mode on the different ports of my Unraid server, but due to restructuring everything I set up custom networks and each of my docker container has its own set of IPs now. Once I moved them over to their own IPs I was no longer able to access them once connected to my Wireguard VPN on any other device. After doing a bunch of digging I found the settings in the settings that say "Remark: docker containers on custom networks need static routing 10.253.0.0/24 to 192.168.68.114". My issue however is that my Router, Deco M5 doesn't support static routing and it has been a feature that has been asked for over a year but nothing has happened. Is there any other way around this so I can access my LAN when connected remotely through WireGuard? From https://forums.unraid.net/topic/84226-wireguard-quickstart/ With "Use NAT" = No and "Host access to custom networks" = enabled and static route server and dockers on bridge/host - accessible! VMs and other systems on LAN - accessible! dockers with custom IP - accessible! (woohoo! the recommended setup for complex networks) This seems to be the only one that allows for docker with custom IP but if I can't set a static route what should I do? Thank you!
  22. Hi @andrew207, Thanks for the quick reply! Yes I am able to complete 3, I am able to successfully send a HEC message VIA postman and search it. I followed this documentation at https://docs.docker.com/config/containers/logging/splunk/ where I should be able to use and append those arguments to the end and have it write to the HEC. Could you go into some detail on how you set up your containers with the volumes? One of the reason I really liked the HEC concept was that all I had to do was generate a token for each of my 8 containers and change that argument and it would be searchable by the token name. I'm not opposed to that but I would just like to get all my docker container logs in a single place and sort through them. EDIT: Never mind, I got it to work using the HEC via the CMD line arguments! Thanks!
  23. Hi, I am having some issues getting splunk set up for the HTTP Event Collector (HEC). I installed the template and changed the "DataPersist" to one of my data shares, and set it as my custom network type (not bridge). I was able to post to one of the tokens using Postman. I set up two tokens for HEC and attached these extra arguments to one of my containers The docker container starts up fine (it failed before with ECONNREFUSED before I had the HEC tokens enabled) but nothing is being written to splunk. What else should I do to fix this? My understanding is that if I set those flags above, the docker container should use the splunk log driver and automatically write all the logs that would appear in the Docker Popup Log window into Splunk?
  24. In regards to the above question, navigating to http://[public ip] gets me a page that says "Wrong site from NginxProxyManager" I'm not sure what happened as I haven't touched this container in a few days, but now all redirects are working as specified. I can no longer get to the NPM login page from any HTTP request. request.mydomain.com http://request.mydomain.com request.mydomain.com:80 http://request.mydomain.com:80 Thanks for the help trying to fix, much appreciated!!!