OddMagnet

Members
  • Posts

    30
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

OddMagnet's Achievements

Noob

Noob (1/14)

1

Reputation

  1. Again, the interface spam happened at 8am on Sunday. I've tried getting logs from docker via this script: #!/bin/bash cd /mnt/user/appdata timeout 2h docker compose logs -f -t --since=1s > /mnt/user/data/docker-compose.log However, this didn't record properly for the full duration: Script Starting Apr 28, 2024 07:00.01 Full logs for this script are available at /tmp/user.scripts/tmpScripts/Get Docker Compose Logs/log.txt error from daemon in stream: Error grabbing logs: unexpected EOF Script Finished Apr 28, 2024 07:01.21 Full logs for this script are available at /tmp/user.scripts/tmpScripts/Get Docker Compose Logs/log.txt I don't think it's a problem with the script, since it recorded over a minute. The last line from the logs doesn't seem to be the problem either: autobrr | 2024-04-28T05:01:21.043895040Z {"level":"debug","module":"filter","method":"CheckFilter","time":"2024-04-28T07:01:21+02:00","message":"(Cross-Seed) external filter check not matching what filter wanted"}
  2. So I've setup a user script to start logging docker events on Sundays, one hour before the interface spam happens. Looking at the log, I can see a few things: container exec_create → exec_start → exec_die happens a lot. After a quick google search I learned that those events also happen for Healthchecks, which is the case in my logs a lot of container kill / die / stop happening after the interface spam started also a lot of network disconnect for my docker bridge network I've edited the logs to remove all the healthchecks and attached it here. I don't think there are any pointers in there, but if anyone is bored enough to take a look at it I'd appreciate it. Additionally I've checked docker compose logs (with --since and --until), but there's not much to see before the time of the problem. (It doesn't help that not all log lines contain times and the containers don't all use the same formatting for times...) For now I'll set up another script to catch the docker compose logs and hopefully that'll be more helpful next week.. docker-events-edited.log
  3. Changed the cable and router port, hopefully it's not gonna happen again. Everything (Router, Cable, Server) is brand-new, but I'd much rather have the cable be the culprit, lol. What's weird though, is that eth0 always goes up again when I connect a KB+Monitor, login and start the diagnostics command. (and obviously loosing the physical link when nothing is moving is extremely weird in the first place)
  4. that'd suck a lot. I guess that'd explain the drops for eth0 as well? Though it's weird there are only drops for receive Errors info Receive counters Transmit counters eth0 Errors: 0 Errors: 0 Drops: 19739 Drops: 0 Overruns: 0 Overruns: 0 Any suggestions on how I'd be able to verify what parts of the chain are good and where the problem actually is?
  5. Had a look at the syslog, this time it's different it seems. Not interface spam before eth0 goes down. (I don't think it previously even had an entry about eth0 going down) @JorgeB can you take another look at the new diagnostics? oddnas-diagnostics-20240403-0729.zip
  6. So it happened again, guess it's not a Sunday only thing then. Again no container restarts, nothing in the container or docker logs that would indicate any problems
  7. So it happened again, or at least the "soft version" where all my containers restart after the interface spam. This seems to mostly happen on Sunday mornings, I've looked over all my container logs, but not one of them had anything in it that would indicate being a problem. Additionally I've checked all my container settings, but none of them are configured to do anything at around that time. It doesn't look like it's a docker problem to me. I'll try disabling some of my containers over the coming weeks and months, but it's gonna be a very tedious process of narrowing it down (if they're the problem) In the meantime, is there anything else I could check? I'm still confused why there is any mention of IPv6 in my logs, when I don't even have it enabled.
  8. Still looking for help with this problem. I'd really like to solve the interface spam in my logs, which I believe is the root cause of my problem
  9. Still looks like a problem to me though. To me it seems that WebGUI and SSH broke because of the interfaces being down (aside from the br-... and veth... interfaces) It's hard to believe that the spam I mentioned in my second post is completely unrelated to the interfaces going down. I don't understand why there are so many veth... interfaces in the first place, why the br-... interfaces ports are spamming my log so much or why there are entries about IPv6 when that's not even enabled in my settings
  10. Containers all had 100% uptime (other than intentional restarts - e.g. when I changed things in the compose file)
  11. Looking back, there also has been a lot of this, though I'm not sure if it's related: Mar 18 09:58:46 OddNas kernel: veth5e5dac8: renamed from eth0 Mar 18 09:58:46 OddNas kernel: br-db82cd71027f: port 20(veth3e692a8) entered disabled state Mar 18 09:58:46 OddNas kernel: br-db82cd71027f: port 20(veth3e692a8) entered disabled state Mar 18 09:58:46 OddNas kernel: device veth3e692a8 left promiscuous mode Mar 18 09:58:46 OddNas kernel: br-db82cd71027f: port 20(veth3e692a8) entered disabled state Mar 18 09:58:46 OddNas kernel: br-db82cd71027f: port 20(veth7260a7f) entered blocking state Mar 18 09:58:46 OddNas kernel: br-db82cd71027f: port 20(veth7260a7f) entered disabled state Mar 18 09:58:46 OddNas kernel: device veth7260a7f entered promiscuous mode Mar 18 09:58:46 OddNas kernel: br-db82cd71027f: port 20(veth7260a7f) entered blocking state Mar 18 09:58:46 OddNas kernel: br-db82cd71027f: port 20(veth7260a7f) entered forwarding state Mar 18 09:58:47 OddNas kernel: br-db82cd71027f: port 20(veth7260a7f) entered disabled state Mar 18 09:58:48 OddNas kernel: eth0: renamed from veth190d797 Mar 18 09:58:48 OddNas kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth7260a7f: link becomes ready Mar 18 09:58:48 OddNas kernel: br-db82cd71027f: port 20(veth7260a7f) entered blocking state Mar 18 09:58:48 OddNas kernel: br-db82cd71027f: port 20(veth7260a7f) entered forwarding state Some more stuff that might be noteworthy: Looking at the ifconfig.txt, there's a lot of vethsomething interfaces, no idea why In the same file, my aside from the above mentioned interfaces, only some br-... interface was up
  12. I've had the WebGUI and SSH break one or two times before already and simply rebooted. This time I figured might as well connect my keyboard and monitor and create a diagnostics file. Weirdly enough, right when I pressed enter to start the diagnostics tool, the WebGUI and SSH started working again. I checked the logs and this seems to be the timeframe it happened (unless I'm misinterpreting something): 20:11 → WebGUI and SSH stopped working 20:20 → connected Keyboard 20:23 → connected Monitor 20:31 → started diagnostics Mar 20 17:05:06 OddNas sshd[8859]: Starting session: shell on pts/0 for root from 192.168.178.80 port 50226 id 0 Mar 20 20:11:15 OddNas kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Down Mar 20 20:11:15 OddNas kernel: br0: port 1(eth0) entered disabled state Mar 20 20:11:18 OddNas ntpd[1127]: Deleting interface #1 br0, 192.168.178.200#123, interface stats: received=7185, sent=7186, dropped=0, active_time=1387370 secs Mar 20 20:11:18 OddNas ntpd[1127]: 216.239.35.12 local addr 192.168.178.200 -> Mar 20 20:11:18 OddNas ntpd[1127]: 216.239.35.8 local addr 192.168.178.200 -> Mar 20 20:11:18 OddNas ntpd[1127]: 216.239.35.4 local addr 192.168.178.200 -> Mar 20 20:11:18 OddNas ntpd[1127]: 216.239.35.0 local addr 192.168.178.200 -> Mar 20 20:20:51 OddNas kernel: usb 1-1: new full-speed USB device number 4 using xhci_hcd Mar 20 20:20:51 OddNas kernel: input: Corsair Corsair Gaming K55 RGB Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-1/1-1:1.0/0003:1B1C:1B3D.0002/input/input5 Mar 20 20:20:51 OddNas kernel: hid-generic 0003:1B1C:1B3D.0002: input,hidraw0: USB HID v1.11 Keyboard [Corsair Corsair Gaming K55 RGB Keyboard] on usb-0000:00:14.0-1/input0 Mar 20 20:20:51 OddNas kernel: input: Corsair Corsair Gaming K55 RGB Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-1/1-1:1.1/0003:1B1C:1B3D.0003/input/input6 Mar 20 20:20:51 OddNas kernel: input: Corsair Corsair Gaming K55 RGB Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-1/1-1:1.1/0003:1B1C:1B3D.0003/input/input7 Mar 20 20:20:51 OddNas kernel: input: Corsair Corsair Gaming K55 RGB Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-1/1-1:1.1/0003:1B1C:1B3D.0003/input/input8 Mar 20 20:20:51 OddNas kernel: hid-generic 0003:1B1C:1B3D.0003: input,hiddev96,hidraw1: USB HID v1.11 Keyboard [Corsair Corsair Gaming K55 RGB Keyboard] on usb-0000:00:14.0-1/input1 Mar 20 20:20:51 OddNas kernel: hid-generic 0003:1B1C:1B3D.0004: hiddev97,hidraw2: USB HID v1.11 Device [Corsair Corsair Gaming K55 RGB Keyboard] on usb-0000:00:14.0-1/input2 Mar 20 20:23:11 OddNas kernel: fbcon: i915drmfb (fb0) is primary device Mar 20 20:23:11 OddNas kernel: Console: switching to colour frame buffer device 320x90 Mar 20 20:23:11 OddNas kernel: i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device Mar 20 20:23:33 OddNas login: pam_unix(login:auth): authentication failure; logname=LOGIN uid=0 euid=0 tty=/dev/tty1 ruser= rhost= user=root Mar 20 20:23:36 OddNas login: FAILED LOGIN 1 FROM tty1 FOR root, Authentication failure Mar 20 20:24:04 OddNas login: pam_unix(login:session): session opened for user root(uid=0) by LOGIN(uid=0) Mar 20 20:24:04 OddNas login: ROOT LOGIN ON tty1 Mar 20 20:26:46 OddNas sshd[8859]: Read error from remote host 192.168.178.80 port 50226: No route to host Mar 20 20:26:46 OddNas sshd[8859]: pam_unix(sshd:session): session closed for user root Mar 20 20:27:17 OddNas root: ACPI action up is not defined Mar 20 20:27:18 OddNas root: ACPI action left is not defined Mar 20 20:27:18 OddNas root: ACPI action left is not defined Mar 20 20:27:18 OddNas root: ACPI action left is not defined Mar 20 20:27:18 OddNas root: ACPI action left is not defined Mar 20 20:27:18 OddNas root: ACPI action left is not defined Mar 20 20:27:19 OddNas root: ACPI action left is not defined Mar 20 20:27:22 OddNas root: ACPI action up is not defined Mar 20 20:27:22 OddNas root: ACPI action down is not defined Mar 20 20:31:39 OddNas kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Mar 20 20:31:39 OddNas kernel: br0: port 1(eth0) entered blocking state Mar 20 20:31:39 OddNas kernel: br0: port 1(eth0) entered forwarding state Mar 20 20:31:41 OddNas ntpd[1127]: Listen normally on 2 br0 192.168.178.200:123 Mar 20 20:31:41 OddNas ntpd[1127]: new interface(s) found: waking up resolver Mar 20 20:31:45 OddNas emhttpd: read SMART /dev/sdb Mar 20 20:32:19 OddNas sshd[13939]: Connection from 192.168.178.80 port 54474 on 192.168.178.200 port 22 rdomain "" Mar 20 20:32:19 OddNas sshd[13939]: Failed publickey for root from 192.168.178.80 port 54474 ssh2: ED25519 SHA256:GInj1AiL72FiRCs+JM71XjgEElxEiNHUoR508RkQS3g Mar 20 20:32:19 OddNas sshd[13939]: Postponed keyboard-interactive for root from 192.168.178.80 port 54474 ssh2 [preauth] Mar 20 20:32:22 OddNas sshd[13939]: Postponed keyboard-interactive/pam for root from 192.168.178.80 port 54474 ssh2 [preauth] Mar 20 20:32:22 OddNas sshd[13939]: Accepted keyboard-interactive/pam for root from 192.168.178.80 port 54474 ssh2 Mar 20 20:32:22 OddNas sshd[13939]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0) Mar 20 20:32:22 OddNas sshd[13939]: Starting session: shell on pts/0 for root from 192.168.178.80 port 54474 id 0 Full diagnostics are attached oddnas-diagnostics-20240320-2031.zip
  13. dhcpcd comes, afaik installed by default. I wasn't able to find anything saying otherwise. For now I've just disabled IPv6, would love to find a real solution to this
  14. The Docker usage on my Dashboard is at 904 GiB Which happens to be pretty much exactly the amount of actually used space on my Cache disk: (904 GiB => 970 GB) A couple days ago, when my cache was almost full from transferring files: My Settings for Docker: Why is the usage showing like this? And is there any way to "fix" this? It's not affecting performance, but it'd be nice to see how much my Docker images are really using. (I checked and the actual size of all images is 81GB)
  15. still got the problem, though it looks like it's not only spamming deletions now: Feb 16 12:39:43 OddNas avahi-daemon[17573]: Registering new address record for ADDRESS on br0.*. Feb 16 12:39:47 OddNas dhcpcd[1016]: br0: deleting address ADDRESS/64 Feb 16 12:39:47 OddNas avahi-daemon[17573]: Withdrawing address record for ADDRESS on br0. Feb 16 12:41:36 OddNas dhcpcd[1016]: br0: deleting address ADDRESS/64 Feb 16 12:42:44 OddNas avahi-daemon[17573]: Registering new address record for ADDRESS on br0.*.