Mlatx

Members
  • Posts

    148
  • Joined

  • Last visited

Everything posted by Mlatx

  1. I want to report back. Going from macvlan to ipvlan solved the issue. I'm not using any custom IP's. Using ipvlan with custom IP's caused problems with my containers having intermittent connectivity. If I need custom IP's in the future, I'll setup a vlan for docker.
  2. I've change it to ipvlan. I am not having the same issues I had when I had custom IP's, so let' see if this addresses the problem. I'll post back if this solution works.
  3. How do I find out if there is some remnant of a custom IP address? I saw this error just pop up: php-fpm[5996]: [WARNING] [pool www] server reached max_children setting (50), consider raising it Is that a macvlan issue as well?
  4. I don't have any custom IP addresses. I did, and had macvlan show up before. I posted in anotherr thread I started. ipvlan did not fix the issue. It caused other issues. Then I removed all custom IP addresses. I have the standard bridge network for internal dockers. I created a network, proxynet, for external dockers. And my nzbget has no network routing through delugevpn. I've never had these issues until 6.10.
  5. I think I fixed the trim issue. I did not realize I had my cache drives on the SAS controller, which apparently doesn't trim. Now that they are connected to the mobo controller, trim works. I'm not sure if that is what cause the lockup. There are some warnings and error sin the log today related to netfiler, kworker, and call trace. server-diagnostics-20221020-0909.zip
  6. I recently had issues with macvlan crashing unraid. I tried to go with ipvlan, but issues with that cause me to revert back to macvlan without unique IP's. That's great and no issues with macvlan in the logs. Today, my server crashed shortly after what looks like disks spinning down. That was the last point in the log before restarting. The PSU is new, so I doubt it is a power issue. For now, I don't have disks spinning down. Also, I see ipv6 comments in the log, but unraid and pfsense are set to not use ipv6. What would cause disks spinning down to freeze the system? Why are ipv6 entries shown? To add some more context, I am seeing this error throughout the log not shown in the attached snipiet. Oct 15 00:00:20 server kernel: critical target error, dev sdg, sector 1950353247 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0 Oct 15 00:00:20 server kernel: BTRFS warning (device sdg1): failed to trim 1 device(s), last error -121 I tried to manually run trim using, fstrim -v /mnt/cache/, and the same error showed up. I'll run an extended smart test on my cache drive. Could this be an imminent cache failure, or is it something else? log.txt
  7. My docker image got corrupted and was in read only mode. I deleted it, restarted docker services, and re-installed apps. Deleting the docker image wiped out proxynet, so I created it again and had to go into the applications that used it and re-enable it. Since then, I get the "your connection isn't private" error. I'm using pfsense, and the configuration is fine. It's my domain hosted on cloudflare. What could be the problem? Solved: I decided to delete the app and appdata folder and start from scratch. I copied the content of appdatae elsewhere. With the new installation up, I copied back my proxy.conf files. I added my cloudflare account. Swag restarted and shutdown unexpectedly. On the restart, everything was back up and running.
  8. Since that was the easiest option, that is what I did. Thanks for pointing it out.
  9. Interesting. It has never happened to me before. Although I recently made changes to my hardware within the last 3 months or so. Maybe that caused the issues. I'll try the suggestions.
  10. Nothing was scheduled last night. I don't recall if anything was schedule the last failure. The only change I made recently was to change the spin down delay from never to 4 hours. On the last unresponsive restart from the first attached log, I see this warning several times. Sep 27 17:27:25 server kernel: WARNING: CPU: 6 PID: 7014 at fs/btrfs/delayed-inode.c:1304 btrfs_assert_delayed_root_empty+0x16/0x19 Sep 27 17:27:25 server kernel: Modules linked in: xt_mark xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap macvlan nvidia_modeset(PO) nvidia_uvm(PO) xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod nvidia(PO) drm backlight nct6775 hwmon_vid nct6683 apex(O) gasket(O) efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables amd64_edac edac_mce_amd kvm_amd wmi_bmof kvm btusb btrtl btbcm crct10dif_pclmul crc32_pclmul btintel crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd mpt3sas bluetooth igb i2c_piix4 i2c_algo_bit rapl raid_class ccp k10temp igc scsi_transport_sas ecdh_generic i2c_core ahci ecc input_leds led_class libahci tpm_crb acpi_cpufreq tpm_tis tpm_tis_core tpm wmi button This happened during the day not overnight.
  11. Here is today's log snipit. Looking at my emails, the system went unresponsive after 3am. It doesn't come backup until after I rebooted. I don't see any errors in the log. syslog2.txt
  12. Dockers. Last night, the following were updated: Home-Assistant-Core, Prometheus, Postgresql14, and Swag.
  13. Hi All, This is the second time in the last week this has happened to me. An automatic update happens just after midnight, and I cannot connect to my server in the morning. I've had to do an unclean restart. I have the log server running, and I here is a section of the log the last time it did this. It seems like a network connection issue. Can anyone help with troubleshooting? If I need the whole log, let me know. syslog.txt
  14. This did it across 2 different motherboards, cpu's and memory sticks. I'm running ECC memory and no mem test errors. The disk errors go away after a reboot. Extended smart test always clear good. The only constant is the drives and the PSU. I think I am pushing the limits of the PSU. I'm going to replace that to see if it makes a difference.
  15. I had an issue earlier today I'm hoping is resolved by a hard drive reformat. I'm using this controller for 8 drives. It replaced a prior similar controller. I had read error on the prior controller. This one did the same. So I thought the cables were bade only to wake up to a corrupted cache pool today. The LSI board is IT mode with the latest firmware. What could be causing this? The same disks showed error on both boards. This is the first time the cache pool was corrupted. Could it be the power cables or power supply? A work around may be to move back to the onboard controller. 4 ports are controlled by an asmedia 1061. The other are controlled by an AMD b550 controller. Since AMD controllers have issues, I may pick up a 4 port asmedia board. I would love to get some feedback.
  16. Do I copy it via krusader or some other method?
  17. Hi All, I woke up this morning to dockers not working. Checking the Swag log, I saw a lot of read-only errors. Checking fix common problems, I see unable to write to cache. I have a 1TB cache pool in btrfs. Are the drives corrupted? In 5 years or so running unraid, this is the first time I see this error. I have ECC memory along with an LSI controller. What would cause this to happen? If it is corrupt, do I move the date to the array by changing cache from prefer to yes? Then format and put it back. Should I stick with btrfs or go xfs? Thank you for the help. server-diagnostics-20220716-1216.zip
  18. iOS Notifications Hi All, I finally have Frigate setup with Home Assistant Core. I am using this blueprint for notifications, https://community.home-assistant.io/t/frigate-mobile-app-notifications/311091. I trigger an alert, and I am not getting the notification on my Home Assistant app. Am I missing something in the configuration? Is there a better suited blueprint or notification method?
  19. Is there any way to send email invitations through baikal? I entered my email address in field: Email invite sender address. It doesn't work.
  20. Mlatx

    CPU 100%

    It looks like it was crashplan. I've deleted and re-installed it. The quick fill up seems to have went away.
  21. Mlatx

    CPU 100%

    Yes but not running at the time. I'm thinking it could be the zoneminder docker. htop with 100% cpu showed zmc process consuming 66% of the ram. It seems like a docker issue. The other potential docker is crash plan pro. I have it set to limit memory usage. I'm going to try and monitor the ram and kill off a docker to see how much gets freed.
  22. Mlatx

    CPU 100%

    Thanks @JorgeB I've ordered a 9207-8i. I'm back to having the cpu pegged to 100 again. The log is not full this time. The memory is. I'm attaching the latest log file. server-diagnostics-20220522-1504.zip
  23. Mlatx

    CPU 100%

    I'm using a SAS controller for my disks. I had a bunch of errors on my ryzen board with the internal controller, and going SAS eliminated them. I wanted to test the SAS cables. Switching them did nothing. I put the cache drives back on the internal motherboard controller, and I no longer have the error in my log. Recently, disk 2 was giving me errors, and an extended SMART test showed no errors. This happens every 3 months. Restarting fixes the issue. It looks like I'm going to have to by a new SAS controller. First, I'll wait and see if the disk 2 error shows up again. Maybe having the cache drives and share drives on the same controller causes some kind of issue.
  24. Mlatx

    CPU 100%

    Extended SMART test showed no errors. Could it be the SATA port? I'm looking at a live log and still getting the blk_update_request: critical target error, dev sdf.
  25. Mlatx

    CPU 100%

    I'll try that. I had ntop running. I took some pics and noticed this: unraid /var/run/dockerid.pid --log-level=fatal --storagedriver=btrfs. Does that mean something docker related could be jamming the log?