OwenT

Members
  • Posts

    9
  • Joined

  • Last visited

OwenT's Achievements

Noob

Noob (1/14)

0

Reputation

  1. Two weeks with no crashes, I'm going to mark that as the solution. Thanks for the help!
  2. Thanks both, I've disabled c-states, I'll update here if it crashes again
  3. This is my third time coming to the forums for this issue, fingers crossed I can find a solution this time. My specs are: AMD Ryzen 5 1400 ASRock A320M-DGS 16 GiB RAM After between 24 hours and 2 weeks of up time, my server will fully hang. Web UI stops responding, all docker containers stop, no output if I have a monitor plugged in (this CPU has no onboard display output, however I used to have a GPU in the system. I removed it to eliminate it as a cause for the issues). The only way to recover from this crash is to hold the power button until the mobo turns off, then reboot. Previous investigation lead to it being connected to ipvlan /macvlan but I've tried both with no change. In the syslog right before the crash I'm seeing an error, but I'm not sure what it could be. Mar 12 03:07:07 xephyr smbd[25757]: #25 /lib64/libc.so.6(__libc_start_main+0x85) [0x1481d7602775] Mar 12 03:07:07 xephyr smbd[25757]: #26 /usr/sbin/smbd(_start+0x21) [0x5576b5babb31] Mar 12 03:07:07 xephyr smbd[25757]: [2024/03/12 03:07:07.456711, 0] ../../source3/lib/dumpcore.c:315(dump_core) Mar 12 03:07:07 xephyr smbd[25757]: dumping core in /var/log/samba/cores/smbd Mar 12 03:07:07 xephyr smbd[25757]: Mar 12 04:00:39 xephyr kernel: md: sync done. time=57557sec Mar 12 04:00:39 xephyr kernel: md: recovery thread: exit status: 0 Mar 12 04:36:30 xephyr kernel: TCP: request_sock_TCP: Possible SYN flooding on port 39519. Sending cookies. Check SNMP counters. Mar 12 04:40:02 xephyr root: Fix Common Problems Version 2024.02.29 Mar 12 04:40:03 xephyr root: Fix Common Problems: Warning: Unraid OS not up to date Mar 12 05:31:41 xephyr kernel: TCP: request_sock_TCP: Possible SYN flooding on port 20683. Sending cookies. Check SNMP counters. Mar 12 07:08:28 xephyr kernel: TCP: request_sock_TCP: Possible SYN flooding on port 57702. Sending cookies. Check SNMP counters. Mar 12 13:13:52 xephyr kernel: TCP: request_sock_TCP: Possible SYN flooding on port 56787. Sending cookies. Check SNMP counters. Mar 12 13:28:48 xephyr kernel: TCP: request_sock_TCP: Possible SYN flooding on port 52846. Sending cookies. Check SNMP counters. Mar 12 16:29:29 xephyr kernel: traps: smartctl[32392] general protection fault ip:154c136d28e4 sp:7ffc40f06998 error:0 in libc-2.37.so[154c13610000+169000] Mar 12 17:03:04 xephyr kernel: docker0: port 1(veth91a2379) entered disabled state Mar 12 17:03:04 xephyr kernel: veth56ef535: renamed from eth0 Mar 12 17:03:05 xephyr kernel: docker0: port 1(veth91a2379) entered disabled state Mar 12 17:03:05 xephyr kernel: device veth91a2379 left promiscuous mode Mar 12 17:03:05 xephyr kernel: docker0: port 1(veth91a2379) entered disabled state Mar 12 17:03:06 xephyr kernel: docker0: port 1(veth573e246) entered blocking state Mar 12 17:03:06 xephyr kernel: docker0: port 1(veth573e246) entered disabled state Mar 12 17:03:06 xephyr kernel: device veth573e246 entered promiscuous mode Mar 12 17:03:10 xephyr kernel: eth0: renamed from vethd75fbd0 Mar 12 17:03:10 xephyr kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth573e246: link becomes ready Mar 12 17:03:10 xephyr kernel: docker0: port 1(veth573e246) entered blocking state Mar 12 17:03:10 xephyr kernel: docker0: port 1(veth573e246) entered forwarding state Mar 12 22:13:26 xephyr root: Delaying execution of fix common problems scan for 10 minutes Specifically this line Mar 12 16:29:29 xephyr kernel: traps: smartctl[32392] general protection fault ip:154c136d28e4 sp:7ffc40f06998 error:0 in libc-2.37.so[154c13610000+169000] My best guess would be a hardware failure, however this exact hardware was running fine as a gaming PC for several years with no issues. Every single component has been reseated, new thermal compound applied etc. I have done multiple unraid version upgrades, docker updates, and meters found zero problems. I've added the diagnostics I will be trying a new USB drive but just wanted to get this thread going while I wait for that to arrive. Any ideas before I just buy a new server? xephyr-diagnostics-20240313-0026.zip
  4. Some odd behaviour has been happening now, the server will go completely unresponsive after 1-3 days of uptime. does not respond on the network, plugging in a monitor + kb does nothing, syslog stops logging, only thing I can do is hold the power button. I've had a look in the logs but I can't see what might be causing it. xephyr-diagnostics-20230826-1319.zip
  5. Fix Common Problems gave me this warning and told me to post my diagnostics here, so I'm doing that. I haven't noticed any odd behaviour or anything like that. xephyr-diagnostics-20230808-1222.zip
  6. That did the trick! Will this persist through OS updates or will I need to remember to re-add the flag?
  7. I've managed to grab the diagnostics with the two devices in the system. in lsscsi.txt I only see the four HDDs connected directly to the motherboard, the two connected to the ASM1064 are missing. The NVMe SSD is listed there. In lspci.txt I see the ASM1064 is being detected though. xephyr-diagnostics-20230808-0929.zip
  8. Ah, thank you! I'm currently doing a parity-sync so I'll wait before I do that. I did just have a thought, could this be related to the 6 disk limit in a basic license? My aim is to use the SSD as a cache drive not an array drive so I thought this would be ok. Also would have expected an error message in this situation rather than the UI not loading.
  9. Hi all, I have this motherboard ASRock A320M-DGS, which has 4 sata ports, I have 6 drives so I grabbed this card ASM1064 4 Port Sata Card as the asm1064 chip is listed as being compatible with unraid. This all seems to work fine. However I also have an Intel 670p NVMe SSD, when I put this in the M.2 slot unraid's web interface will not load, if I access the IP of the server I get a "host unreachable". I can log in to the CLI directly on the server and that seems to work fine. lspci shows both the SSD and the ASM1064 as detected, but it won't show the two drives on that card under /dev. I initially assumed that the PCIe 1x spot and M.2 are interlinked and using one will disable the other as I've seen that in other motherboards before. The manual and website make zero mention of this, and both devices do show up in Linux so I'm guessing the motherboard isn't turning one of the two off. Also even if it is a motherboard issue, I'd expect it to either not boot at all, or unraid would run as normal and just not be able to see the disabled device. It feels very odd that the OS starts seemingly normally, but then fails to bring up the UI. I can't get diagnostics as the webGUI won't load under this issue. Is there any way to get these devices working together?