SHALcL Posted October 4, 2023 Share Posted October 4, 2023 Hello, This morning when i woke up unraid was unresponsive until reboot, so i was not able to collect logs before rebooting. This is the second time this happened. I checked the system logs but i cant find any clues of what could be causing this. Both times i was sleeping so i did not see exactly when it did happen. Can someone point me to the right direction to troubleshoot this? Thanks!! server1a-diagnostics-20231004-0946.zip Quote Link to comment
JorgeB Posted October 4, 2023 Share Posted October 4, 2023 Enable the syslog server and post that after a crash. 1 Quote Link to comment
SHALcL Posted October 4, 2023 Author Share Posted October 4, 2023 Ohh, okay, makes sense why the logs i had were useless. I will activate the syslog server and see. Last time it hung up was like a month ago, so it will be a long time before i get those useful logs... Thanks! Quote Link to comment
SHALcL Posted October 9, 2023 Author Share Posted October 9, 2023 It just happened again, this time I was not sleeping, i was not doing anything special with unraid when it happened. Attached the diagnostics file, this time with syslog server enabled. server1a-diagnostics-20231009-1050_latest.zip Quote Link to comment
JorgeB Posted October 9, 2023 Share Posted October 9, 2023 1 hour ago, SHALcL said: this time with syslog server enabled. You need to post the separately, it does not come with the diags. Quote Link to comment
SHALcL Posted October 9, 2023 Author Share Posted October 9, 2023 Damn, i dont have the syslogs. If you check my previous post history you will see that i have some weird network behaviour on my unraid, and this made it to not capture the logs since the last reboot... Now i'm capturing logs again, lets wait for another crash... Sorry. Quote Link to comment
itimpi Posted October 9, 2023 Share Posted October 9, 2023 4 minutes ago, SHALcL said: Damn, i dont have the syslogs. If you check my previous post history you will see that i have some weird network behaviour on my unraid, and this made it to not capture the logs since the last reboot... Now i'm capturing logs again, lets wait for another crash... Sorry. If you have the Mirror to Flash option set for the syslog server then it does not need the network working to capture a log to the flash drive in the 'logs' folder. Quote Link to comment
SHALcL Posted October 9, 2023 Author Share Posted October 9, 2023 1 minute ago, itimpi said: If you have the Mirror to Flash option set for the syslog server then it does not need the network working to capture a log to the flash drive in the 'logs' folder. Yeah, but i did not enable it becasue i can't casue (or i dont know how yet) the crash, and i have to leave it running for weeks to months for it to happen again, and I don't want to burn my flashdrive Quote Link to comment
SHALcL Posted October 16, 2023 Author Share Posted October 16, 2023 Happened again. server1a-diagnostics-20231017-0032.zip Quote Link to comment
dlandon Posted October 16, 2023 Share Posted October 16, 2023 I see this in the logs: Oct 17 00:00:13 Server1A kernel: r8169 0000:0b:00.0 eth1: RTL8168h/8111h, 22:09:5c:07:20:4f, XID 541, IRQ 81 Oct 17 00:00:13 Server1A kernel: r8169 0000:0b:00.0 eth1: jumbo features [frames: 9194 bytes, tx checksumming: ko] You have a Realtek NIC (eth0). The Realtek NICs are troublesome on Linux because the drivers are not well maintained. You are also using Jumbo Frames. This is not a good combination. Jumbo frames are discouraged bcause it is hard to set up a network to properly handle them. Do the following: Set the MTUs on all networking back to default. Reconfigure your network setup to either use eth1 as a backup to eth0 (bond with both NICs), or use eth1 only. Get your system stable, then work on network improvements a little at a time and watch for issues. 1 Quote Link to comment
SHALcL Posted October 17, 2023 Author Share Posted October 17, 2023 Thanks for the reply! I have the 2.5G connected to the lan, and the 10G with jumbo packets connected directly to a PC. I will try to disable jumbo packets between te server and the pc and see if it stops crashing. Quote Link to comment
SHALcL Posted October 18, 2023 Author Share Posted October 18, 2023 (edited) It happened again and this time without jumbo packets enabled. EDIT: I think its OOM this time... server1a-diagnostics-20231019-0012.zip Edited October 18, 2023 by SHALcL Quote Link to comment
SHALcL Posted November 2, 2023 Author Share Posted November 2, 2023 It happened again, this time without apparent reason. server1a-diagnostics-20231102-1150_latest.zip Quote Link to comment
dlandon Posted November 3, 2023 Share Posted November 3, 2023 You are still using Jumbo frames: Nov 2 11:48:57 Server1A kernel: r8169 0000:0b:00.0 eth0: RTL8168h/8111h, 22:09:5c:07:20:4f, XID 541, IRQ 47 Nov 2 11:48:57 Server1A kernel: r8169 0000:0b:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko] Recommendations: Remove Jumbo frames. You have to be sure they ar not enabled anywhere on your network. IMHO, Jumbo frame offer little improvement and are not worth the headaches. Update your gpustat plugin. Try setting up a bridge with both eth0 and eth1 in the bridge and use backup configuration. This wil allow eth1 to take over if eth0 fails. Get an Intel NIC. Quote Link to comment
SHALcL Posted November 3, 2023 Author Share Posted November 3, 2023 Both interfaces are using the default MTU of 1500. At the other end, on my windows computer i also have jumbo packets disabled (10G NIC is a direct connection between the server and the workstation). Quote Link to comment
SHALcL Posted November 9, 2023 Author Share Posted November 9, 2023 Hello, And happened once again... I dont have any MTU above the default 1500 and i updated everything. This freezing randomly thing is starting to get old... server1a-diagnostics-20231102-1150_latest1.zip Quote Link to comment
JorgeB Posted November 9, 2023 Share Posted November 9, 2023 Did you enable the syslog server as mentioned? Quote Link to comment
SHALcL Posted November 9, 2023 Author Share Posted November 9, 2023 Just now, JorgeB said: Did you enable the syslog server as mentioned? Yes, i did Quote Link to comment
JorgeB Posted November 9, 2023 Share Posted November 9, 2023 Then please post it as well. Quote Link to comment
SHALcL Posted November 9, 2023 Author Share Posted November 9, 2023 Just now, JorgeB said: Then please post it as well. Sorry, I assumed the diagnostics would already take them. syslog-127.0.0.1.log syslog-127.0.0.1_1.log syslog-127.0.0.1_2.log syslog-127.0.0.1_3.log syslog-127.0.0.1_4.log Quote Link to comment
Solution JorgeB Posted November 9, 2023 Solution Share Posted November 9, 2023 On 10/9/2023 at 10:53 AM, JorgeB said: On 10/9/2023 at 9:52 AM, SHALcL said: this time with syslog server enabled. You need to post the separately, it does not come with the diags. Nov 7 23:45:46 Server1A kernel: macvlan_broadcast+0x10a/0x150 [macvlan] Nov 7 23:45:46 Server1A kernel: ? _raw_spin_unlock+0x14/0x29 Nov 7 23:45:46 Server1A kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan] Macvlan call traces will usually end up crashing the server, switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)). 1 Quote Link to comment
SHALcL Posted November 9, 2023 Author Share Posted November 9, 2023 (edited) Changing this setting right away. Fingers crossed. Thanks for looking in to it! If it does not crash in a couple of weeks I will mark this as the solution. Edited November 9, 2023 by SHALcL Quote Link to comment
JorgeB Posted November 9, 2023 Share Posted November 9, 2023 Make sure you reboot after changing the setting, in case there's already been a call trace. Quote Link to comment
SHALcL Posted November 9, 2023 Author Share Posted November 9, 2023 41 minutes ago, JorgeB said: Make sure you reboot after changing the setting, in case there's already been a call trace. I will do it tonight. Many thanks.- Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.