Dro Posted December 31, 2023 Share Posted December 31, 2023 Hello, Having issues with my server.. Server has been stable for quite awhile (its all mostly new hardware within the year) then all of a sudden I started getting random, daily reboots, usually every 24hrs. Now today it has been happening almost every 20 minutes.. Its driving me crazy.. Usually when I run stable for awhile then start seeing reboots I replace the USB drive. This has already been done and i'm still having the same issues. I'm also already set on IPVLAN v 6.12.6 Here are the things ive done Checked File System Installed brand new USB thumb drive Ram Memtest Double Checked IPVLAN Went through the logs myself the best I could. I do some suspicious things maybe related to the NICs but I am not sure. This is all 10gb SFP+ multiple 10gb NICs Again, this current configuration has run just fine. So i am not sure what has all of a sudden started happening. Attached is the diagnostic logs.. Looking for any direction Thank you! mdronet-unraid1-diagnostics-20231230-2055.zip Quote Link to comment
itimpi Posted December 31, 2023 Share Posted December 31, 2023 The syslog in the diagnostics is the RAM version that starts afresh every time the system is booted. You should enable the syslog server (probably with the option to Mirror to Flash set) to get a syslog that survives a reboot so we can see what leads up to a crash. If using the mirror option the syslog file is stored in the 'logs' folder on the flash drive. Quote Link to comment
Dro Posted December 31, 2023 Author Share Posted December 31, 2023 (edited) I do have syslog enabled. And it is copying the file to disk. I figured if that was set running diagnostics would pull that file. But good to know it doesn’t. How would you like me to provide the syslog file from the one that is saved? Is there a specific method? I do not have the mirror to flash option enabled. Just a local syslog directory that the file is stored. Will that suffice? Edited December 31, 2023 by Dro Quote Link to comment
JorgeB Posted December 31, 2023 Share Posted December 31, 2023 You just need to post the syslog from wherever it's being saved to. Quote Link to comment
Dro Posted December 31, 2023 Author Share Posted December 31, 2023 1 hour ago, JorgeB said: You just need to post the syslog from wherever it's being saved to. Ok, i cleared the log, will wait for a reboot so we don't have a lot of old information. So far server has been up for 5 hours.. I have Zabbix monitoring everything so ill get alerted right away when the server restarts. Will post the time it happens and provide the syslog along with it.. appreciate the help. Quote Link to comment
Dro Posted January 1 Author Share Posted January 1 I got an alert around 1:12am that the server had an unlcean shutdown.. Looks like we are back to the "within 24hrs" rebooting schedule. That is usually the normal. When i made this post it was rebooting every 20m or less non stop.. Please see attached syslog that was saved to the syslog server.. Thanks! syslog-192.168.1.10.log Quote Link to comment
JorgeB Posted January 1 Share Posted January 1 There are multiple call traces related to the NIC, also if the server is rebooting by itself, vs. crashing or hanging, that usually always suggests a hardware problem. Quote Link to comment
Dro Posted January 1 Author Share Posted January 1 1 hour ago, JorgeB said: There are multiple call traces related to the NIC, also if the server is rebooting by itself, vs. crashing or hanging, that usually always suggests a hardware problem. the server has 2 dual port 10gb NICs as it was used at one point as an iscsi datastore for ESXi into a Zfs pool basically had its own iscsi network isolated for that purpose. It ran like that without any issues and worked great. I believe the server is just rebooting. Not hanging or technically crashing. Can you tell which NIC it may be? Could a bad DAC cable also cause this? Quote Link to comment
Solution JorgeB Posted January 1 Solution Share Posted January 1 17 minutes ago, Dro said: I believe the server is just rebooting. Enable the syslog server and post that after a reboot, it will show if there was a shutdown event. 17 minutes ago, Dro said: Can you tell which NIC it may be? eth0 Dec 31 23:13:48 MDRONET-UNRAID1 kernel: ------------[ cut here ]------------ Dec 31 23:13:48 MDRONET-UNRAID1 kernel: netdevice: eth0: failed to initialise TXQ 42 Dec 31 23:13:48 MDRONET-UNRAID1 kernel: WARNING: CPU: 12 PID: 23349 at drivers/net/ethernet/sfc/ef10.c:2414 efx_ef10_tx_init+0xfc/0x1c0 [sfc] Quote Link to comment
Dro Posted January 1 Author Share Posted January 1 Thanks. My bonded interface. Will troubleshoot from there. Quote Link to comment
Dro Posted January 2 Author Share Posted January 2 I’m curious, is there a known bug with LACP in 12.6? It seems like since disabling that my server is stable again. I’ve never had issues prior with LACP. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.