SP67 Posted September 8, 2023 Share Posted September 8, 2023 Hello, I have had a server crash. It's the second since I updated to 6.12.4 (I had another couple crashed related to macvlan in 6.12.3 but those seem resolved with 6.12.4). I'm attaching syslog and diagnostics since I'm at a loss. I have run memtest and all tests passed fine. Thanks in advance syslog-192.168.88.253.log nas-diagnostics-20230908-1833.zip Quote Link to comment
JorgeB Posted September 8, 2023 Share Posted September 8, 2023 Nothing relevant loges that I can see, do you remember the time it last crashed? Quote Link to comment
SP67 Posted September 8, 2023 Author Share Posted September 8, 2023 (edited) Last time it crashed I did't have syslog activated. It was on Saturday, after I installed two additional hard drives. Last time my shares also disappeared and my array didn't start so I opened this post: Help! No shares and array won't start - General Support - Unraid Regards Edit: last time it crashed with docker and vms modules inactive. Edited September 8, 2023 by SP67 Quote Link to comment
JorgeB Posted September 8, 2023 Share Posted September 8, 2023 10 minutes ago, SP67 said: Last time it crashed I did't have syslog activated. Post a new log after a crash then. Quote Link to comment
SP67 Posted September 8, 2023 Author Share Posted September 8, 2023 Sorry, I meant the log on the 1st post if after today's crash. I don't have the log of Saturday's crash. I'll post again if/when it crashes but I'm afraid of my data integrity. Quote Link to comment
JorgeB Posted September 8, 2023 Share Posted September 8, 2023 5 minutes ago, SP67 said: Sorry, I meant the log on the 1st post if after today's crash. OK, in that case there's nothing relevant logged, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. Quote Link to comment
SP67 Posted September 9, 2023 Author Share Posted September 9, 2023 I tried changing network to ipvlan and disabling bonding and bridging as in 6.12.3 it was stable with this (though I used a second, USB NIC for Docker). It crashed again less than 24h later. I've changed back to macvlan. I have reset the BIOS in case there was something miss configured. I'm attaching the logs but I don't see anything relevant. Regards syslog-192.168.88.253 (1).log nas-diagnostics-20230910-0012.zip Quote Link to comment
JorgeB Posted September 10, 2023 Share Posted September 10, 2023 Still nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. Quote Link to comment
SP67 Posted September 14, 2023 Author Share Posted September 14, 2023 (edited) I'm trying different things to see if stability improves. Currently running with 2 NICs (one for Docker and one for unraid), with bridging and bonding disabled as with this config it was stable for a month with 6.12.3. However, I'm seeing some weird call traces on red on the logs. My Home Assistant VM just crashed and the GUI is not responding as fast as usual. Can you please take a look at the logs? It started at 18:02:21 with a "Sep 14 18:02:21 NAS kernel: kernel BUG at drivers/md/unraid.c:356!" Edit: I can't run diagnostics because the process hangs at lsof -Pni 2</dev/null|todos >.... syslog-192.168.88.253 (3).log Edited September 14, 2023 by SP67 Quote Link to comment
JorgeB Posted September 14, 2023 Share Posted September 14, 2023 md driver (Unraid driver) is crashing, this also usually points to a hardware problem. Quote Link to comment
SP67 Posted September 14, 2023 Author Share Posted September 14, 2023 I see. It does seem that at the end it’s going to be hardware. I’ll try with only 2 ram sticks. Thanks Quote Link to comment
SP67 Posted September 14, 2023 Author Share Posted September 14, 2023 Follow up question. Every time I crash and the parity check starts it detects 12 sync errors. Are these errors getting corrected and new every time or are then always the same? thanks Quote Link to comment
itimpi Posted September 14, 2023 Share Posted September 14, 2023 1 minute ago, SP67 said: Follow up question. Every time I crash and the parity check starts it detects 12 sync errors. Are these errors getting corrected and new every time or are then always the same? It depends on whether the check that is running is correcting or non-correcting. Since you have the Parity Check Tuning plugin installed the entries in the Parity History should tell you what type of check was run. Quote Link to comment
SP67 Posted September 14, 2023 Author Share Posted September 14, 2023 I see. So in this case I guess that the errors are corrected and are generated again when the server crashes. Quote Link to comment
Solution SP67 Posted October 4, 2023 Author Solution Share Posted October 4, 2023 So it does seem that despite the RAM passing MemTest correctly, if I remove 2 sticks and leave the system with 2x8GB instead than 4x8GB everything runs smoothly. Thanks Quote Link to comment
bman Posted October 5, 2023 Share Posted October 5, 2023 I don't have the answer for your issue, but I experienced very much the same kinds of trouble when I wanted to run 4 RAM sticks on my Gigabyte EP45-UD3P based computer. Two sticks were fine but I had to tweak many BIOS settings to get four RAM sticks to be stable. I was able to find the correct tweaks via internet search. Maybe someone has ideas for your motherboard? Quote Link to comment
SP67 Posted October 6, 2023 Author Share Posted October 6, 2023 (edited) It might be worth to look again then. For reference the board is a Gigabyte Z87-UDH3. I'll write back if I find anything. Thanks! Wow, it does seem like its a really common issue with Gigabyte motherboards and 4 RAM modules when you enable XMP: Fix for gigabyte 4 dimm memory instability | Page 3 | tonymacx86.com Gigabyte Z87 mobos freeze when using 4 healthy sticks of RAM | tonymacx86.com Z87 When will Gigabyte fix the 4 DIMM memory issue? (giga-byte.co.uk) Edited October 6, 2023 by SP67 Quote Link to comment
SP67 Posted October 6, 2023 Author Share Posted October 6, 2023 So lets say I want to try some tweaks I've found online (updating the BIOS to a beta version that some people say improves stability, playing with the DRAM voltage and timmings, etc.). What would you guys recommend me so that I don't force my system to go through several parity checks if/when the system crashes? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.