September 8, 20232 yr Hello, I have had a server crash. It's the second since I updated to 6.12.4 (I had another couple crashed related to macvlan in 6.12.3 but those seem resolved with 6.12.4). I'm attaching syslog and diagnostics since I'm at a loss. I have run memtest and all tests passed fine. Thanks in advance syslog-192.168.88.253.log nas-diagnostics-20230908-1833.zip
September 8, 20232 yr Community Expert Nothing relevant loges that I can see, do you remember the time it last crashed?
September 8, 20232 yr Author Last time it crashed I did't have syslog activated. It was on Saturday, after I installed two additional hard drives. Last time my shares also disappeared and my array didn't start so I opened this post: Help! No shares and array won't start - General Support - Unraid Regards Edit: last time it crashed with docker and vms modules inactive. Edited September 8, 20232 yr by SP67
September 8, 20232 yr Community Expert 10 minutes ago, SP67 said: Last time it crashed I did't have syslog activated. Post a new log after a crash then.
September 8, 20232 yr Author Sorry, I meant the log on the 1st post if after today's crash. I don't have the log of Saturday's crash. I'll post again if/when it crashes but I'm afraid of my data integrity.
September 8, 20232 yr Community Expert 5 minutes ago, SP67 said: Sorry, I meant the log on the 1st post if after today's crash. OK, in that case there's nothing relevant logged, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.
September 9, 20232 yr Author I tried changing network to ipvlan and disabling bonding and bridging as in 6.12.3 it was stable with this (though I used a second, USB NIC for Docker). It crashed again less than 24h later. I've changed back to macvlan. I have reset the BIOS in case there was something miss configured. I'm attaching the logs but I don't see anything relevant. Regards syslog-192.168.88.253 (1).log nas-diagnostics-20230910-0012.zip
September 10, 20232 yr Community Expert Still nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.
September 14, 20232 yr Author I'm trying different things to see if stability improves. Currently running with 2 NICs (one for Docker and one for unraid), with bridging and bonding disabled as with this config it was stable for a month with 6.12.3. However, I'm seeing some weird call traces on red on the logs. My Home Assistant VM just crashed and the GUI is not responding as fast as usual. Can you please take a look at the logs? It started at 18:02:21 with a "Sep 14 18:02:21 NAS kernel: kernel BUG at drivers/md/unraid.c:356!" Edit: I can't run diagnostics because the process hangs at lsof -Pni 2</dev/null|todos >.... syslog-192.168.88.253 (3).log Edited September 14, 20232 yr by SP67
September 14, 20232 yr Community Expert md driver (Unraid driver) is crashing, this also usually points to a hardware problem.
September 14, 20232 yr Author I see. It does seem that at the end it’s going to be hardware. I’ll try with only 2 ram sticks. Thanks
September 14, 20232 yr Author Follow up question. Every time I crash and the parity check starts it detects 12 sync errors. Are these errors getting corrected and new every time or are then always the same? thanks
September 14, 20232 yr Community Expert 1 minute ago, SP67 said: Follow up question. Every time I crash and the parity check starts it detects 12 sync errors. Are these errors getting corrected and new every time or are then always the same? It depends on whether the check that is running is correcting or non-correcting. Since you have the Parity Check Tuning plugin installed the entries in the Parity History should tell you what type of check was run.
September 14, 20232 yr Author I see. So in this case I guess that the errors are corrected and are generated again when the server crashes.
October 4, 20232 yr Author Solution So it does seem that despite the RAM passing MemTest correctly, if I remove 2 sticks and leave the system with 2x8GB instead than 4x8GB everything runs smoothly. Thanks
October 5, 20232 yr I don't have the answer for your issue, but I experienced very much the same kinds of trouble when I wanted to run 4 RAM sticks on my Gigabyte EP45-UD3P based computer. Two sticks were fine but I had to tweak many BIOS settings to get four RAM sticks to be stable. I was able to find the correct tweaks via internet search. Maybe someone has ideas for your motherboard?
October 6, 20232 yr Author It might be worth to look again then. For reference the board is a Gigabyte Z87-UDH3. I'll write back if I find anything. Thanks! Wow, it does seem like its a really common issue with Gigabyte motherboards and 4 RAM modules when you enable XMP: Fix for gigabyte 4 dimm memory instability | Page 3 | tonymacx86.com Gigabyte Z87 mobos freeze when using 4 healthy sticks of RAM | tonymacx86.com Z87 When will Gigabyte fix the 4 DIMM memory issue? (giga-byte.co.uk) Edited October 6, 20232 yr by SP67
October 6, 20232 yr Author So lets say I want to try some tweaks I've found online (updating the BIOS to a beta version that some people say improves stability, playing with the DRAM voltage and timmings, etc.). What would you guys recommend me so that I don't force my system to go through several parity checks if/when the system crashes?
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.