May 19, 20251 yr Hi all, I've been successfully running my unRAID server for many years now, but recently, it's been crashing within a day or less. Symptomatically, this appears at the GUI just hanging, and no longer can connect to it via ssh or smb. I'm not sure what is going on, because on reboot, everything seems to be operating properly, and it will work fine for a while, but at some random time, it will just freeze / hang. I downloaded the diagnostics and attached them here after one such recent incident, but could use your help in analyzing them. I've run SMART tests on all my drives (2 parity, 3 disks, 1 SSD cache), and all seem to be reporting just fine (Completed: No Error on SMART Long / Extended Tests), and the SMART values all look good to me. Scrutiny docker also says the drives are fine. Please help! TIA! 🙏 alexandria-syslog-20250516-0937.zip Edited May 20, 20251 yr by rvijay007 Versiono change in Title
May 19, 20251 yr That's an old version so it is prior to the new/current pricing so you'd have updates included. Possible that whatever issue you are having has long been fixed in an update. Just curious why you are on such an old version of 6. IIRC 6.12.15 is the latest version of unraid 6. Since the OS is an older version, going to assume the hardware is also older. Possibly a memory error? Just an anecdote but I've seen a bad memory stick have random problems like you describe where it seems fine but after a while will lock up. I confirmed it by running memtest86 and letting it run.
May 20, 20251 yr Community Expert Enable the syslog server and post that after a crash, also post the complete diagnostics zip.
May 20, 20251 yr Author 12 hours ago, SnowCrash said: That's an old version so it is prior to the new/current pricing so you'd have updates included. Possible that whatever issue you are having has long been fixed in an update. Just curious why you are on such an old version of 6. IIRC 6.12.15 is the latest version of unraid 6. Since the OS is an older version, going to assume the hardware is also older. Possibly a memory error? Just an anecdote but I've seen a bad memory stick have random problems like you describe where it seems fine but after a while will lock up. I confirmed it by running memtest86 and letting it run. Sorry, that was a title mistype. I'm on v6.12.15. The computer is barely 2 years old. Edited May 20, 20251 yr by rvijay007
May 21, 20251 yr Author 11:02 PM PT - Rebooted computer after setting "Mirror to Flash" option in Syslog folder23 hours ago, JorgeB said:Enable the syslog server and post that after a crash, also post the complete diagnostics zip.I have enabled the syslog server, writing to my system folder. The system just crashed, around 10:50PM PT (22:50) today, after being up mostly all day. Pulled the syslog and diagnostics file, attached below.alexandria-diagnostics-20250520-2253.zipsyslog-10.0.1.9.log
May 21, 20251 yr Community Expert Unfortunately, there's nothing relevant logged, this can also be a hardware issue, since you have multiple RAM sticks, try using the server with just one pair, if the same try with the other one, that will basically rule out bad RAM, but since you have a 13900K, it could also be the Intel 13/14 gen issue.
May 22, 20251 yr Author 16 hours ago, JorgeB said:Unfortunately, there's nothing relevant logged, this can also be a hardware issue, since you have multiple RAM sticks, try using the server with just one pair, if the same try with the other one, that will basically rule out bad RAM, but since you have a 13900K, it could also be the Intel 13/14 gen issue.Does the memtest86 help diagnose the specific issues? Also, I'm noticing on certain parity runs that the speeds are absymally low, despite nothing aberrant coming up on SMART checks or DiskSpeed plugin, which benchmarks all the HDD drives at sufficient speeds. Parity check operated perfectly fine at the beginning of April, as they had always done. Not sure if this is a related issue or whether a separate problem...
May 22, 20251 yr Community Expert 5 hours ago, rvijay007 said:Also, I'm noticing on certain parity runs that the speeds are absymally lowPost diags when that happens, possibly the Unraid driver is crashing, that is also typically a hardware issue.
May 22, 20251 yr Author 2 hours ago, JorgeB said:Post diags when that happens, possibly the Unraid driver is crashing, that is also typically a hardware issue.I attempted to hit cancel on the below screenshot state. Got the indeterminate progress indicator, but it never cancelled and didn't disappear until I clicked a different tab at the top and returned back to main. And still looks like it hasn't cancelled the parity check. Downloaded the syslog file and updated diagnostics at this point, which are attached below.alexandria-diagnostics-20250522-0202.zipsyslog-10.0.1.9 (1).log
May 22, 20251 yr Community Expert Yep, Unraid driver crashing, typically bad RAM or CPU, the latter mostly when it's an Intel 13/14 gen.
May 22, 20251 yr Author 6 hours ago, JorgeB said:Yep, Unraid driver crashing, typically bad RAM or CPU, the latter mostly when it's an Intel 13/14 gen.I ran memtest86+ v6.20, and it seemed to pass as in the picture down below. Didn't realize that there were Intel 13th/14th gen CPU issues - I also went ahead and updated the motherboard BIOS to the latest stable firmware release. What data from those log files should I send Intel to get the chip RMA'd as I've never done anything like this before? TIA! Edited May 22, 20251 yr by rvijay007 inline image
May 22, 20251 yr Community Expert 12 minutes ago, rvijay007 said:What data from those log files should I send Intel to get the chip RMA'd as I've never done anything like this before? TIA!Just tell them the PC is constantly crashing, they are aware of the issues with those CPUs, you can Google Intel 13/14 gen issue for more info.
May 23, 20251 yr Author I reached out to Intel, which said I need to run the following Support Utility to see if it is a CPU error. This does run on Linux, but I'm not sure how I can get this to run on unRAID, or whether it would work at all in a VM. I know how to run it if I have a Linux shell, but unRAID isn't one of the supported types and doesn't work when I try.https://www.intel.com/content/www/us/en/download/18895/26735/download.html
May 23, 20251 yr Community Expert It may be easier to just boot with a Windows live flash drive and run it from there, you can create one with Rufus.
June 2, 20251 yr Author Solution Intel support led me to upgrade the motherboard BIOS firmware to the latest Intel microcode (Update Intel uCode to 0x12F), and now my system stability is back to what is expected. Parity finished in an appropriate amount of time (<2 days for 20TB drives), and the system uptime is over a week now instead of crashing all the time. I am still not sure how much permanent damage has been sustained due to the Intel 13th/14th generation bugs, and their support doesn't seem interested in RMAing the chipset, but at least the system is stable again. Edited June 2, 20251 yr by rvijay007
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.