December 18, 20232 yr Eleven days ago I experienced my first unexpected crash with unRAID. Today, there was a second crash. I'd appreciate some help pointing me into the right direction. The syslog shows some warnings about `READ FPDMA QUEUED` and subsequent IO and Buffer Errors which are new to me. It's been a long time though, since I've last checked for warnings and errors in the syslog and these could have been there for a while. The only meaningful thing I can remember to have changed was adding a VM running PiHole and setting that as my DNS on my router. That also came with enabling IPv6 for the server and Docker. After some issues with the unRAID app store, I whitelisted the server. That was AFTER the first crash, though, I think. One drive (sdb) has been slowly throwing UDMCA CRC errors. According to my Telegram agent, it stood at 19 immediately after the first crash, then increased to 30 as of today; most errors were thrown during the 30h-long parity check after the first crash. The drive throwing READ FPDMA QUEUED warnings is a known bad drive mounted via Unassigned Devices. I keep it and another drive for non-important data. A few months back I attempted to upgrade to 6.12 twice. Both times I had issues, first with the Nvidia plugin, then with the i915 drivers and ultimately had to roll back to 6.11.5 by making a fresh USB and copying over my configs. Edit: Funnily enough, this second crash, within 60 seconds also corresponds to when the lights went out at a few buildings at my university, about 2km away. Though I doubt this power-ripple made it to my server, as it runs off an UPS. stower20-diagnostics-20231218-1615.zip Edited December 18, 20232 yr by DesertCookie
December 18, 20232 yr Author 48 minutes ago, JorgeB said: Enable the syslog server and post that after a crash. Shucks, I totally thought I had it enabled already and it was in the diagnostics. I must have disabled some time in the last two years. Will do as soon as available.
January 22, 20242 yr Author I finally (heh) encountered another random crash that I only got wind of by my Telegram agent informing me of an unclean shutdown. I've appended all the logs and diagnostics that surround the crash. It took place on the 21st of Jan. at roughly 15:30 to 15:40. stower20-diagnostics-20240121-1545.zip syslog-1704572223 syslog-1705483734
January 22, 20242 yr Community Expert 3 hours ago, DesertCookie said: I only got wind of by my Telegram agent informing me of an unclean shutdown. Do you mean the server rebooted on its own? If yes that usually point to a hardware issue, or power, and I don't see anything relevant in the syslog, which is normal for hardware/power issues.
January 23, 20242 yr Author 20 hours ago, JorgeB said: Do you mean the server rebooted on its own? I cannot confirm that it rebooted itself this time around. The last two times it locked up and shut off, then automatically restarted, as I've configured it to do in the UEFI. I can only assume it was the same this time around. I don't know where to start with hardware or power, sadly. This system has been running perfectly the past year and a half. I had more issues with the Threadripper 1900 system I had before. I'd say both the UPS and the power supply, which still has seven years of warranty, are above any doubt and it probably is hardware, if anything. I might try to upgrade to 6.12 once again and see if that fixes it; the last three times I tried it would get hung up on an i915 driver issue and not boot any further. I had to hard reset and rebuild the thumb drive to 6.11.5.
January 23, 20242 yr Community Expert 31 minutes ago, DesertCookie said: The last two times it locked up and shut off, then automatically restarted This is almost always a hardware or power issue.
January 24, 20242 yr Author On 1/23/2024 at 10:28 AM, JorgeB said: This is almost always a hardware or power issue. I found an issue with my UPS. I'll change its battery and observe if there are any more crashes the coming weeks. Thanks for looking into the logs for me.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.