eroc1990 Posted July 18, 2023 Share Posted July 18, 2023 I'm going to lead by stating that my UnRAID setup is not normal. I don't do pool/cache. UnRAID is basically a container/VM manager for me, with direct /mnt/disk# mappings for mounted volumes in Docker. Recently, I've been running into an issue where BTRFS will just crash on me out of the blue. My system is usually running anywhere between 60-80% memory usage, of which I have 32 GB. My Docker disk image is formatted BTRFS (was XFS for a while, recently changed after a similar XFS crash prompted a rebuild of the image file). Disk 1/2 are formatted XFS, and 3, 4, and 5 are BTRFS. It seems like there's no rhyme or reason to the crashes I've encountered. The server wasn't under excessive load, memory wasn't constrained beyond what I stated earlier. I'll just see my alerting from my VPS telling me stuff's down, and then I refresh and the system logs are filled with BTRFS read/write errors. Per Scrutiny, all disks in my server are healthy, so I don't think it's a drive issue. I haven't extensively tested my memory, but pre-6.12, I would have uptime in excess of 30-40 days between manual reboots, if needed, save for a random crash here or there due to me allocating more resources to something than I meant to. I've attached two diagnostics, one generated automatically when the issue occurred, and another I manually generated post-reboot. Any insight on how I can improve my server's stability would be greatly appreciated. ericserverpc-diagnostics-20230718-1621.zip ericserverpc-diagnostics-20230718-1638.zip Quote Link to comment
JorgeB Posted July 19, 2023 Share Posted July 19, 2023 Jul 18 16:12:23 ericServerPC kernel: ahci 0000:06:00.0: AHCI controller unavailable! Controller is having issues and it's dropping all connected devices, this is not that uncommon with Ryzen board, suggest using an add-on controller. Quote Link to comment
eroc1990 Posted July 19, 2023 Author Share Posted July 19, 2023 3 hours ago, JorgeB said: Jul 18 16:12:23 ericServerPC kernel: ahci 0000:06:00.0: AHCI controller unavailable! Controller is having issues and it's dropping all connected devices, this is not that uncommon with Ryzen board, suggest using an add-on controller. Isn't it odd, though, that it's just suddenly happening? This would never happen on previous versions of UnRAID. Quote Link to comment
Solution JorgeB Posted July 19, 2023 Solution Share Posted July 19, 2023 Kernel change can change things, though usually it's the other way around, with newer kernels it's much less likely to happen, but tit still does to some, BIOS update may also help. Quote Link to comment
eroc1990 Posted July 19, 2023 Author Share Posted July 19, 2023 Yup, usually you'd expect better performance. Unfortunately for me, that has not been the case. I've been up-to-date on the BIOS for a while, including prior to installing 6.12.x. I wish I could fix it by updating that. Oh well. I might just wind up manually downgrading back to the latest point release of 6.11 as a result of this. Quote Link to comment
eroc1990 Posted July 19, 2023 Author Share Posted July 19, 2023 I just decided to downgrade to 6.11.5 for now. Leaving it as is, evaluating performance from there, and may try upgrading again at some point in steps to see where the issues really begin to manifest. Thanks for the help though, @JorgeB. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.