SergeantCC4 Posted November 19, 2022 Share Posted November 19, 2022 Recently after updating to 6.11.1 (to correct for the wireguard display glitch) my server would intermittently become unresponsive and I would be forced to perform a hard reboot. This typically occured while I was connected via wireguard to my server (it runs pi hole so I can get benifits remotely). I had previously gotten a docker image corrupt error but when I was unable to stop/delete the image I simply rebooted the server and it seemed to have fixed the problem. I posted on another topic that I thought I was having the same issue but I was mistaken. I tried to run the memtest built into the server but upon selection the server just rebooted and came back to the unraid boot screen. I read I was supposed to enable CSM but for some reason when I did that my server would no long POST. I disconnected my PCIe JBOD cards, and my boot flash and was able to post, and am now currently running memtest86 v10. The first run just finished and hasn't found any errors but I'm going to let it run through a few more cycles. My question is however, I have a dual NVMe protected cache array set up and I'm unsure if I have the right settings, and/or I need to do the balance or scrub. It's worked find during 6.10, and I upgraded to 6.11 to take advantage of the iGPU in the 12500 I just upgraded to. I tried to find some documentation, but it seems to be a little fuzzy (or maybe i'm the fuzzy one) about what to do when, where, and why. Anyone have any recommendations? Thanks in advance! syslog.txt citadel-diagnostics-20221118-2026.zip Quote Link to comment
JorgeB Posted November 20, 2022 Share Posted November 20, 2022 There's filesystem corruption on the pool, you should backup and re-format. Quote Link to comment
SergeantCC4 Posted November 20, 2022 Author Share Posted November 20, 2022 With those being relatively new devices (<2 months) should I be worried that something is wrong with them? Was there something I could've done to prevent this? Quote Link to comment
JorgeB Posted November 20, 2022 Share Posted November 20, 2022 Start by running memtest. Quote Link to comment
SergeantCC4 Posted November 20, 2022 Author Share Posted November 20, 2022 I saw your post elsewhere to do that so I ran 6 passes total yesterday and it returned zero errors. Quote Link to comment
JorgeB Posted November 21, 2022 Share Posted November 21, 2022 Then just reformat and see here for better pool monitoring, so you are warned if there are more issues. 1 Quote Link to comment
SergeantCC4 Posted November 22, 2022 Author Share Posted November 22, 2022 Thanks @JorgeB I was able to backup my files to the array, reformat the cache pool, and migrate my data back. Some strange networking bugs with wireguard happened but they seemed to fix themselves when I upgraded to 6.11.5. I read the post you linked and set up an hourly check using User Scripts. I also read a few other forums about the btrfs issues and set up a weekly balance as it seemed that there is really no harm in doing this? I've had a busy day and didn't get a chance to really look in depth but is there anywhere I can start for a basic understanding of Scrub and Balance and how often if at all they should be run? I'm not sure I understand the purpose of these features. Quote Link to comment
JorgeB Posted November 22, 2022 Share Posted November 22, 2022 Monthly scrub should be enough, regular balance might not be needed, depends on how the pool is used, but if you want to schedule one a monthly balance using the default block usage should be good. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.