Hollandex Posted March 16, 2022 Share Posted March 16, 2022 (edited) Diagnostics attached. I woke up today to Docker containers not working. If I tried to stop/restart any container, I got "Execution Error". I tried stopping Docker, deleting the vdisk, and starting Docker back up. Now, the Docker page says Docker failed to start. No idea what's going on. The server was working great last night. Never turned it off. Now it's borked. sanctuary-diagnostics-20220316-1131.zip Edited March 16, 2022 by Hollandex Quote Link to comment
Hollandex Posted March 16, 2022 Author Share Posted March 16, 2022 I also just noticed that I can't start a VM. I get this error: "unable to open /mnt/user/domains/EndeavourOS/vdisk1.img: Read-only file system" Cache drives are mounted correctly and not even close to full. Quote Link to comment
Squid Posted March 16, 2022 Share Posted March 16, 2022 Issue with the BTRFS filesystem on the cache drive. Best to way for one of the BTRFS guys who know this stuff inside and out ( @JorgeB, possibly others) Quote Link to comment
JorgeB Posted March 16, 2022 Share Posted March 16, 2022 Btrfs detected data corruption in both devices Mar 16 11:12:23 Sanctuary kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 65, gen 0 Mar 16 11:12:23 Sanctuary kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 46, gen 0 This is usually a RAM issue, start by running memtest, since the filesystem was also affected if a problem if found best bet after fixing it is to backup and re-format the pool. Quote Link to comment
Hollandex Posted March 16, 2022 Author Share Posted March 16, 2022 (edited) I rebooted to no avail but I then shut the system down entirely so I could mess with the components inside the case. When I started it back up, everything is working fine again. Docker starts, VMs work, etc. So....does this still sound lime a RAM problem? I'll probably take the server offline tonight and let memtest run while I sleep. Edit: Spoke too soon. It started to work. Now it's all failing again. In the same way. Edited March 16, 2022 by Hollandex Quote Link to comment
Hollandex Posted March 16, 2022 Author Share Posted March 16, 2022 56 minutes ago, JorgeB said: Btrfs detected data corruption in both devices Mar 16 11:12:23 Sanctuary kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 65, gen 0 Mar 16 11:12:23 Sanctuary kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 46, gen 0 This is usually a RAM issue, start by running memtest, since the filesystem was also affected if a problem if found best bet after fixing it is to backup and re-format the pool. Okay, running memtest now. If the problem persists, since the cache is acting as if it's read only, will Mover be able to correctly move the contents off the cache and on to the array? Or should I just manually do a copy/paste from /mnt/cache to /mnt/disk1 (for instance)? I think I'll format the cache pool either way, just to be safe, so I wanted to make sure I get appdata/domains/system files properly backed up. Quote Link to comment
Hollandex Posted March 17, 2022 Author Share Posted March 17, 2022 UPDATE I ran 8 passes of MemTest, across ~14 hours. Zero errors. I might try more passes later but, for now, I'm satisfied there aren't any issues with the RAM. I ran an extended self test on both NVMEs. No issues found. So, at this point, I have no idea why or how my btrfs pool got corrupted. Which kind of sucks. I'd love to pinpoint a reason so I can feel assured it won't happen again. I formatted each drive as XFS then put them back in a btrfs pool (this was the only way I could get Unraid to let me format the drives from btrfs to btrfs). I nuked my docker vdisk, just in case it was the culprit. And now everything is back to running like normal. Thanks to both of you, Squid and JorgeB, for the help! Quote Link to comment
JorgeB Posted March 18, 2022 Share Posted March 18, 2022 See below how to monitor the pool so you're notified if new corruption is found: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=700582 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.