Rattus Posted April 19, 2023 Share Posted April 19, 2023 Hi All, I've run into a bit of an issue that I think is stemming from some corruption errors in my cache. Initial symptom was that the cache wasn't being emptied when mover was running. I followed this guide to get everything off of the cache: Backing up the Pool to the Array During the move, I thought I noticed that the pool still wasn't emptying. In fact it was emptying, it just wasn't giving me more free space. Please see attached 2 screen shots taken over a few hours. You'll note that I have 2 x 1 TB Drives in a BTRFS Raid 1. Initially 418GB was Used. Then 284 GB was Used but the Free Space only went up by about 10 GB. Both Drives are identical in size so no funny calculations on the BTRFS side. After some sleuthing I found this thread because I also got the same error message mentioned. BTRFS Pool Too Many Profiles. I've also now run the dev stats command and got this: Quote root@Radon:~# btrfs dev stats /mnt/cache [/dev/nvme0n1p1].write_io_errs 0 [/dev/nvme0n1p1].read_io_errs 0 [/dev/nvme0n1p1].flush_io_errs 0 [/dev/nvme0n1p1].corruption_errs 862164 [/dev/nvme0n1p1].generation_errs 0 [/dev/sdc1].write_io_errs 0 [/dev/sdc1].read_io_errs 0 [/dev/sdc1].flush_io_errs 0 [/dev/sdc1].corruption_errs 0 [/dev/sdc1].generation_errs 0 I had some problems a while ago with bad RAM corrupting my cache, the sdc drive was added after this corruption was solved (or so I thought). So my question is: If I was to remove the NVME drive, zero it, and then re-add it, would that solve my issue? or is there more at play here that I haven't yet found? Do I need to do anything to be able to safely remove the NVME Drive? Results of the Pool Balance are below for reference (I'm not 100% sure how to tell if it is balanced and I can remove a drive) Quote Data, single: total=867.49GiB, used=445.59GiB Data, RAID1: total=60.97GiB, used=40.22GiB System, single: total=4.00MiB, used=80.00KiB System, RAID1: total=32.00MiB, used=64.00KiB Metadata, single: total=2.01GiB, used=428.72MiB Metadata, RAID1: total=1.00GiB, used=152.41MiB GlobalReserve, single: total=512.00MiB, used=0.00B No balance found on '/mnt/cache' radon-diagnostics-20230419-1456.zip Quote Link to comment
Solution JorgeB Posted April 19, 2023 Solution Share Posted April 19, 2023 With all that corruption best way forward would be to re-format the pool, is all data you want or can recover already backed up? Quote Link to comment
Rattus Posted April 19, 2023 Author Share Posted April 19, 2023 No, Unfortunatly not. My Docker Image is still there along with some Nextcloud and swag data. How do I force it to move off? I have a spare sata SSD and sata port that I can use if that helps? Quote Link to comment
JorgeB Posted April 19, 2023 Share Posted April 19, 2023 Docker image can easily be recreated, you can use btrfs restore to restore ignoring the corrupt files, of course the files will still be corrupted, so some might not work correctly after. Quote Link to comment
Rattus Posted April 19, 2023 Author Share Posted April 19, 2023 Hey JorgeB, Thanks heaps for the super quick replies yesturday. I've been trying to do the restore like you said but I've run into some problems. I've tried multiple commands, These are the errors I'm getting: Quote root@Radon:/mnt# btrfs restore -v /dev/sdc1 /mnt/disk1/restore ERROR: /dev/sdc1 is currently mounted, cannot continue root@Radon:/mnt# btrfs restore -v /dev/cache /mnt/disk1/restore ERROR: mount check: cannot open /dev/cache: No such file or directory ERROR: could not check mount status: No such file or directory root@Radon:/mnt# btrfs restore -vi /dev/sdc1 /mnt/disk1/restore ERROR: /dev/sdc1 is currently mounted, cannot continue root@Radon:/mnt# btrfs restore -vi /dev/nvme0n11 /mnt/disk1/restore ERROR: mount check: cannot open /dev/nvme0n11: No such file or directory ERROR: could not check mount status: No such file or directory root@Radon:/mnt# btrfs restore -vi /dev/nvme0n1 /mnt/disk1/restore No valid Btrfs found on /dev/nvme0n1 Could not open root, trying backup super No valid Btrfs found on /dev/nvme0n1 Could not open root, trying backup super No valid Btrfs found on /dev/nvme0n1 Could not open root, trying backup super So it won't work if the array is started, then it won't work if they array is stopped, and it also won't work if the array is in maintenance mode. I am super confused as to what I need to do here ... Quote Link to comment
JorgeB Posted April 20, 2023 Share Posted April 20, 2023 9 hours ago, Rattus said: root@Radon:/mnt# btrfs restore -v /dev/sdc1 /mnt/disk1/restore ERROR: /dev/sdc1 is currently mounted, cannot continue root@Radon:/mnt# btrfs restore -v /dev/cache /mnt/disk1/restore ERROR: mount check: cannot open /dev/cache: No such file or directory ERROR: could not check mount status: No such file or directory root@Radon:/mnt# btrfs restore -vi /dev/sdc1 /mnt/disk1/restore ERROR: /dev/sdc1 is currently mounted, cannot continue It must be unmounted. 9 hours ago, Rattus said: root@Radon:/mnt# btrfs restore -vi /dev/nvme0n11 /mnt/disk1/restore ERROR: mount check: cannot open /dev/nvme0n11: No such file or directory ERROR: could not check mount status: No such file or directory root@Radon:/mnt# btrfs restore -vi /dev/nvme0n1 /mnt/disk1/restore No valid Btrfs found on /dev/nvme0n1 Could not open root, trying backup super No valid Btrfs found on /dev/nvme0n1 Could not open root, trying backup super No valid Btrfs found on /dev/nvme0n1 Could not open root, trying backup super Wrong device, should be /dev/nvme0n1p1 Quote Link to comment
Rattus Posted April 21, 2023 Author Share Posted April 21, 2023 Hello Again JorgeB and all, So its been a process but: - BTRFS Restore moved all of the files leftover files from cache to disk 1 under /mnt/disk1/restore - I've now got the pool back up, Pool is formatted to BTRFS and showing all available storage (see screenshot attached) I followed this method from Squid with some tweaks to reformat the pool. I had to remove the SATA Drive, reduce the pool size to one, then format as per Squid, then re-add the drive. Now the dev stats command shows no errors: Quote [/dev/nvme0n1p1].write_io_errs 0 [/dev/nvme0n1p1].read_io_errs 0 [/dev/nvme0n1p1].flush_io_errs 0 [/dev/nvme0n1p1].corruption_errs 0 [/dev/nvme0n1p1].generation_errs 0 [/dev/sdc1].write_io_errs 0 [/dev/sdc1].read_io_errs 0 [/dev/sdc1].flush_io_errs 0 [/dev/sdc1].corruption_errs 0 [/dev/sdc1].generation_errs 0 My question now is, what is the safest way to move the files from the restore folder I created back to the drive so that they are recognised as never actually leaving the cache? Is that even what I am supposed to do here? Is it just the BTRFS Restore in reverse? Quote Link to comment
JorgeB Posted April 21, 2023 Share Posted April 21, 2023 You can just copy the files to their original locations, but like mentioned some files will still be corrupt, and that can cause issues down the line. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.