Geck0 Posted July 11, 2022 Share Posted July 11, 2022 Hi, I woke up this morning to a Fix Common Problems error. This appears to be identical to this error: https://forums.unraid.net/topic/94659-solved-btrfs-issues/ My cache pool is this: I downloaded diagnostics, but it was inside a VM (stupidly), however I've attached the reports from the syslog folder. I was in a hurry to stop the server in case of corruption. Prior to this, I ran btrfs dev stats /mnt/cache_nvme It was full of errors. I tried a scrub and since rebooting I've run it again. you will note there are still errors but not nearly as much. I'm guessing the cache drives dropped out. Quote root@Nexus:~# btrfs dev stats /mnt/cache_nvme ERROR: cannot check /mnt/cache_nvme: No such file or directory ERROR: '/mnt/cache_nvme' is not a mounted btrfs device root@Nexus:~# btrfs dev stats /mnt/cache_nvme [/dev/nvme1n1p1].write_io_errs 0 [/dev/nvme1n1p1].read_io_errs 0 [/dev/nvme1n1p1].flush_io_errs 0 [/dev/nvme1n1p1].corruption_errs 0 [/dev/nvme1n1p1].generation_errs 4 [/dev/nvme2n1p1].write_io_errs 104169165 [/dev/nvme2n1p1].read_io_errs 31314052 [/dev/nvme2n1p1].flush_io_errs 4340 [/dev/nvme2n1p1].corruption_errs 0 [/dev/nvme2n1p1].generation_errs 0 root@Nexus:~# I've attached Diagnostics.zip which was before I stopped the array. After stopping the array, it couldn't find nvme2n1. I did a reboot to clear the logs as the system log was full. I've attached full diagnostics after rebooting into maintenance mode. I cannot restart docker or VM service, as they were on these drives. Currently, status is stuck on trying to unmount shares. I've had to kill a process to get it to shutdown. Could somebody please help? Fortunately, I did a lengthy sync of the important stuff last night to an undefined drive. Im wondering if this caused the issue in part?! Last parity check was on the 02/07/2022, completed without errors. I remain unsure what the next steps are? Should I dd an image of the cache, which is 500gb? I remain concerned about bitrot, corruption, etc. I've left the server shutdown for now. Diagnostics.zip nexus-diagnostics-20220711-1034.zip Quote Link to comment
Geck0 Posted July 11, 2022 Author Share Posted July 11, 2022 I forgot a copy of the syslog file, I downloaded it when the shares wouldn't unmount. nexus-syslog-20220711-0053.zip Quote Link to comment
JorgeB Posted July 11, 2022 Share Posted July 11, 2022 One of the devices has been dropping offline, run a scrub to sync the pool and see here for more info and better pool monitoring. Quote Link to comment
Geck0 Posted July 11, 2022 Author Share Posted July 11, 2022 Hi Jorge, thank you for taking the time to reply. I've mounted the second drive and running an rclone copy at the moment. I was going to dd image the disks, would you say it would be prudent to do? Quote Link to comment
Solution JorgeB Posted July 11, 2022 Solution Share Posted July 11, 2022 Second 27 minutes ago, Geck0 said: I've mounted the second drive Jul 11 10:11:35 Nexus kernel: BTRFS: device fsid 653150f2-f281-4987-bb45-f3e553c14b9c devid 2 transid 534361 /dev/nvme2n1p1 scanned by udevd (1391) Jul 11 10:11:35 Nexus kernel: BTRFS: device fsid 653150f2-f281-4987-bb45-f3e553c14b9c devid 1 transid 542551 /dev/nvme1n1p1 scanned by udevd (1402) I assume you mean devid 2 (nvme2n1)? The other device will be out of sync. If yes it's always good to make sure backups are up to date, if you mounted devid2 by itself read/write best forward is to wipe the other one and then re-add to the pool. Quote Link to comment
Geck0 Posted July 11, 2022 Author Share Posted July 11, 2022 Jorge', thanks for your help. I think everything is up and running. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.