ridewithjoe Posted March 24, 2022 Share Posted March 24, 2022 (edited) I have a rather sizeable array and a cache pool of 4 SSD's. I've been chasing some issues with my array drive dropping out and needing to rebuild. Right now the array is sync'ed ok but my cache pool is showing Unmountable File System. I'm not sure how to proceed to resolve it properly or even if I can do it and not lose any option to recover the data. I've attached a diagnostic dump of the current state. nasvm-diagnostics-20220324-1733.zip Edited March 24, 2022 by ridewithjoe Post updated diagnostics after reboot Quote Link to comment
JorgeB Posted March 24, 2022 Share Posted March 24, 2022 Log is completely spammed with SSH related text, reboot and post new diags after array start. Quote Link to comment
ridewithjoe Posted March 24, 2022 Author Share Posted March 24, 2022 2 hours ago, JorgeB said: Log is completely spammed with SSH related text, reboot and post new diags after array start. Rebooted and uploaded updated diags. Quote Link to comment
JorgeB Posted March 25, 2022 Share Posted March 25, 2022 Mar 24 17:33:32 nasvm kernel: BTRFS info (device sdi1): bdev /dev/sdi1 errs: wr 18755249, rd 19783429, flush 618, corrupt 70194, gen 25760 This cache device has been dropping offline, see here for better pool monitoring. As for current issue filesystem is corrupt, there are some recovery options here. Quote Link to comment
ridewithjoe Posted March 25, 2022 Author Share Posted March 25, 2022 Thank you for the assistance. My cache pool drives are: (sdi) (sdl) (sdk) (sdj) Following the instructions in the recovery post I created Temp directory /cacherpr I initiated the mount mount -o usebackuproot,ro /dev/sdi1 /cacherpr I get "mount: /cacherpr: can't read superblock on /dev/sdi1" same result if try and mount any of the 4 drives. I fear I have something far worse going on than I anticipated. Quote Link to comment
JorgeB Posted March 25, 2022 Share Posted March 25, 2022 Try btrfs restore, or upgrade to v6.10 since there is a new rescue=all option like mentioned in the link that might work for ro mount, and you just need to do it for one of the devices, it's all the same pool, if one doesn't work the others also won't. Quote Link to comment
ridewithjoe Posted March 25, 2022 Author Share Posted March 25, 2022 I updated to 6.10.0-rc4... figured that was my best bet at this point. That went well. I was able to mount using mount -o rescue=all,ro /dev/sdi1 /mnt/disk6/cachetmp Mount shows: /dev/sdi1 on /mnt/disk6/cachetmp type btrfs (ro,relatime,ssd,rescue=nologreplay:ignorebadroots:ignoredatacsums,space_cache,subvolid=5,subvol=/) So that all seems pretty good.. however not seeing any files or data in the mountpoint.... that doesn't seem good at all. root@nasvm:/mnt/disk6/cachetmp# ls -l total 0 Quote Link to comment
Solution JorgeB Posted March 25, 2022 Solution Share Posted March 25, 2022 Try the 2nd option, btrfs restore. Quote Link to comment
ridewithjoe Posted March 26, 2022 Author Share Posted March 26, 2022 OK this is great. That let me recover most files. I think that most files that were not recovered can be recreated. Thanks so much for the guidance. Now I need to investigate and determine what has destabilized the pool and add some recovery backups to the cache directories. Thanks again. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.