February 20, 20233 yr Hello all! So I recently had a cache drive failure, and replaced it (maybe 3 weeks ago, tops) with a brand new SSD. I woke up this morning and noticed my Docker containers were down, so I went to investigate. I tried restarting the container in question, and noticed some errors while it was trying to start. I rebooted the server, and once it came up I saw that the cache drive was "unmountable". Since I'm on unraid 6.x, I stopped the array, started in maintenance mode, and ran a check (-nv), followed by another check (-v). It's weird it seems like the process was successful as no errors were reported (some files seem to have been moved to lost+found), but I'm still not able to mount the drive. Here's the tail end of the log from the check I ran. Phase 7 - verify and correct link counts... resetting inode 816629359 nlinks from 1 to 2 resetting inode 2032652 nlinks from 2 to 3 resetting inode 3164097 nlinks from 2 to 1 Maximum metadata LSN (48:72667) is ahead of log (47:440994). Format log to cycle 51. XFS_REPAIR Summary Mon Feb 20 11:02:02 2023 Phase Start End Duration Phase 1: 02/20 11:01:59 02/20 11:01:59 Phase 2: 02/20 11:01:59 02/20 11:01:59 Phase 3: 02/20 11:01:59 02/20 11:02:00 1 second Phase 4: 02/20 11:02:00 02/20 11:02:00 Phase 5: 02/20 11:02:00 02/20 11:02:00 Phase 6: 02/20 11:02:00 02/20 11:02:01 1 second Phase 7: 02/20 11:02:01 02/20 11:02:01 The full logs are really long and I'm not sure how useful they are for anyone offering help, so let me know if I should include those. There really isn't a next steps option on the page besides a full redo, is that what's required here? https://wiki.unraid.net/Check_Disk_Filesystems#Redoing_a_drive_formatted_with_XFS Kind of a bummer since I just went through this process of restoring my app data, but I just want to make sure I don't have any more options before I move forward. Thanks!!
February 20, 20233 yr Author 11 minutes ago, JorgeB said: Please post the diags after array start. Appreciate the reply – just stopped the array (was in maintenance mode), started it, and pulled the diagnostics. See attached. tayshserve-diagnostics-20230220-1133.zip
February 20, 20233 yr Community Expert 1 hour ago, tayshserve said: full logs are really long zip and post
February 20, 20233 yr Author I was being dramatic and quite frankly, lazy. I attached the full output from the check in a txt file, it's only 300kb. Thanks for the nudge @trurl check_logs.txt
February 20, 20233 yr Community Expert Did you do the check from the webUI or the command line? Easy to get the command wrong.
February 20, 20233 yr Author 15 minutes ago, trurl said: Did you do the check from the webUI or the command line? Easy to get the command wrong. I ran it straight from the GUI – first I ran it with `-nv` then I ran it with just `-v` after that.
February 20, 20233 yr Community Expert Why are you still in 6.9.2? Sometimes newer versions of Unraid have newer versions of xfsprogs. Checking the release notes there have been. https://wiki.unraid.net/Manual/Release_Notes
February 20, 20233 yr Author Honestly no real reason. Before my cache drive decided to die I had like 300+ days of uptime – things were pretty stable so I just didn't want to mess with it. Before I got about formatting this cache drive and all of that – should I try updating and run the check again and see what happens?
February 20, 20233 yr Community Expert 1 minute ago, tayshserve said: try updating and run the check again and see what happens? Worth a try
February 20, 20233 yr Author 14 minutes ago, trurl said: Worth a try Still comes out as unmountable – I'll attach the logs from the check again. Looking like I'll just have to format the drive and deal with any potential fallout? (almost always plex DB corruption lol) check_logs_new.txt
February 20, 20233 yr Community Expert Might be worth seeing the diagnostics in case there is something going on at the disk I/O level interfering with this.
February 20, 20233 yr Author Here's the diagnostics one more time! Is there any way to get files off of this drive before I format it (assuming I'm gonna have to go that route)? Wondering if I can attempt to save a couple of databases – I feel like whenever this happens I have to reconfigure Plex and Home Assistant as their database seems to corrupt and start from scratch. tayshserve-diagnostics-20230220-1315.zip
February 20, 20233 yr Community Expert 2 minutes ago, tayshserve said: whenever this happens You have CA Backup plugin installed. Have you tried restoring appdata?
February 20, 20233 yr Author Funny story – this is the 3rd time I've had some type of cache drive related issue. The first time I had no backup – learned my lesson and backed up my entire appdata directory to Backblaze b2 via Duplicati. The second time, I restored from the Backblaze backup.. and it took FOREVER. Like 20 hours or so. Anyway, I decided to use CA Backup and noticed I could ignore a folder.. so I ignored all the heavy media folders within Plex. Turns out you can't ignore just a child folder? Or – if you can, I didn't do it properly. Anyway, I now specifically have no backup for plex! So hopefully whatever is on the array for Plex isn't corrupt!
February 20, 20233 yr Community Expert Currently, your default shares https://wiki.unraid.net/Manual/Shares#Default_Shares are on the array. Since you have Docker and VM Manager enabled, and no cache, it had to recreate these from scratch on the array. Normally, you want the default shares cache:prefer so Docker/VM performance isn't impacted by parity, and so array disks can spin down since these files are always open. Maybe you set these to cache:yes because you were having cache problems, but appdata is actually set cache:only currently. I would expect any attempt to write to appdata wouldn't succeed since there is no cache for it to write to. The appdata on the array must have been there from before. What folders are currently in appdata? You should install Dynamix File Manager plugin, it will make it a lot easier to work with your user shares and disks directly on the server.
February 20, 20233 yr Author 3 minutes ago, trurl said: What folders are currently in appdata? You should install Dynamix File Manager plugin, it will make it a lot easier to work with your user shares and disks directly on the server. Going to install that now.. I currently have all the folder I would suspect on the array – just seems weird the last modified dates are super old for everything. Screenshot attached of the folders. So yeah currently my appdata folder is set to "Only : Cache" which I'm guessing is incorrect, and means that my appdata folder never gets copied back to the array. So the only way to remedy my Plex situation (all others are backed up) would be to potentially get the files off the cache before formatting? Is that even possible?
February 20, 20233 yr Community Expert 9 minutes ago, tayshserve said: the last modified dates are super old for everything If you didn't have the default shares as cache:prefer, then they probably wouldn't have been recreated. If these shares were cache:no or cache:yes, they were probably already on the array. 18 minutes ago, trurl said: Normally, you want the default shares cache:prefer so Docker/VM performance isn't impacted by parity, and so array disks can spin down since these files are always open.
February 20, 20233 yr Community Expert 12 minutes ago, tayshserve said: get the files off the cache before formatting? There are other file recovery applications, UFS Explorer is often mentioned since it supports Unraid filesystems. Whether it can succeed when xfsprogs couldn't?
February 20, 20233 yr Author 1 hour ago, trurl said: There are other file recovery applications, UFS Explorer is often mentioned since it supports Unraid filesystems. Whether it can succeed when xfsprogs couldn't? Yeah – so this is probably good advice so I don't waste my time.. Especially since the Plex database can be rebuilt. It's just annoying telling all my family they need to repin all of their content. I'm just gonna format and get myself back on track, plex should be easy enough to get going again. I want to say, thanks for sticking with this thread @trurland shout out to @JorgeBwho has also helped me directly before and countless times before by just being the solution to other issues on the boards.
February 21, 20233 yr Author @trurl hmm more interesting developments – now it appears I'm unable to format the cache drive. It just gets stuck in the "Formatting" state. I feel this typically takes 5 seconds or so and the UI updates that the process is done.
February 21, 20233 yr Author I'm not sure what's going on.. I've tried a few different things. Tried to unassign the device, start the array back up, stop the array, reassign the device.. no luck. I also tried deleting the pool and starting from scratch, nothing. Seems like something is going on with this disk? Diags attached, thanks! tayshserve-diagnostics-20230220-1641.zip
February 21, 20233 yr Community Expert I don't know, syslog seems like it should have worked. Run an extended self-test on it while you wait some hours to see if @JorgeBhas any ideas.
February 21, 20233 yr Author More lessons learned from past failures – when I bought my replacement SSD I actually bought 2 to have 1 as a back up. I popped that into my server and was able to remove the corrupt drive, format the new backup drive I had, and add that as the cache. I now have the corrupt drive as an unassigned device, however I'm still unable to format it. I suppose it could be the cable the original drive is using? I'm back up and running but would still like to understand what happened here so I can prevent it in the future. Thanks again, really appreciate your help as our whole house runs on this server haha. Edited February 21, 20233 yr by tayshserve
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.