MichiganMatt Posted December 14, 2023 Share Posted December 14, 2023 Hello Unraid Family, I need some help with my server. I've been struggling with system crashes for awhile and I upgraded my mobo/CPU/memory but kept my a Supermico Case, raid controller and thumbdrive. The migration went pretty well and after I was moved into the new hardware, I also deleted and rebuilt my CACHE and Docker Containers as I had quite a few BTRFS errors on that I could never get rid of. Post migration, CACHE rebuild and Docker file rebuild, I continued to have issues with system crashes and I noticed yesterday that a bunch of my SHARES were no longer accesible after the array was started. Disk 2 shows that It's unmountable with the error message "Unmountable: Unsupported or no file system" I looked through the logs, and I think this is what's going on with Disk 2 in the Array is that's it's corrupt but I don't what to run xfs_repair because I'm really not familiar with the command. The SMART data reported on that disk looks OK, maybe this is an issue with the repeated shutdowns. Spoiler Dec 14 09:40:51 UnRaidServer kernel: XFS (md2p1): Corruption detected. Unmount and run xfs_repair Dec 14 09:40:51 UnRaidServer kernel: XFS (md2p1): Failed to recover intents Dec 14 09:40:51 UnRaidServer kernel: XFS (md2p1): Filesystem has been shut down due to log error (0x2). Dec 14 09:40:51 UnRaidServer kernel: XFS (md2p1): Please unmount the filesystem and rectify the problem(s). Dec 14 09:40:51 UnRaidServer kernel: XFS (md2p1): Ending recovery (logdev: internal) Dec 14 09:40:51 UnRaidServer kernel: XFS (md2p1): log mount finish failed I also see in the shareDisks.txt that my 'share does not exist', but the system still has it somewhere, but it worries me that the data could be gone. Spoiler appdata shareUseCache="only" # Share exists on cache B-----s shareUseCache="no" # Share does not exist C-------------------p shareUseCache="no" # Share exists on disk3, disk4 domains shareUseCache="only" # Share exists on cache F-------------s shareUseCache="no" # Share does not exist F------------s shareUseCache="no" # Share does not exist F----------s shareUseCache="no" # Share does not exist isos shareUseCache="no" # Share does not exist L----------E shareUseCache="no" # Share does not exist N--------S shareUseCache="yes" # Share exists on disk4 P-----------e shareUseCache="prefer" # Share exists on cache S----b shareUseCache="yes" # Share exists on cache S-------------------d shareUseCache="only" # Share exists on inbound-temp S----V shareUseCache="yes" # Share exists on disk1, disk3 S-----s shareUseCache="only" # Share exists on cache system shareUseCache="only" # Share exists on cache t--------p shareUseCache="only" # Share exists on cache I know from Spaceinvaders video, that when there is a major issue like this, it's best to stop, breath, and be very careful with steps to fix the issues as real data loss can occur if I make run commands I'm not familiar with. With that said, I would very much appreciate other looking at my logs and advising next steps. unraidserver-diagnostics-20231214-0953.zip Quote Link to comment
itimpi Posted December 14, 2023 Share Posted December 14, 2023 1 minute ago, MichiganMatt said: corrupt but I don't what to run xfs_repair because I'm really not familiar with the command. You can run it from the GUI as described in this section of the online documentation accessible via the ‘Manual’ link at the bottom of the GUI or the DOCS link at the top of each forum page. The Unraid OS->Manual section in particular covers most features of the current Unraid release. Quote Link to comment
MichiganMatt Posted December 14, 2023 Author Share Posted December 14, 2023 Hello itimpi, Thank you for the quick response and pointing me in this direction to do this test, I had no idea that could be checked per disk, I had always done everything in UnRaid at the Pool/Array level. Here is the results of it with the -n and without the -n. I see two .jpeg files (cell phone pictures likely) in results below that would be lost and my backup strategy has them, but most of the data on the drive is not backed up and that's what I don't want to loose. When I run without the -n, the prompt notes that there will be data loss and wants me to run with the -L option . If I run with the -L option, can I expect the data loss to be just the Image Files identified in the -n step or is there likely/probably to be more data loss? Spoiler Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... sb_fdblocks 244981168, counted 247420741 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 inode 15034073100 - bad extent starting block number 4503567550846660, offset 0 correcting nextents for inode 15034073100 bad data fork in inode 15034073100 would have cleared inode 15034073100 - agno = 8 - agno = 9 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 8 - agno = 1 - agno = 3 - agno = 2 - agno = 5 - agno = 6 - agno = 7 - agno = 9 - agno = 4 inode 15034073100 - bad extent starting block number 4503567550846660, offset 0 correcting nextents for inode 15034073100 bad data fork in inode 15034073100 would have cleared inode 15034073100 entry "20210327_175807.jpg" at block 20 offset 1408 in directory inode 15037234757 references free inode 15034073100 would clear inode number in entry at offset 1408... No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... entry "20210327_175807.jpg" in directory inode 15037234757 points to free inode 15034073100, would junk entry would rebuild directory inode 15037234757 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. Spoiler Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. Quote Link to comment
Solution itimpi Posted December 14, 2023 Solution Share Posted December 14, 2023 You need to run without -n and add -L (Unraid has already failed to mount the drive) to get the repair done. Afterwards you need to start the array in normal mode and see if you have a lost+found folder which is where the repair process puts anything it cannot figure out. Quote Link to comment
MichiganMatt Posted December 14, 2023 Author Share Posted December 14, 2023 Thanks itimpi 😍, array is running and back on-line and I don't see where any files are lost or gone from the array. Quote Link to comment
itimpi Posted December 15, 2023 Share Posted December 15, 2023 13 hours ago, MichiganMatt said: Thanks itimpi 😍, array is running and back on-line and I don't see where any files are lost or gone from the array. If you do not have a lost+found folder on the dtive then that normally means all data has been recovered with no data loss. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.