paululibro Posted November 27, 2021 Share Posted November 27, 2021 So one of my drives was unmounted and was showing "Unmountable: not mounted". I followed this guide and was able to mount the drive. But now some of its data is missing and my server shows more free space than before (~600GB). Is it possible to rebuild that data from parity or was it overwritten with xfs_repair? Quote Link to comment
trurl Posted November 28, 2021 Share Posted November 28, 2021 attach diagnostics to your NEXT post in this thread. Quote Link to comment
paululibro Posted November 28, 2021 Author Share Posted November 28, 2021 Attachments orion2-diagnostics-20211128-1028.zip Quote Link to comment
paululibro Posted November 28, 2021 Author Share Posted November 28, 2021 So should I try to rebuild from parity? Is it even possible? Quote Link to comment
trurl Posted November 28, 2021 Share Posted November 28, 2021 Haven't had a chance to look at diagnostics yet but thought I should respond before you do what you are thinking about doing. 18 minutes ago, paululibro said: So should I try to rebuild from parity? Is it even possible? No Quote Link to comment
trurl Posted November 28, 2021 Share Posted November 28, 2021 Disk3 seems to be disabled and unassigned, but can't tell whether or not the emulated disk is unmountable since the array isn't started. Also, can't tell what happened since you rebooted before getting diagnostics so syslog reset. Post a screenshot of Main - Array Devices. Quote Link to comment
trurl Posted November 28, 2021 Share Posted November 28, 2021 2 minutes ago, trurl said: Disk3 seems to be disabled and unassigned Did you unassign disk3 yourself? Quote Link to comment
trurl Posted November 28, 2021 Share Posted November 28, 2021 OK, after looking more closely at your syslog, it seems you had already started to rebuild disk3. It looks like emulated disk3 was mounting though. Start the array with disk3 unassigned and post new diagnostics. Quote Link to comment
paululibro Posted November 29, 2021 Author Share Posted November 29, 2021 (edited) Here is what happened: 1. My Disk 3 was showing up as "Unmountable: not mounted" 2. I ran xfs_repair on it and it was mounted again but was missing over 600GB of data 3. I unplugged it and replaced it with brand new disk 4. After boot, rebuilding from parity started automatically 5. After it finished, new Disk 3 is mounted but with zero data on it Now: - I plugged old Disk 3, with new Disk 3 still connected and mounted old Disk 3 with "Unassigned Devices" plugin - When old Disk 3 is mounted it shows that it has zero data on it BUT when old Disk 3 is mounted, new Disk 3 shows up "Unmountable: not mounted". When I unplugged old Disk 3 and rebooted, new Disk 3 is mounted again but with zero data I'm confused, beacuse old Disk 3 should still have remaining data. Should I start array with old or new Disk 3 unassigned? Edited November 29, 2021 by paululibro Quote Link to comment
trurl Posted November 29, 2021 Share Posted November 29, 2021 1 hour ago, paululibro said: new Disk 3 is mounted but with zero data on it Are you sure you didn't format? 1 hour ago, paululibro said: Should I start array with old or new Disk 3 unassigned? No disk assigned as disk3. Quote Link to comment
paululibro Posted November 29, 2021 Author Share Posted November 29, 2021 6 hours ago, trurl said: Are you sure you didn't format? It was a brand new empty disk. When you said it was not possible to rebuild data lost with xfs_repair I was at least hoping it would have the rest of data from old Disk 3. Here are diagnostics with old Disk 3 (the one that should have data but shows empty) plugged in but unassigned and new Disk 3 not connected at all. orion2-diagnostics-20211129-1104.zip Quote Link to comment
paululibro Posted November 29, 2021 Author Share Posted November 29, 2021 (edited) Also when I click on "File System Check" next to unassigned old Disk 3: I'm getting these logs: Spoiler FS: xfs Executing file system check: /sbin/xfs_repair -n /dev/sdg1 2>&1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... Metadata CRC error detected at 0x43cfad, xfs_bnobt block 0x37fffffd0/0x1000 btree block 7/1 is suspect, error -74 bad magic # 0xa202020 in btbno block 7/1 Metadata corruption detected at 0x43cc88, xfs_cntbt block 0x37fffffd8/0x1000 btree block 7/2 is suspect, error -117 bad magic # 0x49414233 in btcnt block 7/2 agf_freeblks 74458438, counted 0 in ag 7 agf_longest 74458438, counted 0 in ag 7 Metadata CRC error detected at 0x46b78d, xfs_inobt block 0x37fffffe0/0x1000 btree block 7/3 is suspect, error -74 bad magic # 0x58444233 in inobt block 7/3 Metadata corruption detected at 0x4536d0, xfs_bnobt block 0x80000000/0x1000 btree block 1/1 is suspect, error -117 Metadata CRC error detected at 0x43cfad, xfs_bnobt block 0xfffffff8/0x1000 btree block 2/1 is suspect, error -74 bad magic # 0x64383a61 in btbno block 2/1 Metadata CRC error detected at 0x43cfad, xfs_bnobt block 0x27fffffe0/0x1000Metadata corruption detected at 0x4536d0, xfs_bnobt block 0x1ffffffe8/0x1000 btree block 4/1 is suspect, error -117 btree block 5/1 is suspect, error -74 bad magic # 0x205f5f5f in btbno block 5/1 Metadata CRC error detected at 0x43cfad, xfs_bnobt block 0x2ffffffd8/0x1000 btree block 6/1 is suspect, error -74 bad magic # 0xa202020 in btbno block 6/1 Metadata CRC error detected at 0x43cfad, xfs_bnobt block 0x17ffffff0/0x1000 btree block 3/1 is suspect, error -74 bad magic # 0x51380c72 in btbno block 3/1 Metadata corruption detected at 0x4536d0, xfs_cntbt block 0x80000008/0x1000 btree block 1/2 is suspect, error -117 Metadata CRC error detected at 0x43cfad, xfs_cntbt block 0x100000000/0x1000 btree block 2/2 is suspect, error -74 bad magic # 0xe12ea68c in btcnt block 2/2 Metadata corruption detected at 0x4536d0, xfs_cntbt block 0x1fffffff0/0x1000 btree block 4/2 is suspect, error -117 Metadata corruption detected at 0x4536d0, xfs_cntbt block 0x27fffffe8/0x1000 btree block 5/2 is suspect, error -117 Metadata CRC error detected at 0x43cfad, xfs_cntbt block 0x2ffffffe0/0x1000 btree block 6/2 is suspect, error -74 bad magic # 0x54686973 in btcnt block 6/2 Metadata CRC error detected at 0x43cfad, xfs_cntbt block 0x17ffffff8/0x1000 btree block 3/2 is suspect, error -74 bad magic # 0x17b63fbe in btcnt block 3/2 agf_freeblks 268435445, counted 750 in ag 1 agf_freeblks 268435445, counted 0 in ag 2 agf_longest 268435445, counted 0 in ag 2 agf_longest 268435445, counted 5 in ag 1 agf_freeblks 268435445, counted 567 in ag 5 agf_longest 268435445, counted 4 in ag 5 agf_freeblks 267913717, counted 1320 in ag 4 agf_longest 267913717, counted 9 in ag 4 agf_freeblks 268435445, counted 0 in ag 6 agf_longest 268435445, counted 0 in ag 6 agf_freeblks 268435445, counted 0 in ag 3 agf_longest 268435445, counted 0 in ag 3 Metadata corruption detected at 0x4536d0, xfs_inobt block 0x80000010/0x1000 btree block 1/3 is suspect, error -117 Metadata CRC error detected at 0x46b78d, xfs_inobt block 0x100000008/0x1000 Metadata corruption detected at 0x4536d0, xfs_inobt block 0x1fffffff8/0x1000btree block 2/3 is suspect, error -74 bad magic # 0x5829dbd5 in inobt block 2/3 btree block 4/3 is suspect, error -117 Metadata corruption detected at 0x4536d0, xfs_inobt block 0x27ffffff0/0x1000 btree block 5/3 is suspect, error -117 Metadata CRC error detected at 0x46b78d, xfs_inobt block 0x180000000/0x1000 btree block 3/3 is suspect, error -74 bad magic # 0xd3a6ff15 in inobt block 3/3 Metadata CRC error detected at 0x46b78d, xfs_inobt block 0x2ffffffe8/0x1000 btree block 6/3 is suspect, error -74 bad magic # 0x546f7272 in inobt block 6/3 agi_count 0, counted 1728 in ag 1 agi_freecount 0, counted 191 in ag 1 agi_count 0, counted 2016 in ag 4 agi_freecount 0, counted 134 in ag 4 agi_count 0, counted 2336 in ag 5 agi_freecount 0, counted 102 in ag 5 sb_icount 64, counted 6144 sb_ifree 61, counted 488 sb_fdblocks 1952984849, counted 268438106 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... found inodes not in the inode allocation tree found inodes not in the inode allocation tree found inodes not in the inode allocation tree - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... free space (1,2952230-2952262) only seen by one free space btree free space (1,2952264-2952332) only seen by one free space btree free space (1,2952334-2952418) only seen by one free space btree [[[ IT GOES FOR THE NEXT ~1400 lines but with different values ]]] - check for inodes claiming duplicate blocks... - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 0 No modify flag set, skipping phase 5 Inode allocation btrees are too corrupted, skipping phases 6 and 7 Maximum metadata LSN (1610016023:-369629521) is ahead of log (1:58). Would format log to cycle 1610016026. No modify flag set, skipping filesystem flush and exiting. Edited November 29, 2021 by paululibro Quote Link to comment
paululibro Posted November 29, 2021 Author Share Posted November 29, 2021 I read multiple posts related to "Unmountable: not mounted" error and I can’t find anything similar to my issue. Like: - what happened to the data on the original Disk 3? Is it still there but can’t be accessed due to fs errors? - and why content of original Disk 3 wasn’t rebuilt to new Disk 3 even if parity was valid and finished with 0 errors? Quote Link to comment
Squid Posted November 29, 2021 Share Posted November 29, 2021 4 minutes ago, paululibro said: what happened to the data on the original Disk 3? Is it still there but can’t be accessed due to fs errors Yes 6 minutes ago, paululibro said: - and why content of original Disk 3 wasn’t rebuilt to new Disk 3 even if parity was valid and finished with 0 errors? Based upon the last set of diagnostics you posted, it really looks like at some point during all of this when starting the array and the disk came up as unmountable you hit the check box for format and acknowledged the pop up that stated it's never part of a rebuild operation /dev/md3 7.3T 52G 7.3T 1% /mnt/disk3 So in a nutshell the system did what was asked and formatted the drive (or the emulated version) and subsequently rebuilt a blank filesystem. So, you're option right now is to fix the errors on the old disk 3 and then copy back into the array. Quote Link to comment
trurl Posted November 29, 2021 Share Posted November 29, 2021 12 hours ago, trurl said: Are you sure you didn't format? 6 hours ago, paululibro said: It was a brand new empty disk. You didn't answer my question. Quote Link to comment
paululibro Posted November 29, 2021 Author Share Posted November 29, 2021 (edited) Quote Are you sure you didn’t format? I’m not 100% sure but I don’t think so. I connected new disk, assigned it to the Disk 3 and started array. I got a message that replacement disk was found, disk 3 is not ready and that Parity-Sync/Data-Rebuild is in progress. Edited November 29, 2021 by paululibro Quote Link to comment
paululibro Posted November 29, 2021 Author Share Posted November 29, 2021 1 hour ago, Squid said: So, you're option right now is to fix the errors on the old disk 3 and then copy back into the array. So how do I proceed? Currently new Disk 3 is mounted and empty and original Disk 3 is unassigned and mounted to /dev/sdg. Checking file system gives the same logs as posted before. I just use xfs_repair but point it to /dev/sdg1? Quote Link to comment
trurl Posted November 29, 2021 Share Posted November 29, 2021 3 minutes ago, paululibro said: Checking file system gives the same logs as posted before. I just use xfs_repair but point it to /dev/sdg1? The check results you posted before were already using /dev/sdg1. 6 hours ago, paululibro said: No modify flag set You just need to remove the -n (no modify) Quote Link to comment
paululibro Posted November 30, 2021 Author Share Posted November 30, 2021 (edited) Quote Quote what happened to the data on the original Disk 3? Is it still there but can’t be accessed due to fs errors Yes So, you're option right now is to fix the errors on the old disk 3 and then copy back into the array. I ran xfs_repair: Spoiler root@Orion2:~# xfs_repair /dev/sdg1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... Metadata corruption detected at 0x4536d0, xfs_bnobt block 0x80000000/0x1000 btree block 1/1 is suspect, error -117 Metadata corruption detected at 0x4536d0, xfs_cntbt block 0x80000008/0x1000 btree block 1/2 is suspect, error -117 Metadata CRC error detected at 0x43cfad, xfs_bnobt block 0xfffffff8/0x1000 btree block 2/1 is suspect, error -74 bad magic # 0x64383a61 in btbno block 2/1 Metadata CRC error detected at 0x43cfad, xfs_bnobt block 0x17ffffff0/0x1000 btree block 3/1 is suspect, error -74 bad magic # 0x51380c72 in btbno block 3/1 Metadata CRC error detected at 0x43cfad, xfs_bnobt block 0x37fffffd0/0x1000 btree block 7/1 is suspect, error -74 bad magic # 0xa202020 in btbno block 7/1 Metadata CRC error detected at 0x43cfad, xfs_bnobt block 0x27fffffe0/0x1000 Metadata CRC error detected at 0x43cfad, xfs_bnobt block 0x2ffffffd8/0x1000 btree block 6/1 is suspect, error -74 bad magic # 0xa202020 in btbno block 6/1 btree block 5/1 is suspect, error -74 Metadata corruption detected at 0x4536d0, xfs_bnobt block 0x1ffffffe8/0x1000 bad magic # 0x205f5f5f in btbno block 5/1 btree block 4/1 is suspect, error -117 agf_freeblks 268435445, counted 750 in ag 1 agf_longest 268435445, counted 5 in ag 1 Metadata CRC error detected at 0x43cfad, xfs_cntbt block 0x100000000/0x1000 btree block 2/2 is suspect, error -74 bad magic # 0xe12ea68c in btcnt block 2/2 Metadata corruption detected at 0x4536d0, xfs_cntbt block 0x1fffffff0/0x1000 btree block 4/2 is suspect, error -117 Metadata CRC error detected at 0x43cfad, xfs_cntbt block 0x17ffffff8/0x1000 btree block 3/2 is suspect, error -74 bad magic # 0x17b63fbe in btcnt block 3/2 Metadata CRC error detected at 0x43cfad, xfs_cntbt block 0x2ffffffe0/0x1000 btree block 6/2 is suspect, error -74 bad magic # 0x54686973 in btcnt block 6/2 Metadata corruption detected at 0x4536d0, xfs_cntbt block 0x27fffffe8/0x1000 btree block 5/2 is suspect, error -117 Metadata corruption detected at 0x43cc88, xfs_cntbt block 0x37fffffd8/0x1000 btree block 7/2 is suspect, error -117 bad magic # 0x49414233 in btcnt block 7/2 Metadata corruption detected at 0x4536d0, xfs_inobt block 0x80000010/0x1000 btree block 1/3 is suspect, error -117 agf_freeblks 268435445, counted 0 in ag 2 agf_longest 268435445, counted 0 in ag 2 agf_freeblks 268435445, counted 0 in ag 3 agf_longest 268435445, counted 0 in ag 3 agf_freeblks 267913717, counted 1320 in ag 4 agf_longest 267913717, counted 9 in ag 4 agf_freeblks 268435445, counted 567 in ag 5 agf_longest 268435445, counted 4 in ag 5 agf_freeblks 268435445, counted 0 in ag 6 agf_longest 268435445, counted 0 in ag 6 agf_freeblks 74458438, counted 0 in ag 7 agf_longest 74458438, counted 0 in ag 7 Metadata CRC error detected at 0x46b78d, xfs_inobt block 0x100000008/0x1000 btree block 2/3 is suspect, error -74 bad magic # 0x5829dbd5 in inobt block 2/3 Metadata corruption detected at 0x4536d0, xfs_inobt block 0x27ffffff0/0x1000 Metadata corruption detected at 0x4536d0, xfs_inobt block 0x1fffffff8/0x1000 btree block 4/3 is suspect, error -117 btree block 5/3 is suspect, error -117 Metadata CRC error detected at 0x46b78d, xfs_inobt block 0x180000000/0x1000 btree block 3/3 is suspect, error -74 bad magic # 0xd3a6ff15 in inobt block 3/3 Metadata CRC error detected at 0x46b78d, xfs_inobt block 0x37fffffe0/0x1000 Metadata CRC error detected at 0x46b78d, xfs_inobt block 0x2ffffffe8/0x1000 btree block 7/3 is suspect, error -74 btree block 6/3 is suspect, error -74 bad magic # 0x58444233 in inobt block 7/3 bad magic # 0x546f7272 in inobt block 6/3 agi_count 0, counted 1728 in ag 1 agi_freecount 0, counted 191 in ag 1 agi_count 0, counted 2016 in ag 4 agi_freecount 0, counted 134 in ag 4 agi_count 0, counted 2336 in ag 5 agi_freecount 0, counted 102 in ag 5 sb_icount 64, counted 6144 sb_ifree 61, counted 488 sb_fdblocks 1952984849, counted 268438106 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... found inodes not in the inode allocation tree found inodes not in the inode allocation tree found inodes not in the inode allocation tree - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 2 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Maximum metadata LSN (1610016023:-369629521) is ahead of log (1:64). Format log to cycle 1610016026. done But the original drive still shows up empty: Also when old disk is mounted, new disk is unmountable: And now, there is also a "Format" button next to the old drive. If it's the one we talked about earlier then I'm 100% sure I didn't format: Started array in the maintenance mode and checked both drives: Original drive: Spoiler FS: xfs Executing file system check: /sbin/xfs_repair -n /dev/sdg1 2>&1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 6 - agno = 7 - agno = 4 - agno = 5 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. New drive: Spoiler Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. So what do I do now? orion2-diagnostics-20211130-1115.zip Edited November 30, 2021 by paululibro Quote Link to comment
trurl Posted November 30, 2021 Share Posted November 30, 2021 Nov 29 17:54:22 Orion2 kernel: XFS (sdg1): Filesystem has duplicate UUID e2cc6cf0-db52-44ea-8bef-9974b89a834f - can't mount Nov 30 10:29:35 Orion2 kernel: XFS (md3): Filesystem has duplicate UUID e2cc6cf0-db52-44ea-8bef-9974b89a834f - can't mount The unassigned disk and disk3 have the same uuid. You will have to change the uuid of the unassigned disk. Click on the Settings icon for the unassigned disk. Quote Link to comment
paululibro Posted November 30, 2021 Author Share Posted November 30, 2021 I'm trying to generate new UUID but I'm getting timeout error in syslog: Nov 30 22:24:29 Orion2 unassigned.devices: Error: shell_exec(/usr/sbin/xfs_admin -U generate '/dev/sdg1') took longer than 20s! Nov 30 22:24:29 Orion2 unassigned.devices: Changed partition UUID on '/dev/sdg1' with result: command timed out After clicking "Change UUID" again I'm getting this: Nov 30 22:24:47 Orion2 unassigned.devices: Changed partition UUID on '/dev/sdg1' with result: ERROR: cannot find log head/tail, run xfs_repair So I run xfs_repair with required -L flag and try to generate again but it's back to timeout error and head/tail error. I'm doing it in the maintenance mode. orion2-diagnostics-20211130-2231.zip Quote Link to comment
paululibro Posted November 30, 2021 Author Share Posted November 30, 2021 I tried to generate it manually: root@Orion2:~# xfs_admin -U generate /dev/sdg1 Clearing log and setting UUID writing all SBs new UUID = ab1f65dd-e188-4c67-afc6-9f89fc139e93 And now both disks are mounted but original disk still shows up as empty: root@Orion2:/mnt/disks/WDC_WD80EZAZ-11TDBA0_2SG425ZW# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdg1 7811939620 54499088 7757440532 1% /mnt/disks/WDC_WD80EZAZ-11TDBA0_2SG425ZW /dev/md1 7811939620 7807708212 4231408 100% /mnt/disk1 /dev/md2 7811939620 7802174388 9765232 100% /mnt/disk2 /dev/md3 7811939620 54499088 7757440532 1% /mnt/disk3 /dev/md4 7811939620 7799822864 12116756 100% /mnt/disk4 /dev/md5 7811939620 7791383328 20556292 100% /mnt/disk5 /dev/sdc1 500107576 126153784 372420104 26% /mnt/cache Quote Link to comment
trurl Posted December 1, 2021 Share Posted December 1, 2021 The first diagnostics you posted were after reboot so can't see what you might have done before that, but it really seems like you must have formatted the original disk before replacing it. Quote Link to comment
paululibro Posted December 2, 2021 Author Share Posted December 2, 2021 Well, that's unfortunate. Thanks for all your help anyway. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.