HenkaN Posted May 18, 2019 Share Posted May 18, 2019 I need to rebuild disk6 from parity, however for some reason disk5 is now "forgotten". It doesn't know which disk to expect there so it tells me it will emulate content of disk5 and that i have "Too many wrong and/or missing disks!" to start the array and rebuild disk6 from parity. The error is understandable considering i run one parity disk, so i can't expect it to rebuild both. But is there a way to add the id of disk5 somewhere so it will "remember" it again and i can rebuild disk6? Ignore me not selecting a new drive in the screenshot, doesn't matter if i do or not. Quote Link to comment
trurl Posted May 18, 2019 Share Posted May 18, 2019 Go to Tools - Diagnostics and attach the complete diagnostics zip file to your next post. Quote Link to comment
HenkaN Posted May 18, 2019 Author Share Posted May 18, 2019 Here we go. henkraid-diagnostics-20190518-1854.zip Quote Link to comment
JorgeB Posted May 19, 2019 Share Posted May 19, 2019 On 5/18/2019 at 8:50 AM, HenkaN said: disk5 is now "forgotten" Disk5 isn't forgotten, it's disable, we can't see why because the diags are just after rebooting, you can force it enable to rebuild disk6, but rebuild might not be 100% successful depending why and how long has disk5 been disabled, is disk6 dead? Quote Link to comment
HenkaN Posted May 19, 2019 Author Share Posted May 19, 2019 Yeah we can't use the original disk6. That has to be rebuilt. I have no idea about these kind of things, how do i enable disk5 again? All i've done with disk5 after this happened is mount it with unassigned devices and pulled the data just in case. It should be perfectly intact as it was left before. Quote Link to comment
JorgeB Posted May 19, 2019 Share Posted May 19, 2019 1 hour ago, HenkaN said: mount it with unassigned devices and pulled the data just in case. It should be perfectly intact as it was left before. If the disk was mounted read/right it won't be intact, as there are always some writes because of filesystem housekeeping, still if that's the only thing done and depending on the filesystem used the rebuild should be mostly successful, to try it: -Tools -> New Config -> Retain current configuration: All -> Apply -Assign any missing disk(s) including old disk5 and new disk6 -Important - After checking the assignments leave the browser on that page, the "Main" page. -Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters): mdcmd set invalidslot 6 29 -Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box (GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the invalid slot command, but they won't be as long as the procedure was correctly done), disk6 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check Quote Link to comment
HenkaN Posted May 19, 2019 Author Share Posted May 19, 2019 12 minutes ago, johnnie.black said: disk6 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check Seems to be working and it does indeed say "Unmountable: No file system". You want me to start array in maintenance mode after it's done and "Check Filesystem Status"? Quote Link to comment
JorgeB Posted May 19, 2019 Share Posted May 19, 2019 3 minutes ago, HenkaN said: You want me to start array in maintenance mode after it's done and "Check Filesystem Status"? Yes. Quote Link to comment
HenkaN Posted May 19, 2019 Author Share Posted May 19, 2019 Alright, we'll see in about 6 hours if it turns out a success. At least it looks better than i hoped for so far, big thanks man! Quote Link to comment
JorgeB Posted May 19, 2019 Share Posted May 19, 2019 If you want post current diags, it should give a better idea of the filesystem status. Quote Link to comment
HenkaN Posted May 19, 2019 Author Share Posted May 19, 2019 Alright. This is as of right now with the rebuild ongoing. henkraid-diagnostics-20190519-1103.zip Quote Link to comment
JorgeB Posted May 19, 2019 Share Posted May 19, 2019 A valid xfs filesystem is being detected and that's good news, there's metadata corruption but xfs_repair should be able to fix it, when the rebuild is done, start the array in maintenance mode and run: xfs_repair -v /dev/md6 It will likely tell you to use -L to zero the log, if so run again with: xfs_repair -vL /dev/md6 You can also use the GUI for the check, using the same options as needed. Quote Link to comment
trurl Posted May 19, 2019 Share Posted May 19, 2019 Do you have Notifications setup to alert you immediately by email or other agent when Unraid detects a problem? If you don't deal with a single problem when it happens, then you may end up with multiple problems that are more likely to cause data loss. Quote Link to comment
HenkaN Posted May 19, 2019 Author Share Posted May 19, 2019 (edited) Okay here's the full picture. My server kinda froze, i could navigate the UI but i couldn't change anything, reboot, shutdown, nothing. So i had to pull the power to reboot it. When i powered it back up i got this metadata error, tried to correct it with xfs_repair, it needed -L as you said here. That didn't do anything. And every time i tried to start up the array it would get stuck on mounting disk6 and throw this. (And require me to cut the power because nothing worked when that happened). This pissed me off for a while, had people tell me it's probably cable issues, the drive is failing and i don't know what. So i decided to pull the drive and plug it into another pc to see if it was fine there or not. Turns out it was fine, i was still pissed though so i formated it and wrote a ton of data to it to see if anything weird happened. I honestly don't even care anymore if all or some of the data is lost/corrupted. However it feels like that metadata issue will be the next to deal with and i'm back at square one, unless xfs_repair decides to actually fix it now. My hopes was that the corruption wasn't written to parity (like that would actually be the case :p), seems like it is now though? However we'll see in about 5 hours if that's still the case or not. Edited May 19, 2019 by HenkaN ocd Quote Link to comment
HenkaN Posted May 19, 2019 Author Share Posted May 19, 2019 7 minutes ago, trurl said: Do you have Notifications setup to alert you immediately by email or other agent when Unraid detects a problem? If you don't deal with a single problem when it happens, then you may end up with multiple problems that are more likely to cause data loss. I have notifications, are there anything you see in the diags that i should be aware of? Quote Link to comment
JorgeB Posted May 19, 2019 Share Posted May 19, 2019 28 minutes ago, HenkaN said: That didn't do anything. If it doesn't work after the rebuild is done post the complete xfs_repair output. Quote Link to comment
HenkaN Posted May 19, 2019 Author Share Posted May 19, 2019 5 minutes ago, johnnie.black said: If it doesn't work after the rebuild is done post the complete xfs_repair output. Does that log to a file? Last time i ran xfs_repair it barely returned any information at all. Quote Link to comment
JorgeB Posted May 19, 2019 Share Posted May 19, 2019 1 minute ago, HenkaN said: Does that log to a file? No, just copy/paste from the console. Quote Link to comment
trurl Posted May 19, 2019 Share Posted May 19, 2019 59 minutes ago, HenkaN said: i was still pissed though so i formated it and wrote a ton of data to it to see if anything weird happened. Writing anything to an array disk while it is outside the array invalidates parity. 1 hour ago, HenkaN said: My hopes was that the corruption wasn't written to parity (like that would actually be the case :p), seems like it is now though? Parity doesn't actually contain any data, and parity cannot fix filesystem corruption. 1 Quote Link to comment
trurl Posted May 19, 2019 Share Posted May 19, 2019 41 minutes ago, trurl said: Writing anything to an array disk while it is outside the array invalidates parity. If this was the disabled disk then it was no longer part of parity though. Quote Link to comment
HenkaN Posted May 19, 2019 Author Share Posted May 19, 2019 1 hour ago, trurl said: Writing anything to an array disk while it is outside the array invalidates parity. Parity doesn't actually contain any data, and parity cannot fix filesystem corruption. The disk i was testing in another pc hasn't been put back in the array. That's the disk i'm rebuilding right now. And what i mean by the metadata things not written to parity was since they happened during a bad reboot, it might not have affected the parity disk. Quote Link to comment
HenkaN Posted May 19, 2019 Author Share Posted May 19, 2019 (edited) It doesn't look like disk6 is detected as an xfs filesystem. I restarted the system and it don't look like that changed anything. Not sure if that's a problem? Did this, didn't run the actual repair (the option to check it isn't in the gui). It says "FS auto", "Unmountable: No file system". And on the bottom "Unmountable disk present:Disk 6 • WDC_WD40EFRX-68N32N0_WD-WCC7K2RK8VU8 (sdb)". And i have the option to format it. root@Henkraid:~# xfs_repair -n /dev/md6 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... Log inconsistent or not a log (last==0, first!=1) empty log check failed zero_log: cannot find log head/tail (xlog_find_tail=22) - scan filesystem freespace and inode maps... ir_freecount/free mismatch, inode chunk 3/35062848, freecount 0 nfree 5 inode rec for ino 6521313728 (3/78862784) overlaps existing rec (start 3/78862784) agi_freecount 57, counted 14 in ag 3 sb_icount 2496, counted 2560 sb_ifree 397, counted 308 sb_fdblocks 496361854, counted 491894032 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 imap claims a free inode 6477511085 is in use, would correct imap and clear inode - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 entry "South Park - S01E13 - Cartman's Mom is a Dirty Slut Bluray-720p.mp4" at block 0 offset 2824 in directory inode 6477511060 references free inode 6477511085 would clear inode number in entry at offset 2824... No modify flag set, skipping phase 5 Inode allocation btrees are too corrupted, skipping phases 6 and 7 Maximum metadata LSN (1:26702) is ahead of log (0:0). Would format log to cycle 4. No modify flag set, skipping filesystem flush and exiting. henkraid-diagnostics-20190519-2056.zip Edited May 19, 2019 by HenkaN Quote Link to comment
JorgeB Posted May 20, 2019 Share Posted May 20, 2019 10 hours ago, HenkaN said: No modify flag set Run xfs_repair without -n Quote Link to comment
HenkaN Posted May 20, 2019 Author Share Posted May 20, 2019 8 hours ago, johnnie.black said: Run xfs_repair without -n Alrighty that seems to have fixed it now, it's detected as an xfs filesystem and it mounts without any issues from what i can tell so far. Will provide the xfs_repair output and my diagnostics just in case there's anything more to it. I guess now i just have to slowly go through the data and see what's lost due to all of my problems lol. Thanks alot man! root@Henkraid:~# xfs_repair /dev/md6 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... Log inconsistent or not a log (last==0, first!=1) empty log check failed zero_log: cannot find log head/tail (xlog_find_tail=22) ERROR: The log head and/or tail cannot be discovered. Attempt to mount the filesystem to replay the log or use the -L option to destroy the log and attempt a repair. --------------------------- root@Henkraid:~# xfs_repair -L /dev/md6 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... Log inconsistent or not a log (last==0, first!=1) empty log check failed zero_log: cannot find log head/tail (xlog_find_tail=22) - scan filesystem freespace and inode maps... ir_freecount/free mismatch, inode chunk 3/35062848, freecount 0 nfree 5 inode rec for ino 6521313728 (3/78862784) overlaps existing rec (start 3/78862784) agi_freecount 57, counted 14 in ag 3 sb_icount 2496, counted 2560 sb_ifree 397, counted 308 sb_fdblocks 496361854, counted 491894032 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 imap claims a free inode 6477511085 is in use, correcting imap and clearing inode cleared inode 6477511085 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 entry "South Park - S01E13 - Cartman's Mom is a Dirty Slut Bluray-720p.mp4" at block 0 offset 2824 in directory inode 6477511060 references free inode 6477511085 clearing inode number in entry at offset 2824... Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... bad hash table for directory inode 6477511060 (no data entry): rebuilding rebuilding directory inode 6477511060 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Maximum metadata LSN (4:49915) is ahead of log (1:2). Format log to cycle 7. done --------------------------- root@Henkraid:~# xfs_repair /dev/md6 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... Phase 3 - for each AG...de chunk Phase 3 - for each AG...de chunk - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done henkraid-diagnostics-20190520-1541.zip Quote Link to comment
JorgeB Posted May 20, 2019 Share Posted May 20, 2019 Data should be mostly OK, assuming parity was in sync. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.