unRAID forgot one disk assignment and need to rebuild another one from parity.


Recommended Posts

I need to rebuild disk6 from parity, however for some reason disk5 is now "forgotten". It doesn't know which disk to expect there so it tells me it will emulate content of disk5 and that i have "Too many wrong and/or missing disks!" to start the array and rebuild disk6 from parity. The error is understandable considering i run one parity disk, so i can't expect it to rebuild both. But is there a way to add the id of disk5 somewhere so it will "remember" it again and i can rebuild disk6?

 

Ignore me not selecting a new drive in the screenshot, doesn't matter if i do or not.

image.png.9d287a102ac2d4f39a8d3e7bff747cba.png

Link to comment
On 5/18/2019 at 8:50 AM, HenkaN said:

disk5 is now "forgotten"

Disk5 isn't forgotten, it's disable, we can't see why because the diags are just after rebooting, you can force it enable to rebuild disk6, but rebuild might not be 100% successful depending why and how long has disk5 been disabled, is disk6 dead?

Link to comment

Yeah we can't use the original disk6. That has to be rebuilt. I have no idea about these kind of things, how do i enable disk5 again? All i've done with disk5 after this happened is mount it with unassigned devices and pulled the data just in case. It should be perfectly intact as it was left before.

Link to comment
1 hour ago, HenkaN said:

mount it with unassigned devices and pulled the data just in case. It should be perfectly intact as it was left before.

If the disk was mounted read/right it won't be intact, as there are always some writes because of filesystem housekeeping, still if that's the only thing done and depending on the filesystem used the rebuild should be mostly successful, to try it:

 

-Tools -> New Config -> Retain current configuration: All -> Apply
-Assign any missing disk(s) including old disk5 and new disk6
-Important - After checking the assignments leave the browser on that page, the "Main" page.

-Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters):

mdcmd set invalidslot 6 29

-Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box (GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the invalid slot command, but they won't be as long as the procedure was correctly done), disk6 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check

 

 

 

 

Link to comment
12 minutes ago, johnnie.black said:

disk6 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check

Seems to be working and it does indeed say "Unmountable: No file system".

You want me to start array in maintenance mode after it's done and "Check Filesystem Status"?

Link to comment

A valid xfs filesystem is being detected and that's good news, there's metadata corruption but xfs_repair should be able to fix it, when the rebuild is done, start the array in maintenance mode and run:

 

xfs_repair -v /dev/md6

 

It will likely tell you to use -L to zero the log, if so run again with:

 

xfs_repair -vL /dev/md6

You can also use the GUI for the check, using the same options as needed.

Link to comment

Okay here's the full picture. My server kinda froze, i could navigate the UI but i couldn't change anything, reboot, shutdown, nothing. So i had to pull the power to reboot it. When i powered it back up i got this metadata error, tried to correct it with xfs_repair, it needed -L as you said here. That didn't do anything. And every time i tried to start up the array it would get stuck on mounting disk6 and throw this. (And require me to cut the power because nothing worked when that happened).

 

image0.thumb.jpg.34a0a4e8998f9fce42fdaded76c52582.jpg

 

This pissed me off for a while, had people tell me it's probably cable issues, the drive is failing and i don't know what. So i decided to pull the drive and plug it into another pc to see if it was fine there or not. Turns out it was fine, i was still pissed though so i formated it and wrote a ton of data to it to see if anything weird happened.

I honestly don't even care anymore if all or some of the data is lost/corrupted. However it feels like that metadata issue will be the next to deal with and i'm back at square one, unless xfs_repair decides to actually fix it now. My hopes was that the corruption wasn't written to parity (like that would actually be the case :p), seems like it is now though?

However we'll see in about 5 hours if that's still the case or not.

Edited by HenkaN
ocd
Link to comment
7 minutes ago, trurl said:

Do you have Notifications setup to alert you immediately by email or other agent when Unraid detects a problem? If you don't deal with a single problem when it happens, then you may end up with multiple problems that are more likely to cause data loss.

I have notifications, are there anything you see in the diags that i should be aware of?

Link to comment
59 minutes ago, HenkaN said:

i was still pissed though so i formated it and wrote a ton of data to it to see if anything weird happened.

Writing anything to an array disk while it is outside the array invalidates parity.

 

1 hour ago, HenkaN said:

My hopes was that the corruption wasn't written to parity (like that would actually be the case :p), seems like it is now though?

Parity doesn't actually contain any data, and parity cannot fix filesystem corruption.

  • Upvote 1
Link to comment
1 hour ago, trurl said:

 Writing anything to an array disk while it is outside the array invalidates parity.

 

Parity doesn't actually contain any data, and parity cannot fix filesystem corruption.

The disk i was testing in another pc hasn't been put back in the array. That's the disk i'm rebuilding right now.

And what i mean by the metadata things not written to parity was since they happened during a bad reboot, it might not have affected the parity disk.

Link to comment

It doesn't look like disk6 is detected as an xfs filesystem. I restarted the system and it don't look like that changed anything.

Not sure if that's a problem? Did this, didn't run the actual repair (the option to check it isn't in the gui).

 

It says "FS auto", "Unmountable: No file system". And on the bottom "Unmountable disk present:Disk 6 • WDC_WD40EFRX-68N32N0_WD-WCC7K2RK8VU8 (sdb)". And i have the option to format it.

root@Henkraid:~# xfs_repair -n /dev/md6
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
Log inconsistent or not a log (last==0, first!=1)
empty log check failed
zero_log: cannot find log head/tail (xlog_find_tail=22)
        - scan filesystem freespace and inode maps...
ir_freecount/free mismatch, inode chunk 3/35062848, freecount 0 nfree 5
inode rec for ino 6521313728 (3/78862784) overlaps existing rec (start 3/78862784)
agi_freecount 57, counted 14 in ag 3
sb_icount 2496, counted 2560
sb_ifree 397, counted 308
sb_fdblocks 496361854, counted 491894032
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
imap claims a free inode 6477511085 is in use, would correct imap and clear inode
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
entry "South Park - S01E13 - Cartman's Mom is a Dirty Slut Bluray-720p.mp4" at block 0 offset 2824 in directory inode 6477511060 references free inode 6477511085
        would clear inode number in entry at offset 2824...
No modify flag set, skipping phase 5
Inode allocation btrees are too corrupted, skipping phases 6 and 7
Maximum metadata LSN (1:26702) is ahead of log (0:0).
Would format log to cycle 4.
No modify flag set, skipping filesystem flush and exiting.

 

 

henkraid-diagnostics-20190519-2056.zip

Edited by HenkaN
Link to comment
8 hours ago, johnnie.black said:

Run xfs_repair without -n

Alrighty that seems to have fixed it now, it's detected as an xfs filesystem and it mounts without any issues from what i can tell so far. Will provide the xfs_repair output and my diagnostics just in case there's anything more to it. I guess now i just have to slowly go through the data and see what's lost due to all of my problems lol. Thanks alot man!

 

root@Henkraid:~# xfs_repair /dev/md6
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
Log inconsistent or not a log (last==0, first!=1)
empty log check failed
zero_log: cannot find log head/tail (xlog_find_tail=22)
ERROR: The log head and/or tail cannot be discovered. Attempt to mount the
filesystem to replay the log or use the -L option to destroy the log and
attempt a repair.

---------------------------

root@Henkraid:~# xfs_repair -L /dev/md6
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
Log inconsistent or not a log (last==0, first!=1)
empty log check failed
zero_log: cannot find log head/tail (xlog_find_tail=22)
        - scan filesystem freespace and inode maps...
ir_freecount/free mismatch, inode chunk 3/35062848, freecount 0 nfree 5
inode rec for ino 6521313728 (3/78862784) overlaps existing rec (start 3/78862784)
agi_freecount 57, counted 14 in ag 3
sb_icount 2496, counted 2560
sb_ifree 397, counted 308
sb_fdblocks 496361854, counted 491894032
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
imap claims a free inode 6477511085 is in use, correcting imap and clearing inode
cleared inode 6477511085
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
entry "South Park - S01E13 - Cartman's Mom is a Dirty Slut Bluray-720p.mp4" at block 0 offset 2824 in directory inode 6477511060 references free inode 6477511085
        clearing inode number in entry at offset 2824...
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
bad hash table for directory inode 6477511060 (no data entry): rebuilding
rebuilding directory inode 6477511060
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (4:49915) is ahead of log (1:2).
Format log to cycle 7.
done

---------------------------

root@Henkraid:~# xfs_repair /dev/md6
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
Phase 3 - for each AG...de chunk
Phase 3 - for each AG...de chunk
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done

 

henkraid-diagnostics-20190520-1541.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.