Did I delete my data drive?

auntyant · May 31

After an unclean shutdown, the two discs in my array were no longer being recognized and I tried to recreate it by using New Config.

I thought I remembered which was my parity disc, but I stupidly tried both my discs as the parity drive to see if the configuration worked, but now both my discs are showing as "unmountable" in the interface.

I only later found the Unmountable Discs area of support, which says this:

Quote

If you know which drives were the parity drives then you can simply reassign the drives. However, if you do not know which were the parity drives then you have to be more careful as incorrectly assigning a drive to parity that is really a data drive will result in you losing its contents.

Since I've already tried running the array with both drives as parity, does this mean that I've created a learning opportunity for myself by deleting all the information on my data drive?

If no, how do I recover this?

If yes, how do I recover this? Do I just need to

Reformat the data drive and
Recreate parity?

If I use my server as a home media server (Plex), does that mean I need to reacquire all the media that was on the data disc?

Thanks in advance to anyone willing to help out with my idiocy.

trurl · May 31

20 minutes ago, auntyant said:

two discs in my array

A single parity and single data drive should be a mirror. Is that what you had? If so, then either disk should be mountable (or possibly repairable) if you haven't done anything else to them.

Attach Diagnostics to your NEXT post in this thread. Don't do anything else without further advice, your proposed solution will definitely lose your data.

auntyant · May 31

pumpkinpasty-diagnostics-20240531-1937.zipThank you for the quick reply. I have four discs in total (one cache, two data, one parity) but only two that I thought were relevant - all the other discs are easily identifiable. The two in question are the same make and size, so when I configure the array their only distinguishing information is their serial number, and I don't remember which of them was my parity.

I've already started the array alternately with both 10TB discs as parity to test them, which is what makes me think I've already lost the data since I've already started the array with both in the parity slot.

Diagnostics attached!

Edited May 31 by auntyant

JorgeB · May 31

If a parity sync started they will probably be both gone, but if it was very quickly canceled there may be a chance, do a new config and assign all disks as data, don't assign parity, start the array and post new diags.

itimpi · May 31

There is also a chance a disk recovery tool such as UFS Explorer on Windows could recover most of the data if the sync was cancelled quickly.

auntyant · May 31

Done and done and attached. From my side, it sure looks like the data has snuffed it.

And thank you again!

pumpkinpasty-diagnostics-20240531-1951.zip

JorgeB · May 31

It's still detecting an xfs filesystem on disk2, post the output of:

xfs_repair -v /dev/md2p1

auntyant · May 31

🫡

root@pumpkinpasty:~# xfs_repair -v /dev/md2p1
Phase 1 - find and verify superblock...
- block cache size set to 686464 entries
Phase 2 - using internal log
- zero log...
zero_log: head block 53028 tail block 53024
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair. If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

JorgeB · May 31

Now:

xfs_repair -vL /dev/md2p1

auntyant · May 31

root@pumpkinpasty:~# xfs_repair -vL /dev/md2p1
Phase 1 - find and verify superblock...
        - block cache size set to 686464 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 53028 tail block 53024
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
clearing needsrepair flag and regenerating metadata
finobt ir_freecount/free mismatch, inode chunk 0/137247744, freecount 26 nfree 28
agi_freecount 26, counted 28 in ag 0
sb_icount 3456, counted 3392
sb_ifree 289, counted 360
sb_fdblocks 1186214285, counted 1179649446
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
entry "                                                                                                                 ksos" in shortform directory 128 references invalid inode 0
entry #0 extends past end of dir in shortform dir 128, junking 1 entries
corrected entry count in directory 128, was 1, now 0
corrected i8 count in directory 128, was 1, now 0
corrected directory 128 size, was 35, now 6
corrected root directory 128 .. entry, was 67133673, now 128
bad inode type 0 inode 131
cleared inode 131
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 137150578, moving to lost+found
disconnected dir inode 137246311, moving to lost+found
disconnected dir inode 2147483776, moving to lost+found
disconnected dir inode 4294967424, moving to lost+found
disconnected dir inode 19420599226, moving to lost+found
Phase 7 - verify and correct link counts...
resetting inode 128 nlinks from 4 to 3
resetting inode 131 nlinks from 2 to 6
Maximum metadata LSN (1:53067) is ahead of log (1:2).
Format log to cycle 4.

XFS_REPAIR Summary Fri May 31 20:00:46 2024

Phase           Start           End             Duration
Phase 1:        05/31 20:00:13 05/31 20:00:13
Phase 2:        05/31 20:00:13 05/31 20:00:24 11 seconds
Phase 3:        05/31 20:00:24 05/31 20:00:24
Phase 4:        05/31 20:00:24 05/31 20:00:24
Phase 5:        05/31 20:00:24 05/31 20:00:24
Phase 6:        05/31 20:00:24 05/31 20:00:24
Phase 7:        05/31 20:00:24 05/31 20:00:24

Total run time: 11 seconds
done

JorgeB · May 31

Re-start the array and the disk should mount now, then check contents.

auntyant · May 31

When the array restarted it seems to have successfully recognized both the parity and 10TB data disks but the second data disc was suddenly a stranger. Without really knowing how, I seem to have initiated a Data-Rebuild process of the now unknown second disc. I'm assuming that that means parity will wipe rebuild data disc 2 from what it knows from the now successfully recognized parity disc and data disc 1.

Upside is I didn't lose two discs, just one? Ideally I just wait for the rebuild process to complete, then check the share filesystem to see if the media content still exists?

Again, really appreciate your help on all of this.

JorgeB · May 31

Not sure I follow, there's no parity assigned, how can it be rebuilding a disk?

Post current diags.

auntyant · June 1

Clearly I also don't follow my own logic. Not sure what the Data-Rebuild process did, but the second data disc is still unmountable. I think it might have just been parity rebuilding itself.

Here's all I know:

There are two data discs: one 10TB and one 4TB
There is one parity disc - 10TB
Earlier, the 4TB data disc was mountable but the other two were not
After your repair process, the 10TB data drive was successfully recognized and mounted
I restarted the server
The 10TB data disc and the 10TB parity disc were recognized
The 4TB data disc was suddenly 'unmountable' and the Data-Rebuild process was running (not sure how I initiated that)

Ignoring more of my own theories for the moment - diags attached.

pumpkinpasty-diagnostics-20240601-0729.zip

JorgeB · June 1

Why did you assign parity again? I asked to just restart the array.

Do another new config and assign all disks as data, NO PARITY, then start the array and post new diags.

auntyant · June 1

Apologies - new to this kind of troubleshooting and I'm getting ahead of myself. Reassigned all discs as data and attached diags.

Thank you for your patience and help.

pumpkinpasty-diagnostics-20240601-1000.zip

JorgeB · June 1

Post the output of:

xfs_repair -v /dev/md3p1

auntyant · June 1

root@pumpkinpasty:~# xfs_repair -v /dev/md3p1
Phase 1 - find and verify superblock...
        - block cache size set to 731168 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 97 tail block 97
        - scan filesystem freespace and inode maps...
Metadata CRC error detected at 0x44228d, xfs_bnobt block 0x82e3be0/0x1000
btree block 0/17155964 is suspect, error -74
bad magic # 0x7d7d4b2f in btbno block 0/17155964
Metadata CRC error detected at 0x44228d, xfs_cntbt block 0x82e3bf8/0x1000
btree block 0/17155967 is suspect, error -74
bad magic # 0x72740b77 in btcnt block 0/17155967
Metadata corruption detected at 0x458f90, xfs_refcountbt block 0x82cc018/0x1000
btree block 0/17143811 is suspect, error -117
agf_freeblks 244188641, counted 0 in ag 0
agf_longest 244188635, counted 0 in ag 0
Metadata corruption detected at 0x458f90, xfs_inobt block 0x78/0x1000
btree block 0/15 is suspect, error -117
Metadata CRC error detected at 0x47191d, xfs_finobt block 0x82cb3f8/0x1000
btree block 0/17143423 is suspect, error -74
bad magic # 0x66696213 in finobt block 0/17143423
agi_count 64, counted 384 in ag 0
agi_freecount 61, counted 27 in ag 0
agi_freecount 61, counted 0 in ag 0 finobt
sb_icount 64, counted 448
sb_ifree 61, counted 90
sb_fdblocks 987030708, counted 732089022
root inode chunk not found
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
found inodes not in the inode allocation tree
        - process known inodes and perform inode discovery...
        - agno = 0
entry "                                                                                                         B" in shortform directory 128 references invalid inode 0
entry #0 extends past end of dir in shortform dir 128, junking 2 entries
corrected entry count in directory 128, was 2, now 0
corrected i8 count in directory 128, was 1, now 0
corrected directory 128 size, was 37, now 6
corrected root directory 128 .. entry, was 167796972, now 128
bad inode type 0 inode 131
cleared inode 131
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected dir inode 2147483776, moving to lost+found
Phase 7 - verify and correct link counts...
resetting inode 128 nlinks from 5 to 3
resetting inode 131 nlinks from 2 to 3
Maximum metadata LSN (1684631143:1025665395) is ahead of log (1:97).
Format log to cycle 1684631146.

XFS_REPAIR Summary Sat Jun 1 13:21:20 2024

Phase           Start           End             Duration
Phase 1:        06/01 13:21:01 06/01 13:21:01
Phase 2:        06/01 13:21:01 06/01 13:21:01
Phase 3:        06/01 13:21:01 06/01 13:21:01
Phase 4:        06/01 13:21:01 06/01 13:21:01
Phase 5:        06/01 13:21:01 06/01 13:21:01
Phase 6:        06/01 13:21:01 06/01 13:21:01
Phase 7:        06/01 13:21:01 06/01 13:21:01

Total run time:
done

JorgeB · June 1

Now stop the array, start the array, post new diags.

auntyant · June 1

Done and diagnostics attached.

pumpkinpasty-diagnostics-20240601-1343.zip

JorgeB · June 1

There are two disks mounting, one 10TB and the 4TB, though the latter appears to be empty, is this expected? Does the 10TB content look OK?

auntyant · June 1

4TB should have some content on it I believe, but it's possible that it wouldn't - the 10TB was far from full. I'm ashamed to admit it, but I don't know how to examine the contents of a drive in isolation. I can navigate to my share and examine the contents, which are missing (presumably due to the array configuration), but I'm not sure how to locate the drive itself on the command line.

itimpi · June 1

1 hour ago, auntyant said:

4TB should have some content on it I believe, but it's possible that it wouldn't - the 10TB was far from full. I'm ashamed to admit it, but I don't know how to examine the contents of a drive in isolation. I can navigate to my share and examine the contents, which are missing (presumably due to the array configuration), but I'm not sure how to locate the drive itself on the command line.

If you have the Dynamix File Manager plugin installed then you would not need to go to the command line to examine a drive’s contents. Worth getting used to this plugin as it is apparently going to be a built-in feature of future Unraid releases.

auntyant · June 1

@itimpi thank you - installed and used.

The 4TB has nothing on it, including lost+found
The 10TB is missing the content in the main folder structure, but the files are all there under one of the lost+found folders

What's my recommended course of action here? Should I transfer all the lost+found content back to the main filesystem tree or is there an in-between step?

Edited June 1 by auntyant

JonathanM · June 1

6 hours ago, auntyant said:

transfer all the lost+found content back to the main filesystem tree

If everything has the correct file names and subfolders, that's the best bet. Sometimes the file system check can't figure out the correct file or folder, and assigns placeholder names, which require you to manually figure out what the file is and where it belongs.

Did I delete my data drive?

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation