Jump to content
We're Hiring! Full Stack Developer ×

Need help to make sure I dont lose data


Recommended Posts

Hello everyone.  Long time user, I'll cut right to the point.

I needed to shrink my array due to wanting to remove a bad drive, but don't have another driver for its place. I followed the steps outlined here:  https://wiki.unraid.net/Shrink_array#The_.22Remove_Drives_Then_Rebuild_Parity.22_Method

 

First HUGE mistake, and Im still hating myself every minute for this.  But I took a screenshot of the array, and then saved said screenshot on my array.  Honestly, how idiotic of me.  Anyhow, I got to the step of starting the array and it wouldnt start because for some reason it said my GUID was blacklisted.  I debated for sometime, but ultimately ended up deciding to reboot.  Unraid came back up and now wants me to assign devices.   

 

hfnrrNP.png

 

I know the Hitachi is my parity drive, 100% sure.

The Samsung is my is either my cache drive or my unassigned device.  80% sure its my cache drive

The ancient OCZ is an unassigned device for a VM, or my cache drive.  80% sure its an unassigned device.

This leaves sdd and sdb.  I believe the drive that failed no longer appears here.  Disk 1 was the failed disk.  

 

My problem statement is that I am not sure if sdd was Disk 2 or 3.  Thus, I dont know if sdb is disk 3 or 2. 

 

Is there a way to still retain my data?

 

Edit:  I was able to mount sdb and get my array image.  

image.thumb.png.a928f7543cac3fec88e8d652578ff898.png

 

However, when I put the disks in those assignments, I get a warning on my Parity that All existing data on this device will be OVERWRITTEN when array is Started

Is that because disk 1 is missing now?  Will unraid rebuild parity and take the data that is missing from Disk 1 and rewrite it to Disks 2 and 3?

 

singularity-diagnostics-20191226-1708.zip

Edited by Singularity42
Link to comment

1. sd? assignments are not to be used for slot identification. They can and will change. You must use drive serial number to assign the drives.

2. At the start the of the procedure you linked, it explicitly says that before you start the removal you must have copied all the data from that drive slot elsewhere. Did you do that?

3. Without command line directives, unraid assumes you want to build parity based on the remaining drives that you assigned, the data previously on the removed drive is gone.

4. If you don't have a backup of the missing drive and need to rebuild it, the easiest way is to rebuild it to a new drive. Now that you've set a new config without it, things get very complicated.

 

I'd wait until @johnnie.black can chime in, I can't remember if it's possible to use the set invalid slot command without having a replacement drive to rebuild to.

 

Any writes to ANY of the data or parity drives will corrupt the emulated missing drive. Hopefully the damage already done by mounting one or more of them is minimal.

Link to comment

1.  Yes I know drive serial numbers are how you set drives, I was only using linux naming to point out the drives and not type out entire serial number :)

2. No, I wasnt able to since the drive was offline and data was not accessible.

3. I thought single parity "protection" would save me in the event that a single disk has gone bad and won't come up?

4. Ah, so #3 is at least possible.  

 

If I could get the drive to actually show, I'll be able to set the config to exactly how it was.  I mounted the drive via cli with read only params.  Shouldn't have been any writes.  

Thanks for the reply.  Im not creating a new config or doing anything with disks/array/etc until I hear a bit more.  Thank you so much for your help!

Link to comment
  1. As mentioned, the sd designations can and will change. They will especially change when disks are added, replaced, or removed. They can even change from one boot to the next for no apparent reason. So they are not useful for understanding how disks were previously assigned.
  2. If the disk was missing or disabled, its contents should have been available anyway from the parity calculation. This is known as "emulation". It is even possible to write to an emulated disk, parity is updated as if the disk were written, and so the written data can be recovered when the disk is rebuilt.
  3. It will, but it requires parity plus all of the other disks to calculate the contents of the missing disk. Parity disk by itself cannot recover anything.
  4. There is a command line method to tell Unraid to rebuild a disk other than the parity disk during New Config. You need a disk to rebuild to.
Link to comment

Just thought I would come back to OP for a moment:

3 hours ago, Singularity42 said:

Anyhow, I got to the step of starting the array and it wouldnt start because for some reason it said my GUID was blacklisted.  I debated for sometime, but ultimately ended up deciding to reboot.  Unraid came back up and now wants me to assign devices.   

The fact that it didn't remember your disk assignments suggests that flash was corrupt. In particular, it couldn't read config/super.dat on the flash drive which contains your disk assignments.

 

I'm not entirely sure I would trust the contents of flash at this point. You might put it in your PC and let it checkdisk. While there make a backup of flash. You should always have a backup of flash, stored somewhere you can get it so you can recreate flash in order to boot Unraid. If you have a good backup of flash you can always restore the config folder from that backup on a new install and everything will be back just as it was.

 

For future reference, you can always download a zipped copy of flash from Main - Boot Device - Flash - Flash Backup.

 

3 hours ago, Singularity42 said:

Will unraid rebuild parity and take the data that is missing from Disk 1 and rewrite it to Disks 2 and 3?

If it was simply a case of disk1 missing and no New Config, then it would have been possible to manually copy the data from the emulated (missing) disk1 to disks2 and 3. But since you had lost your drive assignments, you are working from a New Config.

 

17 minutes ago, Singularity42 said:

This would complete the entire old config.   Possible to resolve?

36 minutes ago, trurl said:

You need a disk to rebuild to.

If you can't assign the old disk1 to slot1 for some reason, possibly an actual issue with the drive itself, then you need another disk to assign to slot1. Then the invalidslot command could be used with New Config to make Unraid rebuild disk1 from parity plus all other disks using the parity calculation. Instead of rebuilding parity from all the other disks using the parity calculation, which is what New Config will do by default.

Link to comment
6 minutes ago, trurl said:

Then the invalidslot command could be used with New Config to make Unraid rebuild disk1 from parity plus all other disks using the parity calculation. Instead of rebuilding parity from all the other disks using the parity calculation, which is what New Config will do by default.

Thank you for your detailed and well described replies.  I really appreciate it.  

The disk is for sure dead.  Even took it out and threw it in the freezer for a bit (its worked before! lol).

So it sounds like if I have another disk to put into its spot (I have 5 more coming tomorrow), then I can use the invalidslot command with New Config, and have it rebuild the old Disk 1 contents, to the new Disk 1, from Parity+Disk2+Disk3 ?

 

Thanks again. I really shot myself in the foot on this one.  

Link to comment
8 minutes ago, Singularity42 said:

So it sounds like if I have another disk to put into its spot (I have 5 more coming tomorrow), then I can use the invalidslot command with New Config, and have it rebuild the old Disk 1 contents, to the new Disk 1, from Parity+Disk2+Disk3 ?

Should work, assuming you haven't left anything important out of your narrative. It's possible it won't work perfectly for some reason, but we can deal with that, possibly some corruption to be repaired, after the rebuild.

Link to comment
8 hours ago, Singularity42 said:

Edit:  I was able to mount sdb

This was hopefully mounted read only, though xfs can usually survive even if it wasn't, but can cause some filesystem corruption on the rebuilt disk.

 

When you have a new disk you can use the invalid slot command.

 

-Tools -> New Config -> Retain current configuration: All -> Apply (you can skip this step if already did a new config)
-Assign any missing disk(s), including the new disk1
-Important - After checking the assignments leave the browser on that page, the "Main" page.

-Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters):

mdcmd set invalidslot 1 29

-Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box (GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the invalid slot command, but they won't be as long as the procedure was correctly done), disk1 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check

 

 

Link to comment

Edit:  Nevermind, I am dumb.  Just re-read your statement:  

Quote

disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check

 

During filesystem/parity check, do I want to Write corrections to parity this time?

 

Ok, followed the steps above, thank you again.  Parity rebuilt and now I have:

image.thumb.png.f2aad56de33f7e6a82439a9d3ce045e2.png

 

image.thumb.png.07262b0cfa55ac2e8af902cb50abe43b.png

 

Do I want to format to create file system?

Edit (follow up):  Why does the parity drive not show usage stats?  Is it because they are all spun down?

Edited by Singularity42
Link to comment
11 minutes ago, Singularity42 said:

Do I want to format to create file system?

No, please post current diags, there might be some filesystem corruption because of this:

On 12/27/2019 at 7:56 AM, johnnie.black said:

This was hopefully mounted read only, though xfs can usually survive even if it wasn't, but can cause some filesystem corruption on the rebuilt disk.

Hopefully it's fixable.

 

12 minutes ago, Singularity42 said:

Why does the parity drive not show usage stats?  Is it because they are all spun down?

Because parity doesn't have a filesystem.

 

Link to comment
root@Singularity:~# xfs_repair -v /dev/md1 
Phase 1 - find and verify superblock...
bad primary superblock - bad CRC in superblock !!!

attempting to find secondary superblock...
.found candidate secondary superblock...
verified secondary superblock...
writing modified primary superblock
        - block cache size set to 1506984 entries
sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129
resetting superblock realtime bitmap ino pointer to 129
sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130
resetting superblock realtime summary ino pointer to 130
Phase 2 - using internal log
        - zero log...
zero_log: head block 2760970 tail block 2760970
        - scan filesystem freespace and inode maps...
sb_icount 0, counted 49280
sb_ifree 0, counted 3833
sb_fdblocks 732208915, counted 366708807
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 0
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Note - stripe unit (0) and width (0) were copied from a backup superblock.
Please reset with mount -o sunit=<value>,swidth=<value> if necessary

        XFS_REPAIR Summary    Tue Dec 31 10:39:49 2019

Phase           Start           End             Duration
Phase 1:        12/31 10:39:26  12/31 10:39:44  18 seconds
Phase 2:        12/31 10:39:44  12/31 10:39:44
Phase 3:        12/31 10:39:44  12/31 10:39:46  2 seconds
Phase 4:        12/31 10:39:46  12/31 10:39:46
Phase 5:        12/31 10:39:46  12/31 10:39:46
Phase 6:        12/31 10:39:46  12/31 10:39:48  2 seconds
Phase 7:        12/31 10:39:48  12/31 10:39:48

Total run time: 22 seconds
done

Stopped Array. 

Started Array.

 

AANNNDDDD.  We are back in business!  

Thank you (all) greatly for your help and your time.

 

Quality product, and now I see amazing support.  I've already got 3-4 people to start using UnRaid, I'll keep spreading the word.  Thanks again.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...