While rebuilding parity on a new parity drive, a data drive failed. What now?


Recommended Posts

26 minutes ago, itimpi said:

FYI:  I am not sure that v6 will now even run without problems with only 1GB of RAM.    Even with 2GB certain functions (such as Unraid upgrades via the GUI) are prone to fail so 4GB is probably the practical minimum you should aim for with v6.

Thanks, I understand. I just need to format the two new drives in XFS and copy the data from the old drives to the new ones using rsync - hopefully it won't be too much.

The other option would be to stick with v5 and format the new drives in ReiserFS but then I'd be stuck with this filesystem in the new rig. Unfortunately I don't have enough drive slots in the new rig to bring over all of the old drives and perform the data move there using v6.

Edited by riccume
Link to comment
On 11/5/2020 at 8:05 AM, JorgeB said:

Most likely the issue, you need about 1GB per filesystem TB.

Another thought on repairing disk2 with reiserfsck; could I install a virtual Linux machine on one of my PCs and run reiserfsck --rebuild-tree --scan-whole-partition /dev/md2 from there on the old disk2?

Link to comment

Quick update. I tried to run the old rig with v6 but no luck. v6 would run on the new rig, but the new rig only takes 4 drives while I have 6 data drives and 1 parity drive. v5 on the old rig doesn't let me format the new disks as XFS. Catch 22!

 

So the new plan is the following:

1. Build the new rig

2. Install 3 old data drives and 1 new data drive on the new rig

3. Install a clean v6 unRAID on the old flash drive, copy only the .key from v5, and boot up the new rig with this flash drive

4. Mount the old drives as disk1-3 and the new drive as disk4 (formatted as XFS)

5. Use rsync to copy data from disk1-3 to disk4

6. Remove all drives, install the other 2 old data drives and other 1 new data drive. (Remember, the sixth data drive, 'old disk2', lost its file system and 'rebuilt disk2' also doesn't mount because the parity disk, which was used to rebuild it, had been partially overwritten)

7. Repeat steps 4 and 5 on these drives

8. Leave only the 2 new data drives on new rig. Add the new parity drive and 'old disk2'

9. Mount new data drives as disk1 and disk2 and mount parity drive. Do not mount 'old disk2'

10. Parity rebuild

At this point I will have successfully moved all of the old data drives that are still working to the new rig.

 

11. Mount 'old disk2' as disk3

12. Run the repair process kindly suggested above by @JorgeB

13. If data is missing/corrupted, try to repair 'rebuilt disk2' too

14. Copy data found in 'old disk2' and/or 'rebuilt disk2' to the new data drives using rsync (I also have a recent offline backup of irreplaceable data, which I will use at this stage)

15. Remove 'old disk2'/'rebuilt disk2'

 

Wish me luck!

 

Edited by riccume
Link to comment

Hello @JorgeB - I'm back! Move of old disks1 + 3-6 to new disks using rsync went very smoothly on the new rig and using v6.8.3. I'm now trying to recover data from old disk2 (both the original one that suddenly lost the file system and the rebuilt one, which was rebuilt using a partially corrupted parity disk).

 

When I run reiserfsck --check /dev/md1 on 'original disk2' I get the message:

Failed to open the device '/dev/md1': Unknown code er3k 127

Anything else I could try on this one? Raise Data Recovery found a lot of data in it; unfortunately a lot of what it recovered is corrupted and doesn't open properly.

 

In the meanwhile, I'm running reiserfsck --rebuild-tree --scan-whole-partition /dev/md2 on 'rebuilt disk2'.

 

Thank you!

 

Edited by riccume
Link to comment
49 minutes ago, riccume said:

When I run reiserfsck --check /dev/md1 on 'original disk2' I get the message:

Failed to open the device '/dev/md1': Unknown code er3k 127

/dev/md1 is disk1 in the parity array. How do you have that "original disk2" mounted? Assigned to slot1 (probably not) in the parity array, or as an Unassigned Device, or what?

  • Like 1
Link to comment

An update. reiserfsck on 'rebuilt disk2' seems to have done a decent job, though the root directory structure has disappeared and folders are now mostly bunched together in lost+found - see below.

Capture.thumb.JPG.3a0e9193aaa234971442e71cf617bb02.JPG

 

I'm thinking of copying the whole content of 'rebuilt disk2' to an external hard drive, re-start the array only with the new 12TB and 14TB drives + parity, then on my PC slowly shift through the folders in lost+found and one by one copy them to the correct location on the server. Any better ideas? Thanks.

Link to comment
8 hours ago, JorgeB said:

This means there's an unknown issue with the filesystem (or a bug with reiserfsck for this situation), either way not easy to fix, basically you'd need to ask for help to a reiserfs maintainer.

Thanks @JorgeB Do you think I should I try and run reiserfsck --rebuild-tree using the old unRAID setup, i.e. v5.0.6? I've read somewhere that there have been some issues with reiserfsck --rebuild-tree in recent versions of unRAID?

Link to comment
10 minutes ago, riccume said:

Thanks @JorgeB Do you think I should I try and run reiserfsck --rebuild-tree using the old unRAID setup, i.e. v5.0.6? I've read somewhere that there have been some issues with reiserfsck --rebuild-tree in recent versions of unRAID?

It's worth a try, there have been several issues with latest reiserprogs, though AFAIK latest one is working correctly.

Link to comment
34 minutes ago, trurl said:

Have you looked at the files in there? You may find the filenames have also disappeared.

Yes, they seem to be OK. The data I hadn't backed up is mostly DVDs saved in their standard structure (VIDEO_TS, AUDIO_TS folders etc.). I've played a handful of folders and they seem to work OK.

Link to comment

I am the bearer of great news; all data has been retrieved and moved to the new drives, and the new rig is working like a dream!

 

So, what happened? The "Unknown code er3k 127" message I received when running reiserfsck --check "on olddisk2" was due to my bleary-eyed mistake; instead of installing olddisk2, I had installed the old parity disk!

 

I installed olddisk2, run reiserfsck --check, it came back with zero errors. I tried mounting olddisk2; it did mount without any problem and all the data was there! This is the same disk that showed 'unformatted' in the old rig. Mystery! I know it wasn't a problem with the old motherboard because a new drive worked on the same port. Maybe reiserfsck --check 'reactivates' the file system once it finds no error? Don't know - but so glad it worked!!

 

So, lessons learned during the last two weeks in case they might be helpful to others:

 

1. Always backup your flash drive before changing your setup e.g. replacing a drive

 

2. Install HDDs properly! (I think the issue with olddisk2 might have been caused by the PCB on the back shorting on a screw on the case)

 

3. Personally, I'd avoid using the command invalidslot to change the status of that data drive to 'invalid'. The issue with it is that you cannot double-check that it has had the desired effect - and if it hasn't, the parity disk will be overwritten once you restart the array.

I prefer the second method suggested by @JorgeB (the steps below assume disk2 is the one that needs to be labelled as invalid)

- stop array and take a note of all the current assignments

- Utils -> New Config -> Yes I want to do this -> Apply

- Back on the main page, assign all the disks as they were as well as old parity and new disk2, double check all assignments are correct

- Check both "parity is already valid" and "maintenance mode" and start the array

- Stop the array

- Unassign disk2

- Start the array

- Stop the Array

- Re-assign disk2

- Start the array to begin rebuilding

 

4. If moving from an ancient rig to a completely new rig, first step should be to move the old drives to the new rig and do all the data transfer to the new drives there. This eliminates the risk of old motherboard /cables etc and old version of unRAID acting up during the transfer.

 

5. If the new rig doesn't have enough drive slots to recreate the setup of the old rig > use the standard methods to move data from old smaller drives to new larger drives, you can move the data using the command rsync.
In my case I had 6 data drives + parity on the old rig and 4 SATA slots on the new rig (and didn't want to start messing about with PCI SATA expansions, molex - SATA cables, etc). I followed these steps on the new rig:

- install old data drives as disk 1, 2, and 3

- install new data drive as disk 4

- start the array with these four disks, no parity

- move data from 1 to 4 using the command rsync -aqX /mnt/disk1/ /mnt/disk4

- check that nothing has been left behind using the command rsync -avn /mnt/disk1/ /mnt/disk4

- repeat with disk 2 and 3

- reset the setup (Tools > New Config)

- replace disk 1, 2, and 3 with the other three old data drives and disk 4 with the other new data drive, and repeat the three steps above

- reset the setup (Tools > New Config)

- remove old data drives, install the two new data drives as disk 1 and 2, install a new parity drive, and built parity

- done!

 

6. Raise Data Recovery didn't do a good job at recovering data from olddisk2. The majority of the files it recovered were corrupted, when it turns out that the data on olddisk2 worked perfectly fine on the new rig after running reiserfsck --check and finding zero errors.

 

I think that's it. Thanks so much for all the help, with special mention to @JorgeB!

Edited by riccume
  • Like 1
Link to comment
On 11/14/2020 at 9:05 PM, riccume said:

3. Personally, I'd avoid using the command invalidslot to change the status of that data drive to 'invalid'. The issue with it is that you cannot double-check that it has had the desired effect - and if it hasn't, the parity disk will be overwritten once you restart the array.

Glad all is good, this part is strange, never happened before with v6, especially without an apparent reason, maybe v5 related, but I'll keep it in mind to use the other option when possible.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.