HELP! Harvey bit me


MacDaddy

Recommended Posts

Once you remove the spare disk, and start the array, you should be able to see the content of the failed disk being emulated. If not, do not proceed with the rebuild. Something is wrong. Do no array writes! Fix the problem and start again. If it is emulated and there are critical files on that disk, I'd suggest copying the data off to a workstation while everything is working.

 

You could actually copy the entire disk, file by file, over the network or to an unassigned devices in this manner and never need to rebuild. The advantage is that as each file is copied, it is safe. When doing a rebuild, the entire disk has to be rebuilt in order to recover all your files. If it fails half way through, you have a jumbled mess and no idea what is valid and what is not. I actually like this method better, especially if the diskr in the array are "fragile" (SMART errors, old, been through a flood :) , etc.)

 

You are doing everything right including asking the right questions here. Too often people try to recover in the heat of a stressful situation and make things worse. Congrats for keeping a cool head. You're not home free yet though. Good luck!!

  • Upvote 1
Link to comment

Thanks.  That's good advice.  So far I've had 7 individual files that didn't copy cleanly.  I've been able to retry and get three to copy.  I've been able to reboot and get two more to copy.  So far only one has permanent I/O errors.

 

I like the idea of pulling the files off individually from a emulated drive.  I need to do this "locally" if possible with an unassigned device.  I'm assuming that there is a mount point in Linux that represents the emulated disk (is it the /mnt/user)?  Or does a /mnt/diskX show up for emulated disk?

Link to comment
14 minutes ago, MacDaddy said:

I like the idea of pulling the files off individually from a emulated drive.  I need to do this "locally" if possible with an unassigned device.  I'm assuming that there is a mount point in Linux that represents the emulated disk (is it the /mnt/user)?  Or does a /mnt/diskX show up for emulated disk?

 

Only disks in the parity array can be emulated. Emulation consists of calculating the missing disks data from parity and all the other array disks. It is not possible to emulate an unassigned device.

 

Read johnny.black's post again and let us know if you want to pursue this and need further advice about it.

Link to comment

Apologies.  I was too quick to reply and probably didn't spell it out to the degree I can ask for good advice.  Here is what I intend to try:

1) I have five existing flood damaged disks.  The parity disk appears to be operational.  Three of the data disks appear to operate well enough for me to rsync contents to a fresh disk.  One data disk had I/O errors and is undergoing a second pass at the drying process.

2) Once I get an additional controller card installed, I will have sufficient ports to attach the four operational flood damaged disks.

3) I'll use johnnie.black's advice on bringing an array online with flood damaged disks in a new order (understanding the caution required).

4) Once the array of flood damaged disks (one parity, three data, one emulated) is active, I will attempt to copy the data to a new, precleared, unassigned disk.

5) I would like to perform the copy locally (as opposed to network copy).  I'm fuzzy about where the emulated drive will appear.

--let's say that disk3 is the emulated disk.  Let's also say that the fresh drive is not in the array, but mounts as disk6.  Can I bring up a terminal window on the local box and use my prior rsync command with /mnt/disk3/ as source and /mnt/disk6/ as target?

Link to comment

disk3 is by definition an array disk, and disk6 is also by definition an array disk. The only way to get a disk to mount as disk6 is to make it an array disk, and adding a new disk to the array while you are trying to emulate another disk isn't possible.

 

An unassigned device can be mounted using the Unassigned Devices plugin, and it would appear as /mnt/disks/whatever-you-mount-it-as. And the emulated disk3 would appear as /mnt/disk3. So you could do what you want, but the fresh drive would have an unassigned mount and path.

Link to comment

In order to emulate a disk, you need ALL BUT ONE of you original disks functional (including parity). If some have I/O errors, you MAY have some partial success. If you only have a subset of your disks and trying to emulate one - you are screwed. It will not work. Single parity will allow recovery of a single disk. Maybe those looking at possible hurricane flooding should quickly add a second parity to up the chances of a successful recovery in case of flood. That or move your disks to high ground, or to a flood proof safe deposit box. (Dual parity would give the ability to recover if two disks fail.)

 

But if you have all but one disk available, the key to engaging the emulation is putting Humpty Dumpty (your original array configuration) back together again, since your USB key, and the all important super.dat file containing your current array config, are gone. Since one of your disks is drying out you can't use it. So you need a surrogate to fill in for it. Should be the same size as the failed disk. What you will do is define the array with all the original disks and the surrogate in place, tell it to trust that parity is valid, and then immediately stop the array and unassign the surrogate.

 

When you start the array with the surrogate in place, unRAID will register a disk of the correct size in that disk slot. That's all you're looking for. None of the data on the surrogate even gets touched (if in maintenance mode). It is not involved in any write operation and therefore not involved in any parity calculation. So no harm. After you stop the array, unassign the disk, and restart the array, unRAID will emulate the disk. And it will emulate the original disk, not the surrogate. Because parity was built with the original disk in place, and not the surrogate.

 

The emulation is done to the device level, so you'll get a /dev/sdX. Maintenance mode suppresses disk mounting and user shares. You'll need to use a non-maintenance mode (safe mode or regular mode) to make the method i laid out work - that or manually mount the disks in maintenance mode. Safe mode, if I remember right, disables all plugins, dockers, and VMs. If you are starting from scratch with a new USB key, you shouldn't have any of those things so safe mode and regular mode should be the same.

 

Starting the array in safe mode or regular mode, the emulated disk will be attempted to be mounted. Remember, emulation means that unRAID, behind the scenes, is reading ALL the corresponding sectors on all the other disks (including parity), to calculate what would have been on the original disk. You should recognize the degree of precision that is required here. It must be perfect. And it will be perfect, if all but one of the disks are 100% functional. 

 

But if the disk is showing unformatted in the Web GUI, or when you do a "ls -la /mnt/disk3", you see gobbledygook files, one or more of the disks is missing or not being read properly. You can try to fix it by double checking you have all the correct disks installed. If the disk is RFS formatted, there is a long shot possibility that you might be able to recover some data off of the disk despite some corruption. But I would not try that without more input from the forum. But if you can't fix it, emulation will not be possible and chances are recovery will be dependent on drying the real disk more thoroughly.

 

Hope this is clear.

  • Upvote 1
Link to comment

Thanks guys - I truly appreciate all the advice and recipes.  It is helping me move forward.  

Here is where I'm at the moment:

- I found a new self-service kiosk at my local HEB called DryBox.  The basic idea is to pull a vacuum and heat at a lower point to pull out moisture.  Targeted towards cell phones but a drive fits.  I ran the failed drive through this process.

- I was able to mount the failed drive and get some files off before it degraded.  I let it cool down, rebooted and got another set of files off.  Repeat. Got a lesser set of files. Repeat. Fatal I/O error.

- Followed your instructions to bring up the array to the point the failed disk was emulated.  Mounted the recovery drive and was able to rsync files from emulated drive to fill in the blanks of missing files.  I think I've got a good  backup of the failed drive.

- I bought another 4TB drive and ran it through preclear.  I'm in progress of letting array rebuild this drive.

 

Thinking of second parity.

Thinking of helium drives.

Certainly moving the hell away from this area of Houston.

Link to comment

He doesn't have the USB though with the original drive mappings using the identifier.

 

Unfortunately you can only NOT have one of the following in single parity:

1) USB drive for unraid

2) Parity Disk

3) Array Disk(s)

 

I believe you are going to have to try to rebuild the lost disk using the "swamp" drives and "swamp" parity, then that disk could be directly implemented in your new installation.

 

Better than starting from scratch at least!

Link to comment
1 hour ago, kreene1987 said:

He doesn't have the USB though with the original drive mappings using the identifier.

 

Unfortunately you can only NOT have one of the following in single parity:

1) USB drive for unraid

2) Parity Disk

3) Array Disk(s)

 

I believe you are going to have to try to rebuild the lost disk using the "swamp" drives and "swamp" parity, then that disk could be directly implemented in your new installation.

 

Better than starting from scratch at least!

Not entirely clear what you are trying to say. I think maybe he is already past the point you were trying to make, whatever it was.

 

The original unRAID flash which contained the disk assignments is, in fact, not critical. As long as you are not missing more disks than you have parity, you can get unRAID to rebuild. You just have to fiddle with it a bit. Those steps were already outlined by johnnie.black in this thread.

Link to comment

I would say if you had dual parity, and needed to rebuild 2 disks, the need for the flash would be considerably higher. You'd need to have each disk in its correct slot. If you have a printout from the WebGUI, that would be sufficient, but to reconstruct it with no documention, unless the user has a very good memory, with more than 3 or 4 disks, would be near impossible. Hot swap bays would help a lot if you had to try a couple hundred combinations!

Link to comment
1 hour ago, trurl said:

Yes, dual parity would actually complicate things in a scenario where you didn't know which disk was parity and which was parity2, even if you only needed to rebuild one disk.

 

Yes, but if you knew what were all the data disks and which were parities, there would only be two things you'd need to try to try to rebuild one disk. But if you needed to rebuild two disks, and you weren't sure which data disk went in which slot, you'd have X!*2, where X is the number of data drives, combinations to try. So a 5 data disk array would require 5*4*3*2*1*2=320 combinations. 6 drives, over 1800! How about 20?! Parity2 fans better back up that usb super.dat file, or at least have a printout of your drive configuration, to benefit from your protection in a challenging scenario.

Link to comment

First off, a big shout out to the entire unRaid community.  It's funny how much more stuff beyond my media library was stored away.  Being able to recover was a life saver.  I sincerely appreciate all those who took the time to give me some actionable suggestions and thoughtful approaches.

 

One thing I did when loading the drives was to put an Avery sticker with "row column" indicator (i.e. A1 was in the uppermost left slot, B1 was the uppermost right slot).  I knew A1 was parity and had high (but not absolute) confidence A2 was disk1, A3 was disk2, etc..  I used the plugin to document the disk layout, but the copy I saved on my desktop machine was also destroyed by flood.  Using the instructions provided I was able to get the "swamp" drives to reconstruct the dead drive.

 

 

While it's fresh in my mind, here a few random thoughts that may be useful to others pondering disaster recovery.

1) Make a backup of your machine.  Parity helps in fault tolerance, but it's not the same as a full backup.  I used my old unraid server to backup my production server.  I also had a mutual protection agreement with a buddy to swap hard drives with selected shares backed up.  Where I went wrong was in mounting my old unraid server in an unused closet.  While it was higher than my production server, it wasn't high enough to escape destruction.  I still think mutual protection is the right way to go,  just choose a partner that is not in the same flood plain.

2) Label the disks as they go in.  Be deliberate about how you assign them so you can document the disk assignment on new drives.

3) Email yourself a copy of the output from the drive layout plugin.  It will lower the stress level.

4) When you do your monthly parity, email a copy of the backed up USB drive to yourself.

5) If you have a friend in the medical equipment field, the CFC bath is a great way to go in cleanup.  Full disclosure - I'm not sure this is a "true" CFC bath like was used with circuit boards back in the day (EPA outlawed).  But I think it is a close approximation that does wonders for electronics.

6) The rice baggie approach to dry out may not be the most optimum, but rice is ubiquitous and will buy you time while you deal with the 1001 simultaneous crises that happen in a disaster.

7) In Texas there are DryBox kiosks that provide self serve drying for electronics.  While it is targeted to phones, a hard drive will fit.  I wish I had tried the DryBox on the failing drive at the very beginning.  It helped get it working well enough to get many files off, but I imagine I contributed to it's demise by trying to spin it up directly out of the rice bag.

8). If you can plan your disasters and select for flood then helium drives would be a good preventative measure:-).  However with my luck the next disaster will be an earthquake.

 

It's been said before but bears repeating - don't panic.  Take a deep breath and reach out to the fantastic people on this forum.  Take your time and be deliberate.  

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.