Jump to content
We're Hiring! Full Stack Developer ×

Single File restore


Recommended Posts

I see I thought unraid and snapraid handled parity the same where in snapraid you can restore a single file from the parity disk.

No - that is not how unRAID parity works.  I believe SnapRaid uses a snapshot method (and thus its name), whereas unRAID parity is real-time.  unRAID parity is file system agnostic and has no idea of what any particular sector on a disk is being used for.  You should probably read up on the use of parity in the unRAID online documentation.
Link to comment

To further elaborate, unRAID parity is not a file backup. No storage solution is a backup unless it is used to store an extra copy of files. Extra copies are the only thing that counts as backup.

 

Make a backup plan NOW.

 

You don't have to backup everything, but you should give some thought into what you decide you won't backup. Make some priorities.

 

What is important and irreplaceable?

 

What is difficult or expensive to replace?

 

What would be inconvenient to replace or difficult to live without until it could be replaced?

 

What else would you like to backup and you have enough storage for?

 

Etc.

 

Link to comment
  • 6 years later...

Sure, one should always have backups, since (un)raid is not a backup. But say your onsite backups failed and the remaining offsite backup that didn't fail is annoying to access: on a tape drive, in a different geographic location, on the cloud with expensive restore costs, etc.

 

Requirements for this scenario:

  • know a particular drive had a failure in only one file
  • know that parity had the correct bits
  • know the rest of that drive is okay. 

Maybe this could happen if you had some drive that was externally connected (e.g. SAS) which was disrupted when making a modification to one particular file with no other activity on that drive. An very rare scenario which use case shouldn't officially be supported, but an interesting thought experiment:

 

In theory, wouldn't it be possible to emulate the drive that is known to have a file error and restore that file? There is a method to rebuild a drive onto itself, but as far as I know that will rewrite every bit on that drive and thus can take quite awhile.

One potentially dangerous method in an attempt to achieve that result today: stop the array, mark the disk with the known error as "no drive", start the array in maintenance mode, mount the emulated drive as read only, and restore that file outside of the parity protected array, e.g. cache or an unassigned device.

 

If starting the array in maintenance mode and mounting an emulated drive prevents any bits from being written to any drive, then you could then start the array after re-assigning the disk. Though since it'll likely mark your disk as a new drive, you'd have to make a new config and say parity is valid. To be clear, this probably a poor assumption, as I don't know all that much about filesystem details. Some bookkeeping information very well could be written somewhere before the read-only partitions are even mounted, and possibly mounting as read-only still writes to a log somewhere depending on the filesystem. 

Link to comment
6 minutes ago, trurl said:

If a write to the drive failed, it is disabled. Parity is updated so the write to the emulated drive succeeds. The drive can be rebuilt from parity

 

Yes of course, I mentioned that in my comment with its potential disadvantage:

 

Quote

There is a method to rebuild a drive onto itself, but as far as I know that will rewrite every bit on that drive and thus can take quite awhile.

 

So with drives of 20+ TB it can take quite a while, yielding degraded performance during the rebuild and stressing out the other drives. Though certainly better than getting to an annoying to access backup.

 

Link to comment
58 minutes ago, robobub said:

One potentially dangerous method in an attempt to achieve that result today: stop the array, mark the disk with the known error as "no drive", start the array in maintenance mode, mount the emulated drive as read only, and restore that file outside of the parity protected array, e.g. cache or an unassigned device.

Nothing is mounted in maintenance mode. The rest of this assumes we are emulating files which is not the case.

 

If you are concerned about rebuilding to the same drive, you can rebuild to a spare.

Link to comment
11 minutes ago, trurl said:

Nothing is mounted in maintenance mode. The rest of this assumes we are emulating files which is not the case.

 

If you are concerned about rebuilding to the same drive, you can rebuild to a spare.

Nothing is mounted by default when starting in maintenance mode but the drives and emulated drives are created and mapped to /dev/mapper/mdX. You can then mount (read-only or even read-write, which does update parity) and get access to emulated files. Though as I mentioned I'm not sure if some bits get written in this process that invalidate the drive that is disconnected and being emulated.

 

In this scenario, the concerns about rebuilding to the same drive are in terms of time and stress, not messing with the original disk or data.

Link to comment
6 hours ago, robobub said:

Requirements for this scenario:

  • know a particular drive had a failure in only one file
  • know that parity had the correct bits
  • know the rest of that drive is okay. 

Knowing all this is additional Metadata which needs to be guaranteed to be correct at all times, including instances like those that cause the failure to write one file. Also it needs to be stored somewhere where it's immune to the failures its trying to overcome. Just taking care of first point takes you into the territory of trying to write your own filesystem, which unraid isn't. 

The recovery part isn't that hard. It's knowing those assumptions leading to that recovery which are not so easy

Edited by apandey
Link to comment
7 hours ago, robobub said:

the drives and emulated drives are created and mapped to /dev/mapper/mdX

/dev/mapper only happens with encrypted drives.

 

7 hours ago, robobub said:

not sure if some bits get written in this process that invalidate the drive that is disconnected and being emulated.

Of course the disconnected drive is invalid. That is why it is disabled.

 

7 hours ago, robobub said:

concerns about rebuilding to the same drive are in terms of time and stress, not messing with the original disk or data.

Don't understand this at all. How can you rebuild to the same drive without messing with the same drive?

 

7 hours ago, trurl said:

If you are concerned about rebuilding to the same drive, you can rebuild to a spare.

 

Link to comment
5 hours ago, apandey said:

Knowing all this is additional Metadata which needs to be guaranteed to be correct at all times, including instances like those that cause the failure to write one file. Also it needs to be stored somewhere where it's immune to the failures its trying to overcome. Just taking care of first point takes you into the territory of trying to write your own filesystem, which unraid isn't. 

The recovery part isn't that hard. It's knowing those assumptions leading to that recovery which are not so easy

Well, all times leading up to an event, item 2 and 3 are generally true, and one specific event manually being instantiated can result.

 

I did say it was a rare scenario and my reply was really just a thought experiment about what is possible with unraid.

 

3 hours ago, trurl said:

/dev/mapper only happens with encrypted drives.

 

Interesting, well I am quite doubtful that the parity operation and emulation happens at a different stage whether the devices are encrypted or not, so perhaps the block devices themselves are likely hooked into it without mapping them.  I suppose that makes sense, if you start in unencrypted maintenance mode and run a repair operation on device, any bit changes needs to be propagated to the parity.

 

3 hours ago, trurl said:

Of course the disconnected drive is invalid. That is why it is disabled.

In this scenario I have manually selected the drive as "no device" and disabled it to get the emulation to run. The question is whether any writing of bits happens even though the partitions are mounted as read-only. When recovering a drive in general (outside of unraid), it's common practice to mount as read-only to avoid writing any data at all to the disk. Do you have any details on exactly what information is updated in this case in and outside of unraid?

 

3 hours ago, trurl said:

Don't understand this at all. How can you rebuild to the same drive without messing with the same drive?

 

To clarify "restore that file outside of the parity protected array, e.g. cache or an unassigned device", the proposal is rebuilding the one corrupted file somewhere, then connecting back up the otherwise good drive and replacing the file alone. So avoiding the rebuilding of the full 20 TB and just whatever size the corrupted file is. 

Link to comment
3 hours ago, robobub said:

manually selected the drive as "no device" and disabled it to get the emulation to run. The question is whether any writing of bits happens even though the partitions are mounted as read-only.

The physical disk isn't accessed when it is emulated.

 

The emulated contents can be read from the parity calculation by reading all disks, and the emulated contents can be written by updating parity as if the disk had been written. So, when the emulated disk is written by updating parity, reading the emulated disk will produce the same results as if the disk had been written. But the physical disk isn't accessed at all.

 

3 hours ago, robobub said:

connecting back up the otherwise good drive and replacing the file alone

The physical disk is no longer in sync with the parity array, so it has to be rebuilt.

 

There aren't any files as far as parity is concerned. It is all just bits. Maybe you haven't seen this:

 

https://wiki.unraid.net/Manual/Overview#Parity-Protected_Array

 

Link to comment
2 hours ago, trurl said:

The physical disk isn't accessed when it is emulated.

 

The emulated contents can be read from the parity calculation by reading all disks, and the emulated contents can be written by updating parity as if the disk had been written. So, when the emulated disk is written by updating parity, reading the emulated disk will produce the same results as if the disk had been written. But the physical disk isn't accessed at all.

 

The physical disk is no longer in sync with the parity array, so it has to be rebuilt.

 

There aren't any files as far as parity is concerned. It is all just bits. Maybe you haven't seen this:

 

https://wiki.unraid.net/Manual/Overview#Parity-Protected_Array

 

 

You implied an answer to my question, but it wasn't explicit.  I think you may have misunderstood my question based on adding an explanation on parity/emulation, so I'll restate it. My question is whether you can prevent writes to all disks by starting in maintenance mode and mounting as read-only, or if some amount of data is written on the emulated disk/parity through any of these steps.


My understanding of parity/emulation hasn't changed since my original comment with your explanation, so if there is a specific detail I'm getting wrong, please point it out specifically. In my original comment, I acknowledge that anything that gets written when emulating would be dangerous and cause the physical disk and parity/emulated disk to go out of sync.


Now if it is impossible to prevent writes currently when using maintenance mode and read-only mounting, my follow up question is if would it theoretically be possible (with OS level changes to unraid) to prevent those writes? I think it'd be advantageous to be able to inspect/modify the array without any writes to anything, as is the standard for dealing with issues on individual disks, even ignoring this specific scenario. 

Link to comment
10 hours ago, robobub said:

Well, all times leading up to an event, item 2 and 3 are generally true, and one specific event manually being instantiated can result.

 

as I said, what is true in our mind has to be recorded and guaranteed to be true for a system to be able to act on it. "generally true" is a dangerous assumption to base any data mutation decisions on. Whatever credibility a storage system may have can quickly evaporate if it tried to do something which can compromise data integrity. Is there a specific implementation proposal here which can create these guarantees before the selective rebuild can proceed? Even if initiated manually, what data will such action be decided on so that user doesn't end up with data loss surprises?

 

I agree with trurl here, this is a dark rabbit hole, so we can only engage if there is a specific implementation proposal that actually sounds viable. and that too is probably a topic for a different forum - perhaps feature requests

Link to comment

Well, I guess thank you for indulging me as much as you two have so far. I'll just re-iterate from my original comment:
 

Quote

An very rare scenario which use case shouldn't officially be supported, but an interesting thought experiment

 

The discussion I attempted to provoke was for an advanced use-case only. I thought it was clear from all the bold warnings in my original comment that it is not something I would suggest implementing without knowing more details of the how and when information gets written, of which apparently none of us here know.

 

3 hours ago, trurl said:

I don't think I want to go any further down this rabbit hole with you. Whatever you have in mind is going to end up being complicated to implement, complicated to use, difficult to explain, difficult to decide when to use, and much more error prone than simply rebuilding the disk.

 

2 hours ago, apandey said:

I agree with trurl here, this is a dark rabbit hole, so we can only engage if there is a specific implementation proposal that actually sounds viable. and that too is probably a topic for a different forum - perhaps feature requests

 

Link to comment
31 minutes ago, robobub said:

not something I would suggest implementing without knowing more details of the how and when information gets written, of which apparently none of us here know

We do know how the data is written, the parity calculation is blind to data and a simple predictable calculation on bits being written. But there is nothing more to it which make it filesystem aware. That knowledge is in fact the basis of my comments so far

 

I understand the thought experiment part, just that I don't think it's viable without fundamentally rewriting what unraid does and going into the territory of trying to code a filesystem in itself. I am not challenging the thought experiment, just that I feel I am at a dead end of going any further with it 

Link to comment

  

12 hours ago, apandey said:

We do know how the data is written, the parity calculation is blind to data and a simple predictable calculation on bits being written. But there is nothing more to it which make it filesystem aware. That knowledge is in fact the basis of my comments so far

 

5 hours ago, trurl said:

I think we understand these details well enough. Parity is pretty simple. And it all happens below the filesystem level, so nothing about files can be considered.

 

Ok, then can you answer my original question? What data, if any, is written when just starting the array maintenance mode by itself? This is before any filesystems are interacted with.

I did find an answer to the second half dealing with the filesystems, specifically btrfs: If it is mounted with the options "ro,norecovery" then indeed absolutely no data is written to the disk, and this step would not create a divergence between the physical disk and parity/emulated disk. So the remaining question is the preceeding steps.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...