Jump to content

I didn't see that coming....deleting (extraneous) folder on cache deletes the configured share on array


aglyons

Recommended Posts

So I was re-working my cache drive situation. I bought two SSD drives to add a secondary cache in RAID0 for larger data transfers. Previously, I had 2 x 250GB ssd drives and set those up as a RAID1 array for smaller, more important cache content.

 

All of these share were set to be cache:prefer.

 

During the alignment process of the RAID1 array, I checked the files on the drive. I noticed that one of the shares had a folder sitting on the RAID1 cache that I had already re-assigned to the larger RAID0 cache earlier. So I didn't give a second thought when I deleted that folder on the RAID1 cache. 

 

Well, I found out the hard way that deleting that folder on the RAID1 cache told the system to ALSO delete the entire share on the main array. The entire share was blown away!

 

I searched for a solution to restore deleted files/folders. The first step was determine which drive had the original share. That was impossible because it no longer existed so checking each of the drives showed nothing.

 

While I would like to restore that data, it wasn't the end of the world. I can chalk it up to a learning experience. But if there is a solution I am all ears.

 

While this might be a very unique situation. Does this sound like it could be an unexpected function? I mean, the cache is a working copy of the original source content, especially so when set to 'prefer'. AFAIK the only time that content moves is when the array is stopped. The mover then moves the cache copies back to the array. In this situation, deleting the folder on the cache IMMEDIATELY deleted the share on the array.

 

@limetech Any suggestions/comments?

Edited by aglyons
Link to comment
6 minutes ago, aglyons said:

While this might be a very unique situation. Does this sound like it could be an unexpected function? I mean, the cache is a working copy of the original source content, especially so when set to 'prefer'. AFAIK the only time that content moves is when the array is stopped. The mover then moves the cache copies back to the array. In this situation, deleting the folder on the cache IMMEDIATELY deleted the share on the array.

Sounds as if you have misunderstood how Unraid handles User Shares.    The cache is NOT a copy of files on the array.  Files in Unraid exist either on a pool OR on the main array - not at both locations.   User Shares are then a unified view of files covering both the array and any pools.

Link to comment
32 minutes ago, itimpi said:

Sounds as if you have misunderstood how Unraid handles User Shares.

 

I guess I have misunderstood them.

 

That being said, I would appreciate those that run the show consider this observation.

 

Consider retaining the array data in its place. When the array is shutdown, copy from the cache back to the array. In the very least that would help to avert complete data loss in the event of a hardware problem. If the cache is blown up (for some reason) the cache 'copy' would be toast but at least there would be a copy of the data -albeit maybe not the most recent. Yes, that could slow down the array shutdown process. Updated files in the cache would have to be compiled and moved back to the array.

 

To me, this is a win-win trade off. There is speed in the cache and security in the array.

 

fingers crossed.

Link to comment

Your observation is still making an assumption that the cache has a separate copy of data. That is not how cache is intended to operate in the first place. 

Cache is not a backup, if you need to protect against data loss due to fat finger operations, you need a backup. 

Also, array stop isn't specifically a data management operation. I rarely stop my array, so what you suggest would rarely happen to me

To top it all, I would not expect a prefer share to move out of cache to array anyway, as long as cache has free space. That is what prefer means

Link to comment
34 minutes ago, apandey said:

Your observation is still making an assumption

On the contrary. my original observation was an assumption and my error. My suggestion is to fundamentally change the cache function from a move operation to a clone operation.

 

Quote

Cache is not a backup

 

I'm not suggesting it is or should be a backup. But what harm could come from leaving the data on the array and use the cache data as the working version?

 

I know the general consensus from IT professionals is, as you put it, "if you need to protect against data loss due to fat finger operations, you need a backup." That's not really how the world works. Some enterprising IT professionals accept this and provide some solutions. I can't count how many times the recycle bin on my Synology has saved my butt. It's a simple thing, but effective.

 

The Unraid platform has always been referred to by others as non-professional and for hobbyists use. So maybe providing the non-professional users some level of 'oops control' would be a nice addition.

 

I understand that this is most likely a significant change and not a small feat. But I do hope that this is approached with an open mind and some consideration in the future. 

Edited by aglyons
Link to comment
1 hour ago, aglyons said:

When the array is shutdown, copy from the cache back to the array

No drives are mounted when the array is stopped so it isn't possible to access any files at all.

45 minutes ago, aglyons said:

this is most likely a significant change

Your proposal goes against the way everybody else is currently using the system.

 

If you want a backup function there are plugins and dockers for that, or you can schedule an rsync script with User Scripts plugin.

Link to comment
1 hour ago, aglyons said:

 

1 hour ago, aglyons said:

My suggestion is to fundamentally change the cache function from a move operation to a clone operation

Well fine, but then that is a breaking change and not something I would want given I don't expect to have copies of data that can be managed independently by a unsuspecting user

 

1 hour ago, aglyons said:

recycle bin on my Synology

That could be a solution, and fairly independent of reinventing the cache. Seems there is a recycle bin plugin for shares, though I have no experience using it

 

1 hour ago, aglyons said:

not suggesting it is or should be a backup. But what harm could come from leaving the data on the array and use the cache data as the working version?

It would then have to deal with data synchronization issues. All the issues that a backup system has to deal with, while avoiding a backup system in place that would have solved the underlying issue at hand properly. What happens when someone unsuspectingly updates the array copy of data because they worked at drive level rather than share level. Will their changes be overwritten at some undeterminate point in time later? Fat Fingers come in many forms

 

I use cache only share with a backup to array when I explicitly need that setup (like appdata backup plugin provides). That works without making cache pools any more complex or ambiguous. Would that achieve what you want?

 

1 hour ago, aglyons said:

most likely a significant change

Indeed, going from a mergerfs type implementation to a clone-backup model is fundamentally a different system, so just making sure we are not dealing with an XY problem

 

BTW, seems the plan on unraid is to drop the confusing cache terminology and just work with pools as first class objects that can be setup with mover relationships against shares. Maybe that will avoid some confusing that it may be a cached copy

Edited by apandey
Quoted recycle bin plugin / pool plans
Link to comment
7 minutes ago, apandey said:

I use cache only share with a backup to array when I explicitly need that setup (like appdata backup plugin provides). That works without making cache pools any more complex or ambiguous. Would that achieve what you want?

 

In the meantime, yes. I think this would work. Thanks for the suggestion. At the very least, the current data in the cache pool would be somewhat protected from my fat fingers. I haven't found anything like that other than the CA Autobackup for appdata.

 

I should point out though, using your analogy of 'fat fingers'. Mother nature graced me with fat fingers. Developers are the ones that put the buttons very close together. Should we blame mother nature or could the developers space the buttons out a bit? 

Edited by aglyons
Link to comment
8 minutes ago, aglyons said:

should point out though, using your analogy of 'fat fingers'. Mother nature graced me with fat fingers. Developers are the ones that put the buttons very close together. Should we blame mother nature or could the developers space the buttons out a bit? 

If you go into the history of term, you will discover that if buttons are spaced out, the human phenomenon it describes will still exist and we would find a new term to describe it - perhaps bumpy elbow or sleepy head. Mother nature gave the ability to make the most unexpected mistakes and we just have to spent enormous time teaching computers how to work around that

 

I am now more convinced that this is indeed an xy problem. While recycle bin can be a proper solution to the issue at hand that will reliably work, cache is by luck at best. What happens when you delete a file that hasn't been copied over to array yet? Oops again? 

Edited by apandey
Link to comment
49 minutes ago, apandey said:

I am now more convinced that this is indeed an xy problem

 

Don't be so fast to judge that. If volatility of the data on the cache was not a concern then we wouldn't have the CA Autobackup plugin for the appdata folder. Clearly it was looked at as a potential problem of losing data. In my searching I have seen other posts that ring similar to what I am suggesting.

 

The last post in that thread read as follows;

 

Quote

 

"If one chooses prefer cache then that share is never backed up by native unraid unless you choose to have another SSD as a pool backup.

However most people will run a script that backs the cache only folders to the array.

 

Really this should be built into unraid."

 

 

So others have created a script to achieve this result. That tells me that this is a feature that should be built in. And before anyone jumps on his choice of words, yes Unraid is not backup.

 

I posed the question of what harm would leaving the data on the array rather than outright moving it? So far I haven't seen a strong negative effect.

 

Quote

What happens when someone unsuspectingly updates the array copy of data because they worked at drive level rather than share level

 

That's the spirit! Now we're talking about scenarios and how to handle them. I would think that it would be possible to either bury the array data that makes it almost impossible to not know what you are touching. Or, even simpler, a warning message telling the user that they are touching non-cache data of a cached share. Another option is clicking on the file browser doesn't bring you to the array contents but rather the cache contents. That is actually how it works right now. That would effectively make it impossible to touch the array files. None of these solutions would be impossible to implement.

 

Can we at least agree that having data sit on the cache drives at all times that never hit the array ever is asking for data loss at some point. And that this should be given some thought and attention?

 

Everything I have written are ideas that I have just quickly put out for discussion. I am not the creator, I'm not a developer. I have no power to say that this is how it is going to work going forward. I'm a customer that has run into a gotcha that had me pay a price. My intention is to help other new users of Unraid not experience the same thing.

 

And with that, I rest my case.

Edited by aglyons
Link to comment
3 minutes ago, aglyons said:

what harm would leaving the data on the array rather than outright moving it?

Surprises to users that know how things are supposed to work, and surprises to users that don't know.

 

If files exist in multiple places, user shares only exposes one of the copies. Cache has precedence, then disk1, disk2, ...

 

If you overwrite a file, only one copy would be overwritten, the others would be out-of-date. If cache became corrupt and unmountable, an out-of-date file would suddenly appear in user shares from one of the other disks.

 

I won't argue about whether or not some backup feature might be useful, but I will argue against changing how cache and user shares work. Too many users already depend on things working the way it does, whether they are fully aware of it or not.

Link to comment

All of the examples you are giving are basically backup of some form. So you are arguing for an inbuilt backup system in unraid? Perhaps that can be a feature request. I don't see a backup solution has to necessarily come from changing the meaning of cache pools

 

I gave you the negative effect of multiple copies of data, which leads to data synchronization issues. I see it as xy problem because you are trying solve backup via a mechanism that isn't built with that focus

 

Cache pools are not inherently unprotected. They are just not protected by unraid parity. They can still be protected by whatever filesystem redundancy they are built on. I currently run btrfs raid1 pools, which are protected against drive failure. They can still have filesystem corruption, but parity doesn't protect from that either. A backup does. I might move my pools to zfs after 6.12

 

Here is how pools are planned to be evolving for now: 

 

All the examples of not touching array directly also go against what you can do with unraid today. We have a mix of new and power users

 

33 minutes ago, aglyons said:

Can we at least agree that having data sit on the cache drives at all times that never hit the array ever is asking for data loss at some point

Not any more than data sitting in array itself and not having a cache at all. What happens if you delete anything in that case? Do we suggest users protect themselves against that by setting up a backup / recycle bin mechanism or by setting up a cache (which by now should be clear is not meant to backup data). We do have occasional users who incorrectly assume unraid array has a copy of their data in parity drive that they can recover in such cases, and that is not too far away from this example

 

I can also argue that deleting data without correctly understanding whether you have another copy somewhere is asking for data loss at some point. Manual user operations can only be protected to an extent

Edited by apandey
Link to comment
8 hours ago, aglyons said:

Can we at least agree that having data sit on the cache drives at all times that never hit the array ever is asking for data loss at some point.

No, or at least not any more than data on the array.

 

ALL the data on your server is at a single place unless you do backups and at the risk of accidental deletion just the same.

The array has parity protection against the outright failure of one drive (and only that, no protection against fat fingering, ransomware, filesystem corruption, accidental modification/deletion etc), pools can have protection against outright failure of one drive when they use RAID1'd drives, and there is also no protection against all those things.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...