Use Mover or Cron to duplicate disk structures

jeffreywhunter · May 4, 2015

I like redundancy when it comes to files. Given I have so much disk space available to me, I'm considering creating a duplicate structure within my server. Would it make sense to use Mover to automatically replicate files? Cron? Another way? The use case is to simply ensure that when a file is written, it is automatically copied in the background to a duplicate tree. Sort of a bruteforce raid 1 via duplicate file structures...

Thoughts?

trurl · May 4, 2015

Are you talking about duplicates of files in user shares or duplicates of files on disks?

If you have 2 disks, and they are both part of the same user share, you should avoid these disks having identical paths for a file. For example, you have /mnt/disk1/myshare/file1 and /mnt/disk2/myshare/file1, then /mnt/user/myshare/file1 will have 2 possible results, the file at /mnt/disk1/myshare/file1 and the file at /mnt/disk2/myshare/file1. These files may not even be identical, but the paths from the user share can't distinguish them, and the result is indeterminate.

There are other ways to arrange things so you don't have identical paths but not exactly what you might call a duplicate structure.

gundamguy · May 4, 2015

Are you talking about duplicates of files in user shares or duplicates of files on disks?

If you have 2 disks, and they are both part of the same user share, you should avoid these disks having identical paths for a file. For example, you have /mnt/disk1/myshare/file1 and /mnt/disk2/myshare/file1, then /mnt/user/myshare/file1 will have 2 possible results, the file at /mnt/disk1/myshare/file1 and the file at /mnt/disk2/myshare/file1. These files may not even be identical, but the paths from the user share can't distinguish them, and the result is indeterminate.

There are other ways to arrange things so you don't have identical paths but not exactly what you might call a duplicate structure.

Not positive, but based on things I've seen before (A while ago) I think if you were to try /mnt/user/myshare/file1 you would get the file1 on /mnt/disk1/ 100% of the time, and the file1 on /mnt/disk2/ 0% of the time. I think conflicts are resolved by returning the file from the lowest disk number.

If this is the case then you can do what jeffreywhunter wishes easily. That said, I wouldn't move foward with this without really understanding how conflicts in /mnt/user/ are resolved.

Also as to method, I don't know how you could automagically do this when a file is writen, but you could easily create a script to create a copy of the data using rsync and run it at regular times using cron.

trurl · May 4, 2015

Are you talking about duplicates of files in user shares or duplicates of files on disks?

If you have 2 disks, and they are both part of the same user share, you should avoid these disks having identical paths for a file. For example, you have /mnt/disk1/myshare/file1 and /mnt/disk2/myshare/file1, then /mnt/user/myshare/file1 will have 2 possible results, the file at /mnt/disk1/myshare/file1 and the file at /mnt/disk2/myshare/file1. These files may not even be identical, but the paths from the user share can't distinguish them, and the result is indeterminate.

There are other ways to arrange things so you don't have identical paths but not exactly what you might call a duplicate structure.

Not positive, but based on things I've seen before (A while ago) I think if you were to try /mnt/user/myshare/file1 you would get the file1 on /mnt/disk1/ 100% of the time, and the file1 on /mnt/disk2/ 0% of the time. I think conflicts are resolved by returning the file from the lowest disk number.

If this is the case then you can do what jeffreywhunter wishes easily. That said, I wouldn't move foward with this without really understanding how conflicts in /mnt/user/ are resolved.

Also as to method, I don't know how you could automagically do this when a file is writen, but you could easily create a script to create a copy of the data using rsync and run it at regular times using cron.

So it might actually be deterministic, though not necessarily what is expected or wanted.

Still not necessary to bother with this, though. Just exclude disk2 from the user share. Actually thinking about this some more I'm not sure that would really work. The exclude would probably only apply when writing to the share, and probably not when reading.

You could certainly just avoid using user shares altogether, but that would mean the files would have different paths. So you might as well just make the copy have a different path.

There are of course other ways to accomplish redundancy. Backup to a different system being one obvious way. Doesn't quite accomplish what RAID1 does though.

Squid · May 5, 2015

Are you talking about duplicates of files in user shares or duplicates of files on disks?

If you have 2 disks, and they are both part of the same user share, you should avoid these disks having identical paths for a file. For example, you have /mnt/disk1/myshare/file1 and /mnt/disk2/myshare/file1, then /mnt/user/myshare/file1 will have 2 possible results, the file at /mnt/disk1/myshare/file1 and the file at /mnt/disk2/myshare/file1. These files may not even be identical, but the paths from the user share can't distinguish them, and the result is indeterminate.

There are other ways to arrange things so you don't have identical paths but not exactly what you might call a duplicate structure.

Not positive, but based on things I've seen before (A while ago) I think if you were to try /mnt/user/myshare/file1 you would get the file1 on /mnt/disk1/ 100% of the time, and the file1 on /mnt/disk2/ 0% of the time. I think conflicts are resolved by returning the file from the lowest disk number.

If this is the case then you can do what jeffreywhunter wishes easily. That said, I wouldn't move foward with this without really understanding how conflicts in /mnt/user/ are resolved.

This is Tom's comments on duplicates from the 5.0rc11 thread. Not sure, but probably nothing has changed since then

What are duplicates? There are two kinds of duplicates:

A) The same named file in the same directory path on multiple disks. All file operations will operate on the copy of the file on the lowered-numbered array disk.

B) A situation where there is a file in the same directory path as a directory. For example:

disk1/Movies/Alien <-- a directory

disk2/Movies/Alien <-- a file

In this case, since the object on the lowest numbered disk is a directory, the object will be treated as a directory for all operations.

trurl · May 5, 2015

And what I said earlier about excluding a disk from the share is totally useless. If the folder exists, it will be treated as part of the share for reads and writes regardless.

gundamguy · May 5, 2015

I was working on writing up a script that could do this, but I realized there would be weird problems.

Any script to mirror disk1 on disk2 will fail to dupliate files that unRAID puts on disk2 that don't exist on disk1 (risk that trurl pointed out) so you would need to rsycn from disk2 to disk1 as well... but that will create a weird senario where deleting files is a pain since you have to delete both copies from both disks before the script runs and recreates a copy on the disk you just deleted it from...

I'm just not sure that this is a good idea.

Now if you were to mount these disks outside your array then it would be much easier and simplier to pull off.

garycase · May 5, 2015

While backing up on the same system isn't as good as having a separate backup system, it's certainly better than not having backups at all.

The key, of course, is to be sure the duplicates are on a different disk, so it's less likely you'll lose data if you have a dual disk failure [Note that you could still lose data if the 2 failed disks had both the primary and backup copies of some files => an issue you wouldn't have if you had a separate backup server].

As for automating the process => you could simply use an RSync task that replicated your shares to a set of backup share that were on DIFFERENT disks.

jeffreywhunter · May 5, 2015

This is about redundancy not duplication (although duplication creates redundancy). For critical directories, I just want to maintain a redundant copy. On server backup, separate disks, separate shares. Here's a sample structure

Each of these shares are defined

> Cache (mover moves files from here to Movies, Pictures and Documents)

> Movies

-- Disk1, Disk2

> Movie Backup

-- Disk3, Disk4

> Pictures

--Disk5

> Pictures Backup

--Disk6

> Documents

--Disk7, Disk8

> Documents Backups

--Disk9, Disk10

Hourly a cron job would kick off that would synchronize files between the original and backup directories (which would never be accessed except in emergency).

Does that make more sense? Can I use Mover to also sync files? Or do I need to purchase an app like GoodSync?

garycase · May 5, 2015

Certainly seems like a simple rsync script run hourly by a cron job would do the trick. It works fine between directories -- they don't have to be on different computers.

Note, however, that as I indicated earlier, this does NOT provide a fail-safe backup in case of a data-loss scenario on UnRAID (i.e. 2 failed disks). Using your example, if both disk5 and disk6 failed, you'd lose all of your pictures.

jeffreywhunter · May 5, 2015

Sure, but the odds of losing two disks, those two specific disks, would be remote. Plus, in keeping with my redundant lifestyle, I have the same file structures on my raid 0 on my PC... To be impacted (more than a cycles worth of data), I'd have to lose three drives. two on Unraid and one on my raid 0. Seems pretty remote...?

garycase · May 5, 2015

Agree => as long as the data on UnRAID isn't the only copy the risk is appreciably less.

I was simply noting that maintaining a duplicate copy on UnRAID was NOT a replacement for a backup.

jeffreywhunter · May 6, 2015

Agreed!

Use Mover or Cron to duplicate disk structures

Recommended Posts

jeffreywhunter

Link to comment

trurl

Link to comment

gundamguy

Link to comment

trurl

Link to comment

Squid

Link to comment

trurl

Link to comment

gundamguy

Link to comment

garycase

Link to comment

jeffreywhunter

Link to comment

garycase

Link to comment

jeffreywhunter

Link to comment

garycase

Link to comment

jeffreywhunter

Link to comment

Join the conversation