Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Cache "copy with verify" possible?

Featured Replies

I'm about to build my first unRaid server as a replacement for my WHS box. Once complete I will need to transfer over 5TB of data to the unRaid server over Gigabit LAN. I built a small unRaid test server and achieved writes to it of around 20MB/s. Most of my files are large BluRay rips to individual ISO files, and as I'm concerned about data integrity I always copy with verify (using Teracopy, Fastcopy, xxcopy, etc.). Using verity of course approximately doubles the copying time, so at 20MB/s the copying is starting to take an unacceptable about of time. Copying 5TB at that speed with verify is going to take quite a long time to say the least  :-[

 

So, investigating alternatives to speed up the copy process I thought about adding a cache drive to the unRaid server. I haven't verified it yet but I would expect to get maybe 50 to 70MB/s transfers based on reports in this forum. However, thinking a little more deeply, using as cache disk defeats the purpose of verified copying. The copy on the cache will be perfect, but I'm assuming the copy from the cache to the storage drives will be just a plain vanilla copy with no verity. This problem could be easily solved if there was some option within unRaid (a switch in the cron copy job, perhaps??) to do copies from the cache with verity. Does anybody know of such an option, or if not, is there some other solution available for verified cache copies?

 

 

 

 

hi

 

the copy from cache to the array is done with Rsync

and as far as i know that uses "verify" technology with md hash verification

 

see this wiki

http://en.wikipedia.org/wiki/Rsync

 

The process to move data from cache onto the array is called mover, which uses rsync to verify files as they are moved.

 

This is the command in the script, "rsync -i -dIWRpEAXogt --numeric-ids --inplace --remove-source-files {} /mnt/user0/ \; \) -o "

 

I'm not going to go look up all the options, but I'm fairly certain that this verifies the files as they are moved from the cache disk to the array.

 

Someone can correct me if I am wrong.

Using a cache drive in your situation will slow things down. What happens when the cache drive is full? For maximum speed make sure you are writing to disk shares and not user shares.

  • Author

Well that's service with a smile; three replies in less than 30 minutes. Great forum.

 

Ok, I'll look into the rsync options.

 

<<Using a cache drive in your situation will slow things down. What happens when the cache drive is full? >>

 

Yes, I considered that problem. All my 8 drives will be new WD 2TB EARS so the transfer will have to be in <2TB steps per day. Or perhaps I can modify the cron job to run more frequently??

 

<<For maximum speed make sure you are writing to disk shares and not user shares.>>

 

I thought as much but when I did testing on my test unRaid system I found no speed improvement at all. When copying a 4.7TB DVD ISO from my Windows box to unRaid the average transfer time was 158s to a drive share and 159s to a user share. This was repeatable. Seems I may have a bottleneck elsewhere in the system. When I get my new system up and running I'll redo the test.

 

Thanks for the replies.

It sounds like you are presently just copying off the WHS box. Unassign the parity drive and copy to the data drives without it. Then, assign it and let it build once you are done copying.

 

Peter

 

  • Author

<<Unassign the parity drive and copy to the data drives without it. Then, assign it and let it build once you are done copying.>>

 

Thank you Peter. That did the trick. My test copy went from 158 seconds to only 109s and averaged 41MB/s (from about 28MB/s). That's certainly a worthwhile improvement.

 

Cheers,

 

Ross

  • Author

Just an update on rsync verify:

 

<<This is the command in the script, "rsync -i -dIWRpEAXogt --numeric-ids --inplace --remove-source-files {} /mnt/user0/ \; \) -o ">>

 

<<the copy from cache to the array is done with Rsync and as far as i know that uses "verify" technology with md hash verification >>

 

I took a look at the wiki and also http://everythinglinux.org/rsync/ . As far as I can determine there is no verification that the newly written file was written without error, i.e., there is no post verification. It seems to use md hash to find differences between files of the same name that exist on both the cache and the drive store, and then transfers only blocks of data that contain the differences. I did not read anything to the effect that those writes were then verified, but I'm not really familiar with the Unix OS so I may be wrong (and I hope that I am wrong because I need verified writes from cache).

 

Ross

having a read of the wiki, it would appear that rsync constantly verifies as it transfers

 

I'd consider that better then a single final md5 / sfv check personally :)

 

 

 

but if you're still concerned, you can do a final md5 check between the original copy (on the whs) and the final resting place of the file (post mover script) once you've copied everything, prior to wiping your whs original copies

  • Author

Hello kal ,

 

Thanks for the reply.

 

<having a read of the wiki, it would appear that rsync constantly verifies as it transfers>

 

Can you paste the section that suggests this? I've re-read the wiki and I still can't see any text that suggests writes are subsequently verified in some way, either on-the-fly or later.

 

<<if you're still concerned, you can do a final md5 check between the original copy (on the whs) and the final resting place of the file >>

 

Sure that is possible, but I think that would defeat the purpose of writing to cache (faster transfers). I would be better just to write directly to the user or disk share with verify.

 

Anyway, happy to be proven wrong on this  ;)

 

Ross

well its a file syncronisation tool, and syncronizes on a block (chunk) level by comparing md5 checksums as it goes. Keeps resubmitting differing blocks until the whole destination file contains the same (md5 verified) blocks

The recipient splits its copy of the file into fixed-size non-overlapping chunks and computes two checksums  for each chunk: the MD4 hash, and a weaker 'rolling checksum'. It sends these checksums to the sender. Version 30 of the protocol (released with rsync version 3.0.0) now uses MD5 hashes rather than MD4

 

2nd point,

 

Dont forget, the cache drive only speeds up the writes to the array (by essentially postponing the parity work till the mover script runs), so your TeraCopy verify (or the like) wouldnt be any faster with, or without a cache drive

 

For these large, initial data copies, I'd be considering running without a cache drive, and without a parity drive,

blitzing over all your data onto the unraid disks

Verifying its all there ok (md5 checks etc as necessary),

and then adding in the parity drive.

 

Down the line then, if your subsequent protected unraid writes are thought to be slow, consider using a cache drive then

  • Author

Here is my understanding of the context of that paragraph:

 

The wiki doc talks about the comparison of two existing similar (but not identical) files. The differences between these two files are found using MD hash, etc. This is all done with only reads; there are not yet any writes done to the recipient file at this stage. Once the differences are found, only then are chunks containing those differences written by the sender to the recipient file. There is no mention in the doc that, after the sender writes this data, a subsequent verify takes place.

 

The wiki does not mention how a new file (not existent on the recipient drive) is handled, but I assume it is just a standard copy with no verify.

 

<For these large, initial data copies, I'd be considering running without a cache drive, and without a parity drive ...>\

 

Yes that is great advice, and it makes our discussion about rsync redundant. As an unRaid newbie I was under the impression that writing to a cache drive would be faster than writing directly to a drive share, but if the parity disk is unassigned I can't see how it can be faster (should be the same speed as writing to a drive share assuming the same disk specs).

 

Thanks kal.

 

Ross

I would do the copy without verify, and then do md5 calculations after the fact. You can do the md5 in parallel with copy operations involving other physical disks with no or minimal impact on copy speeds (obviously run the md5 program on the same machine where the data lives).

 

If you are setting up a new array, I would consider doing at least part of the copying with parity in place. Running it under real world conditions is a good test.  BTW, most users get 30

Mb/sec copying to the protected array (without verify).

Oops, that should have been 30 MB/sec +/- 

 

And after you compare a bunch of md5 results and find no differences, you may actually start to trust you setup to copy data accurately ;)

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.