Checksum Suite


Squid

Recommended Posts

What are the key differences (with respect to reliability, outcomes, speed and overheads) between this Plugin and the Dynamix File Integrity plugin created by bonienl?

 

I can see that clearly this plugin allows for recovery via Par2 but I'd like to know if there are any more big differences. If it is only ability for file recovery I can already do that from my Backup and visa versa. I find it unlikely that both my Main and Backup servers would experience rot on the same file at the same time.

 

Horses for Courses and I am just looking at the benefits of one over the other?

Speed: Should be the exact same

Reliability: FIP is under active development whereas this is not.  While I'm going to continue to use this plugin, I would actually suggest everyone to switch over.  TBH, if there are any bugs, etc they probably won't get fixed unless they affect my usage of it.  And bonienl's interface is probably going to kick this one to the ground.

 

But, they are different approaches to the same problem.  I prefer separate hash files for ease of checking through a separate computer, and the separate hashes also let you easily verify copies made to say a flash drive or through the network to another computer whereas bonienl's doesn't let you do that quite as easily.

 

My opinion would be that anyone starting from scratch would be better off using FIP

Link to comment

What are the key differences (with respect to reliability, outcomes, speed and overheads) between this Plugin and the Dynamix File Integrity plugin created by bonienl?

 

I can see that clearly this plugin allows for recovery via Par2 but I'd like to know if there are any more big differences. If it is only ability for file recovery I can already do that from my Backup and visa versa. I find it unlikely that both my Main and Backup servers would experience rot on the same file at the same time.

 

Horses for Courses and I am just looking at the benefits of one over the other?

Speed: Should be the exact same

Reliability: FIP is under active development whereas this is not.  While I'm going to continue to use this plugin, I would actually suggest everyone to switch over.  TBH, if there are any bugs, etc they probably won't get fixed unless they affect my usage of it.  And bonienl's interface is probably going to kick this one to the ground.

 

But, they are different approaches to the same problem.  I prefer separate hash files for ease of checking through a separate computer, and the separate hashes also let you easily verify copies made to say a flash drive or through the network to another computer whereas bonienl's doesn't let you do that quite as easily.

 

My opinion would be that anyone starting from scratch would be better off using FIP

 

Thanks for the quick reply and I appreciate the candor in your response  :)

Link to comment

an idea...

 

i have added a

chmod($md5Filename,0667)

to the

checksum.php

file writer section on my system; if

smb-extra.conf

is set to map hidden files, by "map hidden = yes" (which is apparently not on by default anyway), this will hide the hash file to usual smb clients; not that big thing for media libraries which are mostly read by client applications, but nice for data shares of any sort... since i'm no linux native, i can't say if this chmod setting has additional (unwanted) effects, but might be something to consider (optionally)...

Link to comment

So I decided to uninstall this since it's not going to be supported moving forward.

 

Any help on a command to recursively go though a drive and delete the .md5 files that I created?

 

I would mess around with rm but that seems dangerous if I mess up.

 

Edit: Actually found an answer via Lord Google.

 

Adding the Code in case anyone wants to know how to do this via command line. (Warning, make sure you do this right because you can mess things up pretty badly if you don't...)

 

Step 1: Make sure you are only selecting the files you want to delete.

find /mnt/disk1/ -type f -name "*.md5"

Notes, This will find all files that end with .md5 (the extension I picked for my hash files) on Disk1 (If you have .md5 files you want to keep, you can narrow the search by increasing the starting path)

Step 2: Once you've verified that Step 1 got the files you want to delete, and only the files you want to delete add -delete to the end. Warning this can't be undone and if you put -delete before your file name it will delete EVERYTHING.

find /mnt/disk1/ -type f -name "*.md5" -delete

Repeat Steps 1 and Steps 2 for each disk.

 

I think you could do this using /mnt/user/ as well, but I decided against it because I didn't want to catch anything that should be there in my app folder.

Link to comment

A lot of things can be done very easily in Windows -- with very little (if any) penalty in performance.

 

In fact, although I "played" a bit with this plugin on my test server, I left my main server alone, and still just use Corz Checksum utility in Windows.    It's trivial to create new checksums when needed [right-click, select "Create Checksums"]; and for periodic validations I simply right-click on a share (or disk, depending on what I want to check) and select "Verify Checksums".    A complete validation of my entire DVD collection (~ 25TB) takes a couple days to actually run; but it takes about 2 seconds of "my time"  :)

 

The File Integrity Plugin indeed looks interesting; but I like the dedicated checksum files, so when I copy folders to my backups, or to another PC, the checksums are still there.

 

 

 

Link to comment

A lot of things can be done very easily in Windows -- with very little (if any) penalty in performance.

 

In fact, although I "played" a bit with this plugin on my test server, I left my main server alone, and still just use Corz Checksum utility in Windows.    It's trivial to create new checksums when needed [right-click, select "Create Checksums"]; and for periodic validations I simply right-click on a share (or disk, depending on what I want to check) and select "Verify Checksums".    A complete validation of my entire DVD collection (~ 25TB) takes a couple days to actually run; but it takes about 2 seconds of "my time"  :)

 

The File Integrity Plugin indeed looks interesting; but I like the dedicated checksum files, so when I copy folders to my backups, or to another PC, the checksums are still there.

I agree so completely I could have written Gary's post!  I too use Corz and would like to see a fully supported tool that creates and maintains the separate hash files.  Maybe bonienl will reconsider, and add that alternative some day?

 

It's a little disappointing, thought we had a good hash tool and PAR2 on the way, but I can understand if Squid is unwilling to enslave himself to us!  He knows better than most of us the work, the commitment, and the responsibility involved in maintaining an important plugin.  Always better sooner than later to admit the drive and interest is not there, before more users are relying on it.

Link to comment

A lot of things can be done very easily in Windows -- with very little (if any) penalty in performance.

 

In fact, although I "played" a bit with this plugin on my test server, I left my main server alone, and still just use Corz Checksum utility in Windows.    It's trivial to create new checksums when needed [right-click, select "Create Checksums"]; and for periodic validations I simply right-click on a share (or disk, depending on what I want to check) and select "Verify Checksums".    A complete validation of my entire DVD collection (~ 25TB) takes a couple days to actually run; but it takes about 2 seconds of "my time"  :)

 

The File Integrity Plugin indeed looks interesting; but I like the dedicated checksum files, so when I copy folders to my backups, or to another PC, the checksums are still there.

I agree so completely I could have written Gary's post!  I too use Corz and would like to see a fully supported tool that creates and maintains the separate hash files.  Maybe bonienl will reconsider, and add that alternative some day?

 

It's a little disappointing, thought we had a good hash tool and PAR2 on the way, but I can understand if Squid is unwilling to enslave himself to us!  He knows better than most of us the work, the commitment, and the responsibility involved in maintaining an important plugin.  Always better sooner than later to admit the drive and interest is not there, before more users are relying on it.

I had every intention of carrying on with this, but real life, the aggravation, and from my point of view that I just wasn't having fun doing this one.  Sorry guys

Link to comment

A lot of things can be done very easily in Windows -- with very little (if any) penalty in performance.

 

In fact, although I "played" a bit with this plugin on my test server, I left my main server alone, and still just use Corz Checksum utility in Windows.    It's trivial to create new checksums when needed [right-click, select "Create Checksums"]; and for periodic validations I simply right-click on a share (or disk, depending on what I want to check) and select "Verify Checksums".    A complete validation of my entire DVD collection (~ 25TB) takes a couple days to actually run; but it takes about 2 seconds of "my time"  :)

 

The File Integrity Plugin indeed looks interesting; but I like the dedicated checksum files, so when I copy folders to my backups, or to another PC, the checksums are still there.

I agree so completely I could have written Gary's post!  I too use Corz and would like to see a fully supported tool that creates and maintains the separate hash files.  Maybe bonienl will reconsider, and add that alternative some day?

 

It's a little disappointing, thought we had a good hash tool and PAR2 on the way, but I can understand if Squid is unwilling to enslave himself to us!  He knows better than most of us the work, the commitment, and the responsibility involved in maintaining an important plugin.  Always better sooner than later to admit the drive and interest is not there, before more users are relying on it.

+1

I am still using this.

Link to comment
  • 2 weeks later...

I am just now using this as a way to compare Share/Dir/SubDir/*.* between Tower1 & Tower2, where Tower2 is the master.  I Badly need a list of files that may be incomplete/corrupted/missing on Tower1.

 

I see that this creates SubDir.hash for each

Tower1/Share/Dir/SubDir

Tower2/Share/Dir/SubDir

 

On the Windows Machine I have installed Corz Checksum.  How do I setup a Checksum Compare against Tower1/Share/ vs Tower2/Share/?  I have over a thousand SubDir folders... so comparing each one separately is not feasible.

 

Thank you!

 

Thanks!

Link to comment
On the Windows Machine I have installed Corz Checksum.  How do I setup a Checksum Compare against Tower1/Share/ vs Tower2/Share/?  I have over a thousand SubDir folders... so comparing each one separately is not feasible.

 

Thank you!

Thanks!

afaik this won't work that way since corz has no comparison mode [for the hashes only]. if tower1 should be a 1:1 replica of tower2, you might copy the hashes files from tower2 to tower1 (i.e. using powershell to copy all *.hash files plus the full folder structure), and then run corz against tower1, to find out missing, different and extra files.

 

there are probably other tools out there to synchronize/compare folders, but they won't make any use of the hash files corz/checksum tools had created, thus will need to re-hash everything on the run.

 

antoher way would be to use syncthing (available as docker) to selectively synchronize folders; it allows to set a "master" (read-only) and uses sha256 to compare contents. even if it's definitely not meant to handle n terabytes and millions of files, i'm using it to create a mirror of my main server (sort of hot standby box)...

Link to comment

Nice Solution :)

 

Does this look about right?

ROBOCOPY \\Tower2\ShareA C:\Compare\Tower2\ShareA *.hash /E

ROBOCOPY \\Tower1\ShareA C:\Compare\Tower1\ShareA *.hash /E

well, no ;)

i can't say for sure about robocopy (not using it), bit what i see lacks the correct principle in first place.

 

ok, tower2 is your master, and has all the proper *.hash files (the reference);

 

you had also created them on tower1, which should be your mirror, but unfortunately they are useless, except you would write a script that can compare the contents of both towers directly from the hashes ('am not aware of such a tool).

 

so, in order to use what you have... you would 1st delete all the hash files from tower1 (the mirror), and disable automatic hashing in the checksum tools setting (if you had set them up).

 

now you can use robocopy to copy the hashes from tower2 to tower1, so the master file content and folder structure is 1:1 available on the slave, but just as sort of metadata.

 

now you can either run corz over tower1, and it will find missing, different and extra files, as it is using the tower2 hash files as reference... but it will of course need to read all files and re-hash them during verification.

 

another way would be to start a manual verification using the checksum tools in unraid. afaik (haven't done that) it will also report missing, different and extra*) files, since it is also using the hash files from the tower2 to verify tower1 contents, but of course this also needs to read everything and hash it. main advantage is that you don't need to have your client connected all the time, and most probably verification will run a lot faster when accessing the shares directly on the box.

 

so, in short, there is no easy way to avoid on-the-fly hashing at least one tower, since there are no tools out there (of my knowledge), that will take (corz) hash files in folder structures to compare directories from two sources...

 

*) note: i'm sure the plugin will report different and missing files during verification, i just don't know if it also reports files that are not hashed; maybe squid can tell...

Link to comment

I wouldn't simply compare the hashes ... I'd compare the actual binary contents of the files.

 

There are a variety of good comparison tools.    I use FolderMatch (a Windows utility) 

http://www.foldermatch.com/

 

... but there are many free tools that will also work -- both Windows and Linux based.

 

If, however, you've actually computed the hash files independently on each server, then comparing the hash files will indeed be all you need to do.

 

Link to comment

I'd just use "Contents" => that's a binary comparison of the actual file contents.    Either CRC or Sha-1 has to read the contents to do the calculation, so there's no advantage to those methods when you have access to the files.  I'm actually not sure why FolderMatch even offers those methods, although I suppose it may be useful if you're comparing HUGE files that won't fit into memory for the binary comparison (but even then I'd think it would be done in "chunks").

 

In any event, I always use "Contents" for the comparison method.

 

 

Link to comment

... I'm actually not sure why FolderMatch even offers those methods ...

 

Just for grins, I sent a support request to SaltyBrine and asked why they had the CRC and Sha-1 hashing options.    Their response:  "... To be honest, the hashing algorithm methods were added more for marketing reasons that for any actual need."

 

In other words, if you can compare the actual file contents, there's no reason to compute the hashes  :)

 

 

Link to comment

Great tool!  Finished 375 GB in about 3 hours using compare contents.  Only differences found expected and explainable, gaining faith that conversion from rfs to xfs went very well.  Holds my D525 at 87% sustained, so should be able to compare all 10 TB.  I cannot thank you enough.  If only they offered an Ubuntu version with scheduling, they would have the PERFECT  app!  Pulled the trigger on lifetime updates ;).

Link to comment

Great tool!  ... Pulled the trigger on lifetime updates ;).

 

Agree it's a very handy tool.  I've used it for many years.  It does a lot more than what I use it for ... I just use it as a great comparison tool; but it can also synchronize, copy, move, and check for duplicates.    Not sure why I don't use the other features ... guess I'm just a creature of habit and have my favorites tool for each of those functions [syncBack for synchronization; TeraCopy for copying/moving; etc.].

 

Link to comment
  • Squid locked this topic
  • Squid unlocked this topic

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.