JorgeB Posted December 30, 2015 Share Posted December 30, 2015 I will probably go back to checksums only once dual parity is available. Dual parity won't help in the case you described => if the files are corrupted, the BOTH parity disks will have been updated to reflect the corruption. You'd still need some way of correcting the corrupted data. The corruption happened because a 2nd disk had read errors during a rebuild of another failed disk, with dual parity I could disable the 2nd disk and rebuild both from parity, your am I wrong? Quote Link to comment
garycase Posted December 30, 2015 Share Posted December 30, 2015 In that case it would indeed be okay => since the source of the corruption was a 2nd bad disk. You wouldn't even have to disable it ... the data should automatically be corrected from the parity information when a read error is detected. But the concept is as I noted -- if data is corrupted for some other reason, both parity drives would reflect the corruption. Quote Link to comment
JorgeB Posted December 30, 2015 Share Posted December 30, 2015 But the concept is as I noted -- if data is corrupted for some other reason, both parity drives would reflect the corruption. Yep, and PARs could still help in case a 3rd disk failed with dual parity, although unlikely it's possible, so I may keep using them. Quote Link to comment
SlrG Posted January 3, 2016 Share Posted January 3, 2016 @Squid: Is it correct, that if a verification pass is missed, it won't be executed until the next execution time is reached? My server is sleeping or powered down most of the time. I would really like if the plugin could check if verification passes have been missed and execute them in the sheduled order as soon as the server is running again. An user option to toggle such behaviour would be much appreciated. Quote Link to comment
Squid Posted January 3, 2016 Author Share Posted January 3, 2016 @Squid: Is it correct, that if a verification pass is missed, it won't be executed until the next execution time is reached? My server is sleeping or powered down most of the time. I would really like if the plugin could check if verification passes have been missed and execute them in the sheduled order as soon as the server is running again. An user option to toggle such behaviour would be much appreciated. That's correct. The scheduled verifications (% of drive / share) runs through a cron job. There currently is no built in method to have it reschedule a missed time. TBH, After I got this plugin to the point of where it had the features that I was interested in, and did what I wanted it to do, I more or less lost interest in further updates to this. (It just wasn't as much fun to me as CA is) I am very grateful that Bonienl is working on a GUI for bunker which should do everything and more and be far more polished than this plugin is. (Even though I don't particularly agree with his approach even though it is sound). Once bonienl's plugin is out of WIP progress stage, I was actually going to deprecate this plugin in favor of his (even though for my purposes I would actually continue to use it) Quote Link to comment
jumperalex Posted January 3, 2016 Share Posted January 3, 2016 uggg you need to convince him to at par2 then ;-) At least we've got him exporting to corz standard now iirc. Quote Link to comment
unevent Posted January 3, 2016 Share Posted January 3, 2016 I'm getting this in my logs by the dozen or so daily. Anyone know how to correct? I have six shares set up the same way yet only this one is giving this message in the log. crond[1476]: failed parsing crontab for user root: /usr/local/emhttp/plugins/checksum/scripts/checksumShare.php "/mnt/user/Pictures" &>/dev/null 2>&1 Quote Link to comment
Squid Posted January 3, 2016 Author Share Posted January 3, 2016 I would try removing the schedule for the pictures share and add it again Quote Link to comment
unevent Posted January 4, 2016 Share Posted January 4, 2016 I would try removing the schedule for the pictures share and add it again Removed it, stopped/started the monitor, still get the message every hour. Guess I will have to reboot to clear out cron? Quote Link to comment
gundamguy Posted January 4, 2016 Share Posted January 4, 2016 (Even though I don't particularly agree with his approach even though it is sound). I think this is moving in the right direction and am greatful that both you and Bonienl's work on this. I'm assuming what you don't agree with is the approach of putting the Checksum info into Extended Attributes instead of creating a .hash (or whatever...) file? Also if your looking for the next cool new project to start... maybe consider a Rsync GUI which helps make the unRAID to unRAID (or other Rsync Clients) easier... lot's of discussion about that right now in the General Support forum... people really don't like that they have to create there own scripts and dig into command line to create backups.... Quote Link to comment
Squid Posted January 4, 2016 Author Share Posted January 4, 2016 Also if your looking for the next cool new project to start... maybe consider a Rsync GUI which helps make the unRAID to unRAID (or other Rsync Clients) easier... lot's of discussion about that right now in the General Support forum... people really don't like that they have to create there own scripts and dig into command line to create backups.... Sparklyballs' WebSync docker might fit the bill. Quote Link to comment
gundamguy Posted January 4, 2016 Share Posted January 4, 2016 Good point. I'll have to look into that. Edit: Looking into it... first question, why is a Gui for a tool built into unRAID a docker...? Edit2: Kept looking and figured it out, it's based on a WebGui designed by someone else... that makes more sense. Edit3: Looks like development has stalled for a few months. Quote Link to comment
Squid Posted January 4, 2016 Author Share Posted January 4, 2016 Good point. I'll have to look into that. Edit: Looking into it... first question, why is a Gui for a tool built into unRAID a docker...? Edit2: Kept looking and figured it out, it's based on a WebGui designed by someone else... that makes more sense. Edit3: Looks like development has stalled for a few months. It's a beta for a possible ls.io version. Just pester sparklyballs about it. Quote Link to comment
clevoir Posted January 4, 2016 Share Posted January 4, 2016 I have a problem with duplicate hash values, one will show corruption for a file, whereas another hash value will show that the same file is intact? Prior to your plugin, I manually created MD5 hash values, for each file, using the corz utility On installing and running your plugin, setting it again to create a MD5 hash value per file, I assume that it would monitor the exist hash values instead of creating new ones. Largely this is the case, but I have got situations where the orginal hash value is shown, and another hash value has been created by your plugin. In some cases verifying both hash values will report that the file is intact, but in some other cases the original hash value reports that the file is corrupt, but the new hash value reports that the file is intact? For example I have a mp3 file called Test, the orginal hash value is Test.md5 and the new hash value is Test.mp3.md5 In all cases where I have two hash values for a file, the original hash value is *.md5 and the newly crreated hash value *.file extension.md5 I am concerned that on running the Verification Tool, I don't know it is looks at the old hash value, new value or both. As such it could to reporting that a file is corrupt when it is not. I guess that I could just delete all MD5 files and start again, especially in light of Bunker GUI plugin? Quote Link to comment
trurl Posted January 4, 2016 Share Posted January 4, 2016 I have a problem with duplicate hash values, one will show corruption for a file, whereas another hash value will show that the same file is intact? Prior to your plugin, I manually created MD5 hash values, for each file, using the corz utility On installing and running your plugin, setting it again to create a MD5 hash value per file, I assume that it would monitor the exist hash values instead of creating new ones. Largely this is the case, but I have got situations where the orginal hash value is shown, and another hash value has been created by your plugin. In some cases verifying both hash values will report that the file is intact, but in some other cases the original hash value reports that the file is corrupt, but the new hash value reports that the file is intact? For example I have a mp3 file called Test, the orginal hash value is Test.md5 and the new hash value is Test.mp3.md5 In all cases where I have two hash values for a file, the original hash value is *.md5 and the newly crreated hash value *.file extension.md5 I am concerned that on running the Verification Tool, I don't know it is looks at the old hash value, new value or both. As such it could to reporting that a file is corrupt when it is not. I guess that I could just delete all MD5 files and start again, especially in light of Bunker GUI plugin? If the original hash no longer matches the file, then the file has changed since that hash, whether corruption or something else. If the file still plays OK I suspect something has updated the tags in the mp3. I'll let Squid answer the question of which hash his verification would use, but I suspect it would use the one that follows his naming convention; i.e., *.file_extension.md5 Quote Link to comment
clevoir Posted January 4, 2016 Share Posted January 4, 2016 Having done some more detective work, it appears that all files have had MD5 hash values created in the format *.file_extension.md5 by the plugin regardless if they had hash values in the format *.md5 already or not. The plugin was installed in December, so files after this date are OK, but files before this date have two MD5 hash values Moving forward it appears that hash values in the format *.file_extension.md5 are going to be used both in Squids plugin and the competing Bunker plugin too. How am I going to delete hash values in the old *.md5 whilst leaving the newer hash values instead, automatically too as I have 28TB of files!!! Or am I best deleting all md5 hash values and then letting the plugin recalculte them again? Quote Link to comment
bonienl Posted January 4, 2016 Share Posted January 4, 2016 It's not clear to me how the old format looks like, perhaps you can show an example. The sed utility can be your friend to selectively delete the old format lines and preserving the new format lines. Quote Link to comment
bonienl Posted January 4, 2016 Share Posted January 4, 2016 I am very grateful that Bonienl is working on a GUI for bunker which should do everything and more and be far more polished than this plugin is. (Even though I don't particularly agree with his approach even though it is sound). Once bonienl's plugin is out of WIP progress stage, I was actually going to deprecate this plugin in favor of his (even though for my purposes I would actually continue to use it) I don't see the plugins as replacements of each other. We both have our own approach and functionalities. Would be a shame if you abandon your hard work completely, there will be folks in favour of either approach, and yours can also do par2 for error restoration. I guess what I want to say is eventhough you want to put development on a low, still your plugin can be alive and kicking! Quote Link to comment
Squid Posted January 4, 2016 Author Share Posted January 4, 2016 I am very grateful that Bonienl is working on a GUI for bunker which should do everything and more and be far more polished than this plugin is. (Even though I don't particularly agree with his approach even though it is sound). Once bonienl's plugin is out of WIP progress stage, I was actually going to deprecate this plugin in favor of his (even though for my purposes I would actually continue to use it) I don't see the plugins as replacements of each other. We both have our own approach and functionalities. Would be a shame if you abandon your hard work completely, there will be folks in favour of either approach, and yours can also do par2 for error restoration. I guess what I want to say is eventhough you want to put development on a low, still your plugin can be alive and kicking! Its was the par2 that really took the fun out of it (especially since I couldn't justify it for my own needs). Beyond that, the md5 / sha / b2s still is operational and still will remain so, if only because I personally prefer my approach with the separate checksum files that are portable as you copy the files from one medium to another, etc. But, I think that its a fair assumption that your interface will be far more polished than mine. But, as requested, here is the format of the hash files if you wish to import them. Basically a standard checksum format with comments to allow corz to interpret #Squid's Checksum # #md5#30-01-10_1239.jpg#[email protected]:26 a75d06dceea03c4a4a98690db387a176 30-01-10_1239.jpg #md5#30-01-10_1240.jpg#[email protected]:26 a3de82ba450a3ceae1c19f223954b768 30-01-10_1240.jpg #md5#30-01-10_1305.jpg#[email protected]:26 ac98037b5e2473a66d00eaf1bffcbeeb 30-01-10_1305.jpg #md5#30-01-10_1340.jpg#[email protected]:26 bff394ccda3d3dd663cb2b53f2b0d485 30-01-10_1340.jpg #md5#30-01-10_2007.jpg#[email protected]:26 155027556f2179913f3b20a9baca76ee 30-01-10_2007.jpg #md5#30-01-10_2331.jpg#[email protected]:26 a6d9ffb589bbe70f1e16c2869af4b0de 30-01-10_2331.jpg #md5#31-01-10_0945.jpg#[email protected]:26 ca669c4f2e543da2465b3239f8c02be6 31-01-10_0945.jpg Comment line is algorithm (md5 | sha | blake2) followed by filename followed by mtime of the file itself ** (in corz' obscure format) sha as an identifier is actually sha256. blake2 as an identifier is actually blake2s No "*" on the filenames to indicate a binary file. (Can't remember right now, but one of the checkers that I tested had issues with the * (and its pretty much deprecated nowadays anyways - everything assumes its binary anyways) No paths included anywhere on the filenames - everything is relative for portability. ** My mtime matches the file exactly. But, if you do allow imports of preexisting hashes, then be aware that corz has 2 bugs with the mtime: It can be out by +/- 1 second, and it can be also out by exactly +/- 1 hour (bug in corz due to day light savings) Quote Link to comment
clevoir Posted January 4, 2016 Share Posted January 4, 2016 Thanks for your reply The file format for the md5 hash values was file_name.md5 using solely corz on a manual basis. Listed below are the settings chosen in the corz setting ini file ; Individual hashes.. [bool] [default: individual_hashes=false] ; ; command-line switch: i ; ; Instruct checksum to *always* create individual hash files, even when working ; with folders. This is the same as passing "i" in your switches. Most useful ; when combined with file mask groups, e.g. crim(movies). Most people will want ; to leave this set to false. ; individual_hashes=true ; Algorithm.. [string: md5/sha1] [default: algorithm=md5] ; ; command-line switch: s (use sha1) ; command-line switch: 2 (use BLAKE2) ; ; Which algorithm to use when creating hashes? ; Choices are currently "md5", "sha1" or "blake2" (no quotes). ; algorithm=md5 ; The following setting looks like I could use it to unify all hash values to the same format i.e. *.file_extension.md5 ; Add file extensions? [bool] [default: file_extensions=false] ; ; command-line switch: e ; ; This is for creation of individual per-file checksums only. For example, ; if you create a checksum of the file "foo.avi", the checksum will be named ; either "foo.hash or "foo.avi.hash, the former being the default (false). ; ; Setting this to false, as well as being more elegant, enables checksum to ; store hashes for "foo.txt" and "foo.nfo", (if such files exist next to ; "foo.avi") all inside "foo.hash", which is neat. ; file_extensions=false ; ; NOTE: if checksum encounters existing checksum files *not* using your ; current file_extensions setting, it will rename them to match, so in our ; example, if "foo.avi.md5" existed, it would be renamed to "foo.md5", and ; any other foo.files added to that single checksum file. I appreciate that Checksum Suite and Bunker GUI are two different approaches, at the moment I am conerned just to get all my md5 hash files in the same format Quote Link to comment
clevoir Posted January 4, 2016 Share Posted January 4, 2016 I have now found some files which have only the original file_name.md5 hash values just to complicate matters I have found that if I set the switch in the attached option to true, and then manually chose Create Hash Values using the corze utility, it will sucessfully rename file_name.md5 hashes to file_name.file_extension.md5 hashes The following setting looks like I could use it to unify all hash values to the same format i.e. file_name.file_extension.md5 ; Add file extensions? [bool] [default: file_extensions=false] ; ; command-line switch: e ; ; This is for creation of individual per-file checksums only. For example, ; if you create a checksum of the file "foo.avi", the checksum will be named ; either "foo.hash or "foo.avi.hash, the former being the default (false). ; ; Setting this to false, as well as being more elegant, enables checksum to ; store hashes for "foo.txt" and "foo.nfo", (if such files exist next to ; "foo.avi") all inside "foo.hash", which is neat. ; file_extensions=false ; ; NOTE: if checksum encounters existing checksum files *not* using your ; current file_extensions setting, it will rename them to match, so in our ; example, if "foo.avi.md5" existed, it would be renamed to "foo.md5", and ; any other foo.files added to that single checksum file. I am trying it out on one share This could be a solutuion, but say I have a file with 2 hashes, one in file_name.md5 and one in file_name.file_extension.md5, the above menthod would result in 2 file_name.file_extension.md5 hashes named the same. Quote Link to comment
clevoir Posted January 4, 2016 Share Posted January 4, 2016 Just copied some a file into a folder where I have 2 checksums Folder contains The Circus.wma The Circus.md5 (How the checksum was created orginally) The Circus.wma.md5 (How the checksum was subsequnently created by the plugin) If I run using corz manually using the create file name switch as true, because it sees The Circus.wma and The Circus.wma.md5, but it will not delete the The Circus.md5 file If the folder contains just The Circus.wma The Circus.md5 On running create checksum manually, the corz programe will rename The Circus.md5 to The Circus.wma.md5 So I seem to have some files with file_name.md5 checksums. some with file_name.file_extension.md5 checksums and some with both checksums Where both exist, in some cases both report the file as intact and in some other cases one checksum will report filew is infact and the other checksum as file corrupted. Where files are shown as corupted, they are still playable I seem to have a right mess here, and with the benefit of hindsight perhaps I should have deleted all *.md5 files before installing the plugin. What would be the best way forward, delete all *.md5 files and start again? Quote Link to comment
Squid Posted January 4, 2016 Author Share Posted January 4, 2016 What I would do is go to your share in Windows and then search for *.md5 And then delete the ones that you don't want. Quote Link to comment
garycase Posted January 4, 2016 Share Posted January 4, 2016 I'd just delete *.md5 and then redo your checksums. Not the quickest way in terms of how long it will take, but certainly the quickest in terms of "human time" => just a one-line command followed by initiating the checksum computations. MANY hours later it will be done ... but you don't have to do anything while that's happening. Quote Link to comment
danioj Posted January 6, 2016 Share Posted January 6, 2016 What are the key differences (with respect to reliability, outcomes, speed and overheads) between this Plugin and the Dynamix File Integrity plugin created by bonienl? I can see that clearly this plugin allows for recovery via Par2 but I'd like to know if there are any more big differences. If it is only ability for file recovery I can already do that from my Backup and visa versa. I find it unlikely that both my Main and Backup servers would experience rot on the same file at the same time. Horses for Courses and I am just looking at the benefits of one over the other? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.