sincero Posted July 8, 2015 Share Posted July 8, 2015 Thanks a lot! That's a shame for some of my more frequently changing files. I'll have to think about it, most of the media should be fine, though. You can use the -u command in combination with the -D <time> option, this allows you to update only files which have been modified in the last <time> period. Eg. bunker -u -D 1h /mnt/user/files Will update those files modified in the last hour. Note: files must have initially been added using the -a command My worry was more along the lines of I changed 50 documents in the last week for example, and now I need to manually verify each is OK before changing their checksums, no? I can't just let it auto update or it might have gone bad since and it'l get the "bad" checksum and I might not know until it's too late. Quote Link to comment
bonienl Posted July 8, 2015 Author Share Posted July 8, 2015 Well you have to make a strategy for yourself how to protect and this really depends on how you are using the system. A possible scenario in your case (and you may want to automate this in cron): 1. At regular intervals, e.g. once a day run the -a command to add any new files 2. Perform -u -D <time> command to update files after you own modifications 3. Run the -v command say every week to detect any corruption Quote Link to comment
archedraft Posted July 8, 2015 Share Posted July 8, 2015 Have you had any thoughts of turning this into a plugin that would be able to integrate with v6 web GUI and give you visual options to set the bunker update files and add new files on a schedule? Also be able to setup a period verification checks via the GUI? Quote Link to comment
bonienl Posted July 8, 2015 Author Share Posted July 8, 2015 Have you had any thoughts of turning this into a plugin that would be able to integrate with v6 web GUI and give you visual options to set the bunker update files and add new files on a schedule? Also be able to setup a period verification checks via the GUI? Bunker really started as a side project where I needed some projection 'scheme' for myself. Did not have the intention to build a GUI front-end. It looks like only a selected few people are using it. Quote Link to comment
archedraft Posted July 8, 2015 Share Posted July 8, 2015 Bunker really started as a side project where I needed some projection 'scheme' for myself. Did not have the intention to build a GUI front-end. It looks like only a selected few people are using it. No worries, thought I would inquire. I've still been using Corz but my end goal would to be having a scheduled check periodically that is installed on unRAID. Bunker seems like it will meet those requirements of adding new files, updating modified files, and checking for corruption. I wonder if only a select few are using it because the rest have no idea they should have way to verify data... Quote Link to comment
archedraft Posted July 8, 2015 Share Posted July 8, 2015 Well you have to make a strategy for yourself how to protect and this really depends on how you are using the system. A possible scenario in your case (and you may want to automate this in cron): 1. At regular intervals, e.g. once a day run the -a command to add any new files 2. Perform -u -D <time> command to update files after you own modifications 3. Run the -v command say every week to detect any corruption I have a question on step #2. Does it only update the checksum if the file date has been modified? Or does it update if the checksum has changed. If it's just updating based on the modified date then hypothetical if corruption occurred then the modified file date should still be the same and then we would know once we ran the -v command. If it's updating based on changed checksum then the only way to tell if the file is corrupt would be to open that file? Quote Link to comment
bonienl Posted July 8, 2015 Author Share Posted July 8, 2015 I have a question on step #2. Does it only update the checksum if the file date has been modified? Or does it update if the checksum has changed. If it's just updating based on the modified date then hypothetical if corruption occurred then the modified file date should still be the same and then we would know once we ran the -v command. If it's updating based on changed checksum then the only way to tell if the file is corrupt would be to open that file? -u -D <timer> acts as a filter, which means that only files modified in the period as specified by <time> will be processed. When combined with the command -u, this results in these files will get their checksum updated if a mismatch is found (e.g. because the file content has changed). Quote Link to comment
archedraft Posted July 8, 2015 Share Posted July 8, 2015 I have a question on step #2. Does it only update the checksum if the file date has been modified? Or does it update if the checksum has changed. If it's just updating based on the modified date then hypothetical if corruption occurred then the modified file date should still be the same and then we would know once we ran the -v command. If it's updating based on changed checksum then the only way to tell if the file is corrupt would be to open that file? -u -D <timer> acts as a filter, which means that only files modified in the period as specified by <time> will be processed. When combined with the command -u, this results in these files will get their checksum updated if a mismatch is found (e.g. because the file content has changed). OK, I think I understand but to clarify. If I use -u then it will update the checksum of any checksum that has changed regardless if the file is corrupt correct? Quote Link to comment
trurl Posted July 8, 2015 Share Posted July 8, 2015 I have a question on step #2. Does it only update the checksum if the file date has been modified? Or does it update if the checksum has changed. If it's just updating based on the modified date then hypothetical if corruption occurred then the modified file date should still be the same and then we would know once we ran the -v command. If it's updating based on changed checksum then the only way to tell if the file is corrupt would be to open that file? -u -D <timer> acts as a filter, which means that only files modified in the period as specified by <time> will be processed. When combined with the command -u, this results in these files will get their checksum updated if a mismatch is found (e.g. because the file content has changed). OK, I think I understand but to clarify. If I use -u then it will update the checksum of any checksum that has changed regardless if the file is corrupt correct? I think he means that it uses the file timestamp to decide whether a file has been modified since the time specified in -D. If it was based instead on whether the checksum had changed then it would have to actually do the checksum on every file to know whether it had changed or not. Quote Link to comment
bonienl Posted July 8, 2015 Author Share Posted July 8, 2015 I have a question on step #2. Does it only update the checksum if the file date has been modified? Or does it update if the checksum has changed. If it's just updating based on the modified date then hypothetical if corruption occurred then the modified file date should still be the same and then we would know once we ran the -v command. If it's updating based on changed checksum then the only way to tell if the file is corrupt would be to open that file? -u -D <timer> acts as a filter, which means that only files modified in the period as specified by <time> will be processed. When combined with the command -u, this results in these files will get their checksum updated if a mismatch is found (e.g. because the file content has changed). OK, I think I understand but to clarify. If I use -u then it will update the checksum of any checksum that has changed regardless if the file is corrupt correct? Yes, any mismatch will get corrected. Bunker can not know whether this mismatch is because of an intended or a corrupted file change. Quote Link to comment
bonienl Posted July 8, 2015 Author Share Posted July 8, 2015 I think he means that it uses the file timestamp to decide whether a file has been modified since the time specified in -D. If it was based instead on whether the checksum had changed then it would have to actually do the checksum on every file to know whether it had changed or not. Not quite. The -u (update) command will recalculate the checksum and updates any file which has a different checksum stored. Using the -D option will limit the scope of files which are going to be processed. Say for the past 2 hours you have been working on several files and you KNOW their content has changed, so subsequently a new checksum needs to be calculated for those files, then issuing the command "bunker -u -D 2h /path/to/files" ensures that only those files get updated. Alternatively you can do "bunker -u /specific/file/name" and update each file individually. Quote Link to comment
archedraft Posted July 8, 2015 Share Posted July 8, 2015 I have a question on step #2. Does it only update the checksum if the file date has been modified? Or does it update if the checksum has changed. If it's just updating based on the modified date then hypothetical if corruption occurred then the modified file date should still be the same and then we would know once we ran the -v command. If it's updating based on changed checksum then the only way to tell if the file is corrupt would be to open that file? -u -D <timer> acts as a filter, which means that only files modified in the period as specified by <time> will be processed. When combined with the command -u, this results in these files will get their checksum updated if a mismatch is found (e.g. because the file content has changed). OK, I think I understand but to clarify. If I use -u then it will update the checksum of any checksum that has changed regardless if the file is corrupt correct? Yes, any mismatch will get corrected. Bunker can not know whether this mismatch is because of an intended or a corrupted file change. Understood Next question, is it possible to add the date the checksum was created into the extended attributes of the file and then have bunker flag to compare the date the checksum was created vs modified file date? I believe this is how Corz keeps track of if the file has been changed by the user or if corruption has occurred but Corz keeps that information in a separate file, which may not be possible to store that information in the extended attributes file? Quote Link to comment
trurl Posted July 8, 2015 Share Posted July 8, 2015 I think he means that it uses the file timestamp to decide whether a file has been modified since the time specified in -D. If it was based instead on whether the checksum had changed then it would have to actually do the checksum on every file to know whether it had changed or not. Not quite. The -u (update) command will recalculate the checksum and updates any file which has a different checksum stored. Using the -D option will limit the scope of files which are going to be processed. Say for the past 2 hours you have been working on several files and you KNOW their content has changed, so subsequently a new checksum needs to be calculated for those files, then issuing the command "bunker -u -D 2h /path/to/files" ensures that only those files get updated. It uses the timestamp to limit the scope or not? I think we are talking past each other. Quote Link to comment
bonienl Posted July 8, 2015 Author Share Posted July 8, 2015 I think he means that it uses the file timestamp to decide whether a file has been modified since the time specified in -D. If it was based instead on whether the checksum had changed then it would have to actually do the checksum on every file to know whether it had changed or not. Not quite. The -u (update) command will recalculate the checksum and updates any file which has a different checksum stored. Using the -D option will limit the scope of files which are going to be processed. Say for the past 2 hours you have been working on several files and you KNOW their content has changed, so subsequently a new checksum needs to be calculated for those files, then issuing the command "bunker -u -D 2h /path/to/files" ensures that only those files get updated. It uses the timestamp to limit the scope or not? I think we are talking past each other. Sorry, I misunderstood you, indeed it is based on the timestamp (=file modified date stamp) to include/exclude files. Quote Link to comment
archedraft Posted July 8, 2015 Share Posted July 8, 2015 I think he means that it uses the file timestamp to decide whether a file has been modified since the time specified in -D. If it was based instead on whether the checksum had changed then it would have to actually do the checksum on every file to know whether it had changed or not. Not quite. The -u (update) command will recalculate the checksum and updates any file which has a different checksum stored. Using the -D option will limit the scope of files which are going to be processed. Say for the past 2 hours you have been working on several files and you KNOW their content has changed, so subsequently a new checksum needs to be calculated for those files, then issuing the command "bunker -u -D 2h /path/to/files" ensures that only those files get updated. It uses the timestamp to limit the scope or not? I think we are talking past each other. I believe using -u -D 4h will find all the files that have been modified in the last 4 hours and recalculate the checksums of all those files. I'm assuming that it will recalculate all the checksums in the last 4 hours regardless if the checksum is different or not? Quote Link to comment
bonienl Posted July 8, 2015 Author Share Posted July 8, 2015 I have a question on step #2. Does it only update the checksum if the file date has been modified? Or does it update if the checksum has changed. If it's just updating based on the modified date then hypothetical if corruption occurred then the modified file date should still be the same and then we would know once we ran the -v command. If it's updating based on changed checksum then the only way to tell if the file is corrupt would be to open that file? -u -D <timer> acts as a filter, which means that only files modified in the period as specified by <time> will be processed. When combined with the command -u, this results in these files will get their checksum updated if a mismatch is found (e.g. because the file content has changed). OK, I think I understand but to clarify. If I use -u then it will update the checksum of any checksum that has changed regardless if the file is corrupt correct? Yes, any mismatch will get corrected. Bunker can not know whether this mismatch is because of an intended or a corrupted file change. Understood Next question, is it possible to add the date the checksum was created into the extended attributes of the file and then have bunker flag to compare the date the checksum was created vs modified file date? I believe this is how Corz keeps track of if the file has been changed by the user or if corruption has occurred but Corz keeps that information in a separate file, which may not be possible to store that information in the extended attributes file? The scan date is stored together with the checksum. I didn't make however a function to compare the stored scan date against the file modified date. An interesting idea though ! Quote Link to comment
archedraft Posted July 8, 2015 Share Posted July 8, 2015 I have a question on step #2. Does it only update the checksum if the file date has been modified? Or does it update if the checksum has changed. If it's just updating based on the modified date then hypothetical if corruption occurred then the modified file date should still be the same and then we would know once we ran the -v command. If it's updating based on changed checksum then the only way to tell if the file is corrupt would be to open that file? -u -D <timer> acts as a filter, which means that only files modified in the period as specified by <time> will be processed. When combined with the command -u, this results in these files will get their checksum updated if a mismatch is found (e.g. because the file content has changed). OK, I think I understand but to clarify. If I use -u then it will update the checksum of any checksum that has changed regardless if the file is corrupt correct? Yes, any mismatch will get corrected. Bunker can not know whether this mismatch is because of an intended or a corrupted file change. Understood Next question, is it possible to add the date the checksum was created into the extended attributes of the file and then have bunker flag to compare the date the checksum was created vs modified file date? I believe this is how Corz keeps track of if the file has been changed by the user or if corruption has occurred but Corz keeps that information in a separate file, which may not be possible to store that information in the extended attributes file? The scan date is stored together with the checksum. I didn't make however a function to compare the stored scan date against the file modified date. An interesting idea though ! Awesome! That would certainly be a useful flag to include! That is really the only thing stopping me from jumping ship from Corz. Any chance you may look into that in the future? Quote Link to comment
bonienl Posted July 8, 2015 Author Share Posted July 8, 2015 I believe using -u -D 4h will find all the files that have been modified in the last 4 hours and recalculate the checksums of all those files. I'm assuming that it will recalculate all the checksums in the last 4 hours regardless if the checksum is different or not? Correct. Quote Link to comment
trurl Posted July 8, 2015 Share Posted July 8, 2015 I believe using -u -D 4h will find all the files that have been modified in the last 4 hours and recalculate the checksums of all those files. I'm assuming that it will recalculate all the checksums in the last 4 hours regardless if the checksum is different or not? Maybe I don't understand the question, but how could it know if the checksum is different without recalculating the checksum? Quote Link to comment
archedraft Posted July 8, 2015 Share Posted July 8, 2015 I believe using -u -D 4h will find all the files that have been modified in the last 4 hours and recalculate the checksums of all those files. I'm assuming that it will recalculate all the checksums in the last 4 hours regardless if the checksum is different or not? Maybe I don't understand the question, but how could it know if the checksum is different without recalculating the checksum? hmm that is interesting if you look at the OP -u update mismatched hash keys with correct hash key attribute (may use -f) so if I am understanding that flag, -u does check to see if the checksum is mismatched and recalculates if it has changed? The flaw with this method is that it doesn't distinguish between user changes or corruption. EDIT: unless "updating mismatched keys" really means it just recalculates everything? Quote Link to comment
trurl Posted July 8, 2015 Share Posted July 8, 2015 I believe using -u -D 4h will find all the files that have been modified in the last 4 hours and recalculate the checksums of all those files. I'm assuming that it will recalculate all the checksums in the last 4 hours regardless if the checksum is different or not? Maybe I don't understand the question, but how could it know if the checksum is different without recalculating the checksum? hmm that is interesting if you look at the OP -u update mismatched hash keys with correct hash key attribute (may use -f) so if I am understanding that flag, -u does check to see if the checksum is mismatched and recalculates if it has changed? The flaw with this method is that it doesn't distinguish between user changes or corruption. EDIT: unless "updating mismatched keys" really means it just recalculates everything? When comparing for a mismatch, what is it going to compare against if not the calculation? Quote Link to comment
bonienl Posted July 8, 2015 Author Share Posted July 8, 2015 I believe using -u -D 4h will find all the files that have been modified in the last 4 hours and recalculate the checksums of all those files. I'm assuming that it will recalculate all the checksums in the last 4 hours regardless if the checksum is different or not? Maybe I don't understand the question, but how could it know if the checksum is different without recalculating the checksum? hmm that is interesting if you look at the OP -u update mismatched hash keys with correct hash key attribute (may use -f) so if I am understanding that flag, -u does check to see if the checksum is mismatched and recalculates if it has changed? The flaw with this method is that it doesn't distinguish between user changes or corruption. EDIT: unless "updating mismatched keys" really means it just recalculates everything? There is a difference between -v (verify) and -u (update). If the intention is to find files with possible corruption then -v must be used. This will list all files which have a mismatch between the checksum stored in the extended attributes and the recalculated checksum. Next YOU have to decide which files have an expected mismatch, cause they were changed, and which ones are unexpected (possible corruption). The -u command implicitely expects that files with a different checksum (again this is compared between stored checksum and the recalculated checksum) need to get updated, this means, the recalculated checksum will be written to the extended attributes thus overwriting the previous value. Several options exist to shorten the search list, translating in faster execution (less files need to be recalculated), and these options may be combined with the above commands. Hope this makes sense Quote Link to comment
RobJ Posted July 8, 2015 Share Posted July 8, 2015 I think what some are hoping here though is that if you are storing a copy of the file modification date, then there is an opportunity to determine with some accuracy whether a checksum mismatch is a user modification or a corruption. If you detect a timestamp mismatch, then you can report "File was modified, updating checksum". If the timestamp matches but the checksum does not, then you can report "Probable file corruption". Quote Link to comment
TheDragon Posted July 8, 2015 Share Posted July 8, 2015 I think what some are hoping here though is that if you are storing a copy of the file modification date, then there is an opportunity to determine with some accuracy whether a checksum mismatch is a user modification or a corruption. If you detect a timestamp mismatch, then you can report "File was modified, updating checksum". If the timestamp matches but the checksum does not, then you can report "Probable file corruption". Hit the nail on the head I think! +1 Quote Link to comment
bonienl Posted July 8, 2015 Author Share Posted July 8, 2015 I'll see if I can find that nail Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.