Dynamix File Integrity plugin


bonienl

Recommended Posts

Im not sure this is working right for me. 

 

I have 6 disks and built and exported all of them.  A few days later i noticed disk 5 or 6 (i forget which) was not up to date on the build and export.  I recreated them. fine.  Now last night I had a scheduled check at 3am.  It started disk 3 and 4 only.  Why not 1 and 2 and 5 and 6?  And then i also noticed that 5 and 6 are not up to date again.  What gives?  I thought this is supposed to keep itself up to date as files are copied to the array.

 

Anyone help me get this setup right? or what can i check to see why its not working.

 

Also, when I do an export of one that the build is up to date but the export is not I get a lot of error messages:

 

/mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**

 

 

One more thing.  Now that I updated the plugin, the find command keeps spiking my CPU usage for 2 -3 mins each 2-3 mins.  What is this doing?  After removing the plugin this stopped.  Installing it again creates the same behaviour.  Any way to go back to a version without the find feature? 

Link to comment

Hi bonienl,

 

thank you very much for the great plugin! I converted all my disks to xfs and I'm very happy with the interface, usability and extended data security.

 

I know your plugin uses the extended attributes to save the hash information and I really like that. But as I often check single directories for correctness using corz from my windows machine I would like an user option to create corz compatible hash files per directory, too. I know this is redundant information and I know there is another plugin doing that, but as the author discontinued development it might break over time leaving me stranded. Please consider implementing that option into your plugin.

 

Also another question. I didn't tested long enough to know, but what happens if a time slot for a check is missed? I guess you are using cron to shedule the tasks, which would skip such tasks and wait till the next execution time. As my server is sleeping rather often, this prevents task execution most of the time with the other plugin, which is a big disadvantage for me. IMHO it would be better to start missed tasks as soon as the server wakes, which will keep it awake until completion and only then will let it sleep again, if no other things are going on. But maybe your plugin already does this?

 

If not, there was a post for installing anacron on unraid recently, so it should be possible to get another sheduler running on unraid. Also I remember there beeing php alternatives, which could be a possibility, too.

 

cheers,

 

SlrG

 

 

Link to comment

Im not sure this is working right for me. 

 

I have 6 disks and built and exported all of them.  A few days later i noticed disk 5 or 6 (i forget which) was not up to date on the build and export.  I recreated them. fine.  Now last night I had a scheduled check at 3am.  It started disk 3 and 4 only.  Why not 1 and 2 and 5 and 6?  And then i also noticed that 5 and 6 are not up to date again.  What gives?  I thought this is supposed to keep itself up to date as files are copied to the array.

 

Anyone help me get this setup right? or what can i check to see why its not working.

 

Also, when I do an export of one that the build is up to date but the export is not I get a lot of error messages:

 

/mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**

 

 

One more thing.  Now that I updated the plugin, the find command keeps spiking my CPU usage for 2 -3 mins each 2-3 mins.  What is this doing?  After removing the plugin this stopped.  Installing it again creates the same behaviour.  Any way to go back to a version without the find feature?

 

The hashes are stored in the extended attributes of a file, and should stay up to date automatically when protection is enabled.

 

Export to a file is optional and a manual action, it does not affect the working of the plugin. The status shows when an existing export file gets outdated, but the user decides when to update.

 

The disk verification tasks table defines how and when verification of disks take place. See the online help for more information.

 

I suppose the name of the folder/file isn't really

**PRIVATE**

(asterisks in a folder/file name are not allowed). The message "no export of file" happens when no hash key value is stored in the extended attributes of the file. The usual approach is to rebuild.

 

Sure you are using XFS as file system on disk 6 or have you changed the hashing method from one to another midway?

 

Link to comment

So I installed this plugin but have yet to use it.  So as I see it, this is some sort of warning system that 'bitrot' has occurred.  Question, how likely is this to happen?  Is it more prevalent in HDDs rather than SSDs?  If a file has "bitrot", would it be completely unreadable? 

 

Just wondering as I don't have the means to have everything from my Unraid server backed up...

Thanks.

Link to comment

So I installed this plugin but have yet to use it.  So as I see it, this is some sort of warning system that 'bitrot' has occurred.  Question, how likely is this to happen?  Is it more prevalent in HDDs rather than SSDs?  If a file has "bitrot", would it be completely unreadable? 

 

Just wondering as I don't have the means to have everything from my Unraid server backed up...

Thanks.

bitrot (I.e. Corruption not detected by the hardware) seems to be exceedingly rare, and it is only in the last few years that systems capable of detecting it have started to become mainstream.  Much more likely would be hardware errors such as a failure reading a sector and unRAID does protect against those errors.

 

Bitrot does not mean a file cannot be read - merely that the contents are not what they should be and you are not automatically warned this is the case when you open the file.  For media type files (which I think is one of the main uses for unRAID) it is quite likely that the corruption may not even be noticeable.  For other file types (e.g documents) the corruption would almost certainly be noticed as garbage would be displayed.

Link to comment

Im not sure this is working right for me. 

 

I have 6 disks and built and exported all of them.  A few days later i noticed disk 5 or 6 (i forget which) was not up to date on the build and export.  I recreated them. fine.  Now last night I had a scheduled check at 3am.  It started disk 3 and 4 only.  Why not 1 and 2 and 5 and 6?  And then i also noticed that 5 and 6 are not up to date again.  What gives?  I thought this is supposed to keep itself up to date as files are copied to the array.

 

Anyone help me get this setup right? or what can i check to see why its not working.

 

Also, when I do an export of one that the build is up to date but the export is not I get a lot of error messages:

 

/mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**
Mar 30 16:16:28 FILESERVER bunker: error: no export of file: /mnt/disk6/TV/**PRIVATE**

 

 

One more thing.  Now that I updated the plugin, the find command keeps spiking my CPU usage for 2 -3 mins each 2-3 mins.  What is this doing?  After removing the plugin this stopped.  Installing it again creates the same behaviour.  Any way to go back to a version without the find feature?

 

The hashes are stored in the extended attributes of a file, and should stay up to date automatically when protection is enabled.

 

Export to a file is optional and a manual action, it does not affect the working of the plugin. The status shows when an existing export file gets outdated, but the user decides when to update.

 

The disk verification tasks table defines how and when verification of disks take place. See the online help for more information.

 

I suppose the name of the folder/file isn't really

**PRIVATE**

(asterisks in a folder/file name are not allowed). The message "no export of file" happens when no hash key value is stored in the extended attributes of the file. The usual approach is to rebuild.

 

Sure you are using XFS as file system on disk 6 or have you changed the hashing method from one to another midway?

 

All my disks are XFS.  I haven't changed the hashing method.  I can rebuild the hashes on this disk and see if that fixes it.

 

My main concern now is the FIND feature.  This is using a significant portion of my CPU when I am not wanting to do anything with finding duplicates.

Link to comment

My main concern now is the FIND feature.  This is using a significant portion of my CPU when I am not wanting to do anything with finding duplicates.

 

When you open the integrity control page it will start every 3 minutes a background job which is used to update the status of build and export of each disk. This is unrelated to the find command.

 

You can check if the background job has issues by running it in the CLI:

/etc/cron.daily/exportrotate -q

 

 

Link to comment

Can i ask about the integrity plugin and whether anyone has this AND Corz running against their files?

 

I checksum all of my data with Corz before moving it to its final resting place on unRAID, it is then verified and deleted on the PC.

 

So ... all of my data has a corresponding hashfile in each folder and i was wondering whether this inbuilt integrity would (or should, maybe its better to have both?) replace my corz activity which itself takes some considerable time.

 

The only downside of not Corz'ing the data before moving it across the network as i see it is potential for corruption at that point.

 

Am i being too picky, any guidance on whats the best to do here .... drop Corz or not?

Link to comment

Can i ask about the integrity plugin and whether anyone has this AND Corz running against their files?

 

I checksum all of my data with Corz before moving it to its final resting place on unRAID, it is then verified and deleted on the PC.

 

So ... all of my data has a corresponding hashfile in each folder and i was wondering whether this inbuilt integrity would (or should, maybe its better to have both?) replace my corz activity which itself takes some considerable time.

 

The only downside of not Corz'ing the data before moving it across the network as i see it is potential for corruption at that point.

 

Am i being too picky, any guidance on whats the best to do here .... drop Corz or not?

Not really answering your question, but your talk of deleting the original after transferring to unRAID makes me wonder whether you have backups. unRAID is not a backup unless it is used to store additional copies of files you have elsewhere.
Link to comment

You've made comment on that in an earlier post of mine and i understand your point. Yes ... I have another backup server, thanks! The deletions are from the windows platform. All Corz activity is done there first which is intensive. Would like to know if i should move to this plugin as a replacement?

 

Link to comment

I know your plugin uses the extended attributes to save the hash information and I really like that. But as I often check single directories for correctness using corz from my windows machine I would like an user option to create corz compatible hash files per directory, too. I know this is redundant information and I know there is another plugin doing that, but as the author discontinued development it might break over time leaving me stranded. Please consider implementing that option into your plugin.

 

I have a new function added, which allows for the exported hash files to be converted to a Corz compatible format.

 

I am not using Corz myself, but from reading their product I understand that hash files can either be generated per individual file or as group for a folder. I have opted for the latter one.

 

For each disk a separate folder is created which holds all folders as a collapsed file name with the extension .hash and inside each file are the hash values of the files of the folder itself. Let me know if this is useful.

 

Link to comment

First let me say a big big thanks for working on this!!! Sadly I think it is not yet working in a way, that would make it useful for corz users.

 

Corz on windows works as an explorer plugin. It recognizes the .hash file in a folder and I can verify the files by right clicking on the folder and choosing verify. It will traverse the folder structure, if there are subfolders to check each of them.

 

So a file which has the folder structure in the filename is not what corz expects, I think. For the user it would be a lot of work or would need the creation of a script to rename and move every file to its correct location.

 

If it really has to be created on the flash, it would be better to recreate the folder structure and place in each folder a file with the name of the folder.hash. So if the folder is "(500) Days of Summer (2009)" the filename is "(500) Days of Summer (2009).hash". In this file there are the hashes of the files in that directory.

 

The user could then move the contents of this folder structure to the corresponding folders on the disks. Alternatively the .hash files could be directly placed in the correct folder without the need for the user to move them.

 

I don't know if split levels could create problems with this approach. Maybe a share based one would be better?!

 

Also I don't know if the conversion is correct already. Squids plugins blake2 hashes which work with corz look like this:

#Squid's Checksum
#
#blake2#(500) Days of Summer.avi#[email protected]:13
be0f6749bdf0256adc86e8fbae852aa9fe3d97ff41cae423c1a25c248ee68e2c  (500) Days of Summer.avi
#blake2#(500) Days of Summer-fanart.jpg#[email protected]:19
86123573b1d5936faa6172b1373533ac0f8753ce7b1b852a969db89df262fe48  (500) Days of Summer-fanart.jpg
#blake2#(500) Days of Summer.nfo#[email protected]:28
a998f9d6293588170f5dc6ede76106f4d55a979a0cb00c7685294f459e61a904  (500) Days of Summer.nfo
#blake2#(500) Days of Summer-poster.jpg#[email protected]:21
a0ceec0663c10ff30eee208ebbc7ad9e0317ef15d42f0664f6faaf8d7210d92e  (500) Days of Summer-poster.jpg
#blake2#(500) Days of Summer.srt#[email protected]:55
726aacb53f77a5c993e4dcebc9d568ae6fb8e62aa29d06a77268fc19d212d2e2  (500) Days of Summer.srt
#blake2#(500) Days of Summer-trailer.flv#[email protected]:15
6f1ca22eb8a5c9a9912c1718a7780cdda12d8aa3fc276b18011461d0683d1d1c  (500) Days of Summer-trailer.flv

 

Yours look like this:

cc0a9ab4e46c0becb545ea087025ff6df7d1fcdfe258a2688bc2fa32f33031f70a0feff76b0ab49f03437bb44a7516e0c722bf469399f168903f6afcb81fec85 *(500) Days of Summer-fanart.jpg
5c412b48daf2dd3370a89e8002d93db2f9638d13a931c46f8ed1e3fe8092781f1b3ed628d219b50f1ccad181e080c809e15efeff32c9a4781a98d120f096595c *(500) Days of Summer-poster.jpg
15b39a2affa6c8d43cd0db79f28a86c7f66cae5e69ac724e2bbde94bba3a2b66bb60baacfef9a4c627d20d6cb5d7cf56a2a840db9d5fabed3bfd7bf5f3207995 *(500) Days of Summer-trailer.flv
8149422db3a51fff06cac9d5fc9f2b674ac2819add48e31e6f81fd7c3309ea01e8b2937dbfc3c0fab5c5daeae871cc94ac75b3726c6a24a78ebdcb5d9701c5a7 *(500) Days of Summer.avi
996cc72972b8fc3248a619398e7b49c081dfd81c3b9b22b1947863fb6794d7cf5a214b88ed1bf4fab2a706be044fc50514cb7e0cc568a6d7b7fc963d998befbf *(500) Days of Summer.nfo
54ae9bf21b417dd6884f07100dffca2bf274d47d14637ef9089205912c857ba2ba20a7e07602f6c7dea8ab30e5f6b191ce096b745946a565f93179f6c0e8f2f5 *(500) Days of Summer.srt

 

I hope I'm not creating too much trouble for you. :(

Link to comment

First let me say a big big thanks for working on this!!! Sadly I think it is not yet working in a way, that would make it useful for corz users.

 

Corz on windows works as an explorer plugin. It recognizes the .hash file in a folder and I can verify the files by right clicking on the folder and choosing verify. It will traverse the folder structure, if there are subfolders to check each of them.

 

So a file which has the folder structure in the filename is not what corz expects, I think. For the user it would be a lot of work or would need the creation of a script to rename and move every file to its correct location.

 

If it really has to be created on the flash, it would be better to recreate the folder structure and place in each folder a file with the name of the folder.hash. So if the folder is "(500) Days of Summer (2009)" the filename is "(500) Days of Summer (2009).hash". In this file there are the hashes of the files in that directory.

 

The user could then move the contents of this folder structure to the corresponding folders on the disks. Alternatively the .hash files could be directly placed in the correct folder without the need for the user to move them.

 

I don't know if split levels could create problems with this approach. Maybe a share based one would be better?!

 

Also I don't know if the conversion is correct already. Squids plugins blake2 hashes which work with corz look like this:

#Squid's Checksum
#
#blake2#(500) Days of Summer.avi#[email protected]:13
be0f6749bdf0256adc86e8fbae852aa9fe3d97ff41cae423c1a25c248ee68e2c  (500) Days of Summer.avi
#blake2#(500) Days of Summer-fanart.jpg#[email protected]:19
86123573b1d5936faa6172b1373533ac0f8753ce7b1b852a969db89df262fe48  (500) Days of Summer-fanart.jpg
#blake2#(500) Days of Summer.nfo#[email protected]:28
a998f9d6293588170f5dc6ede76106f4d55a979a0cb00c7685294f459e61a904  (500) Days of Summer.nfo
#blake2#(500) Days of Summer-poster.jpg#[email protected]:21
a0ceec0663c10ff30eee208ebbc7ad9e0317ef15d42f0664f6faaf8d7210d92e  (500) Days of Summer-poster.jpg
#blake2#(500) Days of Summer.srt#[email protected]:55
726aacb53f77a5c993e4dcebc9d568ae6fb8e62aa29d06a77268fc19d212d2e2  (500) Days of Summer.srt
#blake2#(500) Days of Summer-trailer.flv#[email protected]:15
6f1ca22eb8a5c9a9912c1718a7780cdda12d8aa3fc276b18011461d0683d1d1c  (500) Days of Summer-trailer.flv

 

Yours look like this:

cc0a9ab4e46c0becb545ea087025ff6df7d1fcdfe258a2688bc2fa32f33031f70a0feff76b0ab49f03437bb44a7516e0c722bf469399f168903f6afcb81fec85 *(500) Days of Summer-fanart.jpg
5c412b48daf2dd3370a89e8002d93db2f9638d13a931c46f8ed1e3fe8092781f1b3ed628d219b50f1ccad181e080c809e15efeff32c9a4781a98d120f096595c *(500) Days of Summer-poster.jpg
15b39a2affa6c8d43cd0db79f28a86c7f66cae5e69ac724e2bbde94bba3a2b66bb60baacfef9a4c627d20d6cb5d7cf56a2a840db9d5fabed3bfd7bf5f3207995 *(500) Days of Summer-trailer.flv
8149422db3a51fff06cac9d5fc9f2b674ac2819add48e31e6f81fd7c3309ea01e8b2937dbfc3c0fab5c5daeae871cc94ac75b3726c6a24a78ebdcb5d9701c5a7 *(500) Days of Summer.avi
996cc72972b8fc3248a619398e7b49c081dfd81c3b9b22b1947863fb6794d7cf5a214b88ed1bf4fab2a706be044fc50514cb7e0cc568a6d7b7fc963d998befbf *(500) Days of Summer.nfo
54ae9bf21b417dd6884f07100dffca2bf274d47d14637ef9089205912c857ba2ba20a7e07602f6c7dea8ab30e5f6b191ce096b745946a565f93179f6c0e8f2f5 *(500) Days of Summer.srt

 

I hope I'm not creating too much trouble for you. :(

There's different blake2 algorithm's.  File Integrity Plugin uses blake2 (full version)  Corz uses blake2s which is slightly faster to generate and also much shorter in length.  Unfortunately, they are incompatible, and IIRC corz when it sees the full blake hash thinks that its SHA256

 

Also, without the extra information within the hash files (namely the lines starting with #), you're probably going to be best off with using md5 as I'm not 100% sure if corz will recognize using SHA without the comment lines (but it will recognize md5)

Link to comment

Bonienl, further to my recent post about the plugin showing that build is out of date on occassion to certain disks, despite me carrying out the initial build, please see attached screenshot.

 

I have updated the plugin to the latest version, and on checking have found that disk 4 is showing that build is out of date?

 

Build_Problem.png.65c9b82542961d332ef67e67ce3e6803.png

Link to comment

My main concern now is the FIND feature.  This is using a significant portion of my CPU when I am not wanting to do anything with finding duplicates.

 

When you open the integrity control page it will start every 3 minutes a background job which is used to update the status of build and export of each disk. This is unrelated to the find command.

 

You can check if the background job has issues by running it in the CLI:

/etc/cron.daily/exportrotate -q

 

After some testing this was exactly it.  I had the page open so it was automatically refreshing every 3 mins.  Great to know.

 

Any suggestions why 2 of my disks need to rebuilt every day?  Possibly I have to many files for innotify?  I have a few million files (due to plex app data daily backups)

Link to comment

Thanks bonienl for a fantastic plugin. I know you've made export a manual task now but is there any reason why we can't have an option for it to run once a day or so automatically? I like having an upto date listing of all the files in my array but I don't want to have to manually do it every night. Unless there's a way to add the task to a cronjob?

Link to comment

I have setup the following disk verfication tasks:-

 

Task 1 - Disk 1 & 2

Task 2 - Disk 3 & 4

Task 3 - Disk 5 & 6

Task 4 - Disk 7 & 8

Task 5 - Disk 9 & 10

Task 6 - Disk 11 & 12

Task 7 - Disk 13 & 14

 

I have set the tasks to run @ 00:00 on the 3rd day every month. However on checking this morning I only have emails stating that Task 5 has started and completed? I have not had a parity check running at the same time, which could have paused the tasks

 

I have checked the file integrity control and file integrity settings tabs and can not see anything obvious to indicate when only one task run, or if any tasks are running at the moment?

 

On checking syslog only instructions for Disks 9 & 10 were issued?

 

Apr  3 00:00:07 Tower bunker: Verify task for disk10 started, total files size 747 GB

Apr  3 00:00:09 Tower bunker: Verify task for disk9 started, total files size 1.67 TB

 

Link to comment

I have setup the following disk verfication tasks:-

 

Task 1 - Disk 1 & 2

Task 2 - Disk 3 & 4

Task 3 - Disk 5 & 6

Task 4 - Disk 7 & 8

Task 5 - Disk 9 & 10

Task 6 - Disk 11 & 12

Task 7 - Disk 13 & 14

 

I have set the tasks to run @ 00:00 on the 3rd day every month. However on checking this morning I only have emails stating that Task 5 has started and completed? I have not had a parity check running at the same time, which could have paused the tasks

 

I have checked the file integrity control and file integrity settings tabs and can not see anything obvious to indicate when only one task run, or if any tasks are running at the moment?

 

The purpose of different tasks is to start a different verification selection each time a task is executed. Using different tasks is done to limit the load on the processor, the more disks are put under a task, the more load this will give on the processor.

 

In your example it will verify 2 disks at the same time each 3rd day of the month. To complete the verification cycle of all your disks it will take 6 months in total (each month another pair of disks is verified).

 

You can shorten the verification cycle at the expensive of higher processor load (assigning more disks to a task), but you need to be sure that your processor can handle this load, hash calculation is very cpu intensive.

 

Regarding your observation of build: yes it can happen that the status changes from out-of-date to up-to-date, this is because the GUI can not follow in real-time all file changes, but will eventually catch up.

 

Link to comment

My main concern now is the FIND feature.  This is using a significant portion of my CPU when I am not wanting to do anything with finding duplicates.

 

When you open the integrity control page it will start every 3 minutes a background job which is used to update the status of build and export of each disk. This is unrelated to the find command.

 

You can check if the background job has issues by running it in the CLI:

/etc/cron.daily/exportrotate -q

 

Any suggestions why 2 of my disks need to rebuilt every day?  Possibly I have to many files for innotify?  I have a few million files (due to plex app data daily backups)

 

I am pretty sure I figured this out with a lot of testing.  I'm running rsnapshot sync each night to backup my appdata folder to my array.  This is not getting picked up by the plugin.  Why though.  I am not sure.  This is run directly on unraid via cron.  Its not within a plugin.  Maybe that's it?

 

I'm going to try excluding this folder tonight and see if it fixed the unbuilt disk.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.