Dynamix File Integrity plugin


bonienl

Recommended Posts

I am after some advice, I previously used the other integrity plugin which was configured to create seperate MD5 files / ammend them on file changes.

 

This worked automatically, but I was concerned that it was no longer supported, so changed to this plugin instead

 

I have built and exported alll discs, and the plugin as far as I can see is working OK.

 

However I have noticed that occassionally some disks are indicating that their build / export or build & export status are not up to date. In these cases I have run the build / export tasks manually.

 

I have set the plugin to monitor files, and thought that build & export functions (after the initial build / export) would be carried out automatically?

Link to comment

I am after some advice, I previously used the other integrity plugin which was configured to create seperate MD5 files / ammend them on file changes.

 

This worked automatically, but I was concerned that it was no longer supported, so changed to this plugin instead

 

I have built and exported alll discs, and the plugin as far as I can see is working OK.

 

However I have noticed that occassionally some disks are indicating that their build / export or build & export status are not up to date. In these cases I have run the build / export tasks manually.

 

I have set the plugin to monitor files, and thought that build & export functions (after the initial build / export) would be carried out automatically?

 

The plugin works with the extended attributes of the files to attach and verify hash values. It doesn't rely on external files. Export can be used to keep a copy of the calculated hash values. This is completely optional though.

 

An initial build is required, and normally files get automatically updated afterwards (when protection is enabled). A notification is given when files are found which didn't get a hash value attached, these should be exceptions though.

 

  • Like 1
Link to comment

Hey bonienl,

 

I was looking at these scripts:

and I realized both could efficiently be replaced by this plugin if it did some parsing of the export files:

  • It could look for files with the same name (not including the path).  This would identify issues with a file existing at both /mnt/disk1/share/file and /mnt/disk2/share/file, for instance. The hashes wouldn't necessarily be the same.
  • It could look for files with identical hashes, regardless of file name, to identify true duplicates.

I think this would be a nice way to get extra value out of these export files, if you are up for it :)

 

Interesting... I'll have a look at that.

 

Link to comment

Request: Option for selecting which page displays by default when clicking the plugin.

 

I find that except for the initial config 99% of the time I want the "Control" page, so it would be nice to have the option.

I agree with you.  Perhaps the simpler solution will be to make the Control page the default rather than having to set an option?  I wonder if any others have a view on this?
Link to comment

That would be better, 2 clicks instead of 3, was just thinking that I would really like to have it in the main bar, I know the space is limited but for my NAS only servers that don't use dockes/vms there's still a lot of space and would just need a single click, naturally this should be optional as many users don't have the space or would not want it there.

Screenshot_2016-03-28_12_18_11.png.02cf393746f230a868359e14a462f3e5.png

Link to comment

I like the idea of making it less clicks to verify data. Another idea based on johnnie's post. It would be pretty awesome to add a row for each disk on the dashboard that had a green check or whatever symbol stating that all data on that disk was good and a red X if it found that data was corrupt.

Link to comment

I like the idea of making it less clicks to verify data. Another idea based on johnnie's post. It would be pretty awesome to add a row for each disk on the dashboard that had a green check or whatever symbol stating that all data on that disk was good and a red X if it found that data was corrupt.

I like this as on my VM server I am at the dashboard most of the time anyway.
Link to comment

Hey bonienl,

 

I was looking at these scripts:

and I realized both could efficiently be replaced by this plugin if it did some parsing of the export files:

  • It could look for files with the same name (not including the path).  This would identify issues with a file existing at both /mnt/disk1/share/file and /mnt/disk2/share/file, for instance. The hashes wouldn't necessarily be the same.
  • It could look for files with identical hashes, regardless of file name, to identify true duplicates.

I think this would be a nice way to get extra value out of these export files, if you are up for it :)

 

Interesting... I'll have a look at that.

Yes you have the keys to solving a lot of data issues using file integrity data. Statistics of many different kinds would be possible.

 

1 duplicate file analysis

2 show me different views of all my stuff

3 who knows. . .

 

Now you may not want to create all these things.  But creating an API to the file integrity extended attributes might enable all kinds of good things.

 

Link to comment

I have come across some behaviour with this plugin that i'd like to see if others are experiencing the same thing as me.

 

I started with a full scan and export, checked to monitor external files and also have an exclusion on .nfo's and apple metadata. My settings are as indicated in the screenshot below:

 

Screen_Shot_2016_03_29_at_10_51_41_AM.png

 

So, I am expecting that once I have scanned and exported that my interaction with this plugin will be limited to any issues it finds on a monthly basis. Is this a realistic expectation?

 

When the scan executed a couple of days ago (27th) I received the following email:

 

Event: unRAID file corruption
Subject: Notice [MAIN] - bunker verify command
Description: Found 5 files with BLAKE2 hash key mismatch
Importance: warning

BLAKE2 hash key mismatch (updated), /mnt/disk4/nas/Documents/.DS_Store was modified
BLAKE2 hash key mismatch (updated), /mnt/disk4/nas/Documents/Daniel/.DS_Store was modified
BLAKE2 hash key mismatch (updated), /mnt/disk4/nas/Documents/Daniel/development/scripts/.DS_Store was modified
BLAKE2 hash key mismatch (updated), /mnt/disk4/nas/Documents/Daniel/development/.DS_Store was modified
BLAKE2 hash key mismatch (updated), /mnt/disk4/nas/.DS_Store was modified

 

I was expecting that .DS_Store files would be ignored as I have selected ignore apple metadata? Is there something I am missing?

 

The ignore of .nfo files seems to be working just fine.

 

In addition I keep getting emails saying the following:

 

Event: Dynamix file integrity daily update
Subject: Notice [MAIN] - Disk 2, Disk 4, Disk 6 needs Export updating
Description: 29 non-existing files in export file
Importance: normal

 

Why if I have things set up to monitor new files should I have to keep going in and doing a manual export? I am expecting more from the plugin than it is designed to do?

 

Last but not least, when I DO go in there and go to do the manual export I find (each time) that there is not only an export that is required but there needs to be a scan done on one or more disks too. See screenshot below:

 

Screen_Shot_2016_03_29_at_10_58_26_AM.png

 

Now, I can do a Build and Export again BUT feel like I am having to manage this Plugin and keep monitoring the status to ensure that everything remains Built and Exported. Is there anything I need to be doing differently or do my expectations of the plugin need to change?

 

For now Build and Export status is back up to date:

 

Screen_Shot_2016_03_29_at_11_06_49_AM.png

 

EDIT: And now almost 3 hours later, the Build is showing as not up to date for Disk 4. There have been some files copied to Disk 4 BUT I thought that new files were being monitored and things updated accordingly  :-\

 

Screen_Shot_2016_03_29_at_12_46_18_PM.png

Link to comment

To set some expectations right...

 

1. If you experience issues with .DS_Store files, it is likely a left-over from the past. Delete those files (command can be found in an earlier post in this thread)

2. The operation of this plugin does NOT require external files. After an initial build and enabling of the protection and verification feature, all should stay up-to-date automatically. This includes warnings when silent corruption is deteced.

3. If files are found afterwards which didn't get a hash checksum in their extended attributes then a manual Build is required. This is checked once a day, and should be more exception than rule. If this  happens frequently, ask yourself how you are creating files and let me know if there is a true omission.

4. Export files are an option and NOT required for the operation of the plugin. These files can be used for recovery or other purposes. These files require manual action to stay up to date, i.e. the user decides when to update the export files. This is a change from the earlier versions of this plugin.

 

And taking this oppertunity to announce a new version 2016.03.29.

 

This new version adds the feature to search for duplicate file names and file hashes, these are read from the export files. At the moment pretty basic and not a replacement of existing tools. Try it out and see if this is a step in the right direction.

 

 

 

Link to comment

To set some expectations right...

 

1. If you experience issues with .DS_Store files, it is likely a left-over from the past. Delete those files (command can be found in an earlier post in this thread)

2. The operation of this plugin does NOT require external files. After an initial build and enabling of the protection and verification feature, all should stay up-to-date automatically. This includes warnings when silent corruption is deteced.

3. If files are found afterwards which didn't get a hash checksum in their extended attributes then a manual Build is required. This is checked once a day, and should be more exception than rule. If this  happens frequently, ask yourself how you are creating files and let me know if there is a true omission.

4. Export files are an option and NOT required for the operation of the plugin. These files can be used for recovery or other purposes. These files require manual action to stay up to date, i.e. the user decides when to update the export files. This is a change from the earlier versions of this plugin.

 

And taking this oppertunity to announce a new version 2016.03.29.

 

This new version adds the feature to search for duplicate file names and file hashes, these are read from the export files. At the moment pretty basic and not a replacement of existing tools. Try it out and see if this is a step in the right direction.

 

Right, that makes more sense.

 

I shall get rid of ALL .DS_Store files and see if I have any issues going forward.

 

Understanding now what the export function is and how external files are used I don't think I need it, I have a full daily backup to restore from which is also being checked regularly AND i doubt I would ever get rot on two files at the same time so even if there was rot I will always have one copy of the file to restore from so I shall not worry about it.

 

Thank you for taking the time to explain that to me  :)

Link to comment

And taking this oppertunity to announce a new version 2016.03.29.

 

This new version adds the feature to search for duplicate file names and file hashes, these are read from the export files. At the moment pretty basic and not a replacement of existing tools. Try it out and see if this is a step in the right direction.

 

Wow that was fast, thanks bonienl!

 

I pressed "find" and got:

 

Reading and sorting hash files
Including... disk1.export.hash
Including... disk1.export.20160214.bad.hash
Including... disk2.export.hash
Including... disk3.export.hash
Including... disks.export.20160109.bad.hash
Including... disks.export.20160111.bad.hash
Including... disks.export.20160115.bad.hash
Including... disks.export.20160118.bad.hash
Including... disks.export.20160129.bad.hash
Including... disks.export.20160130.bad.hash

Finding duplicate file names
Duplicate file names found!
See log file: duplicate_file_names.txt.

 

Even though it says "Duplicate file names found", my duplicate_file_names.txt file just contains:

[other content] 

 

So I purposefully created two files:

  /mnt/disk1/Movies/todo/test.txt

  /mnt/disk2/Movies/todo/test.txt

and ran it again.  Now duplicate_file_names.txt contains:

  disk3,disk2 [other content] Movies/todo/test.txt

 

So I'm glad that file shows up, but it seems like the first run should have picked up some too, especially after I saw the data from the next run...

 

 

I ran "find" a third time with the "include duplicate file hashes" option.  It ran for about 15 minutes and created a 243k line file!  This is amazing data.  I see I need to exclude a few more directories (such as anything SVN related), but it did identify duplicate photos, videos, and other files that I need to clean up.  I am definitely surprised by what it dug up.

 

I need to figure out how to parse the file for groups of files (so I can work on all photos at once, etc) but overall this is very cool. 

Link to comment

I shall get rid of ALL .DS_Store files and see if I have any issues going forward.

 

Actually, you can just go to the tools page of this plugin and press the Clear button.  This will delete any attributes that were added before the files were excluded, and once the attributes are gone the files will no longer be processed.

Link to comment

I shall get rid of ALL .DS_Store files and see if I have any issues going forward.

 

Actually, you can just go to the tools page of this plugin and press the Clear button.  This will delete any attributes that were added before the files were excluded, and once the attributes are gone the files will no longer be processed.

 

No matter, deleted them now! Thanks though!

Link to comment

I pressed "find" and got:

 

Reading and sorting hash files
Including... disk1.export.hash
Including... disk1.export.20160214.bad.hash
Including... disk2.export.hash
Including... disk3.export.hash
Including... disks.export.20160109.bad.hash
Including... disks.export.20160111.bad.hash
Including... disks.export.20160115.bad.hash
Including... disks.export.20160118.bad.hash
Including... disks.export.20160129.bad.hash
Including... disks.export.20160130.bad.hash

Finding duplicate file names
Duplicate file names found!
See log file: duplicate_file_names.txt.

 

I am afraid you need to do some manual housekeeping.

 

The *.bad.hash files need to be moved to the logs folder. Actually their name have changed too, and end on .log now (but not necssary to rename the old files, just move them). The export folder should only contain the exported files from the disks.

Link to comment

That would be better, 2 clicks instead of 3, was just thinking that I would really like to have it in the main bar, I know the space is limited but for my NAS only servers that don't use dockes/vms there's still a lot of space and would just need a single click, naturally this should be optional as many users don't have the space or would not want it there.

 

The new version allows you to have the feature just one click away (that is put the file integrity control page in the header menu). See file integrity settings to change this.

 

Link to comment

I am afraid you need to do some manual housekeeping.

 

The *.bad.hash files need to be moved to the logs folder. Actually their name have changed too, and end on .log now (but not necssary to rename the old files, just move them). The export folder should only contain the exported files from the disks.

 

Thanks, that's much cleaner :) I moved the files and updated to the 2016.03.30 version

 

Can you clarify what the Find button is looking for?  My duplicate_file_names.txt file still just shows this:

  disk3,disk2 [other content] Movies/todo/test.txt

So if it is looking for files at the same path on different disks, then it is working correctly.  But if it is looking for duplicate filenames (without considering the path) then it is missing a lot.

Link to comment

Can you clarify what the Find button is looking for?  My duplicate_file_names.txt file still just shows this:

  disk3,disk2 [other content] Movies/todo/test.txt

So if it is looking for files at the same path on different disks, then it is working correctly.  But if it is looking for duplicate filenames (without considering the path) then it is missing a lot.

 

For duplicate file names, it looks for files with the same path and name on different disks. This is the same behavior as how unRAID sees duplicate names.

 

To find duplicate files regardless of their location, look at the duplicate hashes file.

 

Ps. "other content" in your example above denotes two files with the same path/name but different content (their hashes aren't equal).

 

Link to comment

Can you clarify what the Find button is looking for?  My duplicate_file_names.txt file still just shows this:

  disk3,disk2 [other content] Movies/todo/test.txt

So if it is looking for files at the same path on different disks, then it is working correctly.  But if it is looking for duplicate filenames (without considering the path) then it is missing a lot.

 

For duplicate file names, it looks for files with the same path and name on different disks. This is the same behavior as how unRAID sees duplicate names.

 

To find duplicate files regardless of their location, look at the duplicate hashes file.

 

Ps. "other content" in your example above denotes two files with the same path/name but different content (their hashes aren't equal).

 

In that case, it is working perfectly.  Thanks for this!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.