Dynamix File Integrity plugin


bonienl

Recommended Posts

The purpose of this plugin is to check file integrity on the unRAID system itself, and alert when a mismatch (warning) or corruption (error) is found. It is set up in such way that it works transparantly to the user. Once hash protection and verification is set up, it basically is a set and forget operation and just let it do its job.

 

In the latest version 2015.12.31 I have added a daily cron script, which checks for new/modified files in the past 24 hours and updates the export files accordingly. It will also do some housekeeping for any daily generated files. This will keep the export files for all disks automatically up-to-date, the user still has the option to generate export files at will though.

 

Working with a (off-site) backup system may require a different approach. If it is another unRAID system, then this plugin may be used again, otherwise it needs to be something which runs for your particular set up.

 

Link to comment

Whenever the control page is revisited it will update itself to the latest state.

While running a build the statement above works for the same browser only. If you load the control page on a different browser or computer it will not show the latest state.

 

This is expected behavior. You need to use the same browser to see the latest status.

Link to comment

Tell us more about the export files. I'm sure it was discussed in the bunker thread but it would be helpful to have it here as well.

 

Where are they stored? One hash file per file hashed, one hash file per folder, one hash file per share or drive?

 

Are they compatible with the popular corz checksum application for Windows / Linux?

Link to comment

Tell us more about the export files. I'm sure it was discussed in the bunker thread but it would be helpful to have it here as well.

 

Where are they stored? One hash file per file hashed, one hash file per folder, one hash file per share or drive?

 

Are they compatible with the popular corz checksum application for Windows / Linux?

 

The online help of the plugin gives already some answers, but let me summarize here :)

 

All export files are stored on the flash device in the folder /config/plugins/dynamix.file.integrity

 

Export files are generated per data disk, e.g. if you have 11 data disks in your array, you will get 11 export files with the name diskXX.export.hash

# ls -l
total 86448
-rwxrwxrwx 1 root root 79742592 Dec 31 13:01 disk1.export.hash
-rwxrwxrwx 1 root root   343327 Dec 30 22:38 disk10.export.hash
-rwxrwxrwx 1 root root   143739 Dec 30 22:37 disk11.export.hash
-rwxrwxrwx 1 root root   158578 Dec 30 22:37 disk2.export.hash
-rwxrwxrwx 1 root root   336681 Dec 30 22:37 disk3.export.hash
-rwxrwxrwx 1 root root  2152038 Dec 30 22:38 disk4.export.hash
-rwxrwxrwx 1 root root  1170741 Dec 30 22:38 disk5.export.hash
-rwxrwxrwx 1 root root   688421 Dec 30 22:38 disk6.export.hash
-rwxrwxrwx 1 root root  1041706 Dec 30 22:38 disk7.export.hash
-rwxrwxrwx 1 root root   531778 Dec 30 22:37 disk8.export.hash
-rwxrwxrwx 1 root root  2046025 Dec 30 22:38 disk9.export.hash

Each line in the export file has the layout:

<hash-value>|<file-name>|<scan-date>|<file-modification-date>|<file-size>

(5 fields in total separated by a | character)

 

The same information is also stored in the extended attributes and allows bunker to detect whether a hash mismatch occured (file was changed but hash not updated) or "silent" corruption occured. In both cases a notification is sent and a syslog entry made (default settings).

 

The exported files are not needed for bunker to do its detection, it does all its reporting and alerts based on the attended attributes, but these files can be used for other purposes. For example:

 

- Exporting and importing all hash information quickly on to another machine

- After a (new) disk rebuild, do a validation of all files based on previous stored hashes

- Restore extended attributes when they went missing due to unforeseen circumstances

 

I am not familar with corz, but since it is a windows program, it won't know about the additional information kept in the extended attributes and hence unlikely compatible.

 

Link to comment

I am not familar with corz, but since it is a windows program, it won't know about the additional information kept in the extended attributes and hence unlikely compatible.

Of course not, but it creates/checks hash files which is what I was asking about.

But ... I have no clue what format corz is using, I don't think it will be compatible though.

Link to comment

I am not familar with corz, but since it is a windows program, it won't know about the additional information kept in the extended attributes and hence unlikely compatible.

Of course not, but it creates/checks hash files which is what I was asking about.

But ... I have no clue what format corz is using, I don't think it will be compatible though.

At its heart, the exact same format as the output from md5sum.  Some extras added in as comment lines, but all that's necessary is the hash value and the filename

 

 

Link to comment

A couple of points / questions. 

 

I am using the other Checksum tools, but I like the style of this plugin.

 

I am 100% sure that Corz will not work with the files + extended attributes approach, but it might work with the exported hash files if that file / files are in the right format. I'm not an expert on what that should be but it sound like Squid can help a lot on that if you choose to go down that path.

 

Second Point: If you want to maintain your checksums while copying from one unRAID device to another, I am pretty sure you can use rsync with the extended attributes flag. -X or --xattrs should preserve extended attributes and not require rehasing the file... 

Link to comment

I know the format of MD5sum, SHA256sum and BLAKE2. Sounds like corz uses the same format.

 

The extra information put in the export files of this plugin is used to detect "silent" corruption on the fly, and keeps track of when scans have been performed.

 

I am not developing this to work with corz perse, but perhaps corz can be instructed to read the format as used by this plugin?

 

Yes, rsync can be used to copy files and preserve the extended attributes, but not everybody is doing that (or give the correct options). It is the better approach than re-importing the hash values though.

 

Link to comment

Giving this a second thought ... I might be able to give the export files the same format as the other utilities. Need a bit of investigation here.

If you set up exported files something like this, corz (or pretty much any checker windows / linux / mac) shouldn't have a problem with it

 

# Any extra information output by bunker in a commented line
123456789012346  \\server\share\filename
# Any extra info for second file
1234567890123456 \\server\share\filename2

etc

 

The commented lines done by corz are very specific with regards to file modification times, but if its not there then corz would tell you that the file is either good or corrupt, but not if the file has been modified since the hash was created.

 

Note that this assumes MD5 as a hash.  For corz to work with SHA / BLAKE, there is a specific comment line that would have to be in there (I'll get it tonight)

Link to comment

I like new standards ;D

 

But in this case changed the export format to the regular format as used by MD5, SHA256, BLAKE2. That is

<hash-key> *<file-name>

 

Did some reading up on corz, and it says the above format is the preferred one. So here you go!

 

A new version 2015.12.31b is available, you need to regenerate the export files to get them in the new format.

 

Link to comment

Thanks, MD5 hashes can now be easily checked with corz.

 

Fyi auto exports at the end of a disk build are still in old format, manual exports are in the new one.

 

Also hashs generated using SHA and BLAKE2 are different from corz, so not compatible, maybe different formats used?

 

These are the hashs from same file using all 3 options:

 

MD5

Bunker: 817f434b2bc289366c4628a1d05d5d7f
Corz:   817f434b2bc289366c4628a1d05d5d7f

 

SHA

Bunker: 08ba7c1a65d3d267c8f60dd995a32a0b07c14904236fb48f7d2d870206e8b4d1
Corz:   7c8f4bfd0a5d69db04638960c0be8acd7aeda7bf

 

BLAKE2

Bunker: 0c8d4002102d70d207436b5a58aae71f861bd93e95e07d14bd263273ae5489c816eb7ff208cdde0face4177aec5ac0e0b4b13da344f2a4165c9517ad3689e127
Corz:   0b0908f09b11d07b7eb245a2ca8194c6f70f5bb50f1d61706e589c9d526f04cb

 

Link to comment

Thanks for the feedback.

 

I've made an update to the bunker script and now all is in the new format. Go to the plugin manager to do a update (version 2016.01.01)

 

I take the direct output from the utilities md5sum, sha256sum and b2sum. Note that md5sum and sha256sum are standard with the unRAID installation. b2sum (blake2) is installed by the plugin.

 

The following test file has the content "test"

# md5sum -b test
d8e8fca2dc0f896fd7cb4cb0031ba249 *test

# sha256sum -b test
f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2 *test

# b2sum test
34073762db7af5008c7213f93390e0e7b73051ecd42d49f3633c82c9af0caff3fff74f09c7a6ff72ef309a584c8dbeb0cbda750fb08bdaa88ceabfccda650c35 test

# b2sum -a blake2s test
975c6313d3cb4fae3ff12feccc34a68b8719e2793728377b3f32fafbe39b8a7f test

Optionally blake2 can be set to use a 256bit checksum instead of the standard 512bit.

 

Not sure why corz is different.

 

Ps. Since MD5 is most widely used, it seems to be the best choice when compatibility is required.

 

Link to comment

Out of interest...

 

Those using/testing the plugin, how many disks do you have and are there any disks with extremely large amount of files?

 

In my case I have 11 disks, most disks are used to store media (movies and music). The number of files on these disks is relatively low. I have reserved one disk to keep my 'data' files. This one has around 320K files.

 

For example, I can do an export of all my disks simultaneously and the 'media' disks are finished within a few seconds, while the 'data' disk takes about 2 minutes (using cache_dirs to speed up things).

 

How are your experiences?

 

Link to comment

How are your experiences?

 

I’ll get back to you on that, I’m still hashing all my servers, it will take some time, my main interest is not bit rot per se but being able to easily check a complete disk if anything happens to it that makes me doubt some or all files integrity.

 

Meanwhile found an issue, and this probably won’t happen to many people but if it happens, it can be a pain because you can’t stop the array:

 

Say I want to start a manual build for disks 1, 2 and 3, so I check mark all 3 and click build, now for some reason I want to stop all 3, after clicking cancel on disk1, disk1 and 3 stop building, I can then cancel disk2, but disk3 will continue to build in the background, if I stop the array it will remain retrying to unmount disks until I use the console to kill all bunker processes.

 

Link to comment

 

Optionally blake2 can be set to use a 256bit checksum instead of the standard 512bit.

 

Not sure why corz is different.

 

Because corz uses blake2s  Just another option with b2sum.  But if anyone is looking at checking hashes with a different program, you should be using md5 for maximum compatibility
Link to comment

 

Optionally blake2 can be set to use a 256bit checksum instead of the standard 512bit.

 

Not sure why corz is different.

 

Because corz uses blake2s  Just another option with b2sum.  But if anyone is looking at checking hashes with a different program, you should be using md5 for maximum compatibility

 

Yeah, that’s what I decided to use.

Link to comment

Say I want to start a manual build for disks 1, 2 and 3, so I check mark all 3 and click build, now for some reason I want to stop all 3, after clicking cancel on disk1, disk1 and 3 stop building, I can then cancel disk2, but disk3 will continue to build in the background, if I stop the array it will remain retrying to unmount disks until I use the console to kill all bunker processes.

 

When an operation is canceled, it kills the bunker script immediately associated with the particular disk, but not any checksum operations which were called by bunker in thebackground. These operations do stop eventually but not instantly as one might expect. Hence stopping the array may leave it in unmounting state for quite a while.

 

I need to see how I can improve this, i.e. stop all processes at once...

 

Link to comment

I understand,  but the issue I'm having is that in the example above, I click cancel on disk1 build, disk3 build progress will disappear front the GUI,  so I can't cancel but it will continue to build in the background.

 

The expected behavior is that canceling a single operation should leave the others running and visible in the GUI. Just did a quick test myself. I start a hash build for three disks, and after a while I cancel one. The other two stay visible and continue to run as expected.

 

The state of which disks need to be listed is kept in a cookie, so when the page is refreshed or revisited the GUI knows which disks it needs to display in the progress field.

 

Link to comment

Don’t know what to tell you, tried on different servers and just tried again, using a different browser, as I have some issue with Chrome for some operations, but behavior was the same:

 

screen 1 after starting build on 3 disks, screen 2 after clicking cancel on disk 1.

1.png.4a740a361ada6f52b7052c38dfb0fe94.png

2.png.4bd7cc463fb6dad83382fbe6e67cdfd0.png

Link to comment

Assuming you are still in this 'bad' state. Can you show the output of the following CLI commands:

ls -l /tmp
ls -l /var/tmp
ps -ef | grep bunker
ps -ef | grep watcher
ps -ef | grep inotifywait

Ps. I am  using Chrome all the time and don't have issues. Adblocker maybe?

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.