Jump to content
itimpi

unRAIDFindDuplicates.sh

86 posts in this topic Last Reply

Recommended Posts

I also get an error

The only way I can see that error being reported is if there were no files found (the current logic assumes this is not a possibility)!  Can you try running with the -v option as that might give more information. 

Share this post


Link to post

Getting 2 errors. Anyway to fix this?

 

./unRAIDFindDuplicates.sh: line 153: [!: command not found

 

and

 

ls: cannot access /mnt/disk*//mnt/user/.........

 

Please see attached.

 

Duppie

The first error is cosmetic and can be ignored if your parameters are valid (it is an error in parameter validation).

 

Not sure what might cause the second one.  Can you give me an example of the command line used so I can see if that is relevant.  My suspicion is that you gave an incorrect parameter and because of the first error did not get warned about it correctly and the run aborted.

 

I am testing a new version with some additional parameter options (added as a result of feedback) so now is a good time to fix any errors in the core code, and also improve validation of any parameter options.

Share this post


Link to post

I also get an error

The only way I can see that error being reported is if there were no files found (the current logic assumes this is not a possibility)!  Can you try running with the -v option as that might give more information.

 

It's a huge share and it finishes immediately, quicker than even spinning up the disks, there are definitely duplicates, I will try it with -v

 

Share this post


Link to post

Getting 2 errors. Anyway to fix this?

 

./unRAIDFindDuplicates.sh: line 153: [!: command not found

 

and

 

ls: cannot access /mnt/disk*//mnt/user/.........

 

Please see attached.

 

Duppie

 

For the 2nd problem use the command line option '-i Movies' rather than '-i /mnt/user/Movies' ... The /mnt/user is automatically added to what is supplied to the -i parameter's value.

 

Share this post


Link to post

here you go

That is strange - with the -v option active you should be shown for each disk any share that is found on that as it is checked.  The output you show suggests that nothing was found.  Can you perhaps try running the command

ls -ld  /mnt/disk*/*

to check that there are in fact folders on the disk corresponding to the share names?

Share this post


Link to post

here you go, there are over 10,000 movies (folders + >40,000 files)

Thanks - that shows the problem! 

 

At the moment the script is not handling correctly user shares with spaces in their names.    As can be seen from the verbose output you posted earlier it thinks there are two shares "HD" and "Movies" whereas there should only be one share of "HD Movies".  That would also explain why it completed so quickly as it could not find any files belonging to the two shares it was looking for.  Now that I know what the problem is should be easy to fix for the next release.  Unexpected spaces (and other special characters like quotes) are notorious for causing problems in shell scripts.

 

It is possible that using -i "HD Movies" as a parameter option to explicitly specify the share should work as a temporary workaround (rather than letting the script derive the share list automatically), but I am not sure.

Share this post


Link to post

thanks, I tried that, alas no luck.  I will have to wait for the next version

Yp.  I found the line that stopped the proposed workaround (some missing quotes).  I had picked up all the cases where spaces occurred in file names, but not in share names.    I have now added a share with spaces to my server so that case gets tested for next time.

Share this post


Link to post

I have updated the first post with v1.3 of the script and updated the usage description appropriately..  This should fix the issues that a number have encountered with spaces in share names.  It also adds a few new features compared to the previously posted version:

  • Version 1.2  13 Sep 2014  Added the -D option to check extra disk
  • Version 1.3  01 Oct 2014  Added -f and -F options to list empty (duplicated) directories.
                              Added -z and -Z options to list zero length (duplicated) files.
                              Fix:  Allow for shares that have spaces in their names.

The options to check for empty folders and zero length files can be quite useful in identifying  issues that might have been created if any errors were made when copying files between disks (it found a few on my own system).

 

Please feel free to ask any questions; provide feedback on the current version; or make suggestions for improvement.

Share this post


Link to post

v1.3 works perfectly, I found 4 failed copies and was able to identify the right version. 

 

Excellent utility - Thank you!

Share this post


Link to post

v1.3 works perfectly, I found 4 failed copies and was able to identify the right version. 

 

Excellent utility - Thank you!

Thanks for the confirmation that the fix worked for you.

 

I would be interested to know if you have tried the -f/F and -z/Z options and whether they proved of use?

Share this post


Link to post

No I didn't try any switches, I only read the manual when I've f??ked something up

I am often like that as well  :)

 

It might be worth trying the switches I mentioned as they show other ways you may have messed things up.  They did on my system.

 

 

Share this post


Link to post

Thank you so much! I spent hours trying to find this sneaky duplicate and had pretty much given up on it before I found this. It was found and deleted in 30 seconds with this script ;D

Share this post


Link to post

The output of this script is a bit wonky.

 

-rw-r--r-- 1 nobody users 61747200 Oct 25  2011 /mnt/disk1/files/Wow World of Wonder/BDMV/STREAM/00618.m2ts
-rw-r--r-- 1 nobody users 61747200 Oct 25  2011 /mnt/disk3/files/Wow World of Wonder/BDMV/STREAM/00618.m2ts
  WARNING: File sizes are different
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk1/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk3/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts
  WARNING: File sizes are different
-rw-r--r-- 1 nobody users 157181952 Oct 25  2011 /mnt/disk1/files/Wow World of Wonder/BDMV/STREAM/00620.m2ts
-rw-r--r-- 1 nobody users 157181952 Oct 25  2011 /mnt/disk3/files/Wow World of Wonder/BDMV/STREAM/00620.m2ts
  WARNING: File sizes are different
-rw-r--r-- 1 nobody users 1176 Oct 25  2011 /mnt/disk1/files/Wow World of Wonder/BDMV/index.bdmv
-rw-r--r-- 1 nobody users 1176 Oct 25  2011 /mnt/disk3/files/Wow World of Wonder/BDMV/index.bdmv

 

 

Those files don't look like they are different sizes to me. Anyone else see this?

Share this post


Link to post

The output of this script is a bit wonky.

 

-rw-r--r-- 1 nobody users 61747200 Oct 25  2011 /mnt/disk1/files/Wow World of Wonder/BDMV/STREAM/00618.m2ts
-rw-r--r-- 1 nobody users 61747200 Oct 25  2011 /mnt/disk3/files/Wow World of Wonder/BDMV/STREAM/00618.m2ts
  WARNING: File sizes are different
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk1/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk3/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts
  WARNING: File sizes are different
-rw-r--r-- 1 nobody users 157181952 Oct 25  2011 /mnt/disk1/files/Wow World of Wonder/BDMV/STREAM/00620.m2ts
-rw-r--r-- 1 nobody users 157181952 Oct 25  2011 /mnt/disk3/files/Wow World of Wonder/BDMV/STREAM/00620.m2ts
  WARNING: File sizes are different
-rw-r--r-- 1 nobody users 1176 Oct 25  2011 /mnt/disk1/files/Wow World of Wonder/BDMV/index.bdmv
-rw-r--r-- 1 nobody users 1176 Oct 25  2011 /mnt/disk3/files/Wow World of Wonder/BDMV/index.bdmv

 

 

Those files don't look like they are different sizes to me. Anyone else see this?

I agree the report looks strange!    Have you actually directly checked the size on disk of the two files mentioned?  It seems more likely that they are actually different and the details in the warning message is wrong or this would be reported far more frequently.  If it is the report that is wrong can you let me know if it is the first or second line as that would help pin down the fault.

Share this post


Link to post

I agree the report looks strange!    Have you actually directly checked the size on disk of the two files mentioned?  It seems more likely that they are actually different and the details in the warning message is wrong or this would be reported far more frequently.  If it is the report that is wrong can you let me know if it is the first or second line as that would help pin down the fault.

 

It did happen a lot... I just showed the last few entries.

 

root@Tower:/boot# cat duplicates.txt | grep WARNING | wc -l
1665

 

I've manually looked at a few of the files and the size matches what the script is printing. They are the same size.

 

root@Tower:/boot# tail duplicates.txt 
  WARNING: File sizes are different
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk1/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk3/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts
  WARNING: File sizes are different

root@Tower:/boot# ls -l "/mnt/disk1/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts"
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk1/files/Wow\ World\ of\ Wonder/BDMV/STREAM/00619.m2ts
root@Tower:/boot# ls -l "/mnt/disk3/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts"
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk3/files/Wow\ World\ of\ Wonder/BDMV/STREAM/00619.m2ts

 

Spaces in names an issue? Any other thoughts?

Share this post


Link to post

One thing to note is that the files on disk1 and disk3 are two different filesystems. reiserfs/xfs. Could that be a reason for the filesize mismatch?

Share this post


Link to post

I agree the report looks strange!    Have you actually directly checked the size on disk of the two files mentioned?  It seems more likely that they are actually different and the details in the warning message is wrong or this would be reported far more frequently.  If it is the report that is wrong can you let me know if it is the first or second line as that would help pin down the fault.

 

It did happen a lot... I just showed the last few entries.

 

root@Tower:/boot# cat duplicates.txt | grep WARNING | wc -l
1665

 

I've manually looked at a few of the files and the size matches what the script is printing. They are the same size.

 

root@Tower:/boot# tail duplicates.txt 
  WARNING: File sizes are different
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk1/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk3/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts
  WARNING: File sizes are different

root@Tower:/boot# ls -l "/mnt/disk1/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts"
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk1/files/Wow\ World\ of\ Wonder/BDMV/STREAM/00619.m2ts
root@Tower:/boot# ls -l "/mnt/disk3/files/Wow World of Wonder/BDMV/STREAM/00619.m2ts"
-rw-r--r-- 1 nobody users 1183365120 Oct 25  2011 /mnt/disk3/files/Wow\ World\ of\ Wonder/BDMV/STREAM/00619.m2ts

 

Spaces in names an issue? Any other thoughts?

Not off-hand.  The script should have all relevant file names quoted so that spaces get handled and I have lots of spaces in my names (both folders and files) and have not noticed any issues.    I notice that the version of the script on my home system reports itself as 1.4 while the one available for download is 1.3.  Not sure if the difference is significant, the note against the 1.4 change suggests it is not.  However it is quite a while since I looked at the script so trying to remember how it works is taxing my mind a bit :)  I know I had to go through some loops around getting state correct when using shell script.

 

If you look at the script the point at which the message is output has the lines

        if [ $previous_file_size -ne $ff_size ] ; then
           to_both "  WARNING: File sizes are different"

Changing it to read

        if [ $previous_file_size -ne $ff_size ] ; then
           to_both "  WARNING: File sizes are different"  $previous_file_size $ff_size

will add the sizes that are mismatching to the message which might help track down the cause.

 

Interesting comment around the file systems being different on the disks.  I would not have thought it should be relevant as nothing is done that is file system aware (at least that I know off).  However all my disks are XFS so it is possible that the difference triggers something.

 

Share this post


Link to post

Is it possible the size being compared is the actual size on disk? If so, a sparse file will be incorrectly reported as failing the size comparison, when the content of the file if read out or checksummed is identical.

Share this post


Link to post

Is it possible the size being compared is the actual size on disk? If so, a sparse file will be incorrectly reported as failing the size comparison, when the content of the file if read out or checksummed is identical.

Possible.  The size is being obtained using a command of the form

du -s filename

so you could see if they are reported the same on your system?  If that is the case then an alternative command could be used.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.