hhs99 Posted February 9, 2018 Share Posted February 9, 2018 I get the following error using -v ./unRAIDFindDuplicates.sh: line 373: verbose_to_bpth: command not found Unraid 6.4.1 Quote Link to comment
itimpi Posted February 10, 2018 Author Share Posted February 10, 2018 That looks like a spelling issue. It should be ‘both’ rather than ‘bpth’. It is basically harmless as all that line is trying to do is output a blank line for cosmetic purposes. Quote Link to comment
hhs99 Posted February 10, 2018 Share Posted February 10, 2018 interesting, even with the test duplicate files i placed, it says no dups. Quote Link to comment
remotevisitor Posted February 10, 2018 Share Posted February 10, 2018 Did the test duplicate file that you created exist in the same place on different disks. So if you have a duplicate file at /mnt/disk1/TV/series/file and /mnt/disk2/TV/series/file (just the name has to match, not the contents) then this script will report it as a duplicate. This often happens if you re-organise the data in a share by copying rather than moving the files between disks. As the contents of the 2 disks in the above example would be merged together under /mnt/user/TV to create the TV share only one version of the file will be visible at /mnt/user/TV/series/file, which is why it is useful to identify which files are duplicated and remove the duplicate. i suspect that you were expecting it to report files which have the same name and/or content in random locations .... unfortunately this script does not address this problem. Quote Link to comment
hhs99 Posted February 10, 2018 Share Posted February 10, 2018 I used the exact same file(copied the script into a couple of places with no changes) I just need the ability to find duplicate files in the array somehow someway Quote Link to comment
JonathanM Posted February 10, 2018 Share Posted February 10, 2018 2 hours ago, hhs99 said: I just need the ability to find duplicate files in the array somehow someway Quote Link to comment
EvilSpice Posted October 5, 2018 Share Posted October 5, 2018 Thanks for this script. after a weeklong effort to convert several drives to xfs i found myself with multiple copies of many of my files. this tool made taking care of this issue simple and painless Thanks! Quote Link to comment
itimpi Posted October 6, 2018 Author Share Posted October 6, 2018 6 hours ago, EvilSpice said: Thanks for this script. after a weeklong effort to convert several drives to xfs i found myself with multiple copies of many of my files. this tool made taking care of this issue simple and painless Thanks! That was exactly the reason I wrote it in the first place as I ended up in the same situation Quote Link to comment
Excessus Posted March 11, 2019 Share Posted March 11, 2019 (edited) Thanks for this. I ended up creating a real problem for myself with the unbalance plugin (cancelling it in progress causes it to leave copies of files in place) and I needed a real utility to deal with the mess. Unfortunately for me, while the output of this app is useful, I had far, far too many duplicates to deal with by hand. So I wrote this little Windows (sorry) utility to automate the process for me: Loading file... Buffy S01E10.mkv size mismatch! Chef S01E04.mkv size mismatch! Read 1515 / 2589 lines. Finding size mismatches... /mnt/disk11/media/series/Buffy the Vampire Slayer/Buffy S01E10.mkv : 99,975,168 /mnt/disk12/media/series/Buffy the Vampire Slayer/Buffy S01E10.mkv : 1,981,632,213 /mnt/disk6/media/series/Buffy the Vampire Slayer/Buffy S01E10.mkv : 1,981,632,213 /mnt/disk12/media/series/Chef/Chef S01E04.mkv : 487,030,784 /mnt/disk6/media/series/Chef/Chef S01E04.mkv : 1,045,681,530 Generating list of files that are safe to delete... Script will delete 955 out of 1,515 files. Total space saved: 1,317,440,808,558 bytes (1.2 TB) File C:\zzz\delete.txt successfully created. Essentially, it parses the script's log file, highlights the files that are safe to leave in place and those that are risky and creates a script file that you can execute from the UnRAID console. The output of the script looks like this: echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E08.mkv" rm "/mnt/disk12/media/series/Angel/Season 1/Angel S01E08.mkv" rm "/mnt/disk6/media/series/Angel/Season 1/Angel S01E08.mkv" echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E09.mkv" rm "/mnt/disk12/media/series/Angel/Season 1/Angel S01E09.mkv" rm "/mnt/disk6/media/series/Angel/Season 1/Angel S01E09.mkv" echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E10.mkv" rm "/mnt/disk12/media/series/Angel/Season 1/Angel S01E10.mkv" rm "/mnt/disk6/media/series/Angel/Season 1/Angel S01E10.mkv" echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E11.mkv" In the hope that I can save others some time, I have included the app (binary and source) in this post. Be careful, folks! I sincerely hope that no one loses their data with this thing. UnRAIDdeDupe.zip Edited March 11, 2019 by Excessus Quote Link to comment
mihcox Posted June 10, 2019 Share Posted June 10, 2019 (edited) On 3/11/2019 at 2:57 PM, Excessus said: Thanks for this. I ended up creating a real problem for myself with the unbalance plugin (cancelling it in progress causes it to leave copies of files in place) and I needed a real utility to deal with the mess. Unfortunately for me, while the output of this app is useful, I had far, far too many duplicates to deal with by hand. So I wrote this little Windows (sorry) utility to automate the process for me: Loading file... Buffy S01E10.mkv size mismatch! Chef S01E04.mkv size mismatch! Read 1515 / 2589 lines. Finding size mismatches... /mnt/disk11/media/series/Buffy the Vampire Slayer/Buffy S01E10.mkv : 99,975,168 /mnt/disk12/media/series/Buffy the Vampire Slayer/Buffy S01E10.mkv : 1,981,632,213 /mnt/disk6/media/series/Buffy the Vampire Slayer/Buffy S01E10.mkv : 1,981,632,213 /mnt/disk12/media/series/Chef/Chef S01E04.mkv : 487,030,784 /mnt/disk6/media/series/Chef/Chef S01E04.mkv : 1,045,681,530 Generating list of files that are safe to delete... Script will delete 955 out of 1,515 files. Total space saved: 1,317,440,808,558 bytes (1.2 TB) File C:\zzz\delete.txt successfully created. Essentially, it parses the script's log file, highlights the files that are safe to leave in place and those that are risky and creates a script file that you can execute from the UnRAID console. The output of the script looks like this: echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E08.mkv" rm "/mnt/disk12/media/series/Angel/Season 1/Angel S01E08.mkv" rm "/mnt/disk6/media/series/Angel/Season 1/Angel S01E08.mkv" echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E09.mkv" rm "/mnt/disk12/media/series/Angel/Season 1/Angel S01E09.mkv" rm "/mnt/disk6/media/series/Angel/Season 1/Angel S01E09.mkv" echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E10.mkv" rm "/mnt/disk12/media/series/Angel/Season 1/Angel S01E10.mkv" rm "/mnt/disk6/media/series/Angel/Season 1/Angel S01E10.mkv" echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E11.mkv" In the hope that I can save others some time, I have included the app (binary and source) in this post. Be careful, folks! I sincerely hope that no one loses their data with this thing. UnRAIDdeDupe.zip 65.16 kB · 7 downloads Can you provide a simple guide on how to use this feature? I have the same issue that's to unbalance and have 500+ GB of duplicated steam games. I have downloaded your zip file, but am unsure what to do at this time. EDIT: I was able to successfully import the code into VS, and get it to run. Thanks for this, just waiting on my first log file to finish. May have questions then! Edited June 10, 2019 by mihcox Quote Link to comment
Wuast94 Posted October 19, 2019 Share Posted October 19, 2019 (edited) On 3/11/2019 at 8:57 PM, Excessus said: Thanks for this. I ended up creating a real problem for myself with the unbalance plugin (cancelling it in progress causes it to leave copies of files in place) and I needed a real utility to deal with the mess. Unfortunately for me, while the output of this app is useful, I had far, far too many duplicates to deal with by hand. So I wrote this little Windows (sorry) utility to automate the process for me: Loading file... Buffy S01E10.mkv size mismatch! Chef S01E04.mkv size mismatch! Read 1515 / 2589 lines. Finding size mismatches... /mnt/disk11/media/series/Buffy the Vampire Slayer/Buffy S01E10.mkv : 99,975,168 /mnt/disk12/media/series/Buffy the Vampire Slayer/Buffy S01E10.mkv : 1,981,632,213 /mnt/disk6/media/series/Buffy the Vampire Slayer/Buffy S01E10.mkv : 1,981,632,213 /mnt/disk12/media/series/Chef/Chef S01E04.mkv : 487,030,784 /mnt/disk6/media/series/Chef/Chef S01E04.mkv : 1,045,681,530 Generating list of files that are safe to delete... Script will delete 955 out of 1,515 files. Total space saved: 1,317,440,808,558 bytes (1.2 TB) File C:\zzz\delete.txt successfully created. Essentially, it parses the script's log file, highlights the files that are safe to leave in place and those that are risky and creates a script file that you can execute from the UnRAID console. The output of the script looks like this: echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E08.mkv" rm "/mnt/disk12/media/series/Angel/Season 1/Angel S01E08.mkv" rm "/mnt/disk6/media/series/Angel/Season 1/Angel S01E08.mkv" echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E09.mkv" rm "/mnt/disk12/media/series/Angel/Season 1/Angel S01E09.mkv" rm "/mnt/disk6/media/series/Angel/Season 1/Angel S01E09.mkv" echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E10.mkv" rm "/mnt/disk12/media/series/Angel/Season 1/Angel S01E10.mkv" rm "/mnt/disk6/media/series/Angel/Season 1/Angel S01E10.mkv" echo "Keeping /mnt/disk11/media/series/Angel/Season 1/Angel S01E11.mkv" In the hope that I can save others some time, I have included the app (binary and source) in this post. Be careful, folks! I sincerely hope that no one loses their data with this thing. UnRAIDdeDupe.zip 65.16 kB · 16 downloads I cant get it to work. when i open the "UnRAIDdeDupe\bin\Debug\UnRAIDdeDupe.exe" and import the file i get an error message: ************** Ausnahmetext ************** System.ArgumentOutOfRangeException: Der Index lag außerhalb des Bereichs. Er darf nicht negativ und kleiner als die Sammlung sein. how exactly should i run the script to generate the right output file for your program ? Edited October 19, 2019 by Wuast94 Quote Link to comment
Excessus Posted October 19, 2019 Share Posted October 19, 2019 31 minutes ago, Wuast94 said: I cant get it to work. when i open the "UnRAIDdeDupe\bin\Debug\UnRAIDdeDupe.exe" and import the file i get an error message: ************** Ausnahmetext ************** System.ArgumentOutOfRangeException: Der Index lag außerhalb des Bereichs. Er darf nicht negativ und kleiner als die Sammlung sein. how exactly should i run the script to generate the right output file for your program ? Your source file should look like this: Quote Duplicate Files --------------- -rw-rw-rw- 1 nobody users 2050156830 Feb 7 2017 /mnt/disk11/media/series/Angel/Season 1/Angel S01E08.mkv -rw-rw-rw- 1 nobody users 2050156830 Feb 7 2017 /mnt/disk12/media/series/Angel/Season 1/Angel S01E08.mkv and not, like this: Quote Duplicate Files --------------- media/series/Angel/Season 1/Angel S01E08.mkv media/series/Angel/Season 1/Angel S01E09.mkv media/series/Angel/Season 1/Angel S01E10.mkv It's been a while since I've need to use the script, so I forgot what options I had to use to generate the former. Quote Link to comment
JustinChase Posted April 7, 2020 Share Posted April 7, 2020 On 10/10/2017 at 3:41 AM, itimpi said: Have you given the script execute permission? If you downloaded it to the flash drive this should be automatic (because it is FAT32 format) but will not be the case if put elsewhere. Alternatively run it using the ‘sh’ command which does not require the script to have ‘execute’ permission. I'm having the same issue today. I have this script on my flash drive, but get permissions denied error also. root@media:/boot/scripts# unRAIDFindDuplicates.sh -v -bash: ./unRAIDFindDuplicates.sh: Permission denied What am I doing wrong, how to fix? Quote Link to comment
itimpi Posted April 7, 2020 Author Share Posted April 7, 2020 1 hour ago, JustinChase said: I'm having the same issue today. I have this script on my flash drive, but get permissions denied error also. root@media:/boot/scripts# unRAIDFindDuplicates.sh -v -bash: ./unRAIDFindDuplicates.sh: Permission denied What am I doing wrong, how to fix? This is due to a change in the security that came in with Unraid 6.8.x series where files on the flash drive are no longer allowed to hav execute permission. You can get around this by preceding the script name with the ‘bash’ command. E.g. bash unRAIDFindDuplicates.sh -v Quote Link to comment
JustinChase Posted April 7, 2020 Share Posted April 7, 2020 15 minutes ago, itimpi said: This is due to a change in the security that came in with Unraid 6.8.x series where files on the flash drive are no longer allowed to hav execute permission. You can get around this by preceding the script name with the ‘bash’ command. E.g. bash unRAIDFindDuplicates.sh -v That makes sense, thanks for letting me know. Could you please add this to the first post, because I'll forget in a year or 2 when I try to do this again, and I always look to the first post for instructions, which will save me having to search the thread to find this tidbit. Thanks again, the script works great (once I get it to run). Quote Link to comment
johner Posted May 3, 2021 Share Posted May 3, 2021 (edited) hi, I stumbled across this, and it looks like what i need to detect duplicates across disks under the logical share - so thanks. When I run without options, i get no dupes error after about 30s - all good (i assume!). I then went to run it with -c to double check bash unRAIDFindDuplicates.sh -c And it immediately responds with the help text. Q1: Is this a defect or am I doing something wrong? I then tried: bash unRAIDFindDuplicates.sh -z It said no duplicates (again after about 30s), then sat there reporting nothing for a few mins and eventually came back with lots of errors such as: (this is a subset): ls: cannot access '/mnt/disk*//appdata/ESPHome/hot_water_system/.piolibdeps/hot_water_system/ESPAsyncTCP-esphome/examples/SyncClient/.esp31b.skip': No such file or directory ls: cannot access '/mnt/disk*//appdata/FileBot/log/nginx/error.log': No such file or directory ls: cannot access '/mnt/disk*//appdata/FileBot/xdg/cache/openbox/openbox.log': No such file or directory ls: cannot access '/mnt/disk*//appdata/FileBot/.licensed_version': No such file or directory ls: cannot access '/mnt/disk*//appdata/FileBot/error.log': No such file or directory ls: cannot access '/mnt/disk*//appdata/Grafana-Unraid-Stack/data/influxdb/wal/telegraf/autogen/250/_01402.wal': No such file or directory ls: cannot access '/mnt/disk*//appdata/Grafana-Unraid-Stack/data/influxdb/wal/_internal/monitor/258/_00094.wal': No such file or directory ls: cannot access '/mnt/disk*//appdata/Grafana-Unraid-Stack/data/influxdb/wal/home_assistant/autogen/255/_00003.wal': No such file or directory ls: cannot access '/mnt/disk*//appdata/Grafana-Unraid-Stack/data/loki/index/index_2573': No such file or directory ls: cannot access '/mnt/disk*//appdata/Grafana-Unraid-Stack/data/loki/index/index_2520': No such file or directory ls: cannot access '/mnt/disk*//appdata/Grafana-Unraid-Stack/data/loki/index/index_2525': No such file or directory ls: cannot access '/mnt/disk*//appdata/Grafana-Unraid-Stack/data/loki/index/index_2551': No such file or directory ls: cannot access '/mnt/disk*//appdata/Grafana-Unraid-Stack/data/loki/index/index_2552': No such file or directory ls: cannot access '/mnt/disk*//appdata/Grafana-Unraid-Stack/data/loki/index/index_2579': No such file or directory ls: cannot access '/mnt/disk*//appdata/Grafana-Unraid-Stack/data/loki/index/index_2609': No such file or directory Q2: What is it trying to do when checking for zero length dupes that it isn't when running with no options? I then ran in verbose out of interest bash unRAIDFindDuplicates.sh -v I noticed two things: 1 - this error half way through: List duplicate files unRAIDFindDuplicates.sh: line 373: verbose_to_bpth: command not found checking /mnt/disk1 Q3: Will this error affect the actual dupe check? - I assume not. 2 - it doesn't seem take into consideration additional cache drives that are an option to define in v6.9 (I have a second called 'scratch') Q4: Would you be willing to add something that can dynamically check for additional cache drive config and include in the no option execution? I then tried the -D option to add the additional cache drive (/mnt/scratch) to be treated as an array drive, and it went a bit screwy! bash unRAIDFindDuplicates.sh -v -D /mnt/scratch Output (killed with ctrl-c in the end) ============= STARTING unRAIDFIndDuplicates.sh =================== Included disks: /mnt/disk/mnt/scratch /mnt/disk1 ... ... List duplicate files unRAIDFindDuplicates.sh: line 373: verbose_to_bpth: command not found unRAIDFindDuplicates.sh: line 404: cd: /mnt/disk/mnt/scratch: No such file or directory checking /mnt/disk/mnt/scratch [SHARENAMEREDACTED] ... Duplicate Files --------------- **Looks like it's now listing every file below here (these may be genuine - TBC)** ... I'm running 6.9.2 if that helps in anyway. Thanks! John Edited May 3, 2021 by johner further tests, and clear questions call out, extra update on 2nd cache drive Quote Link to comment
ssean Posted August 16, 2022 Share Posted August 16, 2022 Is this script still the best solution for finding duplicate files? Please note that we are talking about files with the same name that are present in more than one location and are thus wasting space. Quote Link to comment
itimpi Posted August 16, 2022 Author Share Posted August 16, 2022 12 hours ago, ssean said: Is this script still the best solution for finding duplicate files? Please note that we are talking about files with the same name that are present in more than one location and are thus wasting space. Be interested in the response you get I wrote that script a long time ago, but if there is still interest in it and it’s capabilities are not superseded by something else I may look at reworking it to be a plugin which should make it friendlier to use. Quote Link to comment
JonathanM Posted August 16, 2022 Share Posted August 16, 2022 8 minutes ago, itimpi said: Be interested in the response you get I use the dupeguru container. https://forums.unraid.net/topic/56392-support-djoss-dupeguru/ Quote Link to comment
flyize Posted December 21, 2022 Share Posted December 21, 2022 On 8/16/2022 at 11:49 AM, JonathanM said: I use the dupeguru container. https://forums.unraid.net/topic/56392-support-djoss-dupeguru/ Since that uses /mnt/user by default, how can it be setup to search each disk? One of the first replies in that thread seems to suggest that won't, but maybe I'm missing something? Quote Link to comment
JonathanM Posted December 21, 2022 Share Posted December 21, 2022 6 hours ago, flyize said: Since that uses /mnt/user by default, how can it be setup to search each disk? One of the first replies in that thread seems to suggest that won't, but maybe I'm missing something? Change the /mnt/user mapping to /mnt, but be VERY careful not to go into the /mnt/user folder for finding dupes. It's either / or, and if you open up /mnt you are setting yourself up for a world of hurt if you allow it to search for dupes in /mnt/user AND /mnt/diskX or /mnt/poolname at the same time. 1 Quote Link to comment
dboonthego Posted September 23, 2023 Share Posted September 23, 2023 For anyone looking for the checksum binary compare script by Joe L. and discovered the link in this post is dead, I found it attached to this thread. Quote Link to comment
jaso Posted January 16 Share Posted January 16 Just wanted to do two things: 1. A big thankyou to itimpi for the unRAIDFindDuplicates script. I have had a few copy/move errors over the last decade and itempi's script just found nearly 400GB of dupes scattered over my 42GB unraid array. 2. I banged together a little script that looks at the output of the itimpi's script, and deletes the dupes. Note that you must do a bit of cleaning of itimpi's output file first - delete everything except the file paths. That is, remove the lines at beginning of duplicates.txt that look like this: (also delete file size warnings, and the lines for files associated to the warnings) COMMAND USED: ./unRAIDFindDuplicates.sh Duplicate Files --------------- Here is my script - I called it 'delete-dupes.sh'. Execute it like this: bash ./delete-dupes.sh '/boot/duplicates.txt' #!/bin/bash # Check if the file exists if [ ! -f "$1" ]; then echo "File not found!" exit 1 fi # Read the file line by line while IFS= read -r line; do # Check if the line is empty if [ -n "$line" ]; then # Prepend "/mnt/user/" to the line path="/mnt/user/$line" # Delete the file path rm -v "$path" fi done < "$1" Be careful. If you execute the delete-dupes script twice in a row it will delete the remaining (now unique) files. I had thousands of files that were duplicated, without the script I would have been manually deleting duplicate files for weeks. Thanks again itempi! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.