airy52 Posted November 12, 2021 Posted November 12, 2021 (edited) Like many of you, I use my unraid server for plex and media downloading/management. Recently I discovered that hardlinks weren't working properly and found out it was because I was downloading to a different folder mapping than the storage mapping. I set everything to the same /Media path and its working now, but I have a LOT of old data that is now duplicated in the downloads folder(/Media/Sonarr/Downloads) and in the place that Sonarr then organized it after it finished(/Media/TV and /Media/Anime). I've read about some tools like Czkawka(https://github.com/qarmin/czkawka) and DupeGuru(https://github.com/arsenetar/dupeguru) that will help me find the duplicate files, hardlink(or symlink? softlink?) them, and remove the duplicate. I want to do this but I only have enough linux knowledge to do basics or follow instructions. My main concerns are that some files in the downloads folder might be a duplicate, but not be on the same drive anymore (I have 2 drives + 1 parity + cache), and I think that will be an issue? Also I'm not familiar with the inner workings of unraid and how it presents multiple drives as one folder in /mnt/user/, and I don't want to break it running something not intended for this configuration on it. So my question, can any of you help me figure out how to do this properly with any of these(or other) tools? Edited November 19, 2021 by airy52 solved 1 Quote
airy52 Posted November 19, 2021 Author Posted November 19, 2021 For anyone that finds this in a search, I used jdupes included in nerdpack/nerdtools. Command I used was: jdupes -QrL -X size+=:100000k /mnt/user/Media/ get rid of -Q(quick, uses file hashes instead of direct binary comparison) if you don't care if it takes longer or your data is nuclear launch codes or something. 2 Quote
SkilledAlpaca Posted April 4, 2024 Posted April 4, 2024 Apologies for the necro post but this was one of the few threads I've found that has a solution for deduplication and I wanted to expand on the command above from OP. Mainly just adding the flags from their command so you know what it's doing without blindly running it. jdupes Usage: https://codeberg.org/jbruchon/jdupes#usage jdupes -QrL -X size+=:100000k /mnt/user/<share> -Q --quick skip byte-by-byte duplicate verification. WARNING: this may delete non-duplicates! Read the manual first! -r --recurse for every directory, process its subdirectories too -R --recurse: for each directory given after this option follow subdirectories encountered within (note the ':' at the end of the option, manpage for more details) -L --link-hard hard link all duplicate files without prompting -X --ext-filter=x:y filter files based on specified criteria Use '-X help' for detailed extfilter help size[+-=]:size[suffix] Only Include files matching size criteria Size specs: + larger, - smaller, = equal to Specs can be mixed, i.e. size+=:100k will only include files 100KiB or more in size. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.