daman12 Posted June 23, 2021 Share Posted June 23, 2021 Deduper is a Python script bundled into a Docker container that automatically deletes any files with the same content, regardless of name. This is done based off of the SHA512 hash. Quote Link to comment
daman12 Posted June 24, 2021 Author Share Posted June 24, 2021 (edited) Why does it keep saying Exited(0)? To fix this please run a "Force Update" Edited June 26, 2021 by daman12 Quote Link to comment
howyoulikethat Posted June 25, 2021 Share Posted June 25, 2021 This would be great if it had a gui interface, gave a report of what dupes were found, and had options to choose which ones to keep. 2 Quote Link to comment
Phatty Posted July 6, 2021 Share Posted July 6, 2021 This is a great tool! and really happy to see a GUI. I assume this is a work in progress, after the hash process is done it gives a horizontal list of files that will be deleted and so runs off the screen. excited to use it once the UI gets an update. Quote Link to comment
Shandidy Posted August 15, 2021 Share Posted August 15, 2021 (edited) Is it Safe to use, is it going to cause any data loss?, I used to dedup my data on windows server box, when I specify a drive it dose dedup on that drive, if you copy data out the drive it will undo the dedup it uses chunk store to do the dedup, sorry if my question sounded stupid since I am a windows guy and a total newbie to linux and unraid and I just built and migrated my data to the unraid box and I saw this docker today and decided to use it but wanted to make sure this will not cause any data loss, I am currently targeting multiple folders by manually pointing the docker container to it to scan, wanted to make sure it is configured correctly. /Familyscan to /mnt/user/Family/ /appdata/mnt to /user/appdata/deduper /software-2scan to /mnt/user/Software-2/ /MediaServerScan to /mnt/user/MediaServer/ please advise. Thank you! Edited August 15, 2021 by Shandidy fixing typos Quote Link to comment
Shandidy Posted August 16, 2021 Share Posted August 16, 2021 I ran it since yesterday and there is no Dedup savings appeared on the storage, size remained the same, I am using BTRFS encrypted please advise Quote Link to comment
localh0rst Posted November 24, 2022 Share Posted November 24, 2022 On 8/16/2021 at 3:04 PM, Shandidy said: I ran it since yesterday and there is no Dedup savings appeared on the storage, size remained the same, I am using BTRFS encrypted please advise This is no "Dedupe" app. It searches for duplicate files and deletes them. Quote Link to comment
mattw Posted December 12, 2022 Share Posted December 12, 2022 I have installed this and allowed it to index but stopped the process before it completed. I have not been able to find good docs regarding what it will actually delete since it sounds like the user has no control... I will have some dupes, they will not have the same name and some may end up in odd places. The odd places contain the content that I want gone, not my photo/media/music folders that I want the removed from. Am I overlooking docs somewhere? Matt Quote Link to comment
Ladrek Posted December 27, 2022 Share Posted December 27, 2022 I keep getting this error message below but nothing delete's itself. Traceback (most recent call last): File "/main.py", line 9, in <module> with open("/appdata/hashes.json", "a+") as myfile: FileNotFoundError: [Errno 2] No such file or directory: '/appdata/hashes.json' [app] starting Deduper... Quote Link to comment
blasr Posted October 15, 2023 Share Posted October 15, 2023 On 6/23/2021 at 6:38 PM, daman12 said: Deduper is a Python script bundled into a Docker container that automatically deletes any files with the same content, regardless of name. This is done based off of the SHA512 hash. How long it will take to finish a 4TB storage volume? Quote Link to comment
Portonalga Posted December 5, 2023 Share Posted December 5, 2023 I can't suggest enough that you use an actual duplicate finder software instead of a script. You have little to no control over what gets deleted with a script. I run DupeGuru from my computer via SMB, and that has always yielded exactly the results I want. Not trying to undermine thee effort of thee OP or anything, but we need to be conscious about how we handle our data. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.