Duplicate files across disks

Followers

January 15, 20233 yr

I recently moved to a new array with drives and use unbalance to copy data on to the new drives. I now notice that there are duplicate files on different disks as shown in the attachments. Is there a script or plugin to sort this out or do I manually go in there and find them and delete them from the disks?

thanks!

Quote

Solved by JorgeB

January 18, 20233 yr

Go to solution

January 18, 20233 yr

Author

*bump please

Quote

January 18, 20233 yr

Community Expert

You have for example the Dupeguru docker.

Quote

January 18, 20233 yr

Author

No, not my issue. I used it and it will find duplicates within the shares. Unraid has several of the same files spread across the actual disks. In the shares it shows only the 1 file. for example, 'file1' is on Disk 1 and Disk 2 under Videos. Looking at the Share it only shows 'file1'. Yet it is taking 2x the disk space. Does that make sense?

Quote

January 18, 20233 yr

Community Expert
Solution

https://forums.unraid.net/topic/74760-how-do-i-find-duplicates-on-multiple-discs/?do=findComment&comment=688886

Quote

January 18, 20233 yr

Author

Well thats not gonna happen lol. I will work on my plan b. Move the directories with unbalance to a new drive and overwrite. Thanks again though.

Quote

January 18, 20233 yr

That link contained info for Dupegugu, and there was also a second link (one I was looking for to provide you and couldn't find) for a script itimpi created you can find here:

https://forums.unraid.net/topic/33535-unraidfindduplicatessh/

Quote

January 19, 20233 yr

Author

19 hours ago, klepel said:

That link contained info for Dupegugu, and there was also a second link (one I was looking for to provide you and couldn't find) for a script itimpi created you can find here:

https://forums.unraid.net/topic/33535-unraidfindduplicatessh/

Thanks! It helped me find them when I missed them. Appreciate it!

Quote

1 month later...

March 10, 20233 yr

On 1/18/2023 at 11:23 AM, JorgeB said:

https://forums.unraid.net/topic/74760-how-do-i-find-duplicates-on-multiple-discs/?do=findComment&comment=688886

As a FYI I find Czkawka in Dockerhub works best.

Quote

11 months later...

February 18, 20242 yr

Thanks for the clarification on shares vs disks usage. It helped validate that this was the tool I wanted to use to try this, rather than an external one which would have inherent limitations.
I thought I didn't need to do this and I was just checking. My process is very clean, so I shouldn't have needed to do this. Despite that, I still found a fair number of dupes. This was a good reminder of the old accountant's saying: 99% correct is 1% wrong. It doesn't mean your process doesn't work. It's a reminder of why there are checks and balances. I definitely think it's worth running occasionally, probably first on shares then secondly on disks.
Observations: this takes a few gigs of memory with larger data sets just against filenames, and can probably easily consume even more with more complicated data sets or filters. It takes some time to process, so it's probably best to fire it off (before you run out of space ) and return later.
~~Delete~~ Dedupe with care.... Good luck.

Edit: Most of these were simple name collision dupes (i.e. "Book 1") 😂🤦‍♂️but enough were real to be worth the effort.

Edited February 18, 20242 yr by ixit

Quote

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Followers

Go to topic listing

Duplicate files across disks

Featured Replies

Solved by JorgeB

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)