Helmonder Posted September 8, 2014 Share Posted September 8, 2014 Duplicate file logging no longer available in V6. See thread: http://lime-technology.com/forum/index.php?topic=35037.msg325868#msg325868 Tested in b8 Quote Link to comment
limetech Posted September 8, 2014 Share Posted September 8, 2014 Duplicate file logging no longer available in V6. See thread: http://lime-technology.com/forum/index.php?topic=35037.msg325868#msg325868 Tested in b8 That's a feature: it prevents the system log from blowing up when there are numerous duplicates. Duplicates are shown for each file (if they exist) when viewed via the 'folder' icon on the Shares page (that is, using 'indexer' feature). Granted this is not a good way to find all the duplicates. What's needed is a utility to scan an entire share and produce a report. Quote Link to comment
Helmonder Posted September 8, 2014 Author Share Posted September 8, 2014 Duplicate file logging no longer available in V6. See thread: http://lime-technology.com/forum/index.php?topic=35037.msg325868#msg325868 Tested in b8 That's a feature: it prevents the system log from blowing up when there are numerous duplicates. Duplicates are shown for each file (if they exist) when viewed via the 'folder' icon on the Shares page (that is, using 'indexer' feature). Granted this is not a good way to find all the duplicates. What's needed is a utility to scan an entire share and produce a report. Agreed... the option in the syslog was never really handy also.. A little link to something on the settings page (or possibly something that can be scheduled to run) would be a lot better.. Any command line guru's out there who can think up something ? Quote Link to comment
Helmonder Posted September 8, 2014 Author Share Posted September 8, 2014 Can someone make sense of this: http://duff.dreda.org/ DUFF is supposed to find duplicates and should run on slackware.. No idea how to get it working from the unraid console though.. I Quote Link to comment
JonathanM Posted September 8, 2014 Share Posted September 8, 2014 Can someone make sense of this: http://duff.dreda.org/ DUFF is supposed to find duplicates and should run on slackware.. No idea how to get it working from the unraid console though.. I Unfortunately we are not really talking about duplicates in the normal sense, we're actually talking about a naming collision that happens because of the way multiple disks can participate in a user share. /mnt/disk1/foldera/filea.txt will collide with /mnt/disk2/foldera/filea.txt, and the combined view of /mnt/user/foldera/filea.txt will be the one on disk 1, and the file on disk 2 is basically invisible to the user share. The actual contents of the file are irrelevant, they may or may not be duplicates. So, a traditional duplicate file finder is not very helpful. It can get really messy if you don't keep track of which applications are writing directly to the disks, and which are using the user shares. On the plus side, you can easily hide files if you wish by purposely naming them identically and keeping them on different disks. Anyone using the user share will only see the first copy. Quote Link to comment
Helmonder Posted September 8, 2014 Author Share Posted September 8, 2014 Can someone make sense of this: http://duff.dreda.org/ DUFF is supposed to find duplicates and should run on slackware.. No idea how to get it working from the unraid console though.. I Unfortunately we are not really talking about duplicates in the normal sense, we're actually talking about a naming collision that happens because of the way multiple disks can participate in a user share. /mnt/disk1/foldera/filea.txt will collide with /mnt/disk2/foldera/filea.txt, and the combined view of /mnt/user/foldera/filea.txt will be the one on disk 1, and the file on disk 2 is basically invisible to the user share. The actual contents of the file are irrelevant, they may or may not be duplicates. So, a traditional duplicate file finder is not very helpful. It can get really messy if you don't keep track of which applications are writing directly to the disks, and which are using the user shares. On the plus side, you can easily hide files if you wish by purposely naming them identically and keeping them on different disks. Anyone using the user share will only see the first copy. Mwaa.. that may be true but I think the majority of users are using user shares and the majority of users will sometimes do large disk to disk transfers using MC and or commandline copy... While doing that you run a risk of getting the same folder on more drives.. This is what unraid calls a duplicate. A tool should be run on the /mnt/disk* shares (therefor excluding the /mnt/user) and then showing duplicate files, duplicate should be duplicate with the full pathname... (so /mnt/disk1/movies/movie.mkv and /mnt/disk2/movies/movie.mkv would be a duplicate while /mnt/disk1/movies/movie.mkv and /mnt/disk2/movies/alternate/movie.mkv would be no duplicate. If the term "duplicate" is misleading you could also call it orphaned files or something like that.. Either way they are important since they are hogging space and you will never know that.. The described would be ideal way, but a simple, standard, duplicate scanner would also help a lot.. You would have to manually weed out some of the non-duplicates but that would not be to much of a hassle.. Quote Link to comment
switchman Posted September 8, 2014 Share Posted September 8, 2014 While it will not find all the duplicates, I posted an Excel based utility yesterday that helps. http://lime-technology.com/forum/index.php?topic=33689.msg326408#msg326408 Updated my utility to V3 and am sharing it if anyone needs it. I added a rudimentary capability of detecting duplicate files. It only looks within the share being checked and it must have the same directory structure. For example, suppose you had the same file on both disks as show below. These two files would be flagged as duplicates. No other checking is performed, i.e. I don’t look at the size or do any comparison other than the directory structure/filename. These two files would be flagged as duplicates. \disk3\Movies\Two Weeks Notice (2002)\Two Weeks Notice.mkv \disk4\Movies\Two Weeks Notice (2002)\Two Weeks Notice.mkv. The following would not be flagged as they have a different file structure. \disk3\Movies\2 Weeks Notice (2002)\Two Weeks Notice.mkv \disk4\Movies\Two Weeks Notice (2002)\Two Weeks Notice.mkv. The duplicate listings are displayed on the tab “Share_File_Listing”. You can filter on the error column to display only the duplicate Quote Link to comment
gfjardim Posted September 8, 2014 Share Posted September 8, 2014 Maybe something like this: <pre> <? function relativePath($file){ preg_match("%/disk\d+/([^:]*)%", $file, $matches); return $matches[1]; } function searchDuplicates($file){ global $disks; $out = array(); $rel = relativePath($file); foreach ($disks as $disk) { $abs = "$disk/$rel"; if(is_file($abs) && $abs != $file) { $out[] = $abs; } } return $out; } function listDir($dir) { global $duplicates; $dir = rtrim($dir, '\\/'); if (! is_dir($dir)) return NULL; $Files = array_diff(scandir($dir), array('..', '.')); if (! $Files) return $result; natcasesort($Files); foreach ($Files as $f) { $dirname = "$dir/$f"; if (is_dir( $dirname )) { listDir($dirname); } else { $dups = searchDuplicates($dirname); if (count($dups) && array_search($dirname, $duplicates) === FALSE ) { echo "\n\nDuplicate:\n " . shell_exec("ls -lah \"$dirname\""); $duplicates[] = $dirname; foreach ($dups as $dup) { if ( array_search($dup, $duplicates) === FALSE) echo " " . shell_exec("ls -lah \"$dup\"") ; $duplicates[] = $dup; } } } } } #Global var for duplicates: $duplicates = array(); preg_match_all("%disk\d+%", implode(" ", scandir("/mnt")), $matches); $disks = array(); foreach ($matches[0] as $disk) { $disks[] = "/mnt/$disk"; } natsort($disks); foreach ($disks as $disk) { echo "Scanning $disk:"; listDir( $disk ); echo "\n\n"; } ?> </pre> It seems to work. It's a port I just did from a python script I did a long time ago. Quote Link to comment
Helmonder Posted September 9, 2014 Author Share Posted September 9, 2014 How would I run that though ? Quote Link to comment
gfjardim Posted September 9, 2014 Share Posted September 9, 2014 How would I run that though ? php <name of the file> Quote Link to comment
Helmonder Posted September 9, 2014 Author Share Posted September 9, 2014 Ehm.. I see this got moved to UNSCHEDULED .. Did I miss something in the release notes of the previous version pertaining the drop of this feature ? I honoustly think it a bit weird that something that was feature now becomes an unscheduled new feature... And its not something trivial also.. At least I do not think so.. Quote Link to comment
itimpi Posted September 10, 2014 Share Posted September 10, 2014 I have put together a shell script for my own use that gives me the sort of results that others might be looking for from such a utility. I have started a thread in the User Customizations area for this script. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.