Comments on: Drowning in Files http://ask.metafilter.com/341901/Drowning-in-Files/
Comments on Ask MetaFilter post Drowning in FilesWed, 12 Feb 2020 17:50:09 -0800Wed, 12 Feb 2020 18:13:06 -0800en-ushttp://blogs.law.harvard.edu/tech/rss60Question: Drowning in Fileshttp://ask.metafilter.com/341901/Drowning-in-Files
Moving to a new computer. This involves moving a library of files that have been churning for 15 some odd years. Many are obsolete and a lot of them are duplicates.
The duplicates need to be weeded and others need to be sorted by age and evaluated.
I bought a copy of the latest 64-bit version of Xplorer2 that claims to find and remove duplicate files. It does not do the job.
What do you use for dealing with duplicate files?post:ask.metafilter.com,2020:site.341901Wed, 12 Feb 2020 17:50:09 -0800RaybunDuplicatefileremovalBy: Sunburnthttp://ask.metafilter.com/341901/Drowning-in-Files#4905105
<a href="http://www.scootersoftware.com/index.php">Beyond Compare</a> is what I use, though I don't use it often. Works a treat.comment:ask.metafilter.com,2020:site.341901-4905105Wed, 12 Feb 2020 18:13:06 -0800SunburntBy: jmfitchhttp://ask.metafilter.com/341901/Drowning-in-Files#4905107
CCleaner will do it.comment:ask.metafilter.com,2020:site.341901-4905107Wed, 12 Feb 2020 18:24:00 -0800jmfitchBy: davcoohttp://ask.metafilter.com/341901/Drowning-in-Files#4905110
Did you "flatten" (File &gt; Browse flat) your drive first in Xplorer2? <a href="https://www.zabkat.com/x2h_5.htm">Here's</a> the section of the help file dealing with duplicate files.comment:ask.metafilter.com,2020:site.341901-4905110Wed, 12 Feb 2020 18:29:23 -0800davcooBy: flabdablethttp://ask.metafilter.com/341901/Drowning-in-Files#4905116
<em>What do you use for dealing with duplicate files?</em><br>
<br>
I rely on the fact that every new computer I acquire comes with exponentially more storage than the one it replaced, and on using an <a href="https://borgbackup.readthedocs.io/en/stable/">excellent de-duplicating archiver</a> to back them all up.<br>
<br>
Long and bitter experience has taught me that "cleaning up" my computers has always caused me far more grief than not doing that. But if you're determined to go ahead with some such effort anyway, I <em>strongly</em> recommend making sure you have at least one <em>complete</em> offline archive of <em>all</em> of the pre-existing mess on any computer you propose to subject to the tidying process.<br>
<br>
Many, many people come to grief by trying to tidy their machines up <em>preparatory</em> to making their backups, on the apparent basis that they don't want their backups to be messy. But backups just <em>are</em> messy, especially if they're done as regularly as they really ought to be, and tidying up is <em>peak</em> data loss risk.comment:ask.metafilter.com,2020:site.341901-4905116Wed, 12 Feb 2020 18:47:40 -0800flabdabletBy: The Underpants Monsterhttp://ask.metafilter.com/341901/Drowning-in-Files#4905124
I do a search for files of that type, sort by name, then delete the duplicates from the search window. Then again, I have a high tolerance for repetitive tasks.comment:ask.metafilter.com,2020:site.341901-4905124Wed, 12 Feb 2020 19:03:43 -0800The Underpants MonsterBy: sardonyxhttp://ask.metafilter.com/341901/Drowning-in-Files#4905135
I've used <a href="http://www.clonespy.com/">CloneSpy</a> before. I'm not saying it got all of the duplicates, but it helped me clean up a lot of them before I did an OS upgrade.comment:ask.metafilter.com,2020:site.341901-4905135Wed, 12 Feb 2020 19:23:55 -0800sardonyxBy: lhauserhttp://ask.metafilter.com/341901/Drowning-in-Files#4905155
I use <a href="https://www.digitalvolcano.co.uk/dcpro.html">Duplicate Cleaner Pro</a>. It is fantastic with photos, but works on everything. Worth every penny of $29.95.comment:ask.metafilter.com,2020:site.341901-4905155Wed, 12 Feb 2020 20:46:03 -0800lhauserBy: Aleynhttp://ask.metafilter.com/341901/Drowning-in-Files#4905162
Seconding flabdablet's advice on making a complete "messy" backup before you start deleting anything. My opinion is that the best way to approach this is to dump all your old, unsorted stuff into an "archive" folder, then pull stuff that you need and want organized out of it as you need it. If you really need to free up space, external hard drives are pretty cheap.comment:ask.metafilter.com,2020:site.341901-4905162Wed, 12 Feb 2020 20:51:25 -0800AleynBy: flabdablethttp://ask.metafilter.com/341901/Drowning-in-Files#4905183
<em>the best way to approach this is to dump all your old, unsorted stuff into an "archive" folder, then pull stuff that you need and want organized out of it as you need it</em><br>
<br>
Yep. And if you always do that by <em>moving</em> entire existing folder trees into a date-named subfolder inside your Archive folder, like Archive\2020-02-13, and then <em>moving</em> individual files out of Archive subfolders and into their shiny new well-organized locations when you first need them, then the archiving and dearchiving steps can run conveniently fast and your Archive folder can be incorporated into the 15 year evolving mess in a reasonably sustainable fashion.<br>
<br>
If you're consistent about moving stuff in and out of Archive subfolders rather than trying to use Archive as a backup by making copies, then Archive will cost completely negligible amounts of disk space. Also, anything remaining inside an Archive/$date subfolder more than say five years old has clearly not been touched in at least five years and is therefore fairly unlikely to be needed again in a hurry. That lets you simply hive your oldest Archive subfolders off to external storage <em>without</em> needing to trigger a big tidy-up whenever you feel like reclaiming a bit of primary drive space. I like to think of this process as <a href="https://en.wikipedia.org/wiki/Subduction">subduction</a>.<br>
<br>
It bears repeating that Archive is an organizing tool, not a backup strategy. <em>Make</em> backups, and make sure that the contents of Archive get backed up along with everything else.comment:ask.metafilter.com,2020:site.341901-4905183Wed, 12 Feb 2020 22:40:32 -0800flabdabletBy: Homer42http://ask.metafilter.com/341901/Drowning-in-Files#4905199
I use the <a href="https://www.voidtools.com/support/everything/">Everything</a> filename search tool. Finding dupes is easy. Enter this in the search bar - <br>
dup:<br>
<br>
If the number of matches is too large to deal with, you can narrow it down using the <a href="https://www.voidtools.com/support/everything/searching/">search syntax</a>. Although the syntax is powerful and extensive, a subset will meet your needs. These are the ones I use the most.<br>
<br>
Find duplicates of a specified file type, e.g.<br>
dup: doc:<br>
(or - audio: zip: exe: pic: video:)<br>
<br>
Find duplicates using a size range constant defined in the search syntax, e.g.<br>
dupe: size:large<br>
(or - :tiny :small :medium :huge :gigantic) <br>
<br>
You can combine search syntax, e.g. <br>
dupe: video: size:gigantic<br>
<br>
Find duplicates by file suffix, e.g.<br>
dupe: endwith:pdf<br>
(or any other file suffix)comment:ask.metafilter.com,2020:site.341901-4905199Thu, 13 Feb 2020 03:12:02 -0800Homer42By: urbanwhalesharkhttp://ask.metafilter.com/341901/Drowning-in-Files#4905202
<a href="https://www.bigbangenterprises.de/en/doublekiller/">DoubleKiller</a>comment:ask.metafilter.com,2020:site.341901-4905202Thu, 13 Feb 2020 04:02:56 -0800urbanwhalesharkBy: megatheriumhttp://ask.metafilter.com/341901/Drowning-in-Files#4905203
Although your question focuses on duplicates, consider the following for the "obsolete" files: Use xplorer2 to find all files that are more than 3 years (or any other number of your choice) old and move them to a reliable external hard drive as an archive. That way the new computer looks less crowded but you haven't deleted anything that you might need to find next October.comment:ask.metafilter.com,2020:site.341901-4905203Thu, 13 Feb 2020 04:13:11 -0800megatheriumBy: Bangaiohhttp://ask.metafilter.com/341901/Drowning-in-Files#4905214
<em>find all files that are more than 3 years (or any other number of your choice) old and move them to a reliable external hard drive</em><br>
<br>
It depends on what files the OP uses regularly. For example, it's possible that they have a music library where most of the files are &gt;3 years old yet they could be needed at any moment.<br>
<br>
If you can afford the initial disk space penalty, I second <strong>flabdablet</strong> and <strong>Aleyn</strong>'s answers and emphasise you should place the Archive/$DATE folders within the <em>same physical storage device <strong>and</strong> filesystem</em>: whenever some old file is needed a simple <em>CTRL-X CTRL-V</em> will fix it instantly vs. having to plug in your USB drive and wait until things are copied back.<br>
The external storage is still welcome as a full backup for when you start pruning things from the Archive.comment:ask.metafilter.com,2020:site.341901-4905214Thu, 13 Feb 2020 05:33:38 -0800BangaiohBy: uberchethttp://ask.metafilter.com/341901/Drowning-in-Files#4905276
<blockquote><i>find all files that are more than 3 years (or any other number of your choice) old and move them to a reliable external hard drive</i></blockquote><br>
<br>
This seems like false economy.<br>
<br>
Unless you're big into video processing or something similar, it's HIGHLY unlikely that your garden-variety files represent a material portion of your disk usage. The incremental cost of keeping them around (with or without dupes, even) is pretty low, and the possible upside is large. <br>
<br>
I've never done ANY such purge, and have just migrated my data from old Mac to new Mac for nearly 20 years. It's a non-issue.<br>
<br>
When I have disk space issues, it's because of enormous client-party databases, or virtual machines, or because I need to do some catalog maintenance in LightRoom (you really CAN run out of space with higher-end photography, but LR makes it easy to push prior years to a network volume).<br>
<br>
<b>Generally speaking, though, don't put anything you actually want to keep onto an external drive.</b> Stuff on outboard disks tends to get neglected for backups and other things, and the next thing you know you've lost data.comment:ask.metafilter.com,2020:site.341901-4905276Thu, 13 Feb 2020 07:41:30 -0800uberchetBy: Bangaiohhttp://ask.metafilter.com/341901/Drowning-in-Files#4905424
Agreed but the question doesn't seem to hinge on disk space usage but rather organisation, hence why the "move (but not delete yet) everything out of the way and start anew" seems worthwhile.<br>
<br>
It's dangerous to make duplicate finder recommendations because the OP doesn't specify what exactly counts as a dupe for their purposes. Are 2 MP3 files with identical audio stream but different tags dupes? Or an MP3 and a FLAC of the same song? Or do only byte-for-byte identical files count?comment:ask.metafilter.com,2020:site.341901-4905424Thu, 13 Feb 2020 11:11:46 -0800BangaiohBy: soelohttp://ask.metafilter.com/341901/Drowning-in-Files#4905529
I second the recommendations for Everything and CloneSpy. Everything is better when you want to view the dupes and decide for yourself which one (if either) to delete. CloneSpy has an option go through one by one, but then you have to decide for each pair as it presents them to you rather than being able to scroll through as you please. CloneSpy is great for batch jobs where you trust the criteria.comment:ask.metafilter.com,2020:site.341901-4905529Thu, 13 Feb 2020 14:26:50 -0800soelo