find-duplicates
find-duplicates
finds duplicate files quickly based on the
xxHashes of their contents.
$ go install github.com/twpayne/find-duplicates@latest
$ find-duplicates
{
"cdb8979062cbdf9c169563ccc54704f0": [
".git/refs/remotes/origin/main",
".git/refs/heads/main",
".git/ORIG_HEAD"
]
}
find-duplicates [options] [paths...]
paths
are directories to walk recursively. If no paths
are given then the
current directory is walked.
The output is a JSON object with properties for each observed xxHash and values arrays of filenames with contents with that xxHash.
Options are:
--keep-going
or -k
keep going after errors.
--output=<file>
or -o <file>
write output to <file>
, default is stdout.
--threshold=<int>
or -t <int>
sets the minimum number of files with the same
content to be considered duplicates. The default is 2.
--statistics
or -s
prints statistics to stderr.
find-duplicates
work?find-duplicates
aims to be as fast as possible by doing as little work as
possible, using each CPU core efficiently, and using all the CPU cores on your
machine.
It consists of multiple components:
All components run concurrently.
MIT