DUFF(1) | General Commands Manual | DUFF(1) |
duff
— duplicate
file finder
duff |
[-0HLPaeqprtz ] [-d
function] [-f
format] [-l
limit] [file ...] |
duff |
[-h ] |
duff |
[-v ] |
The duff
utility reports clusters of
duplicates in the specified files and/or directories. In the default mode,
duff
prints a customizable header, followed by the
names of all the files in the cluster. In excess mode,
duff
does not print a header, but instead for each
cluster prints the names of all but the first of the files it includes.
If no files are specified as arguments,
duff
reads file names from stdin.
Note that as of version 0.4, duff
ignores
symbolic links to files, as that behavior was conceptually broken.
Therefore, the -H
, -L
and
-P
options now apply only to directories.
The following options are available:
-0
This is useful for file names containing whitespace or other non-standard characters.
-H
-L
or -P
option.
Note that this only applies to directories, as symbolic links to files are
never followed.-L
-H
or -P
option. Note that
this only applies to directories, as symbolic links to files are never
followed.-P
-H
or -L
option. This is
the default. Note that this only applies to directories, as symbolic links
to files are never followed.-a
-d
function-e
-f
formatThe following escape sequences are available:
%n
%c
%d
, for compatibility
reasons.%d
-t
as no digest is calculated.%i
%s
%%
The default format string when using
-t
is:
%n files in cluster %i (%s
bytes)
The default format string for other modes is:
%n files in cluster %i (%s bytes,
digest %d)
-h
-l
limitduff
will sample and compare a few bytes from the start of each file before
calculating a full digest. This is stricly an optimization and does not
affect which files are considered by duff. The default limit is zero
bytes, i.e. to use sampling on all files.-q
-p
duff
consider physical files
instead of hard links. If specified, multiple hard links to the same
physical file will not be reported as duplicates.-r
-t
duff
compares files byte by byte when their
sizes match.-v
-z
The command:
duff -r foo/
lists all duplicate files in the directory foo and its subdirectories.
The command:
duff -e0 * | xargs -0 rm
removes all duplicate files in the current directory. Note that
you have no control over which files in each cluster that are selected by
-e
(excess mode). Use with care.
The command:
find . -name '*.h' -type f |
duff
lists all duplicate header files in the current directory and its subdirectories.
The command:
find . -name '*.h' -type f -print0 |
duff -0 | xargs -0 -n1 echo
lists all duplicate header files in the current directory and its subdirectories, correctly handling file names containing whitespace. Note the use of xargs and echo to remove the null separators again before listing.
The duff
utility exits 0 on
success, and >0 if an error occurs.
Camilla Berglund ⟨elmindreda@elmindreda.org⟩
duff
doesn't check whether the same file
has been specified twice on the command line. This will lead it to report
files listed multiple times as duplicates when not using
-p
(physical mode). Note that this problem only
affects files, not directories.
duff
no longer (as of version 0.4) reports
symbolic links to files as duplicates, as they're by definition always
duplicates. This may break scripts relying on the previous behavior.
If the underlying files are modified while duff is running, all bets are off. This is not really a bug, but it can still bite you.
January 18, 2012 | Debian |