shatag - tag files with their SHA-256 checksums
shatag [-fhlLqrstuv0] [-d DATABASE]
[-n NAME] [-R NAME]... [FILES]...
shatag is a tool for computing and caching SHA-256 file
checksums, and efficiently search for identical file across systems.
Checksums are stored using the POSIX Extended Attributes filesystem
facility, and are preserved when files are moved or renamed. Checksums can
be fetched from a remote host and stored in an sqlite database for fast
lookups.
When invoked with no options, shatag just displays the
cached, valid checksums. If no files are specified, it applies to all
non-hidden files in the current directory. The output format is identical to
the one of the sha256sum command.
- -0, --null
- Instead of outputting one record per line (like sha256sum does,)
separate records with null characters.
- -c, --canonical
- Show canonical (full path) file names.
- -d DATABASE,
--database DATABASE
- Set the path of the SQLite database to query when using -l ,
-L or -p (The default path is $HOME/.shatagdb, overridable
from the config file)
Instead of a file name, a PostgreSQL database can be specified
with a prefix of "pg:" followed by a psycopg2 DSN string,
like:
"pg:dbname=shatag user=myuser password=mypassword
host=192.168.1.3"
- -f, --force
- When running with -t or -u , force recompute the checksum
and overwrite the old one, even if the timestamp indicates a good
checksum.
- -h, --help
- Displays the help message
- -l, --lookup
- Instead of displaying the checksums, look them up against the local
database and indicate if the file exists. A yellow - mark indicates
that the file does not exist somewhere else, a green = that the
file exists at one or several remote locations, a red + that the
file has a duplicate on the local system, and a magenta * that the
file is empty.
- -L,
--lookup-verbose
- Instead of displaying the checksums, look them up against the local
database. Print all the known remote locations for identical files.
- -n NAME, --name
NAME
- Name of local storage (defaults to canonical local host name). This needs
to be correct if the local database contains entries for this own host.
- -p, --put
- Record found tags in the database, for duplicate detection.
- -q, --quiet
- Do not display the valid checksums when they are found.
- -r, --recursive
- Recurse trough subdirectories
- -R NAME, --remote
NAME
- When using -l or -L , This is used to restrict the set of
remote names to consider. If present, other storages will be ignored.
- -s, --scrub
- Recompute the checksum even if the timestamp indicates it would not be
needed, and report inconsistencies. Useful to detect silent corruption.
- -t, --tag
- Compute new checksums for files that don't have one, or when it is
outdated.
- -u, --update
- Recompute the outdated checksums only. Be aware that this can behave
counter-intuitively; outdated checksums will only exists for files that
have been appended to or partially modified. Many programs dealing with
small files (some well-known text editors, notably) will overwrite the
whole file when saving, and the new file will be lacking a checksum
entirely. For these cases, use -t instead.
- -v, --verbose
- Report encoutered files that have an outated or missing checksum.
Retag a whole directory and record everything to the database:
Check files in the current directory for remote duplicates:
Show alternate locations for duplicates of a single file:
- ~/.shatagrc
- YAML configuration file. Currently has only two possible configuration
keys: "database", which sets the database path (by default,
~/.shatagdb) and "name" for the volume name in the database
(default to canonical host name.)
Examples:
database: /var/lib/shatag.db # sqlite3 backend database:
"pg: dbname=shatag host=localhost user=shatag
password=xxxsecretpasswordxxx" # postgres backend database:
http://service.com/shatag # http backend database:
insecure-https://service.com/shatag # http backend, skip ssl certificate
verification
Support for non-ASCII filenames across systems of different and/or
inconsistent encodings have not been fully tested.
Not all option combinations are sensible.
Report shatag bugs to the bugtracker at
http://bitbucket.org/maugier/shatag,