duck - the Debian Url ChecKer
duck [ OPTION ]... [-f file] [-u
file] [-c file] [URL]
duck extracts links, email address domains and VCS-*
entries from the following files:
- o
- debian/control
- o
- debian/upstream, debian/upstream-metadata.yaml and
debian/upstream/metadata
- o
- debian/copyright
- o
- DEP-3 patch files in every directory a series file is found
- o
- systemd.unit files (*.socket, *.device, *.mount,
*.automount, *.swap, *.target, *.path,
*.time, *.snapshot, *.slice, *.scope)
- o
- Appstream files (*.appdata)
If an URL is supplied, duck uses dget to download
the specified URL and processes the downloaded source package (*.dsc file)
instead of working on the current directory.
It tries to access those VCS-* entries and URLs using the
appropriate tool to find out whether the given URLs or entries are broken or
working. If errors are detected, the filename, fieldname and URL/email of
the broken entry are displayed.
duck will search for the default files (see above) and skip
them silently, if they cannot be found. If specific filenames for options
-c, -f or -u are given, and one of those files cannot
be found, duck exits with exit code 2.
Email address domains are checked for existing MX records,
A records, or AAAA records, in this order. If none of these 3
are found for a given domain, it is considered broken.
Checks results are displayed with 3 different error levels
- O:
- (OK) Indicates that the given check did not result in an error. Only shown
if -n is used.
- I:
- (Information) Indicates informational warnings, suchs as missing helper
tools as well as failing checks based on searches in unstructured text
files, which sometimes lead to false positives.
- E:
- (Error) Indicates failing checks based on data from well-defined fields
(e.g. Homepage: entry in debian/control).
and 3 different certainty-levels
- certain
- Data taken from well defined fields. As the format of this field is
specified (e.g. Debian Policy, etc.), it can be checked by the appropriate
tools. If this check then fails, the data in the field is certainly
erroneous.
- possible
- Data extracted using regular expressions (e.g. email addresses, URLs).
This might lead to false positives, so the check result is possibly a
false positive.
- wild-guess
- Data extracted from websites, by using regular expressions. This is still
experimental and probably buggy, hence the "wild-guess".
- -n
- dry run. Don't run any checks, just show entries to be checked.
- -q
- quiet mode. Suppress all output.
- -v
- verbose mode. This shows all URLs found and the checks run.
- --modules-dir=DIRECTORY
- specify modules directory. Mostly useful for developing new checks. If
this parameter is specified, only modules defined in this directory are
used. You have to copy all *.pm files from
/usr/share/duck/lib/checks to the directory specified.
- --color=[WHEN]
- Specify when to emit escape sequences to the output. Available options
are:
-
- auto Emit color escape codes on STDOUT/STDERR, no color if output
is piped to a file or the current terminal is not capable of displaying
colors.
-
-
-
- always Always emit color escape codes.
-
-
-
- never Never emit color escape codes.
-
-
- --no-https
- do not try to find matching https URLs to http URLs. See also the
DUCK_NOHTTPS environment variable.
- --no-check-certificate
- do not check if SSL certificates autenticity. This is highly
discouraged!
- --missing-helpers
- display list of missing external helper tools and exits.
- --version
- display copyright and version information
- -f
- specify path to control file. This overrides the default
debian/control.
- -F
- skip processing of the control file.
- -u
- specify path to upstream metadata file. This overrides the default files
debian/upstream, debian/upstream-metadata.yaml and
debian/upstream/metadata.
- -U
- skip processing of the upstream metadata file.
- -c
- specify path to copyright file. This overrides the default
debian/copyright.
- -C
- skip processing of copyright file.
- -P
- skip processing of patch files.
- -A
- skip processing of appstream metadata files.
- -S
- skip processing of systemd.unit files.
- -l filename
- Process URLs, email addresses, git:// and svn:// entries from the file
specified. Specify one entry per line. This also disables all other check
modules searching for entries in various files.
- --disable-urlfix=<fix1,...>
- disables the specified url fix(es). An urlfix tries to remove
leading/trailing characters from extracted URLs, like trailing dots or
parentheses. Using this parameter enables all urlfixes minus the
specified ones.
- --enable-urlfix=<fix1,...>
- enables the specified url fix(es). Using this parameter disabled
all urlfixes minus the specified ones.
The following urlfixes are available:
-
- TRAILING_COLON Removes trailing colon ":" character.
-
-
TRAILING_PAREN_DOT Removes the string
")." from the end of the URL.
-
- TRAILING_PUNCTUATION Removes trailing "." and
"," characters.
-
- TRAILING_QUOTES Removes trailing single quotes "'"
characters. Note: Double quotes (") are already correctly
handled by the used perl regex matchers.
-
- TRAILING_SLASH_DOT Removes the string "/."
(without the quotes) from the end of the URL.
-
- TRAILING_SLASH_PAREN Removes the string "/)"
(without the quotes) from the end of the URL.
- --tasks=[number]
- Specify the number of checks allowed to run in parallel. Default value is
24. This value must be an integer value >0.
All urlfixes are enabled by default.
- DUCK_NOHTTPS
- If this variable is set, do not try to find matching https URLs to http
URLs.
- XDG_CONFIG_HOME
- if this variable is set, use the config file (if any)
$XDG_HOME/duck/duck.conf. The default value is
$HOME/.config/duck/duck.conf .
- XDG_CONFIG_DIRS
- defines the preference-ordered set of base directories to search for
configuration files in addition to the XDG_CONFIG_HOME base
directory. The directories in XDG_CONFIG_DIRS should be separated
with a colon ':'.
To run duck, change your working directory to an extracted debian
source package and run: duck
- 0
- Success, no errors
- 1
- Error(s) detected
- 2
- User-specified file not found
- debian/duck-overrides
- Overrides-file in the Debian package source tree. This files contains a
list of URL regexs which should not be reported as down/broken. This might
be useful in cases, where URLs are extracted from old/outdated
copyright-files or patches, which will never ever be working, and which
will then lead to false positives. Please see an example in
/usr/share/doc/duck/examples.
- duck.conf
- Config file which contains the regular expressions used to detect parked
domains, redirected websites and The default file is in
/etc/duck/duck.conf. duck also honors the XDG Base Directory
Specification, see the section ENVIRONMENT VARIABLES for details.
Search order for duck.conf is:
$XDG_CONFIG_HOME/duck/duck.conf (default:
$HOME/.config/duck/duck.conf)
/etc/duck/duck.conf
/$XDG_CONFIG_DIRS (default:
/etc/xdg/duck/duck.conf)
Please see the XDG Base Directory Specification
(https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html)
for more details.
Please see http://duck.debian.net/ for additional
information as well as an overview of duck checks run on all source packages
in Debian/unstable.