htmlproofer - validate rendered HTML files
htmlproofer directory [options]
htmlproofer is a set of tests to validate HTML output.
These tests check if image references are legitimate, if they have alt tags,
if internal links are working, and so on. HTMLProofer can run on a
file, a directory, an array of directories, or an array of links. Below is a
mostly comprehensive list of checks that it can perform.
Images (<img> elements)
- •
- Whether all images have alt tags
- •
- Whether internal image references are not broken
- •
- Whether external images are showing
- •
- Whether images are HTTPS
Links (<a>, <link> elements)
- •
- Whether internal links are working
- •
- Whether internal hash references (#linkToMe) are working
- •
- Whether external links are working
- •
- Whether links are HTTPS
- •
- Whether CORS/SRI is enabled
Scripts (<script> elements)
- •
- Whether internal script references are working
- •
- Whether external scripts are loading
- •
- Whether CORS/SRI is enabled
Favicon
- •
- Whether favicons are valid.
OpenGraph
- •
- Whether the images and URLs in the OpenGraph metadata are valid.
HTML
- •
- Whether your HTML markup is valid.
This is done via Nokogiri to ensure well-formed markup.
Listed below are the command line options for
htmlproofer:
- --allow-missing-href
- Don't flag tags missing an href attribute. This is the default for
HTML5.
- --allow-hash-href
- Ignores href="#".
- --as-links
- Assumes that PATH is a comma-separated array of links to
check.
- --alt-ignore
image1,[image2,...]
- A comma-separated list of Strings or RegExps containing images whose
missing alt attributes are safe to ignore.
- --assume-extension
- Automatically add extension (e.g. .html) to file paths, to allow
extensionless URLs (as supported by Jekyll 3 and GitHub Pages).
- --checks-to-ignore
check1,[check2,...]
- An array of Strings indicating which checks not to perform.
- --check-external-hash
- Checks whether external hashes exist (even if the webpage exists). This
slows the checker down.
- --check-favicon
- Enables the favicon checker.
- --check-html
- Enables HTML validation errors from Nokogiri.
- --check-img-http
- Check, that images use HTTPS.
- --check-opengraph
- Enables the Open Graph checker.
- --check-sri
- Check that <link> and <script> external
resources do use SRI.
- --directory-index-file
filename
- Sets the file to look for when a link refers to a directory. Defaults to
index.html.
- --disable-external
- Don't run the external link checker, which can take a lot of time.
- --empty-alt-ignore
- If true, ignores images with empty alt attribues.
- --error-sort
sort
- Defines the sort order for error output. Can be :path,
:desc, or :status. Defaults to :path.
- --enforce-https
- Fails if a link is not marked as HTTPS.
- --extension
ext
- The extension of HTML files including the dot. Defaults to
.html.
- --external_only
- Only check for problems with external references.
- --file-ignore
file1,[file2,...]
- A comma-separated list of Strings or RegExps containing file paths that
are safe to ignore.
- --help
- Print this usage information on the command line.
- --http-status-ignore
123,[xxx, ...]
- A comma-separated list of numbers representing status codes to
ignore.
- --internal-domains
domain1,[domain2,...]
- A comma-separated list of Strings containing domains that will be treated
as internal urls.
- --report-invalid-tags
- Ignore errors from --check-html associated with unknown
markup.
- --report-missing-names
- Ignore errors from --check-html associated with missing
entities.
- --report-script-embeds
- Ignore errors from --check-html associated with
<script>s.
- --log-level
level
- Sets the logging level, as determined by Yell. One of :debug,
:info, :warn, :error, or :fatal. Defaults to
:info.
- --only-4xx
- Only reports errors for links that fall within the 4xx status code
range
- --storage-dir
directory
- Directory where to store the cache log. Defaults to
tmp/.htmlproofer.
- --timeframe
time
- A string representing the caching timeframe.
- --typhoeus-config
string
- JSON-formatted string of Typhoeus config. It will override the
html-proofer defaults.
- --url-ignore
link1,[link2,...]
- A comma-separated list of Strings or RegExps containing URLs that are safe
to ignore. It affects all HTML attributes. Non-HTTP(S) URIs are always
ignored.
- --url-swap
re:string,[re:string,...]
- A comma-separated list containing key-value pairs of RegExp =>
String. It transforms URLs that match RegExp into String
via gsub. The escape sequences \: should be used to produce
literal :s.'
For options which require an array of input, values can be
surrounded by quotes. Don't use any spaces. For example, to exclude an array
of HTTP status code:
htmlproofer --http-status-ignore 999,401,404
./out
For options like --url-ignore, which require an array of
regular expressions, the following syntax works:
htmlproofer --url-ignore /www.github.com/,/foo.com/
./out
The --url-swap switch is a bit special, since one will pass
in a pair of RegEx:String values. The escape sequences \:
should be used to produce literal :. htmlproofer will figure
out what you mean.
htmlproofer --url-swap wow:cow,mow:doh --extension
.html.erb --url-ignore www.github.com ./out
The program author is Garen Torikian.
This manual page page was written by Daniel Leidert
<daniel.leidert@wgdd.de> for the Debian distribution (but may be used
by others).