DOKK / manpages / debian 12 / ruby-html-proofer / htmlproofer.1.en
HTMLProofe(1) User commands HTMLProofe(1)

htmlproofer - validate rendered HTML files

htmlproofer directory [options]

htmlproofer is a set of tests to validate HTML output. These tests check if image references are legitimate, if they have alt tags, if internal links are working, and so on. HTMLProofer can run on a file, a directory, an array of directories, or an array of links. Below is a mostly comprehensive list of checks that it can perform.

Images (<img> elements)

Whether all images have alt tags
Whether internal image references are not broken
Whether external images are showing
Whether images are HTTPS

Links (<a>, <link> elements)

Whether internal links are working
Whether internal hash references (#linkToMe) are working
Whether external links are working
Whether links are HTTPS
Whether CORS/SRI is enabled

Scripts (<script> elements)

Whether internal script references are working
Whether external scripts are loading
Whether CORS/SRI is enabled

Favicon

Whether favicons are valid.

OpenGraph

Whether the images and URLs in the OpenGraph metadata are valid.

HTML

Whether your HTML markup is valid.

This is done via Nokogiri to ensure well-formed markup.

Listed below are the command line options for htmlproofer:

Don't flag tags missing an href attribute. This is the default for HTML5.
Ignores href="#".
Assumes that PATH is a comma-separated array of links to check.
A comma-separated list of Strings or RegExps containing images whose missing alt attributes are safe to ignore.
Automatically add extension (e.g. .html) to file paths, to allow extensionless URLs (as supported by Jekyll 3 and GitHub Pages).
An array of Strings indicating which checks not to perform.
Checks whether external hashes exist (even if the webpage exists). This slows the checker down.
Enables the favicon checker.
Enables HTML validation errors from Nokogiri.
Check, that images use HTTPS.
Enables the Open Graph checker.
Check that <link> and <script> external resources do use SRI.
Sets the file to look for when a link refers to a directory. Defaults to index.html.
Don't run the external link checker, which can take a lot of time.
If true, ignores images with empty alt attribues.
Defines the sort order for error output. Can be :path, :desc, or :status. Defaults to :path.
Fails if a link is not marked as HTTPS.
The extension of HTML files including the dot. Defaults to .html.
Only check for problems with external references.
A comma-separated list of Strings or RegExps containing file paths that are safe to ignore.
Print this usage information on the command line.
A comma-separated list of numbers representing status codes to ignore.
A comma-separated list of Strings containing domains that will be treated as internal urls.
Ignore errors from --check-html associated with unknown markup.
Ignore errors from --check-html associated with missing entities.
Ignore errors from --check-html associated with <script>s.
Sets the logging level, as determined by Yell. One of :debug, :info, :warn, :error, or :fatal. Defaults to :info.
Only reports errors for links that fall within the 4xx status code range
Directory where to store the cache log. Defaults to tmp/.htmlproofer.
A string representing the caching timeframe.
JSON-formatted string of Typhoeus config. It will override the html-proofer defaults.
A comma-separated list of Strings or RegExps containing URLs that are safe to ignore. It affects all HTML attributes. Non-HTTP(S) URIs are always ignored.
A comma-separated list containing key-value pairs of RegExp => String. It transforms URLs that match RegExp into String via gsub. The escape sequences \: should be used to produce literal :s.'

For options which require an array of input, values can be surrounded by quotes. Don't use any spaces. For example, to exclude an array of HTTP status code:

htmlproofer --http-status-ignore 999,401,404 ./out

For options like --url-ignore, which require an array of regular expressions, the following syntax works:

htmlproofer --url-ignore /www.github.com/,/foo.com/ ./out

The --url-swap switch is a bit special, since one will pass in a pair of RegEx:String values. The escape sequences \: should be used to produce literal :. htmlproofer will figure out what you mean.

htmlproofer --url-swap wow:cow,mow:doh --extension .html.erb --url-ignore www.github.com ./out

The program author is Garen Torikian.

This manual page page was written by Daniel Leidert <daniel.leidert@wgdd.de> for the Debian distribution (but may be used by others).

2019-03-24