hxtoc - insert a table of contents in an HTML file
hxtoc [ -x ] [ -l low ] [ -h
high ] [ -t ] [ -d ] [ -c class ] [
-f ] [ file-or-URL ]
The hxtoc command reads an HTML file, inserts missing ID
attributes in all H1 to H6 elements between the levels -l and
-h (unless the option -d is in effect, see below) and also
inserts A elements with NAME attributes, so old browsers will recognize the
H1 to H6 headers as target anchors as well (unless the option -t is
in effect). The output is written to stdout.
If there is a comment of the form
<!--toc-->
or a pair of comments
<!--begin-toc-->
...
<!--end-toc-->
then the comment, or the pair with everything in between, will be
replaced by a table of contents, consisting of a list (UL) of links to all
headers in the document.
The text of headers is copied to this table of contents, including
any inline markup, except that ID attributes, DFN tags and SPAN tags with a
CLASS of "index" are omitted (but the elements' content is
copied).
The copied text can optionally be "flattened" first, see
option -f.
If a header has a CLASS attribute with as value (or one of its
values) the keyword "no-toc", then that header will not appear in
the table of contents.
The following options are supported:
- -x
- Use XML conventions: empty elements are written with a slash at the end:
<IMG />
- -l low
- Sets the lowest numbered header to appear in the table of content. Default
is 1 (i.e., H1).
- -h high
- Sets the highest numbered header to appear in the table of content.
Default is 6 (i.e., H6).
- -t
- Normally, hxtoc adds both ID attributes and empty A elements with a
NAME attribute and CLASS="bctarget", so that older browsers that
do no understand ID will still find the target. With this option, the A
elements will not be generated.
- -c class
- The generated UL elements in the table of contents will have a CLASS
attribute with the value class. The default is
"toc".
- -d
- Tries to use sectioning elements as targets in the table of contents
instead of H1 to H6. A sectioning elements is a DIV, SECTION, ARTICLE,
ASIDE or NAV element whose first child is a heading element (H1 to H6) or
an HGROUP. The sectioning element will be given an ID if it doesn't have
one yet. With this option, the level of any H1 to H6 that is the first
child of a sectioning element (or of an HGROUP that is itself the first
child of a sectioning element) is not determined by its name, but by the
nesting depth of the sectioning elements. (Any H1 to H6 that are not the
first child of a sectioning element still have their level implied by
their name.)
- -f
- Flatten the text of the table of contents. Without -f, the contents
of header elements are copied to the table of contents almost unchanged,
i.e., including any child elements and their attributes (except for ID
attributes, DFN elements and certain SPAN elements, as explained above).
With -f, the contents are flattened instead: All child elements are
removed and only their contents are copied to the table of contents.
Additionally elements with an ALT attribute, such as IMG, are replaced by
the contents of the ALT attribute. Exception: BDO tags are copied
unchanged and elements with a DIR attribute are replaced by a SPAN with
that DIR attribute. (BDO and DIR may occur in languages written
right-to-left.)
The following operand is supported:
- file-or-URL
- The name or URL of an HTML file. If absent, standard input is read
instead.
The following exit values are returned:
- 0
- Successful completion.
- > 0
- An error occurred in the parsing of the HTML file. hxtoc will try
to correct the error and produce output anyway.
The error recovery for incorrect HTML is primitive.