DOKK / manpages / debian 10 / html-xml-utils / hxremove.1.en
HXREMOVE(1) HTML-XML-utils HXREMOVE(1)

hxremove - remove elements from an XML file by means of a CSS selector

hxremove [ -i ] [ -l language ] selectors

hxremove reads a well-formed XML document from standard input and writes it to standard output without any elements that match one of the CSS selectors that are given as argument. For example


hxremove ol li:first-child

removes the first li (list item in XHTML) from every ol (ordered list).

If there are multiple selectors, they must be separated by commas. For example,


hxremove p + ul, blockquote ol

removes all ul elements that follow a p element and also all ol elements that are descendants of a blockquote element.

hxremove assumes that class selectors (".foo") refer to an attribute called "class". And assumes that ID selectors ("#foo") refer to an attribute called "id".

To handle HTML files, make them well-formed XML first, e.g., with hxnormalize -x.

Compare with hxselect, which removes everything but the selected elements.

The following options are supported:

Match case-insensitively. Useful for HTML and some other SGML-based languages.
Sets the default language, in case the root element doesn't have an xml:lang attribute (default: none). Example: -l en

The following operand is supported:

One or more comma-separated selectors. Most selectors from CSS level 3 are supported.

asc2xml(1), xml2asc(1), hxnormalize(1), hxselect(1), UTF-8 (RFC 2279)

10 Jul 2011 7.x