htmlfix - Fixup HTML markup code
htmlfix [-o outputfile] [-F
fixes] [-S fixes] [-v] [inputfile]
The htmlfix program reads inputfile or from
"stdin" and performs the following actions
(name of each fixup is within parentheses):
- (imgsize) : Adding WIDTH and HEIGHT attributes to IMG
tags
- For all "IMG" tags which don't already
have both "WIDTH" and
"HEIGHT" attributes (matched case
insensitive), the size of the image (taken from the
"SRC" attribute) is determined and the
missing ``"width=X"'' and/or
``"height=Y"'' is added to the list of
attributes. The intention is to speedup the layouting of the final
webpage.
Don't intermix this with a size checker: htmlfix will
only add missing width/height attributes and don't adjust ones
with wrong dimensions. This is because else the user wouln't be able to
scale images (used a lot by webdesigners via 1pt dot-images).
There is a special case: When the
"WIDTH" or
"HEIGHT" attribute already exists and
has a value of ``"*"'' this asterisk
is replaced by the physical value instead of appending a new attribute.
Use when you want the attributes at a certain position, i.e. use this
variant as a placeholder.
HTMLfix supports one additionl feature in conjunction with
"WIDTH" and
"HEIGHT":
"SCALE="factor and
"SCALE="percent"%".
This can be used to scale the given or determined width and height
values by multiplying with factor or multiplying with
percent/100.
- (imgalt) : Adding ALT attribute to IMG tags
- For all "IMG" tags which don't already
have a "ALT" tag an
"ALT=""" attribute is added.
The intention is to both make HTML checkers like weblint(1) happy
and to demystify the final webpage for lynx(1) users.
- (summary) : Adding SUMMARY attribute to TABLE tags
- This attribute helps non-visual rendering of tables by adding a hint on
its contents, and it makes tidy(1) quiet.
- (center) : Changing proprietary CENTER tag to standard DIV
tag
- All proprietary (Netscape) "CENTER" tags
are replaced by the HTML 3.2 conforming construct
``"<DIV ALIGN=CENTER>"''.
- (space) : Fix trailing spaces in tags
- Appendix C of the XHTML Specification recommends putting a space before
closing simple tags to help rendering by old browsers. This space is
automatically added when this fixup is used. On the other hand, all spaces
before a right-angle bracker are suppressed.
- (quotes) : Adding missing quotes for attributes
- All attributes of the form ``"...=xyz"''
are replaced by
``"...="xyz""''. Furthermore
all (color) attributes of the form
``"...="XXYYZZ""'' (XX,YY,ZZ
elements of set {0,..,9,a,..,f} are fixed to
``"...="#XXYYZZ""''.
- (indent) : Indenting paragraphs
- Paragraphs enclosed in "<indent [num=N]
[size=S]>"..."</indent>"
containers are indented by N*S spaces. When N=0 then the whitespace block
in front of the paragraph is removed. Default is a 4 space indentation
(N=1, S=4).
- (comment) : Out-commenting tags
- Sometimes it is useful to temporarily out-comment a tag instead of
completely removing it. This can be done by just adding a sharp
(``"#"'') character directly to the end
of the tagname. The result is that the complete tag is commented out. For
container-tags you have to comment out the end-tag explicitly, too.
Example: ``<"a#
href="..."">''.
- (tagcase) : Markup-code case-conversion
- Some people like their HTML markup code either to be all uppercase or all
lowercase. This tag case-conversion is supported by the internal
"<tagconv
case=...>"..."</tagconv>"
container tag from HTMLfix. Use
"case=upper" to translate the HTML tags
in its body to uppercase (default) or
"case=lower" to translate them to
lowercase.
- -o outputfile
- This redirects the output to outputfile. Usually the output will be
send to "stdout" if no such option is
specified or outputfile is
""-"".
- -F fixes
- This option specifies which specifix fixups are performed. Its argument is
a comma separated list of fixup names, and by default all fixups are
performed.
- -S fixes
- This option does the inverse job, it skips specified fixups.
- -v
- This sets verbose mode where some processing information will be given on
the console.
Ralf S. Engelschall
rse@engelschall.com
www.engelschall.com
Denis Barbier
barbier@engelschall.com