tkhtml(3tcl) | tkhtml(3tcl) |
tkhtml - Widget to render html documents.
html pathName ?options?
-height -width -xscrollcommand -xscrollincrement -yscrollcommand -yscrollincrement
See the options(n) manual entry for details on the standard options.
Command-Line Name:-defaultstyle
.br Database Name: defaultstyle
.br Database Class: Defaultstyle
.br
.RS
.PP This option is used to set the default style-sheet for the widget. The
option value should be the entire text of the default style-sheet.
The default stylesheet defines things that are "built-in" to the document - for example the behaviour of <p> or <img> tags in html. The idea behind making it flexible is to allow Tkhtml to display anything that looks roughly like an XML document. But this will not work at the moment because of other assumptions the implementation makes about the set of valid tags. Currently, only valid HTML tags are recognized.
The Tkhtml package adds the [::tkhtml::htmlstyle] command to the interpreter it is loaded into. Invoking this command returns a CSS document suitable for use with Tkhml as a default stylesheet for HTML documents. If the -quirks option is passed to [::tkhtml::htmlstyle] then the returned document includes some extra rules used when rendering legacy documents.
If the value of the -defaultstyle option is changed, the new value does not take effect until after the next call to the widget [reset] method.
The default value of this option is the same as the string
returned by the [::tkhtml::htmlstyle] command.
.RE Command-Line Name:-fontscale
.br Database Name: fontscale
.br Database Class: Fontscale
.br
.RS
.PP This option is set to a floating point number, default 1.0. After CSS
algorithms are used to determine a font size, it is multiplied by the value
of this option. Setting this to a value other than 1.0 breaks standards
compliance.
.RE Command-Line Name:-fonttable
.br Database Name: fonttable
.br Database Class: Fonttable
.br
.RS
.PP This option must be set to a list of 7 integers. The first integer must
be greater than 0 and each subsequent integer must be greater than or equal
to its predecessor.
The seven integers define the sizes of the Tk fonts (in points) used when a CSS formatted document requests font-size "xx-small", "x-small", "small", "medium", "large", "x-large" or "xx-large", respectively.
The default value is {8 9 10 11 13 15 17}.
.RE Command-Line Name:-forcefontmetrics
.br Database Name: forcefontmetrics
.br Database Class: Forcefontmetrics
.br
.RS
.PP This is a boolean option. If true, the font-metrics returned by Tk are
overridden with calculated values based on the font requested. This improves
CSS compatibility, but on some systems may cause problems. The default is
true.
.RE Command-Line Name:-forcewidth
.br Database Name: forcewidth
.br Database Class: Forcewidth
.br
.RS
.PP When determining the layout of a document, Tkhtml3 (and all other
HTML/CSS engines) require as an input the width of the containing block for
the whole document. For web-browsers, this is usually the width of the
viewport in which the document will be displayed.
If this option is true or the widget window is not mapped, Tkhtml3 uses the value of the -width option as the initial containing block width. Otherwise, the width of the widget window is used.
The default value is false.
.RE Command-Line Name:-imagecache
.br Database Name: imagecache
.br Database Class: Imagecache
.br
.RS
.PP This boolean option (default true) determines whether or not Tkhtml3
caches the images returned to it by the -imagecmd callback script. If true,
all images are cached until the next time the [reset] sub-command is
invoked. If false, images are discarded as soon as they are not in use.
For simple applications, or applications that retrieve images from
local sources, false is usually a better value for this option (since it may
save memory). However for web-browser applications where the background
images of elements may be modified by mouseover events and so on, true is a
better choice.
.RE Command-Line Name:-imagecmd
.br Database Name: imagecmd
.br Database Class: Imagecmd
.br
.RS
.PP As well as for replacing entire document nodes (i.e. <img>), images
are used in several other contexts in CSS formatted documents, for example
as list markers or backgrounds. If the
-imagecmd option is not set to an empty string (the default), then each time
an image URI is encountered in the document, it is appended to the -imagecmd
script and the resulting list evaluated.
The command should return either an empty string, the name of a Tk image, or a list of exactly two elements, the name of a Tk image and a script. If the result is an empty string, then no image can be displayed. If the result is a Tk image name, then the image is displayed in the widget. When the image is no longer required, it is deleted. If the result of the command is a list containing a Tk image name and a script, then instead of deleting the image when it is no longer required, the script is evaluated.
If the size or content of the image are modified while it is in
use the widget display is updated automatically.
.RE Command-Line Name:-mode
.br Database Name: mode
.br Database Class: Mode
.br
.RS
.PP This option may be set to "quirks", "standards" or
"almost standards", to set the rendering engine mode. The default
value is "standards".
TODO: List the differences between the three modes in Tkhtml.
.RE Command-Line Name:-parsemode
.br Database Name: parsemode
.br Database Class: Parsemode
.br
.RS
.PP This option may be set to "html", "xhtml" or
"xml", to set the parser mode. The default value is
"html".
In "html" mode, the parser attempts to mimic the tag-soup approach inherited by modern web-browsers from the bad old days. Explicit XML style self-closing tags (i.e. closing a markup tag with "/>" instead of ">") are not handled specially. Unknown tags are ignored.
"xhtml" mode is the same as "html" mode except that explicit self-closing tags are recognized.
"xml" mode is the same as "xhtml" mode except
that unknown tag names and XML CDATA sections are recognized.
.RE Command-Line Name:-shrink
.br Database Name: shrink
.br Database Class: Shrink
.br
.RS
.PP This boolean option governs the way the widgets requested width and
height are calculated. If it is set to false (the default), then the
requested width and height are set by the -width and -height options as per
usual.
If this option is set to true, then the widgets requested width
and height are determined by the current document. Each time the document
layout is calculated, the widgets requested height and width are set to the
size of the document layout. If the widget is unmapped when the layout is
calculated, then the value of the -width option is used to determine the
width of the initial containing block for the layout. Otherwise, the current
window width is used.
.RE Command-Line Name:-zoom
.br Database Name: zoom
.br Database Class: Zoom
.br
.RS
.PP This option may be set to any floating point number. Before the document
layout is calculated, all lengths and sizes specified in the HTML document
or CSS style configuration, implicit or explicit, are multiplied by this
value.
The default value is 1.0.
.RE Command-Line Name:-logcmd
.br Database Name: logcmd
.br Database Class: Logcmd
.br
.RS
.PP This option is used for debugging the widget. It is not part of the
official interface and may be modified or removed at any time. Don"t
worry about it.
.RE Command-Line Name:-timercmd
.br Database Name: timercmd
.br Database Class: Timercmd
.br
.RS
.PP This option is used for debugging the widget. It is not part of the
official interface and may be modified or removed at any time. Don"t
worry about it.
.RE Command-Line Name:-layoutcache
.br Database Name: layoutcache
.br Database Class: Layoutcache
.br
.RS
.PP This option is used for debugging the widget. It is not part of the
official interface and may be modified or removed at any time. Don"t
worry about it.
If this boolean option is set to true, then Tkhtml caches layout information to improve performance when the layout of a document must be recomputed. This can happen in a variety of situations, for example when extra text is appended to the document, a new style is applied to the document, a dynamic CSS selector (i.e. :hover) is activated, the widget window is resized, or when the size of an embedded image or Tk window changes.
Layout caching consumes no extra memory or significant processing cycles, so in an ideal world there is no real reason to turn it off. But it can be a source of layout bugs, hence this option.
The default value is true.
.RE
The [html] command creates a new window (given by the pathName argument) and makes it into an html widget. The html command returns its pathName argument. At the time this command is invoked, there must not exist a window named pathName, but pathName"s parent must exist.
The [html] command creates a new Tcl command whose name is pathName. This command may be used to invoke various operations on the widget as follows:
pathName bbox nodeHandle
pathName cget option
pathName configure ?option? ?value?
pathName fragment html-text
pathName handler node tag script
pathName handler attribute tag script
pathName handler script tag script
pathName handler parse tag script
For a "node" handler script, whenever a document element having the specified tag type (e.g. "p" or "link") is encountered during parsing, then the node handle for the node is appended to script and the resulting list evaluated as a Tcl command. See the section "NODE COMMAND" for details of how a node handle may be used to query and manipulate a document node. A node handler is called only after the subtree rooted at the node has been completely parsed.
If the handler script is a "script" handler, whenever a document node of type tag is parsed, two arguments are appended to the specified script before it is evaluated. The first argument is a key-value list (suitable for passing to the [array set] command containing the HTML attributes that were part of the element declaration. The second argument is the literal text that appears between the start and end tags of the element.
Elements for which a "script" handler is evaluated are not included in the parsed form of the HTML document. Instead, the result of the script handler evaluation is substituted into the document and parsed. For example, to handle the following embedded javascript:
<SCRIPT>
document.write("<p>A paragraph</p>") </SCRIPT>
a script handler that returns the string "<p>A paragraph</p>" must be configured for nodes of type "SCRIPT".
Unlike node or script handlers, a "parse" handler may be associated with a specific opening tag, a closing tag or with text tags (by specifying an empty string as the tag type). Whenever such a tag is encountered the parse handler script is invoked with two arguments, the node handle for the created node and the character offset of the in the parsed document. For a closing tag (i.e. "/form") an empty string is passed instead of a node handle.
TODO: Describe "attribute" handlers.
TODO: The offset values passed to parse handler scripts currently have problems. See http://tkhtml.tcl.tk/cvstrac/tktview?tn=126
Handler callbacks are always made from within [pathName parse] commands. The callback for a given node is made as soon as the node is completely parsed. This can happen because an implicit or explicit closing tag is parsed, or because there is no more document data and the -final switch was passed to the [pathName parse] command.
TODO: Return values of handler scripts? If an exception occurs in a handler script?
pathName image
The returned image should be deleted when the script has finished with it, for example:
set img [.html image] # ... Use $img ... image delete $img
This command is included mainly for automated testing and should be used with care, as large documents can result in very large images that take a long time to create and use vast amounts of memory.
Currently this command is not available on windows. On that platform an empty string is always returned.
pathName node ? ?-index? x y?
If the x and y arguments are present, then a list of node handles is returned. The list contains one handle for each node that generates content currently located at viewport coordinates (x, y). Usually this is only a single node, but floating boxes and other overlapped content can cause this command to return more than one node. If no content is located at the specified coordinates or the widget window is not mapped, then an empty string is returned.
If the -index option is specified along with the x and y coordinates, then instead of a list of node handles, a list of two elements is returned. The first element of the list is the node-handle associated with the generated text closest to the specified (x, y) coordinates. The second list value is a byte (not character) offset into the text obtainable by [nodeHandle text] for the character closest to coordinates (x, y). The index may be used with the [pathName tag] commands.
The document node can be queried and manipulated using the interface described in the "NODE COMMAND" section.
pathName parse ?-final? html-text
If the -final option is present, this indicates that the supplied text is the last of the document. Any subsequent call to [pathName parse] before a call to [pathName reset] will raise an error.
If the -final option is not passed to [pathName parse] along with the final part of the document text, node handler scripts for any elements closed implicitly by the end of the document will not be executed. It is not an error to specify an empty string for the html-text argument.
pathName preload uri
This command may be useful when implementing scripting environments that support "preloading" of images.
pathName reset
pathName search selector
pathName style ?options? stylesheet-text
The following options are supported:
Option Default Value -------------------------------------- -id <stylesheet-id> "author" -importcmd <script> "" -urlcmd <script> ""
The value of the -id option determines the priority taken by the style-sheet when assigning property values to document nodes (see chapter 6 of the CSS specification for more detail on this process). The first part of the style-sheet id must be one of the strings "agent", "user" or "author". Following this, a style-sheet id may contain any text.
When comparing two style-ids to determine which stylesheet takes priority, the widget uses the following approach: If the initial strings of the two style-id values are not identical, then "user" takes precedence over "author", and "author" takes precedence over "agent". Otherwise, the lexographically largest style-id value takes precedence. For more detail on why this seemingly odd approach is taken, please refer to the "STYLESHEET LOADING" below.
The -importcmd option is used to provide a handler script for @import directives encountered within the stylesheet text. Each time an @import directive is encountered, if the -importcmd option is set to other than an empty string, the URI to be imported is appended to the option value and the resulting list evaluated as a Tcl script. The return value of the script is ignored. If the script raises an error, then it is propagated up the call-chain to the [pathName style] caller.
The -urlcmd option is used to supply a script to translate "url(...)" CSS attribute values. If this option is not set to "", each time a url() value is encountered the URI is appended to the value of -urlcmd and the resulting script evaluated. The return value is stored as the URL in the parsed stylesheet.
pathName tag add tag-name node1 index1 node2 index2
pathName tag remove tag-name node1 index1 node2 index2
pathName tag configure tag-name option value ?option value...?
pathName tag delete tag-name
Each displayed document character is identified by a text node-handle (see below) and an index into the text returned by the [node text] command. The index is a byte (not character) offset. See also the documentation for the [pathName node -index] command. Both the [pathName tag add] and [pathName tag remove] use this convention.
Evaluating the [pathName tag add] command adds the specified tag to all displayed characters between the point in the document described by (node1, index1) and the point described by (node2, index2). If the specified tag does not exist, it is created with default option values. The order in which the two specified points occur in the document is not important.
The [pathName tag remove] command removes the specified tag from all displayed characters between the point in the document described by (node1, index1) and the point described by (node2, index2).
The [pathName tag configure] command is used to configure a tags options, which determine how tagged characters are displayed. If the specified tag does not exist, it is created. The following options are supported:
Option Default Value -------------------------------------- -background black -foreground white
A tag can be completely deleted (removed from all characters and have it"s option values set to the defaults) using the [pathName tag delete] command.
The [pathName tag] command replaces the [pathName select] command
that was present in early alpha versions of Tkhtml3. Users should note that
the options supported by [pathName tag configure] are likely to change
before beta release. See
http://tkhtml.tcl.tk/cvstrac/tktview?tn=73 (ticket #73).
pathName text bbox node1 index1 node2 index2
pathName text index offset ?offset...?
pathName text offset node index
pathName text text
The [pathName text text] command returns a string containing the raw, unformatted text of the displayed document. Each block box is separated from the next by a newline character. Each block of whitespace is collapsed to a single space, except within blocks with the CSS "white-space" property set to "pre".
The [pathName text index] command is used to transform from a character offset in the string returned by [pathName text text] to a node/index pair that can be used with the [pathName tag] commands. The return value is a list of two elements, the node-handle followed by the index.
Command [pathName text offset] is the reverse of [pathName text index]. Given a node-handle and index of the type similar to that used by the [pathName tag] commands, this command returns the corresponding character offset in the string returned by [pathName text text] command.
pathName write continue
pathName write text html-text
pathName write wait
pathName xview
pathName xview moveto fraction
pathName xview scroll number what
pathName yview
pathName yview moveto fraction
pathName yview scroll number what
pathName yview nodeHandle
As well as the standard interface copied from the canvas and text widgets, Tkhtml supports passing a single node-handle as the only argument to [pathName yview]. In this case the viewport is scrolled so that the content generated by the node nodeHandle is visible. This can be useful for implementing support for URI fragments.
There are several interfaces by which a script can obtain a "node handle". Each node handle is a Tcl command that may be used to access the document node that it represents. A node handle is valid from the time it is obtained until the next call to [pathName reset]. The node handle may be used to query and manipulate the document node via the following subcommands:
nodeHandle attribute ??-default default-value? ?attribute?
?new-value??
If the new-value argument is present, then set the named
attribute to the specified new-value.
If no attribute argument is present, return a key-value list of the defined attributes of the form that can be passed to [array set].
# Html code for node <p class="normal" id="second" style="color : red"> # Value returned by [nodeHandle attr] {class normal id second style {color : red}} # Value returned by [nodeHandle attr class] normal
nodeHandle children
nodeHandle destroy
nodeHandle dynamic set ?flag?
nodeHandle dynamic clear ?flag?
The supported values for the flag argument are "active", "hover", "focus", "link" and "visited". The status of each dynamic flag determines whether or not the corresponding CSS dynamic pseudo-classes are considered to match the node. For example, when the mouse moves over node $N, a script could invoke:
$N dynamic set hover
for {set n $PN} {$n ne ""} {set n [$n parent]} {
$n dynamic clear hover } for {set n $N} {$n ne ""} {set n [$n parent]} {
$n dynamic set hover }
nodeHandle insert ?-before node? node-list
nodeHandle override ?value?
nodeHandle parent
nodeHandle property ?-before|-after? ?property-name?
nodeHandle remove ?node-list?
nodeHandle replace ? ?options? newValue?
A nodes replacement object may be set to the name of a Tk window or an empty string. If it is an empty string (the default and usual case), then the node is rendered normally. If the node replacement object is set to the name of a Tk window, then the Tk window is mapped into the widget in place of any other content (for example to implement form elements or plugins).
The following options are supported:
Option Default Value -------------------------------------- -deletecmd <script> "" -configurecmd <script> "" -stylecmd <script> ""
When a replacement object is no longer being used by the widget (e.g. because the node has been deleted or [pathName reset] is invoked), the value of the -deletecmd option is evaluated as Tcl script.
If it is not set to an empty string (the default) each time the nodes CSS properties are recalculated, a serialized array is appended to the value of the -configurecmd option and the result evaluated as a Tcl command. The script should update the replacement objects appearance where appropriate to reflect the property values. The format of the appended argument is {p1 v1 p2 v2 ... pN vN} where the pX values are property names (i.e. "background-color") and the vX values are property values (i.e. "#CCCCCC"). The CSS properties that currently may be present in the array are listed below. More may be added in the future.
background-color color font selected
The value of the "font" property, if present in the serialized array is not set to the value of the corresponding CSS property. Instead it is set to the name of a Tk font determined by combining the various font-related CSS properties. Unless they are set to "transparent", the two color values are guaranteed to parse as Tk colors. The "selected" property is either true or false, depending on whether or not the replaced object is part of the selection or not. Whether or not an object is part of the selection is governed by previous calls to the [pathName select] command.
The -configurecmd callback is always executed at least once between the [nodeHandle replace] command and when the replaced object is mapped into the widget display.
nodeHandle tag
nodeHandle text ?-tokens|-pre?
TODO: Document -tokens and -pre.
Apart from the default stylesheet that is always loaded (see the description of the -defaultstyle option above), a script may configure the widget with extra style information in the form of CSS stylesheet documents. Complete stylesheet documents (it is not possible to incrementally parse stylesheets as it is HTML document files) are passed to the widget using the [pathName style] command.
As well as any stylesheets specified by the application, stylesheets may be included in HTML documents by document authors in several ways:
# Implementations of application callbacks to load # stylesheets from the various sources enumerated above. # ".html" is the name of the applications tkhtml widget. # The variable $document contains an entire HTML document. # The pseudo-code <LOAD URI CONTENTS> is used to indicate # code to load and return the content located at $URI. proc script_handler {tagcontents} {
incr ::stylecount
set id "author.[format %.4d $::stylecount]"
set handler "import_handler $id"
.html style -id $id.9999 -importcmd $handler $tagcontents } proc link_handler {node} {
if {[node attr rel] == "stylesheet"} {
set URI [node attr href]
set stylesheet [<LOAD URI CONTENTS>]
incr ::stylecount
set id "author.[format %.4d $::stylecount]"
set handler "import_handler $id"
.html style -id $id.9999 -importcmd $handler $stylesheet
} } proc import_handler {parentid URI} {
set stylesheet [<LOAD URI CONTENTS>]
incr ::stylecount
set id "$parentid.[format %.4d $::stylecount]"
set handler "import_handler $id"
.html style -id $id.9999 -importcmd $handler $stylesheet }
.html handler script style script_handler
.html handler node link link_handler set ::stylecount 0
.html parse -final $document
The complicated part of the example code above is the generation of stylesheet-ids, the values passed to the -id option of the [.html style] command. Stylesheet-ids are used to determine the precedence of each stylesheet passed to the widget, and the role it plays in the CSS cascade algorithm used to assign properties to document nodes. The first part of each stylesheet-id, which must be either "user", "author" or "agent", determines the role the stylesheet plays in the cascade algorithm. In general, author stylesheets take precedence over user stylesheets which take precedence over agent stylesheets. An author stylesheet is one supplied or linked by the author of the document. A user stylesheet is supplied by the user of the viewing application, possibly by configuring a preferences dialog or similar. An agent stylesheet is supplied by the viewing application, for example the default stylesheet configured using the -defaultstyle option.
The stylesheet id mechanism is designed so that the cascade can be correctly implemented even when the various stylesheets are passed to the widget asynchronously and out of order (as may be the case if they are being downloaded from a network server or servers).
# # Contents of HTML document # <html><head>
<link rel="stylesheet" href="A.css">
<style>
@import uri("B.css")
@import uri("C.css")
... rules ...
</style>
<link rel="stylesheet" href="D.css">
... remainder of document ... # # Contents of B.css # @import "E.css"
... rules ...
In the example above, the stylesheet documents A.css, B.css, C.css, D.css, E.css and the stylesheet embedded in the <style> tag are all author stylesheets. CSS states that the relative precedences of the stylesheets in this case is governed by the following rules:
Applying the above two rules to the example documents indicates that the order of the stylesheets from least to most important is: A.css, E.css, B.css, C.css, embedded <stylesheet>, D.css. For the widget to implement the cascade correctly, the stylesheet-ids passed to the six [pathName style] commands must sort lexigraphically in the same order as the stylesheet precedence determined by the above two rules. The example code above shows one approach to this. Using the example code, stylesheets would be associated with stylesheet-ids as follows:
Stylesheet Stylesheet-id ------------------------------- A.css author.0001.9999 <embedded style> author.0002.9999 B.css author.0002.0003.9999 E.css author.0002.0003.0004.9999 C.css author.0002.0005.9999 D.css author.0006.9999
Entries are specified in the above table in the order in which the calls to [html style] would be made. Of course, the example code fails if 10000 or more individual stylesheet documents are loaded. More inventive solutions that avoid this kind of limitation are possible.
Other factors, namely rule specificity and the !IMPORTANT directive are involved in determining the precedence of individual stylesheet rules. These are completely encapsulated by the widget, so are not described here. For complete details of the CSS cascade algorithm, refer to the CSS and CSS 2 specifications (www.w3.org).
Sat Feb 25 2006 |