"LaTeXML::Core::Document" - represents an XML document
under construction.
A "LaTeXML::Core::Document"
represents an XML document being constructed by LaTeXML, and also provides
the methods for constructing it. It extends LaTeXML::Common::Object.
LaTeXML will have digested the source material resulting in a
LaTeXML::Core::List (from a LaTeXML::Core::Stomach) of LaTeXML::Core::Boxs,
LaTeXML::Core::Whatsits and sublists. At this stage, a document is created
and it is responsible for `absorbing' the digested material. Generally, the
LaTeXML::Core::Boxs and LaTeXML::Core::Lists create text nodes, whereas the
LaTeXML::Core::Whatsits create "XML"
document fragments, elements and attributes according to the defining
LaTeXML::Core::Definition::Constructor.
Most document construction occurs at a current insertion
point where material will be added, and which moves along with the
inserted material. The LaTeXML::Common::Model, derived from various
declarations and document type, is consulted to determine whether an
insertion is allowed and when elements may need to be automatically opened
or closed in order to carry out a given insertion. For example, a
"subsection" element will typically be
closed automatically when it is attempted to open a
"section" element.
In the methods described here, the term
$qname is used for XML qualified names. These are
tag names with a namespace prefix. The prefix should be one registered with
the current Model, for use within the code. This prefix is not necessarily
the same as the one used in any DTD, but should be mapped to the a Namespace
URI that was registered for the DTD.
The arguments named $node are an
XML::LibXML node.
The methods here are grouped into three sections covering basic
access to the document, insertion methods at the current insertion point,
and less commonly used, lower-level, document manipulation methods.
- "$doc = $document->getDocument;"
- Returns the "XML::LibXML::Document"
currently being constructed.
- "$doc = $document->getModel;"
- Returns the "LaTeXML::Common::Model"
that represents the document model used for this document.
- "$node = $document->getNode;"
- Returns the node at the current insertion point during
construction. This node is considered still to be `open'; any insertions
will go into it (if possible). The node will be an
"XML::LibXML::Element",
"XML::LibXML::Text" or, initially,
"XML::LibXML::Document".
- "$node = $document->getElement;"
- Returns the closest ancestor to the current insertion point that is an
Element.
- "$node = $document->getChildElement($node);"
- Returns a list of the child elements, if any, of the
$node.
- "@nodes = $document->getLastChildElement($node);"
- Returns the last child element of the $node, if it
has one, else undef.
- "$node = $document->getFirstChildElement($node);"
- Returns the first child element of the $node, if
it has one, else undef.
- "@nodes = $document->findnodes($xpath,$node);"
- Returns a list of nodes matching the given $xpath
expression. The context node for $xpath is
$node, if given, otherwise it is the document
element.
- "$node = $document->findnode($xpath,$node);"
- Returns the first node matching the given $xpath
expression. The context node for $xpath is
$node, if given, otherwise it is the document
element.
- "$node = $document->getNodeQName($node);"
- Returns the qualified name (localname with namespace prefix) of the given
$node. The namespace prefix mapping is the code
mapping of the current document model.
- "$boolean = $document->canContain($tag,$child);"
- Returns whether an element $tag can contain a
child $child. $tag and
$child can be nodes, qualified names of nodes
(prefix:localname), or one of a set of special symbols
"#PCDATA",
"#Comment",
"#Document" or
"#ProcessingInstruction".
- "$boolean = $document->canContainIndirect($tag,$child);"
- Returns whether an element $tag can contain a
child $child either directly, or after
automatically opening one or more autoOpen-able elements.
- "$boolean = $document->canContainSomehow($tag,$child);"
- Returns whether an element $tag can contain a
child $child either directly, or after
automatically opening one or more autoOpen-able elements.
- "$boolean = $document->canHaveAttribute($tag,$attrib);"
- Returns whether an element $tag can have an
attribute named $attrib.
- "$boolean = $document->canAutoOpen($tag);"
- Returns whether an element $tag is able to be
automatically opened.
- "$boolean = $document->canAutoClose($node);"
- Returns whether the node $node can be
automatically closed.
These methods are the most common ones used for construction of
documents. They generally operate by creating new material at the current
insertion point. That point initially is just the document itself, but
it moves along to follow any new insertions. These methods also adapt to the
document model so as to automatically open or close elements, when it is
required for the pending insertion and allowed by the document model (See
Tag).
- "$xmldoc = $document->finalize;"
- This method finalizes the document by cleaning up various temporary
attributes, and returns the XML::LibXML::Document that was
constructed.
- "@nodes = $document->absorb($digested);"
- Absorb the $digested object into the document at
the current insertion point according to its type. Various of the the
other methods are invoked as needed, and document nodes may be
automatically opened or closed according to the document model.
This method returns the nodes that were constructed. Note that
the nodes may include children of other nodes, and nodes that may
already have been removed from the document (See filterChildren and
filterDeleted). Also, text insertions are often merged with existing
text nodes; in such cases, the whole text node is included in the
result.
- "$document->insertElement($qname,$content,%attributes);"
- This is a shorthand for creating an element $qname
(with given attributes), absorbing $content from
within that new node, and then closing it. The
$content must be digested material, either a
single box, or an array of boxes, which will be absorbed into the element.
This method returns the newly created node, although it will no longer be
the current insertion point.
- "$document->insertMathToken($string,%attributes);"
- Insert a math token (XMTok) containing the string
$string with the given attributes. Useful
attributes would be name, role, font. Returns the newly inserted
node.
- "$document->insertComment($text);"
- Insert, and return, a comment with the given $text
into the current node.
- "$document->insertPI($op,%attributes);"
- Insert, and return, a ProcessingInstruction into the current node.
- "$document->openText($text,$font);"
- Open a text node in font $font, performing any
required automatic opening and closing of intermedate nodes (including
those needed for font changes) and inserting the string
$text into it.
- "$document->openElement($qname,%attributes);"
- Open an element, named $qname and with the given
attributes. This will be inserted into the current node while performing
any required automatic opening and closing of intermedate nodes. The new
element is returned, and also becomes the current insertion point. An
error (fatal if in "Strict" mode) is
signalled if there is no allowed way to insert such an element into the
current node.
- "$document->closeElement($qname);"
- Close the closest open element named $qname
including any intermedate nodes that may be automatically closed. If that
is not possible, signal an error. The closed node's parent becomes the
current node. This method returns the closed node.
- "$node = $document->isOpenable($qname);"
- Check whether it is possible to open a $qname
element at the current insertion point.
- "$node = $document->isCloseable($qname);"
- Check whether it is possible to close a $qname
element, returning the node that would be closed if possible, otherwise
undef.
- "$document->maybeCloseElement($qname);"
- Close a $qname element, if it is possible to do
so, returns the closed node if it was found, else undef.
- "$document->addAttribute($key=>$value);"
- Add the given attribute to the node nearest to the current insertion point
that is allowed to have it. This does not change the current insertion
point.
- "$document->closeToNode($node);"
- This method closes all children of $node until
$node becomes the insertion point. Note that it
closes any open nodes, not only autoCloseable ones.
Internal Insertion Methods
These are described as an aide to understanding the code; they
rarely, if ever, should be used outside this module.
- "$document->setNode($node);"
- Sets the current insertion point to be
$node. This should be rarely used, if at all; The
construction methods of document generally maintain the notion of
insertion point automatically. This may be useful to allow insertion into
a different part of the document, but you probably want to set the
insertion point back to the previous node, afterwards.
- "$string = $document->getInsertionContext($levels);"
- For debugging, return a string showing the context of the current
insertion point; that is, the string of the nodes leading up to it. if
$levels is defined, show only that many
nodes.
- "$node = $document->find_insertion_point($qname);"
- This internal method is used to find the appropriate point, relative to
the current insertion point, that an element with the specified
$qname can be inserted. That position may require
automatic opening or closing of elements, according to what is allowed by
the document model.
- "@nodes = getInsertionCandidates($node);"
- Returns a list of elements where an arbitrary insertion might take place.
Roughly this is a list starting with $node,
followed by its parent and the parents siblings (in reverse order),
followed by the grandparent and siblings (in reverse order).
- "$node = $document->floatToElement($qname);"
- Finds the nearest element at or preceding the current insertion point (see
"getInsertionCandidates"), that can
accept an element $qname; it moves the insertion
point to that point, and returns the previous insertion point. Generally,
after doing whatever you need at the new insertion point, you should call
"$document->setNode($node);" to
restore the insertion point. If no such point is found, the insertion
point is left unchanged, and undef is returned.
- "$node = $document->floatToAttribute($key);"
- This method works the same as
"floatToElement", but find the nearest
element that can accept the attribute $key.
- "$node = $document->openText_internal($text);"
- This is an internal method, used by
"openText", that assumes the insertion
point has been appropriately adjusted.)
- "$node = $document->openMathText_internal($text);"
- This internal method appends $text to the current
insertion point, which is assumed to be a math node. It checks for math
ligatures and carries out any combinations called for.
- "$node = $document->closeText_internal();"
- This internal method closes the current node, which should be a text node.
It carries out any text ligatures on the content.
- "$node = $document->closeNode_internal($node);"
- This internal method closes any open text or element nodes starting at the
current insertion point, up to and including
$node. Afterwards, the parent of
$node will be the current insertion point. It
condenses the tree to avoid redundant font switching elements.
- "$document->afterOpen($node);"
- Carries out any afterOpen operations that have been recorded (using
"Tag") for the element name of
$node.
- "$document->afterClose($node);"
- Carries out any afterClose operations that have been recorded (using
"Tag") for the element name of
$node.
The following methods are used to perform various sorts of
modification and rearrangements of the document, after the normal flow of
insertion has taken place. These may be needed after an environment (or
perhaps the whole document) has been completed and one needs to analyze what
it contains to decide on the appropriate representation.
- "$document->setAttribute($node,$key,$value);"
- Sets the attribute $key to
$value on $node. This
method is preferred over the direct LibXML one, since it takes care of
decoding namespaces (if $key is a qname), and also
manages recording of xml:id's.
- "$document->recordID($id,$node);"
- Records the association of the given $node with
the $id, which should be the
"xml:id" attribute of the
$node. Usually this association will be maintained
by the methods that create nodes or set attributes.
- "$document->unRecordID($id);"
- Removes the node associated with the given $id, if
any. This might be needed if a node is deleted.
- "$document->modifyID($id);"
- Adjusts $id, if needed, so that it is unique. It
does this by appending a letter and incrementing until it finds an id that
is not yet associated with a node.
- "$node = $document->lookupID($id);"
- Returns the node, if any, that is associated with the given
$id.
- "$document->setNodeBox($node,$box);"
- Records the $box (being a Box, Whatsit or List),
that was (presumably) responsible for the creation of the element
$node. This information is useful for determining
source locations, original TeX strings, and so forth.
- "$box = $document->getNodeBox($node);"
- Returns the $box that was responsible for creating
the element $node.
- "$document->setNodeFont($node,$font);"
- Records the font object that encodes the font that should be used to
display any text within the element $node.
- "$font = $document->getNodeFont($node);"
- Returns the font object associated with the element
$node.
- "$node =
$document->openElementAt($point,$qname,%attributes);"
- Opens a new child element in $point with the
qualified name $qname and with the given
attributes. This method is not affected by, nor does it affect, the
current insertion point. It does manage namespaces, xml:id's and
associating a box, font and locator with the new element, as well as
running any "afterOpen" operations.
- "$node = $document->closeElementAt($node);"
- Closes $node. This method is not affected by, nor
does it affect, the current insertion point. However, it does run any
"afterClose" operations, so any element
that was created using the lower-level
"openElementAt" should be closed using
this method.
- "$node = $document->appendClone($node,@newchildren);"
- Appends clones of @newchildren to
$node. This method modifies any ids found within
@newchildren (using
"modifyID"), and fixes up any references
to those ids within the clones so that they refer to the modified id.
- "$node = $document->wrapNodes($qname,@nodes);"
- This method wraps the @nodes by a new element with
qualified name $qname, that new node replaces the
first of @node. The remaining nodes in
@nodes must be following siblings of the first
one.
NOTE: Does this need multiple nodes? If so, perhaps some kind
of movenodes helper? Otherwise, what about attributes?
- "$node = $document->unwrapNodes($node);"
- Unwrap the children of $node, by replacing
$node by its children.
- "$node = $document->replaceNode($node,@nodes);"
- Replace $node by @nodes;
presumably they are some sort of descendant nodes.
- "$node = $document->renameNode($node,$newname);"
- Rename $node to the tagname
$newname; equivalently replace
$node by a new node with name
$newname and copy the attributes and contents. It
is assumed that $newname can contain those
attributes and contents.
- "@nodes = $document->filterDeletions(@nodes);"
- This function is useful with
"$doc-"absorb($box)>, when you want
to filter out any nodes that have been deleted and no longer appear in the
document.
- "@nodes = $document->filterChildren(@nodes);"
- This function is useful with
"$doc-"absorb($box)>, when you want
to filter out any nodes that are children of other nodes in
@nodes.
Bruce Miller <bruce.miller@nist.gov>
Public domain software, produced as part of work done by the
United States Government & not subject to copyright in the US.