summaryrefslogtreecommitdiff
path: root/doc/internals.html
diff options
context:
space:
mode:
Diffstat (limited to 'doc/internals.html')
-rw-r--r--doc/internals.html455
1 files changed, 228 insertions, 227 deletions
diff --git a/doc/internals.html b/doc/internals.html
index 525dbc7a..f96eef3c 100644
--- a/doc/internals.html
+++ b/doc/internals.html
@@ -18,152 +18,152 @@ A:link, A:visited, A:active { text-decoration: underline }
<li><a href="internals.html#processing">The processing itself</a></li>
<li><a href="internals.html#XPath">XPath expressions compilation</a></li>
<li><a href="internals.html#XPath1">XPath interpretation</a></li>
- <li><a href="internals.html#Descriptio">Description of XPathObjects</a></li>
+ <li><a href="internals.html#Descriptio">Description of XPath
+ Objects</a></li>
<li><a href="internals.html#XPath3">XPath functions</a></li>
<li><a href="internals.html#stack">The variables stack frame</a></li>
<li><a href="internals.html#Extension">Extension support</a></li>
<li><a href="internals.html#Futher">Further reading</a></li>
<li><a href="internals.html#TODOs">TODOs</a></li>
-</ul><h3><a name="Introducti2" id="Introducti2">Introduction</a></h3><p>This document describes the processing of <a href="http://xmlsoft.org/XSLT/">libxslt</a>, the <a href="http://www.w3.org/TR/xslt">XSLT</a>C library developed for the <a href="http://www.gnome.org/">Gnome</a>project.</p><p>Note: this documentation is by definition incomplete and I am not good
-atspelling, grammar, so patches and suggestions are <a href="mailto:veillard@redhat.com">really welcome</a>.</p><h3><a name="Basics1" id="Basics1">Basics</a></h3><p>XSLT is a transformation language. It takes an input document and
-astylesheet document and generates an output document:</p><p align="center"><img src="processing.gif" alt="the XSLT processing model" /></p><p>Libxslt is written in C. It relies on <a href="http://www.xmlsoft.org/">libxml</a>, the XML C library for Gnome,
-forthe following operations:</p><ul><li>parsing files</li>
- <li>building the in-memory DOM structure associated with the
- documentshandled</li>
+</ul><h3><a name="Introducti2" id="Introducti2">Introduction</a></h3><p>This document describes the processing of <a href="http://xmlsoft.org/XSLT/">libxslt</a>, the <a href="http://www.w3.org/TR/xslt">XSLT</a> C library developed for the <a href="http://www.gnome.org/">Gnome</a> project.</p><p>Note: this documentation is by definition incomplete and I am not good at
+spelling, grammar, so patches and suggestions are <a href="mailto:veillard@redhat.com">really welcome</a>.</p><h3><a name="Basics1" id="Basics1">Basics</a></h3><p>XSLT is a transformation language. It takes an input document and a
+stylesheet document and generates an output document:</p><p align="center"><img src="processing.gif" alt="the XSLT processing model" /></p><p>Libxslt is written in C. It relies on <a href="http://www.xmlsoft.org/">libxml</a>, the XML C library for Gnome, for
+the following operations:</p><ul><li>parsing files</li>
+ <li>building the in-memory DOM structure associated with the documents
+ handled</li>
<li>the XPath implementation</li>
- <li>serializing back the result document to XML and HTML. (Text is
- handleddirectly.)</li>
-</ul><h3><a name="Keep1" id="Keep1">Keep it simple stupid</a></h3><p>Libxslt is not very specialized. It is built under the assumption that
-allnodes from the source and output document can fit in the virtual memory
-ofthe system. There is a big trade-off there. It is fine for reasonably
-sizeddocuments but may not be suitable for large sets of data. The gain is
-that itcan be used in a relatively versatile way. The input or output may
-never beserialized, but the size of documents it can handle are limited by
-the sizeof the memory available.</p><p>More specialized memory handling approaches are possible, like buildingthe
-input tree from a serialization progressively as it is consumed,factoring
-repetitive patterns, or even on-the-fly generation of the output asthe input
-is parsed but it is possible only for a limited subset of thestylesheets. In
-general the implementation of libxslt follows the followingpattern:</p><ul><li>KISS (keep it simple stupid)</li>
- <li>when there is a clear bottleneck optimize on top of this
- simpleframework and refine only as much as is needed to reach the
- expectedresult</li>
-</ul><p>The result is not that bad, clearly one can do a better job but
-morespecialized too. Most optimization like building the tree on-demand
-wouldneed serious changes to the libxml XPath framework. An easy step would
-be toserialize the output directly (or call a set of SAX-like output handler
-tokeep this a flexible interface) and hence avoid the memory consumption of
-theresult.</p><h3><a name="libxml" id="libxml">The libxml nodes</a></h3><p>DOM-like trees, as used and generated by libxml and libxslt, arerelatively
-complex. Most node types follow the given structure except a fewvariations
-depending on the node type:</p><p align="center"><img src="node.gif" alt="description of a libxml node" /></p><p>Nodes carry a <strong>name</strong>and the node
-<strong>type</strong>indicates the kind of node it represents, the most
-common ones are:</p><ul><li>document nodes</li>
+ <li>serializing back the result document to XML and HTML. (Text is handled
+ directly.)</li>
+</ul><h3><a name="Keep1" id="Keep1">Keep it simple stupid</a></h3><p>Libxslt is not very specialized. It is built under the assumption that all
+nodes from the source and output document can fit in the virtual memory of
+the system. There is a big trade-off there. It is fine for reasonably sized
+documents but may not be suitable for large sets of data. The gain is that it
+can be used in a relatively versatile way. The input or output may never be
+serialized, but the size of documents it can handle are limited by the size
+of the memory available.</p><p>More specialized memory handling approaches are possible, like building
+the input tree from a serialization progressively as it is consumed,
+factoring repetitive patterns, or even on-the-fly generation of the output as
+the input is parsed but it is possible only for a limited subset of the
+stylesheets. In general the implementation of libxslt follows the following
+pattern:</p><ul><li>KISS (keep it simple stupid)</li>
+ <li>when there is a clear bottleneck optimize on top of this simple
+ framework and refine only as much as is needed to reach the expected
+ result</li>
+</ul><p>The result is not that bad, clearly one can do a better job but more
+specialized too. Most optimization like building the tree on-demand would
+need serious changes to the libxml XPath framework. An easy step would be to
+serialize the output directly (or call a set of SAX-like output handler to
+keep this a flexible interface) and hence avoid the memory consumption of the
+result.</p><h3><a name="libxml" id="libxml">The libxml nodes</a></h3><p>DOM-like trees, as used and generated by libxml and libxslt, are
+relatively complex. Most node types follow the given structure except a few
+variations depending on the node type:</p><p align="center"><img src="node.gif" alt="description of a libxml node" /></p><p>Nodes carry a <strong>name</strong> and the node <strong>type</strong>
+indicates the kind of node it represents, the most common ones are:</p><ul><li>document nodes</li>
<li>element nodes</li>
<li>text nodes</li>
-</ul><p>For the XSLT processing, entity nodes should not be generated (i.e.
-theyshould be replaced by their content). Most nodes also contains the
-following"navigation" informations:</p><ul><li>the containing <strong>doc</strong>ument</li>
- <li>the <strong>parent</strong>node</li>
- <li>the first <strong>children</strong>node</li>
- <li>the <strong>last</strong>children node</li>
+</ul><p>For the XSLT processing, entity nodes should not be generated (i.e. they
+should be replaced by their content). Most nodes also contains the following
+"navigation" informations:</p><ul><li>the containing <strong>doc</strong>ument</li>
+ <li>the <strong>parent</strong> node</li>
+ <li>the first <strong>children</strong> node</li>
+ <li>the <strong>last</strong> children node</li>
<li>the <strong>prev</strong>ious sibling</li>
<li>the following sibling (<strong>next</strong>)</li>
-</ul><p>Elements nodes carries the list of attributes in the properties,
-anattribute itself holds the navigation pointers and the children list
-(theattribute value is not represented as a simple string to allow usage
-ofentities references).</p><p>The <strong>ns</strong>points to the namespace declaration for
-thenamespace associated to the node, <strong>nsDef</strong>is the linked
-listof namespace declaration present on element nodes.</p><p>Most nodes also carry an <strong>_private</strong>pointer which can beused
-by the application to hold specific data on this node.</p><h3><a name="XSLT" id="XSLT">The XSLT processing steps</a></h3><p>There are a few steps which are clearly decoupled at the
-interfacelevel:</p><ol><li>parse the stylesheet and generate a DOM tree</li>
- <li>take the stylesheet tree and build a compiled version of it
- (thecompilation phase)</li>
+</ul><p>Elements nodes carries the list of attributes in the properties, an
+attribute itself holds the navigation pointers and the children list (the
+attribute value is not represented as a simple string to allow usage of
+entities references).</p><p>The <strong>ns</strong> points to the namespace declaration for the
+namespace associated to the node, <strong>nsDef</strong> is the linked list
+of namespace declaration present on element nodes.</p><p>Most nodes also carry an <strong>_private</strong> pointer which can be
+used by the application to hold specific data on this node.</p><h3><a name="XSLT" id="XSLT">The XSLT processing steps</a></h3><p>There are a few steps which are clearly decoupled at the interface
+level:</p><ol><li>parse the stylesheet and generate a DOM tree</li>
+ <li>take the stylesheet tree and build a compiled version of it (the
+ compilation phase)</li>
<li>take the input and generate a DOM tree</li>
- <li>process the stylesheet against the input tree and generate an
- outputtree</li>
+ <li>process the stylesheet against the input tree and generate an output
+ tree</li>
<li>serialize the output tree</li>
</ol><p>A few things should be noted here:</p><ul><li>the steps 1/ 3/ and 5/ are optional</li>
- <li>the stylesheet obtained at 2/ can be reused by multiple processing
- 4/(and this should also work in threaded programs)</li>
- <li>the tree provided in 2/ should never be freed using xmlFreeDoc, but
- byfreeing the stylesheet.</li>
- <li>the input tree 4/ is not modified except the _private field which maybe
- used for labelling keys if used by the stylesheet</li>
-</ul><h3><a name="XSLT1" id="XSLT1">The XSLT stylesheet compilation</a></h3><p>This is the second step described. It takes a stylesheet tree,
-and"compiles" it. This associates to each node a structure stored in
-the_private field and containing information computed in the stylesheet:</p><p align="center"><img src="stylesheet.gif" alt="a compiled XSLT stylesheet" /></p><p>One xsltStylesheet structure is generated per document parsed for
-thestylesheet. XSLT documents allow includes and imports of other
-documents,imports are stored in the <strong>imports</strong>list (hence
-keeping thetree hierarchy of includes which is very important for a proper
-XSLTprocessing model) and includes are stored in the
-<strong>doclist</strong>list. An imported stylesheet has a parent link to
-allow browsing of thetree.</p><p>The DOM tree associated to the document is stored in
-<strong>doc</strong>.It is preprocessed to remove ignorable empty nodes and
-all the nodes in theXSLT namespace are subject to precomputing. This usually
-consist ofextracting all the context information from the context tree
-(attributes,namespaces, XPath expressions), and storing them in an
-xsltStylePreCompstructure associated to the <strong>_private</strong>field of
-the node.</p><p>A couple of notable exceptions to this are XSLT template nodes (more
-onthis later) and attribute value templates. If they are actually
-templates,the value cannot be computed at compilation time. (Some
-preprocessing couldbe done like isolation and preparsing of the XPath
-subexpressions but it'snot done, yet.)</p><p>The xsltStylePreComp structure also allows storing of the precompiled
-formof an XPath expression that can be associated to an XSLT element (more
-onthis later).</p><h3><a name="XSLT2" id="XSLT2">The XSLT template compilation</a></h3><p>A proper handling of templates lookup is one of the keys of fast
-XSLTprocessing. (Given a node in the source document this is the process
-offinding which templates should be applied to this node.) Libxslt follows
-thehint suggested in the <a href="http://www.w3.org/TR/xslt#patterns">5.2Patterns</a>section of the XSLT
-Recommendation, i.e. it doesn't evaluate itas an XPath expression but
-tokenizes it and compiles it as a set of rules tobe evaluated on a candidate
-node. There usually is an indication of the nodename in the last step of this
-evaluation and this is used as a key check forthe match. As a result libxslt
-builds a relatively more complex set ofstructures for the templates:</p><p align="center"><img src="templates.gif" alt="The templates related structure" /></p><p>Let's describe a bit more closely what is built. First the
-xsltStylesheetstructure holds a pointer to the template hash table. All the
-XSLT patternscompiled in this stylesheet are indexed by the value of the the
-targetelement (or attribute, pi ...) name, so when a element or an attribute
-"foo"needs to be processed the lookup is done using the name as a key.</p><p>Each of the patterns is compiled into an xsltCompMatch structure. It
-holdsthe set of rules based on the tokenization of the pattern stored in
-reverseorder (matching is easier this way). It also holds some information
-about theprevious matches used to speed up the process when one iterates over
-a set ofsiblings. (This optimization may be defeated by trashing when
-runningthreaded computation, it's unclear that this is a big deal in
-practice.)Predicate expressions are not compiled at this stage, they may be
-at run-timeif needed, but in this case they are compiled as full XPath
-expressions (theuse of some fixed predicate can probably be optimized, they
-are not yet).</p><p>The xsltCompMatch are then stored in the hash table, the clash list
-isitself sorted by priority of the template to implement "naturally" the
-XSLTpriority rules.</p><p>Associated to the compiled pattern is the xsltTemplate itself
-containingthe information required for the processing of the pattern
-including, ofcourse, a pointer to the list of elements used for building the
-patternresult.</p><p>Last but not least a number of patterns do not fit in the hash
-tablebecause they are not associated to a name, this is the case for
-patternsapplying to the root, any element, any attributes, text nodes, pi
-nodes, keysetc. Those are stored independently in the stylesheet structure as
-separatelinked lists of xsltCompMatch.</p><h3><a name="processing" id="processing">The processing itself</a></h3><p>The processing is defined by the XSLT specification (the basis of
-thealgorithm is explained in <a href="http://www.w3.org/TR/xslt#section-Introduction">the
-Introduction</a>section). Basically it works by taking the root of the input
-document andapplying the following algorithm:</p><ol><li>Finding the template applying to it. This is a lookup in the
- templatehash table, walking the hash list until the node satisfies all
- the stepsof the pattern, then checking the appropriate(s) global
- templates to seeif there isn't a higher priority rule to apply</li>
- <li>If there is no template, apply the default rule (recurse on
- thechildren)</li>
+ <li>the stylesheet obtained at 2/ can be reused by multiple processing 4/
+ (and this should also work in threaded programs)</li>
+ <li>the tree provided in 2/ should never be freed using xmlFreeDoc, but by
+ freeing the stylesheet.</li>
+ <li>the input tree 4/ is not modified except the _private field which may
+ be used for labelling keys if used by the stylesheet</li>
+</ul><h3><a name="XSLT1" id="XSLT1">The XSLT stylesheet compilation</a></h3><p>This is the second step described. It takes a stylesheet tree, and
+"compiles" it. This associates to each node a structure stored in the
+_private field and containing information computed in the stylesheet:</p><p align="center"><img src="stylesheet.gif" alt="a compiled XSLT stylesheet" /></p><p>One xsltStylesheet structure is generated per document parsed for the
+stylesheet. XSLT documents allow includes and imports of other documents,
+imports are stored in the <strong>imports</strong> list (hence keeping the
+tree hierarchy of includes which is very important for a proper XSLT
+processing model) and includes are stored in the <strong>doclist</strong>
+list. An imported stylesheet has a parent link to allow browsing of the
+tree.</p><p>The DOM tree associated to the document is stored in <strong>doc</strong>.
+It is preprocessed to remove ignorable empty nodes and all the nodes in the
+XSLT namespace are subject to precomputing. This usually consist of
+extracting all the context information from the context tree (attributes,
+namespaces, XPath expressions), and storing them in an xsltStylePreComp
+structure associated to the <strong>_private</strong> field of the node.</p><p>A couple of notable exceptions to this are XSLT template nodes (more on
+this later) and attribute value templates. If they are actually templates,
+the value cannot be computed at compilation time. (Some preprocessing could
+be done like isolation and preparsing of the XPath subexpressions but it's
+not done, yet.)</p><p>The xsltStylePreComp structure also allows storing of the precompiled form
+of an XPath expression that can be associated to an XSLT element (more on
+this later).</p><h3><a name="XSLT2" id="XSLT2">The XSLT template compilation</a></h3><p>A proper handling of templates lookup is one of the keys of fast XSLT
+processing. (Given a node in the source document this is the process of
+finding which templates should be applied to this node.) Libxslt follows the
+hint suggested in the <a href="http://www.w3.org/TR/xslt#patterns">5.2
+Patterns</a> section of the XSLT Recommendation, i.e. it doesn't evaluate it
+as an XPath expression but tokenizes it and compiles it as a set of rules to
+be evaluated on a candidate node. There usually is an indication of the node
+name in the last step of this evaluation and this is used as a key check for
+the match. As a result libxslt builds a relatively more complex set of
+structures for the templates:</p><p align="center"><img src="templates.gif" alt="The templates related structure" /></p><p>Let's describe a bit more closely what is built. First the xsltStylesheet
+structure holds a pointer to the template hash table. All the XSLT patterns
+compiled in this stylesheet are indexed by the value of the the target
+element (or attribute, pi ...) name, so when a element or an attribute "foo"
+needs to be processed the lookup is done using the name as a key.</p><p>Each of the patterns is compiled into an xsltCompMatch structure. It holds
+the set of rules based on the tokenization of the pattern stored in reverse
+order (matching is easier this way). It also holds some information about the
+previous matches used to speed up the process when one iterates over a set of
+siblings. (This optimization may be defeated by trashing when running
+threaded computation, it's unclear that this is a big deal in practice.)
+Predicate expressions are not compiled at this stage, they may be at run-time
+if needed, but in this case they are compiled as full XPath expressions (the
+use of some fixed predicate can probably be optimized, they are not yet).</p><p>The xsltCompMatch are then stored in the hash table, the clash list is
+itself sorted by priority of the template to implement "naturally" the XSLT
+priority rules.</p><p>Associated to the compiled pattern is the xsltTemplate itself containing
+the information required for the processing of the pattern including, of
+course, a pointer to the list of elements used for building the pattern
+result.</p><p>Last but not least a number of patterns do not fit in the hash table
+because they are not associated to a name, this is the case for patterns
+applying to the root, any element, any attributes, text nodes, pi nodes, keys
+etc. Those are stored independently in the stylesheet structure as separate
+linked lists of xsltCompMatch.</p><h3><a name="processing" id="processing">The processing itself</a></h3><p>The processing is defined by the XSLT specification (the basis of the
+algorithm is explained in <a href="http://www.w3.org/TR/xslt#section-Introduction">the Introduction</a>
+section). Basically it works by taking the root of the input document and
+applying the following algorithm:</p><ol><li>Finding the template applying to it. This is a lookup in the template
+ hash table, walking the hash list until the node satisfies all the steps
+ of the pattern, then checking the appropriate(s) global templates to see
+ if there isn't a higher priority rule to apply</li>
+ <li>If there is no template, apply the default rule (recurse on the
+ children)</li>
<li>else walk the content list of the selected templates, for each of them:
- <ul><li>if the node is in the XSLT namespace then the node has a
- _privatefield pointing to the preprocessed values, jump to the
- specificcode</li>
- <li>if the node is in an extension namespace, look up the
- associatedbehavior</li>
+ <ul><li>if the node is in the XSLT namespace then the node has a _private
+ field pointing to the preprocessed values, jump to the specific
+ code</li>
+ <li>if the node is in an extension namespace, look up the associated
+ behavior</li>
<li>otherwise copy the node.</li>
- </ul><p>The closure is usually done through the
- XSLT<strong>apply-templates</strong>construct recursing by applying
- theadequate template on the input node children or on the result of
- anassociated XPath selection lookup.</p>
+ </ul><p>The closure is usually done through the XSLT
+ <strong>apply-templates</strong> construct recursing by applying the
+ adequate template on the input node children or on the result of an
+ associated XPath selection lookup.</p>
</li>
-</ol><p>Note that large parts of the input tree may not be processed by a
-givenstylesheet and that on the opposite some may be processed multiple
-times.(This often is the case when a Table of Contents is built).</p><p>The module <code>transform.c</code>is the one implementing most of
-thislogic. <strong>xsltApplyStylesheet()</strong>is the entry point,
-itallocates an xsltTransformContext containing the following:</p><ul><li>a pointer to the stylesheet being processed</li>
+</ol><p>Note that large parts of the input tree may not be processed by a given
+stylesheet and that on the opposite some may be processed multiple times.
+(This often is the case when a Table of Contents is built).</p><p>The module <code>transform.c</code> is the one implementing most of this
+logic. <strong>xsltApplyStylesheet()</strong> is the entry point, it
+allocates an xsltTransformContext containing the following:</p><ul><li>a pointer to the stylesheet being processed</li>
<li>a stack of templates</li>
<li>a stack of variables and parameters</li>
<li>an XPath context</li>
@@ -173,18 +173,18 @@ itallocates an xsltTransformContext containing the following:</p><ul><li>a point
<li>current selected node list</li>
<li>the current insertion points in the output document</li>
<li>a couple of hash tables for extension elements and functions</li>
-</ul><p>Then a new document gets allocated (HTML or XML depending on the type
-ofoutput), the user parameters and global variables and parameters
-areevaluated. Then <strong>xsltProcessOneNode()</strong>which implements
-the1-2-3 algorithm is called on the root element of the input. Step 1/
-isimplemented by calling <strong>xsltGetTemplate()</strong>, step 2/
-isimplemented by <strong>xsltDefaultProcessOneNode()</strong>and step 3/
-isimplemented by <strong>xsltApplyOneTemplate()</strong>.</p><h3><a name="XPath" id="XPath">XPath expression compilation</a></h3><p>The XPath support is actually implemented in the libxml module (where itis
-reused by the XPointer implementation). XPath is a relatively
-classicexpression language. The only uncommon feature is that it is working
-on XMLtrees and hence has specific syntax and types to handle them.</p><p>XPath expressions are compiled using <strong>xmlXPathCompile()</strong>.It
-will take an expression string in input and generate a structurecontaining
-the parsed expression tree, for example the expression:</p><pre>/doc/chapter[title='Introduction']</pre><p>will be compiled as</p><pre>Compiled Expression : 10 elements
+</ul><p>Then a new document gets allocated (HTML or XML depending on the type of
+output), the user parameters and global variables and parameters are
+evaluated. Then <strong>xsltProcessOneNode()</strong> which implements the
+1-2-3 algorithm is called on the root element of the input. Step 1/ is
+implemented by calling <strong>xsltGetTemplate()</strong>, step 2/ is
+implemented by <strong>xsltDefaultProcessOneNode()</strong> and step 3/ is
+implemented by <strong>xsltApplyOneTemplate()</strong>.</p><h3><a name="XPath" id="XPath">XPath expression compilation</a></h3><p>The XPath support is actually implemented in the libxml module (where it
+is reused by the XPointer implementation). XPath is a relatively classic
+expression language. The only uncommon feature is that it is working on XML
+trees and hence has specific syntax and types to handle them.</p><p>XPath expressions are compiled using <strong>xmlXPathCompile()</strong>.
+It will take an expression string in input and generate a structure
+containing the parsed expression tree, for example the expression:</p><pre>/doc/chapter[title='Introduction']</pre><p>will be compiled as</p><pre>Compiled Expression : 10 elements
SORT
COLLECT 'child' 'name' 'node' chapter
COLLECT 'child' 'name' 'node' doc
@@ -196,96 +196,97 @@ the parsed expression tree, for example the expression:</p><pre>/doc/chapter[tit
NODE
ELEM Object is a string : Introduction
COLLECT 'child' 'name' 'node' title
- NODE</pre><p>This can be tested using the <code>testXPath</code>command (in thelibxml
-codebase) using the <code>--tree</code>option.</p><p>Again, the KISS approach is used. No optimization is done. This could bean
-interesting thing to add. <a href="http://www-106.ibm.com/developerworks/library/x-xslt2/?dwzone=x?open&amp;l=132%2ct=gr%2c+p=saxon">MichaelKay
-describes</a>a lot of possible and interesting optimizations done inSaxon
-which would be possible at this level. I'm unsure they would providemuch gain
-since the expressions tends to be relatively simple in general andstylesheets
-are still hand generated. Optimizations at the interpretationsounds likely to
-be more efficient.</p><h3><a name="XPath1" id="XPath1">XPath interpretation</a></h3><p>The interpreter is implemented by
-<strong>xmlXPathCompiledEval()</strong>which is the front-end to
-<strong>xmlXPathCompOpEval()</strong>the functionimplementing the evaluation
-of the expression tree. This evaluation followsthe KISS approach again. It's
-recursive and calls<strong>xmlXPathNodeCollectAndTest()</strong>to collect
-nodes set whenevaluating a <code>COLLECT</code>node.</p><p>An evaluation is done within the framework of an XPath context stored inan
-<strong>xmlXPathContext</strong>structure, in the framework of
-atransformation the context is maintained within the XSLT context. Its
-contentfollows the requirements from the XPath specification:</p><ul><li>the current document</li>
+ NODE</pre><p>This can be tested using the <code>testXPath</code> command (in the
+libxml codebase) using the <code>--tree</code> option.</p><p>Again, the KISS approach is used. No optimization is done. This could be
+an interesting thing to add. <a href="http://www-106.ibm.com/developerworks/library/x-xslt2/?dwzone=x?open&amp;l=132%2ct=gr%2c+p=saxon">Michael
+Kay describes</a> a lot of possible and interesting optimizations done in
+Saxon which would be possible at this level. I'm unsure they would provide
+much gain since the expressions tends to be relatively simple in general and
+stylesheets are still hand generated. Optimizations at the interpretation
+sounds likely to be more efficient.</p><h3><a name="XPath1" id="XPath1">XPath interpretation</a></h3><p>The interpreter is implemented by <strong>xmlXPathCompiledEval()</strong>
+which is the front-end to <strong>xmlXPathCompOpEval()</strong> the function
+implementing the evaluation of the expression tree. This evaluation follows
+the KISS approach again. It's recursive and calls
+<strong>xmlXPathNodeCollectAndTest()</strong> to collect nodes set when
+evaluating a <code>COLLECT</code> node.</p><p>An evaluation is done within the framework of an XPath context stored in
+an <strong>xmlXPathContext</strong> structure, in the framework of a
+transformation the context is maintained within the XSLT context. Its content
+follows the requirements from the XPath specification:</p><ul><li>the current document</li>
<li>the current node</li>
<li>a hash table of defined variables (but not used by XSLT)</li>
<li>a hash table of defined functions</li>
- <li>the proximity position (the place of the node in the current
- nodelist)</li>
+ <li>the proximity position (the place of the node in the current node
+ list)</li>
<li>the context size (the size of the current node list)</li>
- <li>the array of namespace declarations in scope (there also is a
- namespacehash table but it is not used in the XSLT transformation).</li>
-</ul><p>For the purpose of XSLT an <strong>extra</strong>pointer has been
-addedallowing to retrieve the XSLT transformation context. When an
-XPathevaluation is about to be performed, an XPath parser context is
-allocatedcontaining and XPath object stack (this is actually an XPath
-evaluationcontext, this is a remain of the time where there was no separate
-parsing andevaluation phase in the XPath implementation). Here is an overview
-of the setof contexts associated to an XPath evaluation within an
-XSLTtransformation:</p><p align="center"><img src="contexts.gif" alt="The set of contexts associated " /></p><p>Clearly this is a bit too complex and confusing and should be refactoredat
-the next set of binary incompatible releases of libxml. For example
-thexmlXPathCtxt has a lot of unused parts and should probably be merged
-withxmlXPathParserCtxt.</p><h3><a name="Descriptio" id="Descriptio">Description of XPath Objects</a></h3><p>An XPath expression manipulates XPath objects. XPath defines the
-defaulttypes boolean, numbers, strings and node sets. XSLT adds the result
-treefragment type which is basically an unmodifiable node set.</p><p>Implementation-wise, libxml follows again a KISS approach,
-thexmlXPathObject is a structure containing a type description and the
-variouspossibilities. (Using an enum could have gained some bytes.) In the
-case ofnode sets (or result tree fragments), it points to a separate
-xmlNodeSetobject which contains the list of pointers to the document
-nodes:</p><p align="center"><img src="object.gif" alt="An Node set object pointing to " /></p><p>The <a href="http://xmlsoft.org/html/libxml-xpath.html">XPath
-API</a>(andits <a href="http://xmlsoft.org/html/libxml-xpathinternals.html">'internal'part</a>)
-includes a number of functions to create, copy, compare, convert orfree XPath
-objects.</p><h3><a name="XPath3" id="XPath3">XPath functions</a></h3><p>All the XPath functions available to the interpreter are registered in
-thefunction hash table linked from the XPath context. They all share the
-samesignature:</p><pre>void xmlXPathFunc (xmlXPathParserContextPtr ctxt, int nargs);</pre><p>The first argument is the XPath interpretation context, holding
-theinterpretation stack. The second argument defines the number of
-objectspassed on the stack for the function to consume (last argument is on
-top ofthe stack).</p><p>Basically an XPath function does the following:</p><ul><li>check <code>nargs</code>for proper handling of errors or functionswith
- variable numbers of parameters</li>
- <li>pop the parameters from the stack using <code>obj
- =valuePop(ctxt);</code></li>
+ <li>the array of namespace declarations in scope (there also is a namespace
+ hash table but it is not used in the XSLT transformation).</li>
+</ul><p>For the purpose of XSLT an <strong>extra</strong> pointer has been added
+allowing to retrieve the XSLT transformation context. When an XPath
+evaluation is about to be performed, an XPath parser context is allocated
+containing and XPath object stack (this is actually an XPath evaluation
+context, this is a remain of the time where there was no separate parsing and
+evaluation phase in the XPath implementation). Here is an overview of the set
+of contexts associated to an XPath evaluation within an XSLT
+transformation:</p><p align="center"><img src="contexts.gif" alt="The set of contexts associated " /></p><p>Clearly this is a bit too complex and confusing and should be refactored
+at the next set of binary incompatible releases of libxml. For example the
+xmlXPathCtxt has a lot of unused parts and should probably be merged with
+xmlXPathParserCtxt.</p><h3><a name="Descriptio" id="Descriptio">Description of XPath Objects</a></h3><p>An XPath expression manipulates XPath objects. XPath defines the default
+types boolean, numbers, strings and node sets. XSLT adds the result tree
+fragment type which is basically an unmodifiable node set.</p><p>Implementation-wise, libxml follows again a KISS approach, the
+xmlXPathObject is a structure containing a type description and the various
+possibilities. (Using an enum could have gained some bytes.) In the case of
+node sets (or result tree fragments), it points to a separate xmlNodeSet
+object which contains the list of pointers to the document nodes:</p><p align="center"><img src="object.gif" alt="An Node set object pointing to " /></p><p>The <a href="http://xmlsoft.org/html/libxml-xpath.html">XPath API</a> (and
+its <a href="http://xmlsoft.org/html/libxml-xpathinternals.html">'internal'
+part</a>) includes a number of functions to create, copy, compare, convert or
+free XPath objects.</p><h3><a name="XPath3" id="XPath3">XPath functions</a></h3><p>All the XPath functions available to the interpreter are registered in the
+function hash table linked from the XPath context. They all share the same
+signature:</p><pre>void xmlXPathFunc (xmlXPathParserContextPtr ctxt, int nargs);</pre><p>The first argument is the XPath interpretation context, holding the
+interpretation stack. The second argument defines the number of objects
+passed on the stack for the function to consume (last argument is on top of
+the stack).</p><p>Basically an XPath function does the following:</p><ul><li>check <code>nargs</code> for proper handling of errors or functions
+ with variable numbers of parameters</li>
+ <li>pop the parameters from the stack using <code>obj =
+ valuePop(ctxt);</code></li>
<li>do the function specific computation</li>
- <li>push the result parameter on the stack using
- <code>valuePush(ctxt,res);</code></li>
- <li>free up the input parameters
- with<code>xmlXPathFreeObject(obj);</code></li>
+ <li>push the result parameter on the stack using <code>valuePush(ctxt,
+ res);</code></li>
+ <li>free up the input parameters with
+ <code>xmlXPathFreeObject(obj);</code></li>
<li>return</li>
-</ul><p>Sometime the work can be done directly by modifying in-situ the top
-objecton the stack <code>ctxt-&gt;value</code>.</p><h3><a name="stack" id="stack">The XSLT variables stack frame</a></h3><p>Not to be confused with XPath object stack, this stack holds the
-XSLTvariables and parameters as they are defined through the recursive calls
-ofcall-template, apply-templates and default templates. This is used to
-definethe scope of variables being called.</p><p>This part seems to be the most urgent attention right now, first it isdone
-in a very inefficient way since the location of the variables andparameters
-within the stylesheet tree is still done at run time (it reallyshould be done
-statically at compile time), and I am still unsure that myunderstanding of
-the template variables and parameter scope is actuallyright.</p><p>This part of the documentation is still to be written once this part ofthe
-code will be stable. <span style="background-color: #FF0000">TODO</span></p><h3><a name="Extension" id="Extension">Extension support</a></h3><p>There is a separate document explaining <a href="extensions.html">how
-theextension support works</a>.</p><h3><a name="Futher" id="Futher">Further reading</a></h3><p>Michael Kay wrote <a href="http://www-106.ibm.com/developerworks/library/x-xslt2/?dwzone=x?open&amp;l=132%2ct=gr%2c+p=saxon">areally
-interesting article on Saxon internals</a>and the work he did onperformance
-issues. I wishes I had read it before starting libxslt design (Iwould
-probably have avoided a few mistakes and progressed faster). A lot ofthe
-ideas in his papers should be implemented or at least tried inlibxslt.</p><p>The <a href="http://xmlsoft.org/">libxml documentation</a>, especially <a href="http://xmlsoft.org/xmlio.html">the I/O interfaces</a>and the <a href="http://xmlsoft.org/xmlmem.html">memory management</a>.</p><h3><a name="TODOs" id="TODOs">TODOs</a></h3><p>redesign the XSLT stack frame handling. Far too much work is done
-atexecution time. Similarly for the attribute value templates handling,
-atleast the embedded subexpressions ought to be precompiled.</p><p>Allow output to be saved to a SAX like output (this notion of SAX like
-APIfor output should be added directly to libxml).</p><p>Implement and test some of the optimization explained by Michael
-Kayespecially:</p><ul><li>static slot allocation on the stack frame</li>
+</ul><p>Sometime the work can be done directly by modifying in-situ the top object
+on the stack <code>ctxt-&gt;value</code>.</p><h3><a name="stack" id="stack">The XSLT variables stack frame</a></h3><p>Not to be confused with XPath object stack, this stack holds the XSLT
+variables and parameters as they are defined through the recursive calls of
+call-template, apply-templates and default templates. This is used to define
+the scope of variables being called.</p><p>This part seems to be the most urgent attention right now, first it is
+done in a very inefficient way since the location of the variables and
+parameters within the stylesheet tree is still done at run time (it really
+should be done statically at compile time), and I am still unsure that my
+understanding of the template variables and parameter scope is actually
+right.</p><p>This part of the documentation is still to be written once this part of
+the code will be stable. <span style="background-color: #FF0000">TODO</span></p><h3><a name="Extension" id="Extension">Extension support</a></h3><p>There is a separate document explaining <a href="extensions.html">how the
+extension support works</a>.</p><h3><a name="Futher" id="Futher">Further reading</a></h3><p>Michael Kay wrote <a href="http://www-106.ibm.com/developerworks/library/x-xslt2/?dwzone=x?open&amp;l=132%2ct=gr%2c+p=saxon">a
+really interesting article on Saxon internals</a> and the work he did on
+performance issues. I wishes I had read it before starting libxslt design (I
+would probably have avoided a few mistakes and progressed faster). A lot of
+the ideas in his papers should be implemented or at least tried in
+libxslt.</p><p>The <a href="http://xmlsoft.org/">libxml documentation</a>, especially <a href="http://xmlsoft.org/xmlio.html">the I/O interfaces</a> and the <a href="http://xmlsoft.org/xmlmem.html">memory management</a>.</p><h3><a name="TODOs" id="TODOs">TODOs</a></h3><p>redesign the XSLT stack frame handling. Far too much work is done at
+execution time. Similarly for the attribute value templates handling, at
+least the embedded subexpressions ought to be precompiled.</p><p>Allow output to be saved to a SAX like output (this notion of SAX like API
+for output should be added directly to libxml).</p><p>Implement and test some of the optimization explained by Michael Kay
+especially:</p><ul><li>static slot allocation on the stack frame</li>
<li>specific boolean interpretation of an XPath expression</li>
<li>some of the sorting optimization</li>
- <li>Lazy evaluation of location path. (this may require more changes
- butsounds really interesting. XT does this too.)</li>
- <li>Optimization of an expression tree (This could be done as a
- completelyindependent module.)</li>
-</ul><p></p><p>Error reporting, there is a lot of case where the XSLT
-specificationspecify that a given construct is an error are not checked
-adequately bylibxslt. Basically one should do a complete pass on the XSLT
-spec again andadd all tests to the stylesheet compilation. Using the DTD
-provided in theappendix and making direct checks using the libxml validation
-API sounds agood idea too (though one should take care of not raising errors
-forelements/attributes in different namespaces).</p><p>Double check all the places where the stylesheet compiled form might
-bemodified at run time (extra removal of blanks nodes, hint on
-thexsltCompMatch).</p><p></p><p><a href="bugs.html">Daniel Veillard</a></p></td></tr></table></td></tr></table></td></tr></table></td></tr></table></td></tr></table></body></html>
+ <li>Lazy evaluation of location path. (this may require more changes but
+ sounds really interesting. XT does this too.)</li>
+ <li>Optimization of an expression tree (This could be done as a completely
+ independent module.)</li>
+</ul><p></p><p>Error reporting, there is a lot of case where the XSLT specification
+specify that a given construct is an error are not checked adequately by
+libxslt. Basically one should do a complete pass on the XSLT spec again and
+add all tests to the stylesheet compilation. Using the DTD provided in the
+appendix and making direct checks using the libxml validation API sounds a
+good idea too (though one should take care of not raising errors for
+elements/attributes in different namespaces).</p><p>Double check all the places where the stylesheet compiled form might be
+modified at run time (extra removal of blanks nodes, hint on the
+xsltCompMatch).</p><p></p><p><a href="bugs.html">Daniel Veillard</a></p></td></tr></table></td></tr></table></td></tr></table></td></tr></table></td></tr></table></body></html>