This document describes Roth, a streamability analyser for XSLT 3.0.
Roth reads fragments of XSLT source code and analyses them for
their streamability, following the rules for streamability
analysis given in section 19 of the draft XSLT 3.0
specification. It thus provides a way to tell whether the
XSLT in question is guaranteed streamable, as
that term is defined by the XSLT specification. Any
conforming processor that supports the streaming feature is
required (as a condition of conformance) to stream
any mode declared streamable and the contents of any
xsl:stream
instruction, if they are guaranteed
streamable. (For purposes of this document, to
stream a construct means to handle it in a streaming
manner, in particular with limited storage consumption.)
Note that a construct is not necessarily non-streamable just because it's not guaranteed streamable. The streamability analysis in the spec is a compromise among three competing goals:
The first two goals conflict with the third goal, and the result is that the analysis in the XSLT 3.0 specification is by nature a bit conservative. It is not very hard to find constructs that a technically astute human being can see could be processed in constant memory (and so are streamable in fact) but that the streamability analysis in the spec does not classify as guaranteed streamable. So the specification is explicit that conforming processors are allowed to use a more aggressive and sophisticated streamability analysis than the one defined in the specification, which may enable them to stream other constructs as well.
Roth is intended to help stylesheet authors ensure that their code will be streamable by any streaming processor that conforms to the specification, by checking to make sure that the code is guaranteed streamable: the XSLT 3.0 specification requires that any streaming processor be able to stream any construct classified as guaranteed streamable by the rules in the spec, so any code classified as streamable by Roth should be handled in a streaming fashion by any conforming processor. Roth can thus be regarded as (among other things) a way of helping avoid dependency on any single XSLT processor.
Roth also serves the purpose of allowing the XSLT working group to apply the streamability rules to concrete examples as a way of checking the correctness and completeness of the current streamability rules. As this is written, the WG is the primary expected user of Roth; this is why the default report is so verbose.
In general, the procedure for using Roth is simple:
Analyse!
button.To use Roth to check the streamability of an
xsl:stream
instruction, you will need to
paste the text of the xsl:stream
instruction into
the text widget.
To use Roth to check the streamability of a mode, you will need to run the analysis once on each template of the mode. The mode is guaranteed streamable if (and only if) all of its templates are guaranteed streamable. A future version of Roth may allow you to load an entire stylesheet and specify a mode to check for streamability. For now, though, that's not available.
To use Roth to check the streamability of a stylesheet
as a whole, you will need to decide what you mean by that.
The XSLT 3.0 spec defines streamability for certain constructs
(in particular for modes and for the xsl:stream
instruction), but not for stylesheets as a whole. Decide
what modes in the stylesheet should be streamable, and use
Roth to check them.
A future version of Roth may accept a whole stylesheet and
check every mode in the stylesheet for streamability.
Roth is written in XSLT 2.0. It defines templates for each
XSLT instruction and expression type, and applies them recursively
in the usual way to determine the streamability (or more precisely,
the sweep) of any XSLT construct. Since XSLT instructions
are XML elements, the application of templates to them takes place
in the obvious way. The analysis of select
expressions,
match
patterns, and other non-XML constructs in the
XSLT code is handled by parsing them into an XML representation and
then processing that representation using templates.
The Web interface uses SaxonCE (Saxon Compact Edition), an open-source XSLT 2.0 implementation from Saxonica, which is compiled to Javascript and runs in the user's browser. (So the user's XSLT fragments are never transmitted to the server, and Roth may be run on confidential code without risk of breaching confidentiality.) O'Neil Delpratt of Saxonica built the initial version of the Web interface.
The parser for XPath 3.0 expressions and patterns was generated automatically by Gunther Rademacher's REx parser generator. (Well, actually, Roth version 0.1 doesn't have a full parser; it just handles a small subset of XPath. Version 0.2 will have a parser, but it will be a parser for XPath 2.0 expressions; getting a parser for 3.0 expressions will require an XPath 3.0 grammar, which I haven't constructed yet.)
The source code of Roth can be obtained by downloading the following items from this Web site:
roth:analyse
, which
does the actual work)The current version of Roth is version 0.1, of 3 August 2013. It is based on the WG-internal XSLT 3.0 draft specification of 25 July 2013.
The current version of Roth is incomplete; for the fullest available list of current gaps and coverage (more gaps than coverage, at the moment, sorry), see the separate document on Roth's current shortcomings.
The streamability analyser is distributed under the GNU Public License. (Yes, I know; more details are needed here.)
The current version of the streamability analyser is named “Roth”, for Henry Roth, the American novelist whose magnum opus is a series of novels published under the umbrella title Mercy of a rude stream.