Roth

An XSLT streamability checker

3 August 2013

This document describes Roth, a streamability analyser for XSLT 3.0.

Roth reads fragments of XSLT source code and analyses them for their streamability, following the rules for streamability analysis given in section 19 of the draft XSLT 3.0 specification. It thus provides a way to tell whether the XSLT in question is guaranteed streamable, as that term is defined by the XSLT specification. Any conforming processor that supports the streaming feature is required (as a condition of conformance) to stream any mode declared streamable and the contents of any xsl:stream instruction, if they are guaranteed streamable. (For purposes of this document, to stream a construct means to handle it in a streaming manner, in particular with limited storage consumption.)

Note that a construct is not necessarily non-streamable just because it's not guaranteed streamable. The streamability analysis in the spec is a compromise among three competing goals:

  1. It should be correct, to ensure that everything the spec classifies as guaranteed streamable is in fact streamable by a suitable processor.
  2. It should be simple, both to make it easy to perform and understand the analysis and to make it possible for a reader to see that it is correct.
  3. It should be powerful, to cover as many constructs as possible.

The first two goals conflict with the third goal, and the result is that the analysis in the XSLT 3.0 specification is by nature a bit conservative. It is not very hard to find constructs that a technically astute human being can see could be processed in constant memory (and so are streamable in fact) but that the streamability analysis in the spec does not classify as guaranteed streamable. So the specification is explicit that conforming processors are allowed to use a more aggressive and sophisticated streamability analysis than the one defined in the specification, which may enable them to stream other constructs as well.

Roth is intended to help stylesheet authors ensure that their code will be streamable by any streaming processor that conforms to the specification, by checking to make sure that the code is guaranteed streamable: the XSLT 3.0 specification requires that any streaming processor be able to stream any construct classified as guaranteed streamable by the rules in the spec, so any code classified as streamable by Roth should be handled in a streaming fashion by any conforming processor. Roth can thus be regarded as (among other things) a way of helping avoid dependency on any single XSLT processor.

Roth also serves the purpose of allowing the XSLT working group to apply the streamability rules to concrete examples as a way of checking the correctness and completeness of the current streamability rules. As this is written, the WG is the primary expected user of Roth; this is why the default report is so verbose.

Using Roth

In general, the procedure for using Roth is simple:

  1. Navigate to the Roth interface page (R.I.P.)
  2. Copy an XSLT construct into the text widget.
  3. Click the Analyse! button.

To use Roth to check the streamability of an xsl:stream instruction, you will need to paste the text of the xsl:stream instruction into the text widget.

To use Roth to check the streamability of a mode, you will need to run the analysis once on each template of the mode. The mode is guaranteed streamable if (and only if) all of its templates are guaranteed streamable. A future version of Roth may allow you to load an entire stylesheet and specify a mode to check for streamability. For now, though, that's not available.

To use Roth to check the streamability of a stylesheet as a whole, you will need to decide what you mean by that. The XSLT 3.0 spec defines streamability for certain constructs (in particular for modes and for the xsl:stream instruction), but not for stylesheets as a whole. Decide what modes in the stylesheet should be streamable, and use Roth to check them. A future version of Roth may accept a whole stylesheet and check every mode in the stylesheet for streamability.

Internals

Roth is written in XSLT 2.0. It defines templates for each XSLT instruction and expression type, and applies them recursively in the usual way to determine the streamability (or more precisely, the sweep) of any XSLT construct. Since XSLT instructions are XML elements, the application of templates to them takes place in the obvious way. The analysis of select expressions, match patterns, and other non-XML constructs in the XSLT code is handled by parsing them into an XML representation and then processing that representation using templates.

The Web interface uses SaxonCE (Saxon Compact Edition), an open-source XSLT 2.0 implementation from Saxonica, which is compiled to Javascript and runs in the user's browser. (So the user's XSLT fragments are never transmitted to the server, and Roth may be run on confidential code without risk of breaching confidentiality.) O'Neil Delpratt of Saxonica built the initial version of the Web interface.

The parser for XPath 3.0 expressions and patterns was generated automatically by Gunther Rademacher's REx parser generator. (Well, actually, Roth version 0.1 doesn't have a full parser; it just handles a small subset of XPath. Version 0.2 will have a parser, but it will be a parser for XPath 2.0 expressions; getting a parser for 3.0 expressions will require an XPath 3.0 grammar, which I haven't constructed yet.)

The source code of Roth can be obtained by downloading the following items from this Web site:

Miscellaneous

Version information

The current version of Roth is version 0.1, of 3 August 2013. It is based on the WG-internal XSLT 3.0 draft specification of 25 July 2013.

Limitations

The current version of Roth is incomplete; for the fullest available list of current gaps and coverage (more gaps than coverage, at the moment, sorry), see the separate document on Roth's current shortcomings.

License

The streamability analyser is distributed under the GNU Public License. (Yes, I know; more details are needed here.)

About the name

The current version of the streamability analyser is named “Roth”, for Henry Roth, the American novelist whose magnum opus is a series of novels published under the umbrella title Mercy of a rude stream.