Balisage 2012 – just a week to go

[31 July 2012]

Balisage 2012 is just a week away. Next Monday there is the pre-conference symposium on quality assurance and quality control in XML systems, and a week from today the conference proper starts.

I’m looking forward to pretty much all of the papers on the program, so it’s kind of hard to pick any out for particular mention. And yet, unless I want to just reproduce the program for the conference, I’m going to have to.

Several papers this year deal, one way or another, with the relation of XML and JSON. Some talk about JSON support in XML tools, some about simplifying XML so it has more appeal to the kind of person who finds JSON attractive. Hans-Jürgen Rennau has a different take: he proposes a modest generalization of the XDM data model (which underlies XPath, XSLT, and XQuery and is as close as anyone is likely to come to being the consensus model for XML) which makes the existing XDM and JSON models each a specialization of the more general model. Since XPath, XQuery, and XSLT work on XDM instances, not on serialized data, they then apply without contortions both to XML and to JSON. (Of course, they need a few modest extensions to cover the new data model, too.)

Changing the underlying data model for a technology is hard, of course, but it’s not impossible (SQL has done so, at least in some ways, and that’s one reason for its longevity). I think Rennau’s proposal merits serious discussion. It’s certainly one of the most far-reaching papers at this year’s conference.

Several talks address the relation of XML and non-XML notations for languages, and I’m looking forward to the discussions that that thread of the conference elicits. David Lee, now with MarkLogic, considers what life would be like if we marked up structure in programming languages the way we mark it up in documents. Norm Walsh continues the thread with a discussion of the general issue with particular reference to possible designs for a ‘compact syntax’ for XProc. Mark D. Flood, Matthew McCormick, and Nathan Palmer approach the problem complex from a different and enlightening angle, that of literate programming, in their case literate programming for the development of test cases for scientific function libraries. Mario Blažević offers the latest entry in the ongoing series of papers exploring how to do things with XML that were (in some form or other) part of SGML but were dropped when XML was designed. His paper shows how we might do SHORTREF in an XML context in a more general and more reliable way than was achieved when SHORTREF was bundled into SGML. And finally, Sam Wilmott opens the entire series of talks with a case study and general reflections on literate programming. I look forward to Wednesday at Balisage!

As is customary at Balisage, a few papers approach resolutely theoretical topics, either with or without overt practical applications. I’ll mention just a few: Hervé Ruellan of Canon discusses a long series of careful measurements of entropy in various data structures for XML; his paper feels in some ways like the theoretical underpinnings I wish the Efficient XML Interchange working group had had at the beginning of its work. Abel Braaksma describes the use of higher-order functions as a way to simplify XSLT stylesheet development. And Claus Huitfeldt, Fabio Vitali, and Silvio Peroni have produced a response to the paper presented in 2010 by Allen Renear and Karen Wickett of the University of Illinois claiming that documents (as we conventionally try to formalize them) do not exist. Huitfeldt and his co-authors explore the possibility of viewing documents as ‘timed abstract objects’.

Theory, practice, practice, and theory. I look forward to seeing you at Balisage.