Multiple schema documents in one XML document
This document illustrates the invocation of the
XSD validator XSV to validate an XML instance document
governed by a schema defined by two schema documents
located in a single XML document.
The issue came up in connection with
a
question on StackOverflow; in the course of a discussion of the
answers, an intelligent and energetic Stackoverflow user named Petru
Gardea, who has demonstrated a deep knowledge of XSD and has answered
a great many XSD questions on the site, asked
if there were any XSD processors at all that handled multiple XSD
schema documents in the same XML document, outside the context of WSDL.
(At least, that was what I understood him to be asking; it's not clear
that we are communicating very successfully.) I mentioned XSV as one,
and he reported some problems verifying the behavior.
This document provides pointers to XML documents
that can be used to illustrate the behavior of XSV and
describes how to use XSV's web interface to perform the demonstration.
It is not general-purpose documentation of XSV or the XSV web
interface, with neither of which I have any connection.
XML documents
Three XML documents are provided for the demonstration:
-
An XML document containing
two
xsd:schema
elements, one containing an
import of the namespace defined by the other.
-
An XML document which
uses the schema defined by those two
xsd:schema
elements.
-
A second XML document which
uses the schema defined by those two
xsd:schema
elements,
which differs from the first in using absolute URIs rather than
relative URIs in its schema location hints.
All of these documents are located in the same directory as this
document.
The XSD schema defines several elements:
-
{http://example.com/a}a
, intended as the
root element of an instance document; it accepts one
{http://example.com/b}b
element as a child.
-
{http://example.com/b}b
, which can contain
one or more of the other elements in the same namespace, each of
which is designed to make it easy to provide both valid and
invalid instances, to help verify whether XSV is actually reading
and processing the schema documents properly. (When I specified
these, I forgot for a moment how spotty XSV's validation of simple
types is: the only types it's actually validating at all correctly
here are xsd:int
and xsd:NCName
.)
-
{http://example.com/b}int
, of type xsd:int
-
{http://example.com/b}NCName
, of type xsd:NCName
-
{http://example.com/b}gYear
, of type xsd:gYear
-
{http://example.com/b}boolean
, of type xsd:boolean
-
{http://example.com/b}duration
, of type xsd:duration
-
{http://example.com/a}wrapper
, used as a wrapper in
the document containing the two schema documents
The XML instance documents each contain a root element named
demo:a
(where demo
is bound to the namespace
http://example.com/a
), which contains a
ns2:b
(where ns2
is bound to the namespace
http://example.com/b
). This in turn contains two
instances each of the elements declared with simple types, one
instance valid and one instance invalid.
Invoking XSV with these documents
If you want to experiment with different input:
- Save the XML document at http://blackmesatech.com/2012/09/xsv-demo/test2.xml
and edit it as desired. (Note, however, that XSV does not claim to
check all XSD simple types, so experimenting with invalid instances
of most XSD simple types doesn't produce interesting results.)
- Navigate to
the XSV web interface
and locate the second form, with the paragraph reading
Use this form only if you are behind a firewall or have a schema to check which is not accessible via the Web.
- Click the “Choose File” button and select your local copy of the test document.
- Select the options you want (the default should suffice for a simple demonstration).
- Click the button labeled “Upload and Get Results”.
Reading XSV's output
Not all users find XSV's output messages intuitively obvious at
first encounter, so some notes about the output and its meaning may
be in order.
In the initial bulleted list, the
Target and
docElt entries
serve mostly as a sanity check. When the first test instance provided here is used,
they will read something like this:
-
Target:
http://blackmesatech.com/2012/09/xsv-demo/multischema.xml
(Real name: http://blackmesatech.com/2012/09/xsv-demo/multischema.xml
Length: 604 bytes
Last Modified: Sat, 22 Sep 2012 17:18:46 GMT
Server: Apache)
-
docElt:
{http://example.com/a}a
-
The third item in the initial list is crucial, because by
default XSV performs lax validation starting with the document
root. If XSV finds a declaration for the document element in the
schema, then it will switch to strict validation; if it reports here
that it performed lax validation, the most frequent cause is that it
did not succeed in finding a declaration for the root element in the
instance document. If you were expecting the root element to be valid
against an appropriate declaration in the schema, then, a report of
lax validation is a signal that something is wrong.
For the demonstration, XSV reports
Validation was strict, starting with type [Anonymous]
From this, we can infer that it found the schema document for
namespace
http://example.com/a
and that we don't
have any unexpected namespace binding issues (at least, not
on the root element).
-
The next two items in the list report on the collection of
schema documents used to build the schema. In the case of the demo, they say
-
schemaLocs: http://example.com/a -> http://blackmesatech.com/2012/09/xsv-demo/multischema.xsd#a; http://example.com/b -> http://blackmesatech.com/2012/09/xsv-demo/multischema.xsd#b
- The schema(s) used for schema-validation had no errors
That is, XSV found and loaded a schema for namespace http://example.com/a
from the URI http://blackmesatech.com/2012/09/xsv-demo/multischema.xsd#a
(note the fragment identifier), and a schema for namespace
http://example.com/b
from
http://blackmesatech.com/2012/09/xsv-demo/multischema.xsd#b
(same document, different XML element).
A fuller report of schema resource usage is given further down
in the XSV output under the heading “Schema resources involved”, where
XSV reports that it first tried successfully to load the schema for
http://example.com/a
from the location already mentioned,
then tried unsuccessfully to load a schema for
http://example.com/b
from
http://example.com/b
(the import
in
the first schema document doesn't specify a schema location, so
XSV tries the namespace itself).
Then it tried the location actually given in the instance for
the second namespace, and succeeded.
The final section of XSV's output reports (under the heading
“Problems with the schema-validity of the target”) validity problems:
file:/usr/local/XSV/xsvlog/tmpkZB7HFuploaded:8:5: Invalid per cvc-complex-type.1.2.2: element content failed type check: forty-two is not a valid decimal literal
file:/usr/local/XSV/xsvlog/tmpkZB7HFuploaded:10:5: Invalid per cvc-complex-type.1.2.2: element content failed type check: 42 does not match pattern [_:A-Za-zÀ-ÖØ-öø-ÿĀ-...
As can be seen, it detected the invalid elements on lines 8
(
<ns2:int>forty-two</ns2:int>
)
and 10 (
<ns2:NCName>42</ns2:NCName>
),
but not the others on lines 12, 14, or 16.
What this does and does not demonstrate
The demonstration shows that XSV can load schema documents
(i.e. xsd:schema
elements) which are not
themselves the outermost element in their XML documents.
It thus illustrates that multiple spec-conformant schema
documents can be embedded in the same XML document.
(Since this appears to be an issue for some interested parties,
it should apparently also be noted that WSDL is not involved
here.)
It does not illustrate that embedding two schema documents
in a single file is a good way of solving the problem faced by
the original poster of the Stack Overflow question where the
behavior of XSV came up as a side issue. (First of all,
the original poster is unlikely to be using XSV. Second
and more important it's likely that his import problems
had some other cause; putting the schema documents into the
same file would not address that cause.)
It also does not illustrate that embedding multiple
schema documents in a single XML document is widely supported,
or often a good idea, or not widely supported, or often a
bad idea; those questions are orthogonal to the issue
addressed by the demonstration, which is whether XSV does
or does not support the loading of schema documents from
files that contain multiple xsd:schema
elements.