<!DOCTYPE TEI.2 PUBLIC '-//C. M. Sperberg-McQueen//DTD
          TEI Lite 1.0 plus SWeb (XML)//EN'
          '../../../lib/swebxml.dtd' [

<!--* !!! check relative references in DTD and stylesheet !!! *-->

<!ENTITY date.last.touched '10 September 2012'>

<!ENTITY course-mmyy 'November 2012'>

<!ATTLIST list type CDATA 'bullets' >
<!ATTLIST seg  rend CDATA 'incremental' >
<!ATTLIST xref href CDATA '' >
<!ATTLIST div id ID #IMPLIED >
<!ATTLIST item id ID #IMPLIED >

<!ENTITY iquest "&#xBF;" ><!--=inverted question mark-->
<!ENTITY eacute  "&#233;" ><!-- small e, acute accent -->

]>
<?xml-stylesheet type="text/xsl" href="../../../lib/courses201206.bmt.xsl"?> 
<TEI.2>
<teiHeader>
<fileDesc>
<titleStmt>
<title>Sample data, XQuery and XForms courses, &course-mmyy;</title>
<author>C. M. Sperberg-McQueen</author>
</titleStmt>
<publicationStmt>
<p>Unpublished and confidential.</p>
</publicationStmt>
<sourceDesc>
<p>Created in electronic form.</p>
</sourceDesc>
</fileDesc>
</teiHeader>
<text>
<front>
<titlePage>
<docTitle>
<titlePart>Sample data</titlePart>
<titlePart>XQuery / XForms courses, &course-mmyy;</titlePart>
</docTitle>
<docDate>&date.last.touched;</docDate>
</titlePage>

<div id="navbar" type="navbar">
<divGen type="toc"/><!--* suppress if you need no toc *-->
<div>
<head>Nearby documents</head>
<list>
<item id="siteroot"><xref href="../../..">Home</xref></item>
</list>
</div>
</div>
</front>

<body>
<p>This document provides pointers to some sample public data provided for
hands-on work in XQuery and XForms courses organized by
Black Mesa Technologies.</p>
<p>The small samples are intended to be small enough to fit on a screen or a few screens,
so that you can tell at a glance whether a given query has returned
all the hits you expected or not.</p>
<p>The larger samples are intended to be large enough that queries across them
may be a little more interesting.</p>


<div>
<head>WordHoard Shakespeare</head>
<p>The <xref href="http://www.monkproject.org/">MONK</xref> 
 (Metadata offer new knowledge) project developed <q>a digital environment designed to help 
humanities scholars discover and analyze patterns in the texts they study</q>.
It also encoded some texts for use with that environment, and/or
repackaged texts already encoded and refined the tagging.</p>
<p>
The Shakespeare distributed by MONK was originally created
as the <soCalled>WordHoard Shakespeare</soCalled> and carries
the following publication statement:
<q type="block">
<p><title>The WordHoard Shakespeare</title> is a joint project of the
Perseus Project at Tufts University, The Northwestern University
Library, and Northwestern University Academic Technologies. It is
derived from <title>The Globe Shakespeare</title>, the one-volume
version of the <title>Cambridge Shakespeare</title>, edited by
W. G. Clark, J. Glover, and W. A. Wright (1891-3). The <title>Internet
Shakespeare</title> editions of the quartos and folios have been
consulted to create a modern text that observes as closely as possible
the morphological and prosodic practices of the earliest
editions. Spellings, especially of contracted and hyphenated forms,
have been standardized across the corpus. The text has been fully
lemmatized and morphosyntactically tagged.
</p>
<p>© 2003. The copyright to <title>The WordHoard Shakespeare</title>
is owned jointly by Northwestern University and Tufts University.
<title>The WordHoard Shakespeare</title> is provided for free solely
for non-commercial use by students, scholars, and the public. Any
commercial use or publication of it, in whole or in part, without
prior written authorization of the copyright holders is strictly
prohibited.
</p>
</q>
</p>
<p>Please note and respect the conditions of use.</p>
<p>
<list>
<item><xref href="../../06/data/sonnet92.xml">Sonnet 92</xref> (basic form)</item>
<item><xref href="../../06/data/sonnet92.pos-tagged.xml">Sonnet 92</xref> with part-of-speech tagging (<soCalled>adorned</soCalled>, in MONK terms)</item>
<item><xref href="http://tei2010.blackmesatech.com/data/MONK/sha/unadorned/son.xml">Sonnets</xref> (unadorned)</item>
<item><xref href="http://tei2010.blackmesatech.com/data/MONK/sha/bibadorned/son.xml">Sonnets</xref> (adorned)</item>
<item><xref href="http://tei2010.blackmesatech.com/data/MONK/sha/unadorned/">Unadorned version</xref> (all of Shakespeare)</item>
<item><xref href="http://tei2010.blackmesatech.com/data/MONK/sha/bibadorned/">Bibliographically enriched, and word-tagged</xref> (all of Shakespeare)</item>
</list></p>
</div>

<div>
<head>Foreign Relations of the US</head>
<p>A sample volume from the <title>Foreign Relations of the United
States</title>, a publication of the U.S. Department of State, Office
of the Historian, providing <q>the official documentary historical
record of major U.S. foreign policy decisions and significant
diplomatic activity.</q> For further information, see <xref
href="http://history.state.gov/historicaldocuments">the series web
site</xref>.</p>
<p>A <xref href="http://uwdc.library.wisc.edu/collections/FRUS"><q>digital 
facsimile</q> (page-image) edition</xref> of the volumes from 1861 (the
start of the series) to 1960 is being prepared by 
the University of Wisconsin Digital Collections Center.
<!--* Some volumes are also available from the
<xref href="http://dosfan.lib.uic.edu/ERC/frus/index.html">University
of Illinois at Chicago</xref>.  But these appear all also to be on
the State Department's web site in more current form. *-->
</p>
<p>The volume used here as an example is <title>Foreign
Relations of the United States, 1969–1976, Volume XXIV, Middle East
Region and Arabian Peninsula, 1969–1972; Jordan, September
1970</title>, ed.
Linda W. Qaimmaqami and 
Adam M. Howard; 
general editor
Edward C. Keefer
(Washington:  United States Government Printing Office, 2008).
</p>
<list>
<item><xref href="../../06/data/FRUS/frus1969-76v24.xml">TEI-encoded XML document</xref> (unstyled), made available
through the kindness of Joe Wicentowski of the State Department Office of the Historian.</item>
<item><xref href="http://history.state.gov/historicaldocuments/frus1969-76v24">Web presentation of the volume</xref> (driven
from the TEI-encoded document listed above).</item>
</list>
</div>

<div>
<head>Full-text use-case data</head>
<p>This is a copy of the
sample data used for illustrative purposes in the
XQuery full-text use-cases document
document;
it is copied from <xref href="http://www.w3.org/TR/xpath-full-text-10-use-cases/#FT_UC_SampleData">section 1.5</xref>
of <xref href="http://www.w3.org/TR/xpath-full-text-10-use-cases/">XQuery and XPath Full Text 1.0 Use Cases</xref>,
ed. Pat Case and Sihem Amer-Yahia ([Cambridge]: W3C, 2011).</p>
<list>
<!--* 
<item><xref href="./xquery-use-cases/sampledata.xml">XQuery sample data</xref>.</item>
*-->
<item><xref href="../../06/data/full-text-use-cases/full-text.xml">Full-text sample data</xref>.</item>
</list>
</div>

<div>
<head>SGML '92 sample</head>
<p>This is a copy of the sample document used in an influential
evening workshop on query languages for SGML at the SGML '92 conference organized
by the Graphic Communications Association (now IDEAlliance) in Danvers, Massachusetts.
This copy is derived from that given in 
<xref href="http://www.w3.org/TR/xquery-use-cases/#sgml-data">section 1.5.3</xref>
of the document
<xref href="http://www.w3.org/TR/xquery-use-cases/">XML Query Use Cases</xref>,
ed. Don Chamberlin et al. ([Cambridge]: W3C, 2007).</p>
<list>
<!--* 
<item><xref href="../../06/data/xquery-use-cases/sampledata.xml">XQuery sample data</xref>.</item>
*-->
<item><xref href="../../06/data/sgml92/report.xml">XML version of report</xref> (unstyled).</item>
<item><xref href="../../06/data/sgml92/report.dtd">DTD</xref>.</item>
</list>
</div>

<div>
<head>StratML sample data</head>
<p>This is a collection of documents (mostly strategic plans and the like)
encoded using StratML (Strategy Markup Language).
</p>
<list>
<item><xref href="../../06/data/StratML/stratml-info.xml">Full description and list of documents</xref>.</item>
</list>
</div>

<div><head>Other samples</head>
<p>The data listed below may be interesting to use for
individual exercises and exploration.</p>

<div><head>Short poems by Oscar Wilde</head>
<p>Poems by Oscar Wilde; all are from the CELT project (see
below).  In principle the XML links should go to the CELT project site,
but that leads to DTD issues.  So they go to local copies on a Black Mesa Technologies
host, with commented-out DTDs.</p>
<list>
<item><xref href="http://tei2010.blackmesatech.com/data/CELT/Wilde/Poems/E850003-027.xml">Sonnet to Liberty</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-027">HTML</xref>).</item>
<item><xref href="http://tei2010.blackmesatech.com/data/CELT/Wilde/Poems/E850003-026.xml">H&eacute;las!</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-026">HTML</xref>).</item>
<item><xref href="http://tei2010.blackmesatech.com/data/CELT/Wilde/Poems/E850003-029.xml">To Milton</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-029">HTML</xref>).</item>
<item><xref href="http://tei2010.blackmesatech.com/data/CELT/Wilde/Poems/E850003-061.xml">The Grave of Keats</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-061">HTML</xref>).</item>
<item><xref href="http://tei2010.blackmesatech.com/data/CELT/Wilde/Poems/E850003-070.xml">The Grave of Shelley</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-070">HTML</xref>).</item>
</list>
<!--*
<list>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E850003-027">Sonnet to Liberty</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-027">HTML</xref>).</item>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E850003-026">H&eacute;las!</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-026">HTML</xref>).</item>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E850003-029">To Milton</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-029">HTML</xref>).</item>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E850003-061">The Grave of Keats</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-061">HTML</xref>).</item>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E850003-070">The Grave of Shelley</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-070">HTML</xref>).</item>
</list>
*-->
</div>
<div>
<head>Gorbals census data</head>
<p>There is a group of XML documents with small samples of 
data from the British censuses of 1851 and 1881 for one street
in <soCalled>the Gorbals</soCalled>, a
working-class district of Glasgow.  
</p>
<list>
<!--* no, same-source policy will bite us if we try to use data from xforms201106.blackmesatech.com
<item><xref href="http://xforms201106.blackmesatech.com/data/Gorbals/gorbals-description.xml">Full description and list of data files</xref></item>
*-->
<item><xref href="../../06/data/Gorbals/gorbals-description.xml">Full description and list of data files</xref></item>
</list>
<!--*
<list>
<item><xref href="Gorbals/gorbals-1851-flat.xml">Flat encoding of
1851 data</xref> (a mechanical translation of the original flat
single-table data into XML; information about the household is
repeated redundantly for each individual in the household).</item>
<item><xref href="Gorbals/gorbals-1881-flat.xml">Flat encoding of
1881 data</xref>.</item>
<item><xref href="Gorbals/gorbals-1851-grouped.xml">Hierarchical encoding of
1851 data</xref> (a slightly more plausible XML encoding of the
data, separating household information from information about
the individual members of the household).</item>
<item><xref href="Gorbals/gorbals-1881-grouped.xml">Hierarchical encoding of
1881 data</xref>.</item>
<item><xref href="Gorbals/gorbals-description.xml">Description</xref> of
the fields in the Gorbals data, taken from Daniel Greenstein's
account in <title>A Historian's Guide to Computing</title>
(Oxford: Oxford University Press, 1994).</item>
<item><xref href="Gorbals/gorbals-occ-codes.xml">Occupation codes</xref> 
for the Gorbals data (1851 only? or also 1881?)</item>
<item><xref href="Gorbals/gorbals-birthplace-codes.xml">Birthplace codes</xref> 
for the Gorbals data (1851 only? or also 1881?)</item>
<item><xref href="Gorbals/gorbals-statuso-codes.xml">Status codes</xref> 
for the Gorbals data (1851 only? or also 1881?)</item>

</list>
*-->
</div>

<div>
<head>Full texts from CELT Corpus of Electronic Texts</head>
<p>The <xref href="http://www.ucc.ie/celt/index.html">CELT</xref> 
project is creating a corpus of Irish texts, some in the original
languages (Irish, Latin, French, Spanish, English)
and others in English translation.  The CELT site has
a <xref href="http://www.ucc.ie/celt/publishd.html">list of published works</xref>
with links to HTML versions of the documents.  The XML and SGML links,
however, go to an FTP site no longer in service.
The XML documents are however accessible (for now, at least) on a
<xref href="http://publish.ucc.ie/celt/index">different server</xref> 
described by the project as <q>experimental</q>.
</p>
<p>The project's list of frequently asked questions describes
the copyright of CELT texts this way:
<q type="block">
<p><label>Q. 16</label> What about the copyright of CELT texts? </p>
<p><label>A.</label> All the texts can be searched, read on screen,
downloaded for later use, or printed out for *private* use and
research. If you use our texts in your research, please credit us in
your published results (e.g. conference paper, journal article, on the
web or in a monograph). Links can be made from other websites to CELT,
but please do not download our texts and make them available on your
server. We usually contact copyright holders of the texts we use and
ask them for permission. If we have inadvertently published a text
still in copyright, please contact us so that appropriate arrangements
can be made.
</p>
</q>
In accordance with the project's request, the documents listed below
are <emph>not</emph> available on the course server; the links below
go to CELT's server.</p>
<p>Some examples and exercises in the course may be drawn from the following
CELT texts.  Some are by Jonathan Swift:
<list>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E700001-022">A Modest Proposal</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E700001-022">HTML</xref>)</item>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E700001-013">A Tale of a Tub</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E700001-013">HTML</xref>)</item>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E700001-012">The Battle of the Books</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E700001-012?fragment=all&amp;refnav=View">HTML</xref>)</item>
</list>
Others are by Oscar Wilde:
<list>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E850003-105">Lady Windermere's Fan</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-105">HTML</xref>)</item>
<!--* <item><xref href="http://publish.ucc.ie/celt/docs/raw/E850003-109">Salome</xref></item> *-->
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E850003-106">A Woman of No Importance</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-106">HTML</xref>)</item>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E850003-108">An Ideal Husband</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-108">HTML</xref>)</item>
<item><xref href="http://publish.ucc.ie/celt/docs/raw/E850003-002">The Importance of Being Earnest</xref>
(<xref href="http://publish.ucc.ie/celt/docs/E850003-002?fragment=all&amp;refnav=View">HTML</xref>)</item>
</list>
</p>
</div>
</div>


</body>
</text>
</TEI.2>
<!-- Keep this comment at the end of the file
Local variables:
mode: xml
sgml-default-dtd-file:"/Library/SGML/Public/Emacs/sweb.ced"
sgml-omittag:t
sgml-shorttag:t
End:
-->
