The View from Black Mesa

Notes on keeping scholarly, technical, and public information useful

Main menu

Skip to primary content
Skip to secondary content
  • Home
  • About

Category Archives: Metadata

Thutmose II, a MARC-to-TEI-Header translator

Posted on 12 January 2011 by cmsmcq

[12 January 2011]

Recently I’ve spent some time working on a tool to translate from MARC records to TEI headers, which is called Thutmose II. (Thutmose because it was one of the few names I could think of with TH (for TEI headers) and M (for MARC) in prominent positions, and II because this is the second version of the tool.)

Thutmose implements the mapping from MARC fields to parts of the TEI header specified in the document on best practices for the use of TEI in libraries prepared by the TEI special interest group for libraries, and its preparation has been subsidized by a grant from the Text Encoding Initiative.

Thutmose II is not quite complete (it lacks support for the text-classification information in MARC fields 050-099, and it’s got a couple bugs I won’t mention here in case you don’t encounter them), so it’s probably not ready for use in production systems, but it’s now complete enough for a sort of preview to make sense. So an online interface is now available. Using the online interface, you can run Thutmose II on a MARC record selected at random from a small collection of test records. (The test records are themselves selected randomly from a large collection made available by the HATHI Trust.)

In addition, you can:

  • Click a button to select another MARC record at random.
  • Edit the MARC record to see how Thutmose II reacts to different data. (Be careful to keep the MARC record well-formed; nothing much happens if you try to feed Thutmose ill-formed data. Including: not much in the way of diagnostics.)
  • Edit the Thutmose II configuration file to change the way the output is generated.

Comments and suggestions welcome.

Posted in Metadata, TEI, Thutmose, Tools

Recent Posts

  • Extending MathML, continued
  • Extending MathML 2
  • Early-bird registration for XForms and XQuery courses, May 2013
  • Balisage paper deadline 19 April 2013
  • Reminder: Balisage (and XML User Interface Symposium) papers due soon

Recent Comments

  • John Cowan on Extending MathML, continued
  • John Cowan on Checking ISBN check-digits in XSD 1.1

Archives

  • April 2014
  • April 2013
  • February 2013
  • December 2012
  • July 2012
  • April 2012
  • February 2012
  • January 2012
  • August 2011
  • May 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • October 2010
  • June 2010

Categories

  • About this blog
  • Cocoon
  • Conferences
  • Courses
  • Editions
  • Markup semantics
  • Metadata
  • Preservation
  • Standards development
  • TEI
  • Thutmose
  • Tools
  • Uncategorized
  • Validation
  • XForms
  • XML software
  • XSD
  • XSD 1.1

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
Proudly powered by WordPress