[12 January 2011]
Recently I’ve spent some time working on a tool to translate from MARC records to TEI headers, which is called Thutmose II. (Thutmose because it was one of the few names I could think of with TH (for TEI headers) and M (for MARC) in prominent positions, and II because this is the second version of the tool.)
Thutmose implements the mapping from MARC fields to parts of the TEI header specified in the document on best practices for the use of TEI in libraries prepared by the TEI special interest group for libraries, and its preparation has been subsidized by a grant from the Text Encoding Initiative.
Thutmose II is not quite complete (it lacks support for the text-classification information in MARC fields 050-099, and it’s got a couple bugs I won’t mention here in case you don’t encounter them), so it’s probably not ready for use in production systems, but it’s now complete enough for a sort of preview to make sense. So an online interface is now available. Using the online interface, you can run Thutmose II on a MARC record selected at random from a small collection of test records. (The test records are themselves selected randomly from a large collection made available by the HATHI Trust.)
In addition, you can:
- Click a button to select another MARC record at random.
- Edit the MARC record to see how Thutmose II reacts to different data. (Be careful to keep the MARC record well-formed; nothing much happens if you try to feed Thutmose ill-formed data. Including: not much in the way of diagnostics.)
- Edit the Thutmose II configuration file to change the way the output is generated.
Comments and suggestions welcome.