Convert old TLNs to TEI anchors
Introduction
This document covers a necessary process in the conversion of modern-spelling IML
files over to TEI. This process would normally be run by a programmer, but it can
be run by any research assistant who is fully conversant with the way annotations and collations work in LEMDO. The system used to link annotations and collations into modern-spelling
texts in the ISE made use of the
Through Line System,where every
linein a text was given a canonical number. To point to a line using the TLN, you could simply use the TLN number, and to point to specific text within a line, you could supply the TLN number and then the text you wanted to point to, and a rendering algorithm was meant to identify the target text. (The fragility of this system will be apparent; if a change was made to the target text but not carried over to the link pointing at it, the link would then fail.)
In LEMDO, we use a more precise and flexible system based on TEI
<anchor>
elements in the text. This system enables us to point to a specific location in the
text using one anchor, or to a span of text using two anchors. In the process of remediating
old IML files, we need to look at each of the old TLN references, which are found
in the *_M_annotation.xml and *_M_collation.xml file, identify the locations to which they are supposed to be pointing in the modernized
text itself, then insert appropriate
<anchor>
elements in the modernized text and rewrite the pointers so that they point to those
anchors.This procedure is done using an ant build file found in
code/link_apparatus/build.xml. The process is only partially automated; normally, you will need to run it repeatedly
and fix errors that show up until all the links are converted, so it can be time-consuming.
You will need to have ant and ant-contrib installed, and be running Linux or MacOS.Run the process
First, identify the files that you need to convert. These will normally consist of
a modern-spelling text converted from IML to TEI, and its associated annotation and
collation files, also converted to TEI. For the purpose of this example, we’ll assume
that the work identifier for your text is
WWWW.
In the Terminal, move to the
code/link_apparatus folder, and then run: ant -Dwork=WWWW -DpersId=HOLM1 -DmainDoc=/[the path to your
lemdo directory]/data/texts/WWWW/main/emdWWWW_M.xml So if your LEMDO project directory
is in /home/mholmes, you would type: ant -Dwork=WWWW -DpersId=HOLM1 -DmainDoc=/home/mholmes/lemdo/data/texts/WWWW/main/emdWWWW_M.xml
This command assumes that the modern-spelling text is called emdWWWW_M.xml, and the
process will search for associated files in /data/texts/WWWW/app/emdWWWW_M_annotation.xml and /data/texts/WWWW/app/emdWWWW_M_collation.xml. If you don’t know what the full path to your directory is, you can type pwd at the command line to find the path to the folder you’re currently in, then deduce
the path to the main document based on that. For the person id, use your own LEMDO
id. This is used to create a
<change>
entry in
<revisionDesc>
explaining that the process has been run.The process will locate links from annotations or collations into the text, and attempt
to find their target locations based on the TLN lb elements; where target text is
also available, it should find that text and place anchors around it, then link to
those anchors.
The final results will be saved in code/link_apparatus/temp, and will include not only new versions of the original files, but also the interim
products used in the process, which can be used for debugging.
The output from the process will most likely include a series of error messages for
cases where the original links could not be resolved, either due to errors in the
original IML encoding, or some other unexpected problem. Each of these issues should
be addressed manually, as detailed in the next section.
Fix problems and run repeatedly
Running the process for the first time will most likely only result in partial success.
To address the errors:
Copy any errors from the Terminal into a text editor so you don’t lose them.
Validate the files in the temp folder against our project schema, to make sure no
invalidities have been created.
If they’re valid, copy them back over the originals in the data/texts directory.
Look at the list of errors, and go back to the original texts to identify what the
problem is, and try to fix it. The most common source of errors is that the text supplied
in the collation or annotation file does not exactly match the text in the modern-spelling
text; in this case, fix it so that it does match.
After fixing everything you can, run the process again.
Repeat until all errors are fixed, or until any remaining errors can only be dealt
with by manual intervention (by creating anchors in the text manually and creating
pointers to them).
Prosopography
Janelle Jenstad
Janelle Jenstad is a Professor of English at the University of Victoria, Director
of The Map of Early Modern London, and Director of Linked Early Modern Drama Online. With Jennifer Roberts-Smith and Mark Kaethler, she co-edited Shakespeare’s Language in Digital Media: Old Words, New Tools (Routledge). She has edited John Stow’s A Survey of London (1598 text) for MoEML and is currently editing The Merchant of Venice (with Stephen Wittek) and Heywood’s 2 If You Know Not Me You Know Nobody for DRE. Her articles have appeared in Digital Humanities Quarterly, Elizabethan Theatre, Early Modern Literary Studies, Shakespeare Bulletin, Renaissance and Reformation, and The Journal of Medieval and Early Modern Studies. She contributed chapters to Approaches to Teaching Othello (MLA); Teaching Early Modern Literature from the Archives (MLA); Institutional Culture in Early Modern England (Brill); Shakespeare, Language, and the Stage (Arden); Performing Maternity in Early Modern England (Ashgate); New Directions in the Geohumanities (Routledge); Early Modern Studies and the Digital Turn (Iter); Placing Names: Enriching and Integrating Gazetteers (Indiana); Making Things and Drawing Boundaries (Minnesota); Rethinking Shakespeare Source Study: Audiences, Authors, and Digital Technologies (Routledge); and Civic Performance: Pageantry and Entertainments in Early Modern London (Routledge). For more details, see janellejenstad.com.
Joey Takeda
Joey Takeda is LEMDO’s Consulting Programmer and Designer, a role he assumed in 2020
after three years as the Lead Developer on LEMDO.
Mahayla Galliford
Project manager, 2025-present; research assistant, 2021-present. Mahayla Galliford
(she/her) graduated with a BA (Hons with distinction) from the University of Victoria
in 2024. Mahayla’s undergraduate research explored early modern stage directions and
civic water pageantry. Mahayla continues her studies through UVic’s English MA program
and her SSHRC-funded thesis project focuses on editing and encoding girls’ manuscripts,
specifically Lady Rachel Fane’s dramatic entertainments, in collaboration with LEMDO.
Martin Holmes
Martin Holmes has worked as a developer in the UVic’s Humanities Computing and Media
Centre for over two decades, and has been involved with dozens of Digital Humanities
projects. He has served on the TEI Technical Council and as Managing Editor of the
Journal of the TEI. He took over from Joey Takeda as lead developer on LEMDO in 2020.
He is a collaborator on the SSHRC Partnership Grant led by Janelle Jenstad.
Navarra Houldin
Training and Documentation Lead 2025–present. LEMDO project manager 2022–2025. Textual
remediator 2021–present. Navarra Houldin (they/them) completed their BA with a major
in history and minor in Spanish at the University of Victoria in 2022. Their primary
research was on gender and sexuality in early modern Europe and Latin America. They
are continuing their education through an MA program in Gender and Social Justice
Studies at the University of Alberta where they will specialize in Digital Humanities.
Rylyn Christensen
Rylyn Christensen is an English major at the University of Victoria.
Tracey El Hajj
Junior Programmer 2019–2020. Research Associate 2020–2021. Tracey received her PhD
from the Department of English at the University of Victoria in the field of Science
and Technology Studies. Her research focuses on the algorhythmics of networked communications. She was a 2019–2020 President’s Fellow in Research-Enriched
Teaching at UVic, where she taught an advanced course on
Artificial Intelligence and Everyday Life.Tracey was also a member of the Map of Early Modern London team, between 2018 and 2021. Between 2020 and 2021, she was a fellow in residence at the Praxis Studio for Comparative Media Studies, where she investigated the relationships between artificial intelligence, creativity, health, and justice. As of July 2021, Tracey has moved into the alt-ac world for a term position, while also teaching in the English Department at the University of Victoria.
Orgography
LEMDO Team (LEMD1)
The LEMDO Team is based at the University of Victoria and normally comprises the project
director, the lead developer, project manager, junior developers(s), remediators,
encoders, and remediating editors.
Metadata
| Authority title | Convert old TLNs to TEI anchors |
| Type of text | Documentation |
| Publisher | University of Victoria on the Linked Early Modern Drama Online Platform |
| Series | Linked Early Modern Drama Online |
| Source |
TEI Customization created by Martin Holmes, Joey Takeda, and Janelle Jenstad; documentation written by members of the LEMDO Team
|
| Editorial declaration | n/a |
| Edition | Released with Linked Early Modern Drama Online 1.0 |
| Encoding description | Encoded in TEI P5 according to the LEMDO Customization and Encoding Guidelines |
| Document status | prgGenerated |
| Funder(s) | Social Sciences and Humanities Research Council of Canada |
| License/availability |
This file is licensed under a CC BY-NC_ND 4.0 license, which means that it is freely downloadable without permission under the following
conditions: (1) credit must be given to the author and LEMDO in any subsequent use
of the files and/or data; (2) the content cannot be adapted or repurposed (except
in quotations for the purposes of academic review and citation); and (3) commercial
uses are not permitted without the knowledge and consent of the editor and LEMDO.
This license allows for pedagogical use of the documentation in the classroom.
|