Encode Foreign Languages

Use Cases and General Principle

In TEI, foreign means “not the main language of the text”. For many LEMDO editions, the main language is English. The language of the text is assumed to be English; you mark any other languages in the text with the <foreign> element.
There are five general scenarios in which you will need to tag foreign languages in a LEMDO edition:
A speech written in English will have an interpolated foreign word or phrase.
A speech will be written entirely in another language.
A character speaks entirely in another language.
An entire scene is written in another language.
A play is written mostly or entirely in another language.
In every case, we use the @xml:lang attribute to identify the language. We put that attribute on the lowest element in the hierarchy that entirely captures the foreign language passage. If the passage or word is already entirely wrapped in another element, put the @xml:lang on that element. If not, wrap the word(s) or phrase in the <foreign> element and add the @xml:id attribute to the <foreign> element.
For more information, see Foreign Words in Quotations.

Interpolated Foreign Words and Phrases

This scenario is the most common. Wrap the word or phrase in the <foreign> element, add the @xml:lang attribute, and give the appropriate value for the language. Standard IANA values for common languages are given in a table below.
In the first example we give, Holofernes’ speech contains two words in Latin. The words are not wholly contained in another element, so we wrap the <foreign> element around the Latin words:
<sp>
  <speaker>Holofernes</speaker>
  <p>
    <foreign xml:lang="la">Quis, quis</foreign> thou consonant?</p>
</sp>
<p>
<!-- ... -->
All the better; we shall be the more marketable.—<foreign xml:lang="fr">Bonjour</foreign>, Monsieur Le Beau. What’s the news? <!-- ... --></p>
<p>
<!-- ... -->

  <foreign xml:lang="fr">Sans</foreign> witch-craft could not <!-- ... --></p>

A Speech Entirely in Another Language

If an entire speech is in a foreign language, put the @xml:lang attribute on the <sp> element. In the following example, Holofernes’ speech is entirely in Latin:
<sp xml:lang="la">
  <speaker>Nathaniel</speaker>
  <p>Videsne quis venit?</p>
</sp>

Character Speaking in a Foreign Language

In cases where one character speaks persistently in another language (e.g., Lady Percy speaking in Welsh in 1 Henry IV) but the other characters in the scene speak in English, you will have to add the @xml:lang attribute to each of the speeches in a foreign language. All of Lady Percy’s speeches will have to bear the @xml:lang attribute. This scenario is therefore encoded exactly the same way as the previous use case (A Speech Entirely in a Foreign Language).

Scene in a Non-English Language

There are few scenes that are entirely in a non-English language. You will have to decide if it makes more sense to think of the scene as English with interpolations in other languages, or as another language with interpolations in English.
A classic case is the language lesson scene in Henry V 3.4. The scene is almost entirely in French, with an English stage direction and some English terms for body parts. One choice would be to set the language as French at the level of the scene division and then mark the stage direction as English, thus:
<div type="scene" n="4" xml:lang="fr">
  <stage type="entrance" xml:lang="en">Enter Catherine and Alice, an old gentlewoman.</stage>
  <sp who="#emdH5_FM_Catherine">
    <speaker>Catherine</speaker>
    <p>Alice, tu as été en Angleterre, et tu bien parles le langage.</p>
  </sp>
</div>
In this particular instance, the editor would have to think carefully how to tag Catherine’s and Alice’s attempts at English. Is fingres English? That is an editorial decision.

Play in a Non-English Language

LEMDO does support editions of plays in non-English languages, as well as supporting materials or documents in other languages. For example, there are a number of English plays written partly or wholly in Latin. If your text or document is predominantly Latin, specify the main language at the highest level in the document hierarchy, the <text> element that contains the text in its entirety. Any non-Latin languages in the text are then tagged as foreign. You want your tagging to mark deviations from the main language, so think carefully about the main language of your text.
If you want to look at an example, see Kevin Chovanec’s semi-diplomatic transcription of the octavo Fortunatus play in German (lemdo/data/texts/OFG/main/emdOFG_O.xml in our repository).

IANA Values for Specific Languages

Add the attribute @xml:lang and the appropriate IANA (Internet Assigned Numbers Authority) value. Languages commonly seen in early modern plays are listed here. If you need a value for another language, see the list at IANA. You will need to write to lemdotech@uvic.ca to have new language values added to our schema:
Language Value
Dutch nl
English en1
French, Early Modern and Modern fr
French, Old fro
German de
Greek, Ancient (to 1453) grc
Greek, Modern (1453–present) el
Irish ga
Italian it
Latin la
Old English ang
Portuguese pt
Romany rom
Spanish es
Welsh cy

Foreign Words in Apparatus and Paratexts

Similar principles apply to the encoding of non-English words in apparatus documents (collations and annotations) and critical paratexts. Put the @xml:lang on the element that is high enough in the hierarchy to capture the entire foreign language passage in its entirety:
on <p> if an entire paragraph is in a language other than the main language of your document
on <quote> if the entire quotation is mostly or entirely in a foreign language
on <l> if an entire line of verse is mostly or entirely in a foreign language
on a <foreign> element that you have added to the text if there is no other logical element on which to hang the @xml:lang attribute.

Examples

We will add additional examples to this section on an ongoing basis.
<p>
<!-- ... -->
Rosalind continues her spoofing of Le Beau by referring to a common opening legal phrase, <quote xml:lang="la">Noverint universi per praesentes</quote>, <quote>Let everyone know by the present document</quote> (Hattaway) <!-- ... --></p>
A note on AYL by David Bevington containing two quotations, one in a foreign language and one in English.
<p>
<!-- ... -->
Cf. modern French <foreign xml:lang="fr">mepriser</foreign>
  <!-- ... -->
</p>

Foreign Words in Quotations

See the longer encoding description about tagging non-English languages, where you will find a list of values for foreign languages that frequently appear in early modern texts.
Tag foreign words within English quotations with the <foreign> element.
If the entire quotation is in a foreign language, add the @xml:lang attribute to the <quote> element. You do not need to add the <foreign> element as well. See the allowed values for @xml:lang.

Notes

1.We have a few texts that are entirely in languages other than English. In such texts, the @xml:lang attribute goes on the <text> element and instances of English are marked as <foreign> .

Prosopography

Isabella Seales

Isabella Seales is a fourth year undergraduate completing her Bachelor of Arts in English at the University of Victoria. She has a special interest in Renaissance and Metaphysical Literature. She is assisting Dr. Jenstad with the MoEML Mayoral Shows anthology as part of the Undergraduate Student Research Award program.

Janelle Jenstad

Janelle Jenstad is a Professor of English at the University of Victoria, Director of The Map of Early Modern London, and Director of Linked Early Modern Drama Online. With Jennifer Roberts-Smith and Mark Kaethler, she co-edited Shakespeare’s Language in Digital Media: Old Words, New Tools (Routledge). She has edited John Stow’s A Survey of London (1598 text) for MoEML and is currently editing The Merchant of Venice (with Stephen Wittek) and Heywood’s 2 If You Know Not Me You Know Nobody for DRE. Her articles have appeared in Digital Humanities Quarterly, Elizabethan Theatre, Early Modern Literary Studies, Shakespeare Bulletin, Renaissance and Reformation, and The Journal of Medieval and Early Modern Studies. She contributed chapters to Approaches to Teaching Othello (MLA); Teaching Early Modern Literature from the Archives (MLA); Institutional Culture in Early Modern England (Brill); Shakespeare, Language, and the Stage (Arden); Performing Maternity in Early Modern England (Ashgate); New Directions in the Geohumanities (Routledge); Early Modern Studies and the Digital Turn (Iter); Placing Names: Enriching and Integrating Gazetteers (Indiana); Making Things and Drawing Boundaries (Minnesota); Rethinking Shakespeare Source Study: Audiences, Authors, and Digital Technologies (Routledge); and Civic Performance: Pageantry and Entertainments in Early Modern London (Routledge). For more details, see janellejenstad.com.

Joey Takeda

Joey Takeda is LEMDO’s Consulting Programmer and Designer, a role he assumed in 2020 after three years as the Lead Developer on LEMDO.

Martin Holmes

Martin Holmes has worked as a developer in the UVicʼs Humanities Computing and Media Centre for over two decades, and has been involved with dozens of Digital Humanities projects. He has served on the TEI Technical Council and as Managing Editor of the Journal of the TEI. He took over from Joey Takeda as lead developer on LEMDO in 2020. He is a collaborator on the SSHRC Partnership Grant led by Janelle Jenstad.

Navarra Houldin

Project manager 2022–present. Textual remediator 2021–present. Navarra Houldin (they/them) completed their BA in History and Spanish at the University of Victoria in 2022. During their degree, they worked as a teaching assistant with the University of Victoriaʼs Department of Hispanic and Italian Studies. Their primary research was on gender and sexuality in early modern Europe and Latin America.

Rylyn Christensen

Rylyn Christensen is an English major at the University of Victoria.

Tracey El Hajj

Junior Programmer 2019–2020. Research Associate 2020–2021. Tracey received her PhD from the Department of English at the University of Victoria in the field of Science and Technology Studies. Her research focuses on the algorhythmics of networked communications. She was a 2019–2020 President’s Fellow in Research-Enriched Teaching at UVic, where she taught an advanced course on Artificial Intelligence and Everyday Life. Tracey was also a member of the Map of Early Modern London team, between 2018 and 2021. Between 2020 and 2021, she was a fellow in residence at the Praxis Studio for Comparative Media Studies, where she investigated the relationships between artificial intelligence, creativity, health, and justice. As of July 2021, Tracey has moved into the alt-ac world for a term position, while also teaching in the English Department at the University of Victoria.

Orgography

LEMDO Team (LEMD1)

The LEMDO Team is based at the University of Victoria and normally comprises the project director, the lead developer, project manager, junior developers(s), remediators, encoders, and remediating editors.

Metadata