Chapter 10. Facsimiles

Introduction to Facsimiles

The documentation in this chapter is for editors and encoders working with facsimiles. It is relevant to those encoding or remediating semi-diplomatic transcriptions.

Rationale

Our semi-diplomatic transcriptions are usually accompanied by facsimile images, which are digital surrogates of an early modern witness. We host these collections on an external server and embed them into our semi-diplomatic transcriptions where needed.
Facsimiles are an important component of each LEMDO edition. By hosting them on our own server, we can easily navigate between facsimiles and we can write diagnostics that catch errors or inconsistencies.
Additionally, LEMDO follows the standards outlined by the Endings Project. To ensure the longevity of our project, we use a standardized model to capture detailed information about each facsimile collection, which is located in the metadata of its XML file.
Note that LEMDO uses the terms facsimile and digital surrogate interchangeably.

Learning Outcomes

This chapter is designed to support you through finding, downloading, and encoding facsimile collections for your semi-diplomatic transcriptions. By the time you have worked through this chapter, you will:
Understand how copyright applies to early modern witnesses.
Know how to find, download, name, and number facsimile images according to LEMDO standards.
Be able to fully encode facsimile files, including running basic XSLT.

Contents

Section Description
Find a Digital Surrogate Learn how to use online resources to find high quality, open-access facsimile collections of your printed playbook
Name and Store Facsimiles Learn how to download, name, and number facsimile images
Capture Facscimile Metadata Learn how to encode the metadata of a facsimile file
Encode Images in Facsimile Files Learn how to create links to individual facsimile images
Terminal Code for Naming Facsimiles Learn how to write code that allows you to batch rename facsimiles in your terminal

Find a Digital Surrogate

Rationale

LEMDO hosts a variety of digital surrogates, but our collection is not exhaustive. When you begin working on a new semi-diplomatic transcription, you will likely need to find your own facsimiles. This documentation will guide you through the steps to locate usable digital surrogates online.

Practice

Acquire Facsimiles

To find images, check these websites first:
The English Short Title Catalogue, which will usually provide a list of libraries holding your play. Navigate to your holding library’s website and search for your play in their catalogue. Note that only some libraries make digital surrogates freely and openly available. Note also that many library catalogues now list the EEB microfilms and the EEBO surrogates of microfilms, which can be misleading when you are looking for a high-quality digital surrogate of the copy owned by the library. LEMDO does not use microfilms in place of high-quality facsimile images, nor does it link to EEBO.
The Internet Archive. Many libraries, including Boston Public Library and Harry Ransom Centre, share their digital surrogates via the Internet Archive.
Adam G. Hooks and Zachary Lesser’s Shakespeare Census, which provides direct links to usable digital surrogates. Note that Shakespeare Census only lists information about plays attributed to Shakespeare.
Rob Carson’s Marlowe Census, which adapts the Shakespeare Census code and principles for extant copies of Marlowe’s works.
When you look for facsimile collections online, search by STC number, call number, or the play’s title with date restrictions (e.g. 1550–1600). All of this information is available in the English Short Title Catalogue entry for your play.
Note that British Library digital collections remain unavailable as of January 2026.

Choosing Images for Semi-Diplomatic Transcriptions

When you select images to include in the LEMDO repository for your semi-diplomatic transcription, there are some considerations to keep in mind. The digital surrogate that you select should:
Be from the copy that the semi-diplomatic transcription follows.
Be open-access already or available for LEMDO to use for a reasonable one-time fee. (Contact the LEMDO Director to discuss payment options.)
Be one that we are legally able to download and store.
Be high-resolution (at least 2000px).
Include title page and blank pages.
Be in colour.
Ideally, be a complete copy (i.e., is not missing leaves or gatherings), although we recognize that some early publications do not have any complete copies available.
Ideally, be single-page scans. We are able to split scans of spreads if needed. If your digital surrogate has images of page spreads, please contact lemdo@uvic.ca.
Ideally, have a high percentage of corrected sheets (if that information is known).
Not be microfilm.

Name and Store Facsimiles

Rationale

LEMDO uses a standardized model to name and store our facsimile collections. Standardization is important because it allows us to organize our digital surrogates easily and consistently. When we consistently include the same information in the same places, our XML is easier to use now and parse later​.
Because digital surrogates are typically large files, we store them on an external server hosted by HCMC at https://lemdo.uvic.ca/facsimiles. We can link directly to individual images from our semi-diplomatic transcriptions using their standardized file names.

Download and Name Facsimiles

After you have found a usable set of facsimile images, you will need to download them to your computer. Download each image individually and name each one according to LEMDO’s image naming standard as described in this documentation page. Images must be downloaded as .jpg files only.
The folder to which you save your images must be named according to this convention: Work_Siglum_Library_Copy. The information in the filename goes from the most general to the most specific: work, siglum, holding library, copy number (if there is more than one copy at the library).
Example
Work Use the DRE standard abbreviation for the work, as listed in DRE Play IDs. Ham, 1IYK, AYL, Mucd, FEm, H5, FV
Sigla Give the standard abbreviation or siglum for the publication Q, Q1, MS, O, F, F1
Library Name of the holding library of the copy. Use the LEMDO abbreviation for the holding library. BPL, BL, SLNSW
Copy If a holding library has more than one copy, add the shelfmark, copy number, or call number to the filename 1, 2, Dyce
Examples:
Ham_Q1_BL means the folder containing the facsimiles of the British Library copy of Q1 Hamlet.
DevC_Q1_BPL means the folder containing the facsimiles of the Boston Public Library copy of the Q1 publication of The Devils Charter.
1IYK_Q7_Folger_2 means the folder containing the facsimiles of the Folger Shakespeare Library copy of the Q7 publication of If You Know Not Me, or the Troubles of Queen Elizabeth numbered copy 2 in the Folger collection.
Individual images must be named according to this convention: FolderName_Number. All image names in a collection must have the same number of digits, and they must all start with “0”. To figure out how many digits you need, find the total number of images in your collection, then add one. For example, if you have 48 images in total, your number will be three digits long including the preceding zero (e.g. 001, 002, 003 … 046, 047, 048). Images are named sequentially from the first page, whether or not that page is blank.
Examples:
MayM_MS_Kent_08 is the eighth sequential image of the May Masque manuscript held at the Kent History and Library Centre, which contains 6 images in total.
Ham_Q1_BL_008 is the eighth sequential image of the British Library copy of Q1 Hamlet, which contains 95 images in total.
Oth_Q1_BL_0008 is the eighth sequential image of the British Library copy of Othello, which contains 107 images in total.
F3_SLNSW_00008 is the eighth sequential image of the State Library of New South Wales copy of Shakespeare’s Third Folio, which contains 1029 images in total.
Sequential file numbers do not have to match page numbers. The XML file containing the metadata for the facsimile will match image file numbers with the through-page numbers (if any) and bibliographical signature numbers of the book’s pages.

Adding Facsimiles to the Server

Once you have downloaded and named your images, you will need to ask a developer to add them to the external server. Please email lemdo@uvic.ca with a .zip file of your folder. If the folder is too large for email, please upload it to Google Drive, Proton Drive, Sharepoint, or another secure, shareable server space; provide the link and access instructions in your email. You may also share the folder via a secure file-sharing service such as Dropbox or WeTransfer.

Library Codes

LEMDO uses recognizable abbreviations for libraries. For a searchable, open-access, linked list of the STC codes for libraries, see Meaghan Brown’s website. If you add facsimiles from libraries not listed in the STC, create a logical abbreviation and email lemdo@uvic.ca to let us know.

LEMDO Library Codes Already in Use

Library Abbreviation/Code
Alnwick Castle, Duke of Northumberland’s Library Aln
Beinecke Rare Books and Manuscripts Library, Yale University Yale
Bibliothèque municipale Marceline Desbordes-Valmore (Douai) Douai
Boston Public Library BPL
Brandeis Bran
British Library BL
Cardiff Public Library CPL
Elham Parish Library, Canterbury Cathedral EPL
Folger Shakespeare Library, Washington, DC Folger (STC uses F)
Harry Elkins Widener Memorial Library, Harvard University HD
Harry Ransom Centre, University of Texas Austin HRC
Henry E. Huntington Library, San Marino, California Hunt (STC uses HN)
Legislative Library of British Columbia LLBC
Mary Couts Burnett Library, Texas Christian University TCU
National Library of Scotland, Edinburgh NLS (STC uses E)
Carl H. Pforzheimer Collection, New York Public Library, New York City PFOR
Rosenbach Museum and Library, Philadelphia RML
State Library of New South Wales SLNSW
Trinity College Dublin TCD (STC uses D)
University of California Los Angeles UCLA (STC uses CAL)
University of Victoria Libraries (MacPherson Library) Mac
Victoria and Albert Museum VA

Capture Facscimile Metadata

Rationale

Image files are not XML files and cannot have a <teiHeader> the way our typical XML files do. Instead, we must capture information about each facsimile collection in a separate metadata file. The information in these files is standardized and must be organized exactly as described in this documentation.

Introduction

Facsimile files are divided into two sections: metadata (data about the data) and facsimile content (the actual images you will be encoding). This documentation will guide you through how to encode the metadata section. To learn how to encode facsimile content, see Encode Images in Facsimile Files.
To supplement the learning in this chapter, we have created a standardized template for our facsimile files: facsPlaybook_template. In combination with this documentation, please follow the template when adding a new facsimile file to the repository. It contains the complete content model and generous comments to help you complete each element in the file. To learn how to open and use a template, see Use LEMDO’s Oxygen Templates.

Practice: Encode Playbook Titles

Facsimile files have two <title> elements in their <titleStmt> : one with a @type value of main and one with a value of full. The value main refers to the editorial title, while full refers to the play’s original title as given in the STC.
Format <title> elements as follows:
<title type="main">Henry VI, Part III, Quarto 2</title>
<title type="full">The true tragedie of Richarde Duke of Yorke, and the death of good King Henrie the sixt: with the whole contention betweene the two houses, Lancaster and Yorke</title>
In the full title, regularize double vs to w (e.g. vvith would be changed to with). Standardize long ſ as a short s. Do not capture ligatures (e.g., fl) but do capture digraphs (e.g., æ). Do not include end punctuation.

Practice: Encode Authority

The <authority> element tells us which holding library owns the copy of the playbook. Include the name of your holding library exactly as it appears in our shared orgography ORGS1, then wrap it in <orgName> . Use the @ref attribute to point to its xml:id in ORGS1. If your library does not appear in ORGS1, please email lemdo@uvic.ca and we will add it in.
A fully encoded <authority> element will look like this:
<authority>
  <orgName ref="org:BOST1">Boston Public Library</orgName>
</authority>

Practice: Encode Copyright

We use two elements to encode copyright: <availability> and <licence> . (Note that the <licence> element follows Canadian/British spelling conventions.)
The <availability> element includes the copyright statement for your playbook. Search the holding library’s website for specific information about fair use of images. Otherwise, copy and paste this statement:
<p>Although old books are not protected by copyright, the images thereof may be protected by copyright laws in the country where the book is held. Check with the holding library before you use these images for anything other than classroom use.</p>
The <licence> element is a child of <availability> . It includes the copyright license for your facsimile images. Ensure your <licence> element looks like this:
<licence>Licensed under a <ref target="https://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</ref> (CC BY-SA 4.0)</licence>

Practice: Encode Identification Information

The <msIdentifier> element includes information about how to identify your particular playbook. It contains an <institution> element followed by at least one <idno> element.
In your <institution> element, provide the name of the holding library wrapped in the <orgName> element. The text nodes of the <institution> element and the <authority> should be the same.
Ideally, you will have multiple <idno> elements as children of your <msIdentifier> . Put the call number from the playbook’s holding library in the text node of the first <idno> element. For additional <idno> elements, you must find the other available identifiers (e.g., DEEP and STC numbers). On each <idno> element following the call number, add the @type attribute with the appropriate value from the drop-down list that appears in Oxygen. List the identifiers following the call number in alphabetical order by their @type value.
A fully encoded <msIdentifier> element will look like this:
<msIdentifier>
  <institution>
    <orgName ref="org:BOST1">Boston Public Library</orgName>
  </institution>
  <idno>G.176.8</idno>
  <idno type="ESTC">S111150</idno>
  <idno type="Greg">138b</idno>
  <idno type="ShakCensus">363</idno>
  <idno type="STC">21006a</idno>
</msIdentifier>

Practice: Encode Table of Contents

The <msContents> element functions like a table of contents for your printed playbook. It allows you to index your playbook’s contents, which may otherwise be difficult to navigate. Each entry in your <msContents> will follow this pattern:
<msItem>
  <locus from="015" to="015"/>
  <title>Title page</title>
</msItem>
The <locus> element specifies the page range, while the <title> element describes the content included within that range. Given the diversity of information that can appear in early printed playbooks, every <msContents> entry must follow these guidelines:
Create a new <msItem> for the following information: book cover(s), endpaper, spine, title page, blank page(s), and each page of the play text itself.
Group other miscellaneous content as frontmatter (if it appears before the play text) or backmatter (if it appears after the play text).
Leave a comment if your playbook is missing any pages.
A fully encoded <msContents> will look like this:
<msContents>
  <msItem>
    <locus from="001" to="001"/>
    <title>Book cover</title>
  </msItem>
  <msItem>
    <locus from="002" to="014"/>
    <title>Blank page</title>
  </msItem>
  <msItem>
    <locus from="015" to="015"/>
    <title>Title page</title>
  </msItem>
  <msItem>
    <locus from="016" to="016"/>
    <title>Blank page</title>
  </msItem>
  <msItem>
    <locus from="017" to="075"/>
    <title>Henry VI, Part III</title>
  </msItem>
  <msItem>
    <locus from="076" to="089"/>
    <title>Blank page</title>
  </msItem>
  <msItem>
    <locus from="090" to="090"/>
    <title>Book cover</title>
  </msItem>
</msContents>

Practice: Encode Physical Description

We use the <physDesc> element to describe the physical characteristics of a printed playbook. <physDesc> elements generally include three pieces of information:
The total number of pages, size in centimetres, and type of publication you are encoding (e.g. folio, quarto, octavo, or manuscript). Format this information as follows: [64] p ; 18 cm ; 4⁰.
A physical description of the item, quoted directly from your holding library’s website and wrapped in <quote> with a paranthetical citation pointing to the webpage from which you got the information.
A <typeDesc> element, which describes the type used in your playbook. This will generally be either Roman or Gothic type.
A fully encoded <physDesc> will look like this:
<physDesc>
  <p>[64] p ; 18cm ; 4⁰. <quote>Boston Public Library (Rare Books Department) copy halfbound in early calfskin and brown paper, rebacked in brown goatskin with the title stamped in gilt along the spine. Armorial bookplate of the Barton Library. BPL perforation stamp on title page. Bibliographical inscription in the hand of J.O. Halliwell-Phillipps on front flyleaf with description from Halliwell sale of May, 1857, laid on. Final leaf (H4) lined in brown paper</quote> (<ref target="https://bpl.bibliocommons.com/v2/record/S75C4565407">Boston Public Library catalogue entry</ref>).</p>
  <typeDesc>
    <p>Printed in Roman type.</p>
  </typeDesc>
</physDesc>

Name a Facsimile File

Once you have created your facsimile metadata file, you must name it and add it to the lemdo/data/facsimiles folder. When naming your file, use the prefix facs_ followed by the name of the facsimile folder that your file corresponds to.
Examples:
facs_FEm_Q2_BPL_1.xml contains the metadata for the Boston Public Library copy of the second quarto of Fair Em with the shelfmark Copy 1.
facs_MND_Q2_BL.xml contains the metadata for the British Library copy of the second quarto of A Midsummer Night’s Dream.

Encode Images in Facsimile Files

Rationale

Because we store facsimiles on an HCMC server outside the LEMDO repository, we need to make links to them from within our XML files. These links are created in the facs_ metadata file that corresponds to each facsimile collection.
Before you encode image links in the body of the facs file, you need to complete the <teiHeader> section in the file and capture the information (metadata) about the facsimile. To learn how to encode facsimile metadata, see Capture Facscimile Metadata.
The body content of a facs_ file contains two elements for each facsimile image: <surface> and <graphic> . Via the <surface> element, we create a unique xml:id for each image and indicate its signature number, while the <graphic> element captures the image link itself. These elements must be encoded correctly for the facsimile links in our semi-diplomatic transcriptions to work properly.

Use XSLT to Encode Images

Before creating any image links, please ensure that your <msContents> is correct. The accuracy of our XSLT transformation relies on the numbers you give in your <msContents> .
We have written an XSLT transformation that can create most of the encoding for all of your facsimile images for you. To run the transformation, follow these steps:
Navigate to your file in Oxygen’s Project View.
Right-click the file, navigate to the drop-down labelled Transform, and click Transform with….
Find the transformation called LEMDO: Add surfaces in facsimile. Run the transformation.
Oxygen will ask you to indicate what type of images you have (either .jpg or .png). Use the drop-down menu to select the correct option.

Encode Signature Numbers

After running the XSLT, you will need to manually add signature numbers to the @n attribute of your <surface> element. Begin adding signature numbers from the title page of your printed playbook. Do not add signature numbers to pages that are not part of the playbook’s original contents (e.g., protective blank pages added by libraries).
We use signature numbers as a navigational tool. To account for any printing errors, we use inferred editorial signature numbers in our facsimile files. If your playbook’s pages are missing signature numbers, follow a logical numerical sequence from the last visible signature number until you reach the next one.
A fully encoded facsimile image should look like this:
<surface xml:id="facs_H5_Q1_BL_001" n="001">
  <graphic url="sourcefacs:H5_Q1_BL/H5_Q1_BL_001.png" mimeType="image/png"/>
</surface>

Terminal Code for Naming Facsimiles

This documentation is intended for experienced encoders. It presupposes that you have the basic skills and knowledge required to work in the command line. If you do not yet have this experience, please email lemdo@uvic.ca for guidance.

Rationale

In the process of naming and storing your facsimile images, you can use terminal code in place of some repetitive tasks. Terminal code is executed from the command line.
This documentation contains two important commands: one for converting unusable file formats into .jpg format, and one for naming a set of images according to LEMDO standards. These commands work by treating your images as a set, eliminating the need to modify each one individually.

Prepare Your Images

Complete the following tasks before running terminal codes in the command line:
Download each of your facsimile images individually.
Place your images in one folder.
Ensure that your images appear in the correct order within the folder.
Name your folder according to LEMDO’s facsimile naming convention, which is Work_Sigla_Library_Copy. For more detailed information on our naming protocol, see Name and Store Facsimiles.
Always create a backup of your folder before running any commands to account for potential errors.

Step-by-Step: Use Terminal Code to Convert Images

In the command line, navigate to the folder containing your facsimiles using the cd command. You can also drag and drop your folder into the command line to insert its path automatically.
Copy and paste this command: for f in *.file; do sips -s format jpeg "$f" --out "${f%.file}.jpg" && rm "$f"; done.
Replace both instances of the phrase .file with your current image format (e.g. .tif, .jp2, .png, etc.).
Press enter.
You should see the conversion happen within Terminal. Once the files have finished processing, double check that they have been converted correctly by refreshing your folder.

Step-by-Step: Use Terminal Code to Batch Rename Images

In the command line, navigate to the folder containing your facsimiles using the cd command. You can also drag and drop your folder into the command line to insert its path automatically.
Ensure your images are saved in .jpg format. The command will not work otherwise.
Copy and paste this command: i=1 for f in *.jpg; do printf -v num "%03d" "$i" mv "$f" "filename_$num.jpg" ((i++)) done .
Replace the phrase filename_ with your folder name (e.g., Ham_Q1_BL). Do not delete the final underscore.
Replace the number “3” in %03d with the maximum number of digits you will need to number all of your images. To figure out how many digits you need, find the total number of images in your folder and then add one. For example, if you have 100 images in total, you will need 4 digits, so you will replace the “3” in %03d with “4”.
Press enter.
You will not be able to see this transformation happen within Terminal. Refresh your folder to make sure your images have been named correctly. Ensure each image has at least one preceding zero and is not missing any underscores.

Prosopography

Isabella Seales

Isabella Seales is a fourth year undergraduate completing her Bachelor of Arts in English at the University of Victoria. She has a special interest in Renaissance and Metaphysical Literature. She is assisting Dr. Jenstad with the MoEML Mayoral Shows anthology as part of the Undergraduate Student Research Award program.

Janelle Jenstad

Janelle Jenstad is a Professor of English at the University of Victoria, Director of The Map of Early Modern London, and Director of Linked Early Modern Drama Online. With Jennifer Roberts-Smith and Mark Beatrice Kaethler, she co-edited Shakespeare’s Language in Digital Media: Old Words, New Tools (Routledge). She has edited John Stow’s A Survey of London (1598 text) for MoEML and is currently editing The Merchant of Venice (with Stephen Wittek) and Heywood’s 2 If You Know Not Me You Know Nobody for DRE. Her articles have appeared in Digital Humanities Quarterly, Elizabethan Theatre, Early Modern Literary Studies, Shakespeare Bulletin, Renaissance and Reformation, and The Journal of Medieval and Early Modern Studies. She contributed chapters to Approaches to Teaching Othello (MLA); Teaching Early Modern Literature from the Archives (MLA); Institutional Culture in Early Modern England (Brill); Shakespeare, Language, and the Stage (Arden); Performing Maternity in Early Modern England (Ashgate); New Directions in the Geohumanities (Routledge); Early Modern Studies and the Digital Turn (Iter); Placing Names: Enriching and Integrating Gazetteers (Indiana); Making Things and Drawing Boundaries (Minnesota); Rethinking Shakespeare Source Study: Audiences, Authors, and Digital Technologies (Routledge); and Civic Performance: Pageantry and Entertainments in Early Modern London (Routledge). For more details, see janellejenstad.com.

Joey Takeda

Joey Takeda is LEMDO’s Consulting Programmer and Designer, a role he assumed in 2020 after three years as the Lead Developer on LEMDO.

Mahayla Galliford

Project manager, 2025-present; research assistant, 2021-present. Mahayla Galliford (she/her) graduated with a BA (Hons with distinction) from the University of Victoria in 2024. Mahayla’s undergraduate research explored early modern stage directions and civic water pageantry. Mahayla continues her studies through UVic’s English MA program and her SSHRC-funded thesis project focuses on editing and encoding girls’ manuscripts, specifically Lady Rachel Fane’s dramatic entertainments, in collaboration with LEMDO.

Martin Holmes

Martin Holmes has worked as a developer in the UVic’s Humanities Computing and Media Centre for over two decades, and has been involved with dozens of Digital Humanities projects. He has served on the TEI Technical Council and as Managing Editor of the Journal of the TEI. He took over from Joey Takeda as lead developer on LEMDO in 2020. He is a collaborator on the SSHRC Partnership Grant led by Janelle Jenstad.

Navarra Houldin

Training and Documentation Lead 2025–present. LEMDO project manager 2022–2025. Textual remediator 2021–present. Navarra Houldin (they/them) completed their BA with a major in history and minor in Spanish at the University of Victoria in 2022. Their primary research was on gender and sexuality in early modern Europe and Latin America. They are continuing their education through an MA program in Gender and Social Justice Studies at the University of Alberta where they will specialize in Digital Humanities.

Sofia Spiteri

Sofia Spiteri is currently completing her Bachelor of Arts in History at the University of Victoria. During the summer of 2023, she had the opportunity to work with LEMDO as a recipient of the Valerie Kuehne Undergraduate Research Award (VKURA). Her work with LEMDO primarily includes semi-diplomatic transcriptions for The Winter’s Tale and Mucedorus.

Tracey El Hajj

Junior Programmer 2019–2020. Research Associate 2020–2021. Tracey received her PhD from the Department of English at the University of Victoria in the field of Science and Technology Studies. Her research focuses on the algorhythmics of networked communications. She was a 2019–2020 President’s Fellow in Research-Enriched Teaching at UVic, where she taught an advanced course on Artificial Intelligence and Everyday Life. Tracey was also a member of the Map of Early Modern London team, between 2018 and 2021. Between 2020 and 2021, she was a fellow in residence at the Praxis Studio for Comparative Media Studies, where she investigated the relationships between artificial intelligence, creativity, health, and justice. As of July 2021, Tracey has moved into the alt-ac world for a term position, while also teaching in the English Department at the University of Victoria.

Orgography

Boston Public Library (BOST1)

Metadata