Chapter 21. Programming

This chapter of our documentation is still in beta. We welcome feedback, corrections, and questions while we finalize the page in our 2024–2025 work cycle.

Introduction to Programming

Programming Principles

This document lays out basic programming practices, style rules, and naming conventions for programmers working on the XSLT, HTML, JavaScript, and CSS in the LEMDO project. We do not yet conform to these principles, but we are committed to doing so in time.

Names for Variables, Classes, and Other Identifiers

All LEMDO programming code identifiers and file names should be constrained to the ASCII range.
All LEMDO identifiers should be as long as necessary to describe their meaning/function unambiguously, with information going from general to specific. Examples: note-editorial, title-monograph. This enables easier sorting, searching for, and comparing items which belong in groups.
Generally speaking, the following should all be lower-case with components separated by dashes:
CSS filenames (including SCSS source files).
JavaScript filenames.
Image filenames (for site chrome etc.; facsimiles etc. have their own naming conventions).
HTML custom attribute names (data-*).
Note that HTML data-* attribute values will often be derived from TEI element names and other similar constructs, so these may take whatever form is dictated by their source: <div data-el="persName">.
ECMAScript practices are different:
ECMAScript variable and function names should be lower camelCase (function getSomething()).
ECMAScript constants should be UPPERCASE_WITH_UNDERSCORES.
ECMAScript class names should be upper CamelCase (class NotePopup {...}).
Identifiers should be as descriptive as possible, with information going from general to specific, so: noteEditorial, titleMonograph.

Principles for HTML

Wherever possible, prefer HTML5 Semantic Elements over generic elements. Examples would be nav, section, main, article, header, footer.
The only elements requiring id attributes are those which need to be pointed at, linked to, or manipulated by ECMAScript.
To identify a category of element that needs to be styled (for example), prefer the use of data-* attributes derived from the source TEI wherever possible. So:
<span data-el="persName">
<span data-el="speaker">
<div data-el="ab">
This provides a solid link between the underlying TEI code and the HTML which then propagates easily into the CSS, making maintenance and debugging easier.
Reserve the use of style and class attributes for the propagation of primary source style description from the source TEI into the HTML. Examples:
<span data-el="stage" style="font-size: 80%;">
<span data-el="speaker" class="rnd_gothic">
Here the source element name is identified using the data-el attribute, so default house styling can be applied uniformly if necessary, but the appearance from the original source text is captured in a local style attribute or a class.

Principles for CSS

Following on from the principles above, site-level CSS (as opposed to text-level CSS derived from the TEI encoding of primary source features) should avoid the use of class attributes wherever possible. To select an element for styling:
First prefer its semantic tag name (header, footer, aside).
Next, use TEI-derived data-* attributes (<span data-el="fw" data-type="catch">).
If that is not enough, fall back on a selector based on position in the hierarchy or a semantic feature of an ancestor (q q{...}, div[data-el="speech"] aside).
Resort to classes only as a desperate measure.

Principles for XSLT

XSLT files are named in lower-case with underscores, and use the extension .xsl.
All root (master) XSLT file names end in _master.xsl.
All modules which are included or imported into master files have names ending in _module.xsl, and are stored in a modules subfolder alongside the master file(s) that use them.
XSLT files are documented using elements in the XSLT Stylesheet Documentation Namespace, as supported by the Oxygen XML Editor. They are also validated using project-specific custom Schematron in Oxygen, ensuring adherence to basic good practices and style rules.
Variable and function names should use lower camelCase.

Static Build

What is a Static Build?

A static build is a process which takes a stack of input documents such as TEI files, along with related resources such as images, and creates a website from them. It does this in such a way that the resulting website has no dependency or requirement for any back-end engine such as a database or a PHP processor; all the content consists of static HTML and related files, and the site can be hosted on any webserver just by copying the files up to the server. This process, championed by UVicʼs Project Endings, is described in detail in the many presentations and publications of the Endings team. The great value of a static site is that it requires no ongoing maintenance, is easy to replicate and archive, and has a good chance of surviving for decades without much attention.

Practice: Run a Static Build

The LEMDO project and its related sites are all built using a Jenkins Continuous Integration Server run by HCMC at the University of Victoria, so unless you are a project administrator or programmer, you should never need to run a build yourself. However, if you are writing or editing code for the project, you can consult the documentation on running builds to get detailed information on how to run complete builds on your computer, and also how to run partial, rapid builds to test specific processes or outcomes.

Running Builds

Introduction

The LEMDO repository has a complex build process that can be difficult to understand; it also takes quite a long time to do a complete builds, so when developers or designers are working on the site, they need to be able to do rapid partial builds to see the results of their work. This documentation describes the build requirements for running builds, and outlines some of the strategies for rapid test-builds that are available.

Software Requirements

This is a list of software that is required for running the various build processes. Some of it is actually stored in the repository, and some must be installed on the machine doing the build.

Software Included in the Repository

The following software is stored in the SVN repository, so does not need to be installed locally:
Saxon XSLT processor (saxon-he-10.jar).
Schematron library for Ant (ant-schematron-2010-04-14.jar).
The W3C HTML validator (vnu.jar).
The Jing RELAXNG validator (jing.jar).

Software to be Installed Locally

To run the various LEMDO build processes, you will need the following software to be installed on your machine. At present most of the build processes have to be run on *NIX systems because they depend on command-line utilities. If you are forced to use Windows, you’ll probably have to install the Windows Subsystem for Linux. For running specific components of the build, you may not need all of these applications or libs.
Java.
Ant.
ant-contrib.
svn.
git.
zip.
dart-sass for building CSS (https://sass-lang.com/install).
texlive-full (a complete LaTeX install, needed for building PDF for print).

LEMDOʼs Build Processes

The LEMDO build processes are controlled by two Ant files, build.xml and build_globals_module.xml in the project root folder.
You run a complete build by running this at the command line, in the project root folder: ant.

A Quick Way to Validate All the XML

As part of a full build process, the XML documents from the data folder are copied over to an output folder (products/lemdo-dev/site/xml/source) and validated there, with RNG and Schematron. You can invoke this process without having to run the rest of the build by running: ant quickValidateSource.
This is a useful way to check that you havenʼt broken anything while doing a multiple-file search-and-replace or a similar global operation.

Practice: Run Partial Builds

The complete static build process takes a long time. If you’re working on fixing a build problem and you need to test your changes, it is obviously not practical to run the entire build process and wait to see the results. However, in most cases, you don’t need to. Here are a number of examples of how you can run only a small component of the build process to test specific changes.
Important note: In most cases, you must have an existing completed build in place before you can successfully run partial builds. That means that once in a while, you will need to run a complete local build for yourself. You can, of course, do that over lunch or overnight.
Once you have a full completed build available locally, you can start running only the part of the build that you are interested in. For example, if you are trying to work on a problem that relates to the generation of the Original XML, you might do this: ant createOriginalXml validateOriginalXml.
This will perform only those two steps, and you can then examine the results in the products/lemdo-dev/site/xml/original folder.
Similarly, if you are working on the XHTML generation, you could run: ant createXhtml.
To see a full list of the subtasks available, type ant and press the tab key twice. To see more info, including descriptions of each of the tasks, type: ant -p.
If you’re working on something more substantial that requires several steps, you can just chain them together as appropriate. Make sure you run them in the order they would normally run, because each process may depend on the output from a preceding process. You can discover the order by looking at the @depends attribute on the target named all.

Practice: Process a Subset of Documents

Another useful approach to rapid building is to process only a specific subset of documents. For example, imagine that you are dealing with an HTML problem that affects lots of documents, but you know that one particular document (emdH5_F1.xml) exemplifies the issue, and can be used as a test. You can run this: ant createXhtml -DdocsToBuild=emdH5_F1.
This will run the part of the build that transforms the Standalone XML into HTML files, but it will only process a single document, making it very fast indeed; you can then inspect the changes to that specific document. To process more than one document, separate them with commas: ant createStandaloneXml -DdocsToBuild=emdH5_F1,emdH5_Q1.
You can even use a regular expression, so you could build all of the Henry V documents by running this: ant createStandaloneXml -DdocsToBuild=emdH5_.*.
Finally, there is a specific target named quick, which is designed to do the minimum processing to get from the source XML to the XHTML output (in other words, the most important stages in the build process). If you run: ant quick -DdocsToBuild=emdH5_F1,emdH5_Q1, you’ll pass those two documents through the entire process from source to HTML output, but the process should be relatively fast. Again, it’s important to remember that you must have a complete set of build products in place in your products/lemdo-dev/site folder before this will work properly.

Strategies for Building and Testing

The various strategies described above provide the basis for a programmer to work efficiently on solving a specific problem or adding a specific feature without having to wait for long periods to see the results of changes. If you triage the issue you’re working on carefully, you’ll be able to break it down into small steps, and identify a specific subset of documents which can be used for testing, then develop and test your changes carefully, so that when you do commit changes to the repository, it’s much less likely that the full build will fail because of something you did.

Build and Customize an Anthology

In the LEMDO universe, an anthology is a collection of texts along with supporting information and documentation, presented through a web interface which is based on a set of defaults, but is customized through CSS styling, menu configuration, and so on.

The lemdo-dev Anthology

The lemdo-dev anthology is a special instance which includes all the texts that exist in the repository, whatever their state of development. It is never published at any formal URL, but is available through a continuous integration build server which enables everyone working on any LEMDO project to see the latest state of their own data. The process of building lemdo-dev also detects and reports errors, invalidities and other problems related to any texts in the collection.

Other Anthologies

All other anthologies exist as subfolders of the data/anthologies folder in the LEMDO repository. Each anthology is based primarily on these features:
A subfolder in data/anthologies named for the id of the anthology (for example qme or dre).
A TEI Corpus file (a TEI file whose root element is <teiCorpus> rather than <TEI> ), with an @xml:id attribute identical to the anthology id. This file contains several key components:
A <teiHeader> element which holds the anthology-level metadata (the anthology editor(s), publication statements, etc.).
A hard-coded <TEI> element with @xml:id={anthId}_index. This file contains the content for the anthology home page. The presence of this <TEI> element makes the anthologyʼs <teiCorpus> valid.
A sequence of import instructions in the form of Processing Instructions, looking like this: <?lemdo-import ref="emdFV_edition"?>. Each one of these points to the id of an edition file, whose contents are to be published in the anthology.
A collection of other XML files populating folders within the anthology folder, each of which consists of a TEI document that is to be turned into a web page in the rendered anthology site. These are all in the category ldtBornDigital, and they have filenames and IDs beginning with the anthology ID followed by an underscore. All such files found in the anthology folder will be converted into web pages. Note that these files are also transformed as part of the regular lemdo-dev build, so you can see them in their default state in that build; when the anthology itself is built, those files will be converted to the anthology rendering style, and will have their prefixes removed (so qme_index.html will become simply index.html in the products/qme/site output folder).
A site folder containing all the site-level customizations which apply to the anthology output, including template/sitePage.html, which will form the basic template for all pages in the site. This is where the banner, menu, footer and other standard site components are defined. Other folders include images, for images needed on the site; fonts, for any specialized fonts that need to be included; and css, which contains one or more SCSS files which will be transformed to CSS using SASS, and added to the output pages after the standard CSS inherited from the main lemdo-dev project, enabling a designer to override any display defaults. There should be one main SCSS file named {anthId}.scss which imports any other files and that will be built into {anthId}.css and linked into all the output files.

Build Process

The build process for anthologies is quite complex:
Before anything else happens, the main lemdo-dev build process must run. This processes all the texts and identifies problems that might need to be rectified before an anthology build can proceed.
Next, the main build process runs a diagnostic for each anthology that is in the repository, checking to see whether it is in a state where it can be successfully built. That process produces two outputs: an HTML file (products/{anthId}/anthologyStatus.html) which lists problems found, and a text file (products/{anthId}/anthologyStatus.txt) which simply says "OK" or "FAIL". The checking process is described below.
The anthology file itself is then checked for two things: the <revisionDesc> / @status for the anthology itself must be set to published, and so must the <revisionDesc> / @status attribute in the first embedded <TEI> element which contains the home page content. If either of these is not published, the build does not proceed.
The anthology home page, which is the TEI element embedded in the anthology configuration corpus file, is processed into a page in a temporary folder.
Each item in the collection of site-page XML documents located in the anthology folder is first checked to make sure its <revisionDesc> / @status value is published.
Each LEMDO edition which is claimed by the anthology (using a lemdo-import processing instruction, as described above) is processed as follows:
The edition file itself is checked to ensure that it has a <licence> element which permits the inclusion of the edition file itself in the anthology. This is an example of such a <licence> element:
<licence resp="pers:JENS1" from="2020-10-21" corresp="anth:dre"> This content is licensed for inclusion in the DRE anthology. </licence>
The text inside the <licence> element is informational and not required; the key components are a person taking responsibility for the licence, a starting date for it, and a pointer to the target anthology using the @corresp attribute and the anth prefix. The build process checks only that there is at least one valid @resp value in the attribute. It does not have any way to know whether the person concerned actually has the right to make such a declaration; this check is the responsibility of the anthology lead.
The <revisionDesc> / @status value for the edition file is checked to make sure it is set to published.
If the edition is allowed for inclusion and is published, it is then parsed, and each text linked from it is checked for the same licence permission and publication status.
At this point, we have a complete list of all the documents needed for the anthology, but they are all in the vanilla lemdo-dev style etc. These documents are then processed into the output directory. During this process, all boilerplate components such as banners, menus, footers and so on are replaced with those from the anthologyʼs sitePage.html template file, and the anthologyʼs CSS and JS files are added into the headers of the files.
A post-build diagnostics process is now run on the built site to check that every link points to something which is where it is expected to be, and every file is valid. (Not yet implemented.)

Customizing Anthologies

To customize the functionality and appearance of your anthology, you will need to have a complete checkout of the repository, and you must be working on a computer which is set up to run the required software. Practically speaking, this means that you need either a Linux or Mac operating system (Windows is not able to run required scripts), and you will need both curl and DART-SASS installed and working from the command line. Of course you will also need an svn client in order to check out and commit to the repository, as well as a NetLink ID and permission to commit to the required folders. In addition, if you want to follow the recommended working method described below, you will need to have ant and ant-contrib installed.
There are three aspects of an anthology that can be customized:
Some components of the basic HTML framework used to create pages can be changed, by editing the file data/anthologies/{anthId}/template/sitePage.html. This file basically contains a banner/header, a site menu, and a footer; these are used to replace the generic versions that come from the lemdo-dev build pages.
The style/design/appearance of the site can be modified by editing the SCSS file lemdo/data/anthologies/{anthId}/site/css/{anthId}.scss. This file is compiled to create {anthId}.css, which is then placed in the css folder in the anthology build output, and linked into the HTML files after the default lemdo-dev.css file, so that it can be used to override any rules in that file.
Site functionality can be customized by editing the file lemdo/data/anthologies/{anthId}/site/js/{anthId}.js, which is linked into the HTML files after any other JS, so it can override (for example) object methods and add new functionality.

A Suggested Working Method for Customization Work

For developers or designers working on customization, we suggest the following working method.
First of all, if you are only working on the CSS, images, fonts and/or the JavaScript, you do not need to run a complete site build yourself. You can download a pre-built version of the site from our Jenkins server to your local machine, then make changes and simply rebuild the CSS. This is how: (In the following, {anthId} means the id of the target anthology, such as dre or qme.)
At the command line, in the lemdo folder, run the following command:
./getSiteFromJenkins.sh {anthId}
This will download a complete copy of the latest build of the anthology from the Jenkins server, and place it in the products folder inside the lemdo folder. The main site content appears in the site subfolder. You can view this content directly in a browser, but if you want to work on JavaScript interactivity, you may need to run a local web server in that folder so that browsers will permit all the code to run. Most developers will be familiar with this situation.
Now that you have a local copy, you can make changes to the CSS and JavaScript code in the lemdo data/anthologies/{anthId}/site folder, then run this command to update the local site and see your changes:
ant -f build_anthology.xml updateAnthology -Danthology.id={anthId}
This will copy all JS, images, and font files over to the local built site, and run SASS to build the SCSS file to create a CSS file in the css folder.
Once you are happy with the changes youʼve made, you can commit them to the svn repository (donʼt forget to svn add any new images or fonts files), and then Jenkins will rebuild the site in due course.
If you are editing the site template, then the process is much more complex, because every page will need to be rebuilt before your changes will appear. In that case, you will also need a complete local copy of the base lemdo-dev anthology. You can get that by running: ./getSiteFromJenkins.sh.
That will give you a copy of the lemdo-dev site, which is the source from which the anthology pages are built. After editing the anthology site template file in data/anthologies/{anthId}/template/sitePage.html, run this command to rebuild your entire anthology: ant -f build_anthology.xml -Danthology.id={anthId} -DbuildFailedAnthologyAnyway=true buildAnthologies.
This process may take a little time, depending on the size of your anthology.

Page and Document Titles

Here we lay out basic assumptions and encoding guidelines that lie behind what ends up right at the top of every page: the page title.

Basics

The anthology template files all include an XHTML <h2> element which appears as the first item in the <article> element. That <h2> is the trigger leading to the insertion of the page title(s).
We never take the page title from the content of the TEI <text> element. Our basic assumption is that the title appearing at the top of a page on the site is drawn from the content of the <titleStmt> in the <teiHeader> , according to these rules:
If there is a single <title> element with @type=main, then the contents of that element are processed into an XHTML <h2> element.
If there is no <title> with @type=main, but there is a single <title> element with no @type attribute, then the contents of that element are processed into an XHTML <h2> element.
If there is a <title> element with @type=sub, then that element is processed into an XHTML <h3> element.
All other <title> elements in the header <titleStmt> are ignored for the purposes of rendering the main heading on the page.

TEI <front> Elements

<front> elements should only be used in primary source documents, and should only contain genuine transcribed front matter from a primary source document. Any front element in a primary source document will be processed and rendered following the main page title created following the rules above.

Bylines and Other Front-Like Things in Born-Digital Documents

If you need to provide bylines, prefatory statements, or other such content at the head of the page in a born-digital or mixed-content document, put this information at the beginning of the <body> .

How the Table of Contents is Generated for Semi-Diplomatic Transcriptions

Semi-diplomatic transcriptions vary widely in their original form and in the encoding practices of their editors, but when a play is rendered for the end-user, a table of contents needs to be constructed to appear in the slide-out Content tab on the website. For a non-diplomatic text, the TOC will be constructed using the text node of the <head> element of any <div> elements in the text which have a <head> , since these are obviously the major divisions in the text (usually act-scene numbers or scene numbers), but for semi-diplomatic texts, there may well be no such obvious structure to draw on.
Therefore a rather complicated algorithm tries to decide what components of the text should best be used to create a useful TOC. This is how the process works:
By default, <pb> elements having a signature or folio number in their @n attribute will be used.
However, if the text contains 20 or more <label> elements, then these are assumed to be more suitable text-division headings, and will be used instead. (For an example, see the texts in the Douai Shakespeare Manuscript Project.)
If 20 or more <label> elements have @n attributes, then only the <label> elements having @n attributes will be used, and the text of the TOC entries will be taken from the @n attributes.
If more than 20 <label> elements exist, but fewer than 20 have @n attributes, then all <label> elements will be used, but whenever a <label> has @n, its value will be used for the TOC entry text instead of the content of the label.
Why so complicated? While a TOC constructed from <pb> elements may be very straightforward, it is not very helpful to a general reader looking for the major sections of the text, and it may end up being extremely long, so <label> is usually a better choice if the text contains headings or similar markers which can be tagged as labels. However, the text content of a <label> element may not be very helpful in itself; it might look like this:
<label>Act<choice>
  <abbr>
    <g ref="g:us_Eng"></g>
  </abbr>
  <expan>us</expan>
</choice> jmus</label>
which may be puzzling to a reader. In such a case, the encoder can add the @n attribute to provide a more helpful label for the TOC:
<label n="Actus Primus">Act<choice>
  <abbr>
    <g ref="g:us_Eng"></g>
  </abbr>
  <expan>us</expan>
</choice> jmus</label>
and this will be used in preference to the textual content. The @n attribute can be used on all <label> s to create an entirely curated TOC if that is preferred.
LEMDOʼs long-term plan for most texts other than Douai texts is to mobilize the <milestone> element in order to note correspondences between places in the semi-diplomatic transcription and the modern text. Many semi-diplomatic transcriptions already contain commented-out <milestone> elements. When a modern text is finalized, we will be able to finalize the milestone elements in the semi-diplomatic transcription by adding a @corresp attribute with a value of the @xml:id value of a scene or act <div> in the modern text. We will also add an @n attribute whose value will be used to generate a TOC of act-scene or scene beginnings. Ideally, users will be able to toggle between a signature TOC (A1r, A1v, A2r, A2V, etc) and a milestone TOC.

Processing Pointers, Links, and References

Processing of References

LEMDO has thousands of <ref> elements that point to editions, anthologies, components therein, and to resources outside LEMDO. These are all processed into HTML hyperlinks for the static site.
Our processing does not do anything with <ref> elements in PDFs. Any pointing you do inside a portfolio with the <ref> element will result in nothing in a PDF.
The <ref> and <ptr> elements can co-exist because they are pointing to xml:ids, but only the <ptr> element can be converted to strings at build time in LaTeX.

Example: Processing of a Reference in a PDF

Encoding:
<note type="gloss">Monarchs (<ref type="bibl" target="bibl:OEDT2">
  <title level="m">OED</title>
</ref>
  <term>prince</term>, n. 1.).</note>
Rendering in the PDF:
princes Monarchs (OED prince, n. 1.). The Chorus … emdH5_FM_annotations line 95

Processing of Pointers

We would like to have a canonically-structured textual reference in the output. In the digital edition, we want the A.S.Sp. system (e.g., 5.1.2) plus a precise hyperlink. In the print edition, we want the A.S.L reference system (e.g., 2.3.101). We do not want to have the author of the critical text write a literal 2.3.101 into their document, because lineation may change as the text is edited, but we do want a critical introduction to be able to contain A.S.L citations when it is printed. The actual text in the parenthetical citation must be generated at build time.
LEMDO therefore has two different processing chains for pointers: one for the digital edition and one for the print edition (a camera-ready PDF that can be downloaded or printed through print-on-demand services).
For online publication, we generate a parenthetical citation that gives LEMDOʼs canonical reference system (A.S.Sp.). Clicking on the citation takes one directly to the part of the speech being cited.
For the PDF, we generate a parenthetical citation that gives A.S.L (act, scene, line number) using the line numbers generated for the PDF at the time we make the PDF.
For example, an editor might use a <ptr> element in their critical introduction to point to anchors in the middle of a long speech in their modern text. In the processing for the PDF, LEMDO will calculate and supply the A.S.L, so that reader may find the exact line(s) being cited in the generated parenthetical citation. For the online version, the parenthetical citation will be A.S.Sp. but the hyperlink on the citation will go directly to the target point in the speech.

Citing Published Texts

One approach to the mutability of online texts produced both within and outside LEMDO is to choose a specific print edition of source texts (such as the New Oxford Shakespeare) to which all references can point. This has the obvious disadvantage that such a pointer cannot be made into any kind of link, but links are fragile anyway, and this approach also fits with the numerous citations of critical print texts which occur throughout critical material.

Pointing within Portfolios

We have seen above that certain types of citation between texts are not very robust, because texts are (for the forseeable future, anyway) steadily evolving, and even our principles for lineation and our citation styles are not absolutely finalized. However, when youʼre editing critical material that will be bundled with an edition of a text in (for example) a print volume, you need to be able to point into the text, just as you need to be able to attach annotations to specific points in the text. There are two scenarios in which we do this:

Pinning Annotations

Annotations are <note> elements (documented in detail in Writing Annotations) which live in separate files outside the plays to which they refer. At build time, annotations may be rendered as links with popups, or as footnotes in a print edition.
Annotations are pinned to a location in the text using a system of anchors and pointers which is documented in Writing Annotations.

Local Pointers

Although we know that pointing between electronic texts which are in constant development is inherently fragile, there are situations in which we need to be able to create a canonically-formatted text reference in a critical text to a specific point in the text which is being discussed. If these texts are in the same portfolio, then we know they will be built at the same time, and therefore any output created will be consistent across that build.
This enables us to solve the particular problem noted above, where we would like to have a canonically-structured textual reference such as 2.3.45 appearing in the critical text, and this is particularly important for the print texts that we are going to publish. We donʼt want to have the author of the critical text write a literal 2.3.45 into their document, because lineation may change as the text is edited, but we do want a critical introduction to be able to contain such text when it is printed; therefore the actual text must be generated at build time. We do this using a <ptr> element with @type=localCit:
<div><!-- In the critical text: -->
  <p><!-- ... --> the king addresses the <quote>noble English</quote> (<ptr type="localCit" target="doc:emdH5_FM.xml#emdH5_FM_anc_2000"/>) separately <!-- ... --></p>
  <!-- In the play text: -->
  <l>Oh, <anchor xml:id="emdH5_FM_anc_2000"/>noble English,</l>
</div>
At build time, this will be expanded to (for example) (1.2.111). You will notice that we use the same mechanism for creating a point in the text that can be addressed as we do for annotations: we insert an anchor (see Writing Annotations for instructions on how to do that). To specify a range, include pointers to two anchors with a space between them:
<ptr type="localCit" target="doc:emdH5_FM.xml#emdH5_FM_anc_2000 doc:emdH5_FM.xml#emdH5_FM_anc_2001"/>
If youʼre pointing at an entire line, speech, scene or act, then thereʼs no need to insert two anchors. You can instead add an @xml:id to the target element ( <l> , <sp> , or <div> , if there isnʼt one there already, and point to that instead. To create a new @xml:id, the simplest way is to insert an anchor element in the usual way, then take its @xml:id, which is guaranteed to be unique, and use that, discarding the rest of the <anchor> .

Inline Processing Instructions in LEMDO

This documentation lists and explains a number of different custom processing instructions used by the LEMDO project to include content from elsewhere, and to trigger the automatic generation of content.

Simple Inclusions

LEMDO prefers the use of processing instructions for the purposes of inclusion over other methods such as XInclude because it is more flexible; processing for XPointers in XInclude instructions is not widely supported, and some processors and validators may act upon XInclude instructions when theyʼre not intended to be processed. There are two PI-based inclusions in LEMDO: <?lemdo-include href="doc:learn_encodeLinks_intro"?>
This lemdo-include PI is used in the lemdo.odd file to assemble the separate documentation files found in the data/documentation folder into a single structured document before that is processed into the documentation web pages. This PI should not be used outside of the ODD file. See Documentation and the ODD File for more information.

Generating Content

Another set of processing instructions provides a way to generate content in an output page based on metadata elsewhere in the project. These are three examples: <?taxonomy-table ref="emdAudiences"?>
This tells the processor to find the <taxonomy> element in TAXO1 which whose @xml:id=emdAudiences, and process it to create a TEI <table> element laying out all the categories and their definitions. That table is later processed into an HTML table in the documentation page for the site.
<?charDecl-table ref="characters"?>
This tells the build process to generate a table from from character declarations ( <charDecl> elements) in TAXO1.
<?listPrefixDef-table ref="global_listPrefixDef"?>
This generates a table from a <listPrefixDef> , also in TAXO1.
Processing for these PIs is specified in two places: first in the documentation_inclusion_master.xsl file, which handles the majority of cases since they occur mainly in documentation; but also in the xml_original_templates.xsl module, to handle any cases in which a PI may be used in a page which is not part of the documentation. These templates should be updated in a synchronized way.

The staticSearch build process in LEMDO

This documentation explains how the staticSearch codebase is stored, updated, and used in build processes.

Introduction

LEMDO sites are static sites conformant with Endings principles, so they use the Endings-developed staticSearch system to provide search functionality for individual anthologies. Rather than check out a fresh copy of the staticSearch codebase every time we build something, we store a static copy of the codebase in our repository in code/search/staticSearch. This should be an up-to-date stable version of the codebase.

How staticSearch is used at build time

When any anthology is built, the staticSearch codebase is copied into a staticSearch folder in the anthologyʼs products folder. We could run all our staticSearch build indexing for all anthologies directly from the code/search/staticSearch folder, but making a copy enables us to do tests with alternative versions of staticSearch if we need to, using a single anthology.
For details on how staticSearch itself works, see the staticSearch documentation.

Keeping the staticSearch codebase updated

The file build_globals_module.xml contains a property called staticSearchBranch which specifies which branch we want to use for our staticSearch codebase copy. It should normally be set to a stable release branch, unless we are doing some unusual testing. Release branches are updated periodically, for bugfixes and minor enhancements, so there is also an Ant task in the same file called refreshStaticSearchCode, which will update the files in the code/search/staticSearch folder automatically. After running the update (ant -f build_globals_module.xml refreshStaticSearchCode), check the svn status of the code/search/staticSearch folder to see whether there are any new files that need to be added, or perhaps files that need to be deleted.

Witness List Processing

This documentation describes the process by which witness lists (TEI <listWit> and <witness> elements) are processed into output.

Witness Lists in the Source XML

The encoder documentation provides good info on how witnesses should be encoded. A <listWit> appears in the <sourceDesc> of a collation file, accompanying the apparatus list which is encoded in the body of the document. Each <witness> element represents a single source text which was used by the editor in creating the collation.
A Schematron rule constrains the <witness> element such that it either has @corresp and is itself empty, or has a tagged prose description of the witness and does not have @corresp. The first scenario is used when the BIBL1 entry pointed at by @corresp provides sufficient information and no further explanation is needed. In the second case, the editor provides a prose description which is more elaborate, but is expected to link some of the contents of that description to one or more entries in BIBL1.

Processing of Witness Lists

Witness lists are part of collations, and stored in collation files, but the build process that creates a fully-realized text combines the text of the work with the collations to produce an integrated whole. During the generation of standalone XML, the <listWit> is first copied into the play source file (always a modern edition). At that point, if a <witness> element is empty and it has @corresp, the target bibl element is copied into the content of the witness (minus its @xml:id), and the @corresp attribute is removed.
At the HTML stage, the witness list is processed into a sequence of <div> elements in the appendix of the document, along with the bibliography items, person elements and so on. These elements are hidden by CSS and shown only in response to user actions such as clicking on a linked name. Apparatus elements are also part of the appendix. When a user clicks on a collation icon, the relevant apparatus item appears in a popup. In that popup, each collation siglum is a link to its witness element, and clicking on that link causes the JavaScript to retrieve the witness info and append it to the bottom of the popup. Thus the detailed witness info is always available from any apparatus popup.

Editor Tools

LEMDO has created a few tools to make your encoding work easier. This documentation will guide you through using our file templates and transformations. Another useful tool (keyboard shortcuts) is documented in Keyboard Shortcuts and Special Characters.

Prior Reading

This documentation assumes that you are familiar with how to use Oxygen and have read LEMDO Oxygen Project and Work in Oxygen.

Templates

You can use LEMDOʼs file templates when creating new files for your edition. These files are created and maintained by the LEMDO Team and provide you with metadata, basic file structure, necessary elements, and helpful information and documentation links for the type of file that you are creating. For example, our critical paratext template gives the metadata required for critical paratexts, sample <div> and <p> elements, and sample block quotes (using the <cit> and <quote> elements).
To create a file using a template, follow these steps:
At the top of your Oxygen window, click File and then select New from the drop down menu.
In the window that pops up, scroll down to the Framework templates folder. Click on the LEMDO subfolder. This will show you a list of the templates that we have created.
Select the template that you wish to use.
At the bottom of the New file window, select Save as. If you know the pathway down which you wish to save your file, you can type it into the available field (i.e., lemdo/data/texts/{your edition abbreviation}/{the appropriate folder}). Otherwise, click on the folder to the right of the text field and browse for the correct directory. Name your file according to LEMDOʼs naming conventions.
Click Create.
Follow the instructions outlined in your newly created file. We use XML comments liberally in template files to provide you with instructions and helpful tips. You may delete comments as you complete the tasks therein.

Transformations

In addition to making templates to create new files, LEMDO has written XSLTs (eXtensible Stylesheet Language Transformations) to help you complete encoding tasks. Some are designed to create a new file from an existing one (e.g., our transformation to create a baseline modernized text from semi-diplomatic transcriptions), while some are simply meant to complete repetitive tasks (e.g., our transformation to number <lb> elements with @type="wln" in semi-diplomatic transcriptions). Regardless, these transformations are meant to save you time and effort so that you can focus on other editorial tasks.

Step-By-Step: Run Transformations on Your Files

Running transformations is generally fairly straightforward. Follow these steps:
In Oxygenʼs project view, find the file that you wish to run a transformation on. Right click that file.
Hover your mouse over Transform.
Select Transform with…
Scroll down the list to find the transformation that you are interested in. Select that transformation.
Click Apply selected scenarios (1). If there is a number greater than 1 in the parentheses on that button, your file likely has other associated transformations. Generally, we do not want this. Unselect any transformations that you do not want to run before clicking to apply the selected scenarios.
Open the file that you have run a transformation on. Check that the transformation has worked.
Validate your file.
Commit your file.

Example: Number Lines Using a Transformation

This example will show the process for running a transformation. It will number <lb> elements with a @type value of wln in the file emdH4_F1.xml.
The first step is to right click on the file in Oxygenʼs project view:

                                 Screenshot of the project view pane with data/texts/1H4/main open
Here, we want to transform emd1H4_F1.xml, which lives in data/texts/1H4/main.
Next, hover over Transform and select Transform with…:

                                 Screenshot of transform with options, which lists Apply Transformation Scenarios, Configure Transformation Scenarios, and Transform With…
Note that we generally do not need to configure transformation scenarios for specific files. This will permanently associate a specific transformation with the file that you are working on. Most of the time, we only need to use a transformation once on a file and we do not want it to be associated with the file long-term as we do not want to repeatedly apply the same transformation.
When you click Transform with…, a window will open allowing you to select the appropriate transformation:

                                 Screenshot of the transform with window with lemdo_number_wlns_lb_in_semi-dip selected
In this case, we want to number line beginnings in a semi-diplomatic transcription, so we will select lemdo_number_wlns_lb_in_semi-dip. If you are uncertain which transformation to use, or you want us to add a new transformation to our list, please email lemdotech@uvic.ca.
After clicking the Apply selected scenarios button, we open the file to check that the transformation has worked as expected:

                                 An Oxygen file open to show numbered lb elements
The <lb> elements with @type="wln" now have consecutively numbered @n attributes. The transformation has successfully worked as expected.
As always, the last step in Oxygen is to validate the file.

Print Editions

LEMDO publishes print editions of some of its modern texts. This section of the documentation is intended to cover how those print editions are generated, and is aimed primarily at programmers, since itʼs unlikely that anyone other than programmers will venture into this part of the codebase.

Requirements for Building Print Editions

The print editions are generated using LaTeX, and specifically the Xelatex compiler, so anyone wanting to generate a print edition will need to have not only Ant and ant-contrib but also the (substantial) LaTeX codebase installed. On Linux, we recommend installing the texlive-full package, which should include everything you need. On Mac, you can use the mactex distro. You can also install texlive on Windows, but we do not expect most of our build processes to work on Windows for a variety of reasons. *NIX-based OSes are a much better bet. The distributions are large, so donʼt install this stuff just for fun; only do it if you have a need to build PDFs for print.
The PDF build file includes a quick check for the availability of the Xelatex compiler, so starting from the repository root, you can do this:
Change directories into the PDF build directory:
cd code/pdf
Run the check task:
ant checkForXelatex
If this works, youʼre probably OK, although itʼs always possible that a particular package required by the build process is not installed. If thatʼs the case, when you try to run a build, you should see helpful error messages from LaTeX.
All other requirements (specifically, the fonts used to build the PDF) should be in the repository.

Codebase for Building Print Editions

As you might expect, the codebase for building a print edition lives in code/pdf. It is basically very simple:
build.xml, the Ant build file.
Several XSLT files, in the xsl folder, of which the root file is latex_master.xsl. These files are inadequately organized at the time of writing, because they have developed as part of a learning process; when there is time, they will be reorganized. The content should be well-commented, though.
A fonts folder, in which there are two open-source fonts, Vollkorn and Josefin-Sans. These are configured respectively as the main font and the sans font for the PDF build.
A README file and a TODO file, which are essentially ad-hoc notes.

How to Build a Print Edition

Once you have ensured that your system is set up with all the requirements, and you have set up your TEI print edition document, youʼre ready to try a build. Starting from the LEMDO repository root, this is what you do:
Change directories into the PDF build directory:
cd code/pdf
Run the build process, supplying the id of the document you want to build:
ant -DdocsToBuild=emdOth_Print
(You can supply multiple IDs, comma-separated, if you want to.)
The build process will create a folder called pdf inside the main folder of the text you are building. In there, a number of files will be saved, including a log file, the .tex file containing the LaTeX code which is generated and then compiled, and the PDF file of the print edition. If anything goes wrong, you should see either helpful messages from our code or mysterious messages from the LaTeX compiler.
During the build process you will see many very puzzling emanations such as the common Underfull \hbox (badness 1033) message from the compiler. These are mostly informational, warning you when the layout engine has had to stretch or squash a line a little more than it would like to in order to get the justification to work. However, if the build actually fails, you will need to pay attention to whatever message coincides with the termination of the build.
You will notice that the Xelatex compiler is run at least four times on the .tex file. This is because at each run, the layout engine is able to do a slightly better job of adjusting spacing, pagination and so on, but every time it does this, page references and similar content which were generated at the time of the previous build are potentially no longer accurate, so another run is recommended. The number of runs required to get a final version is not easy to determine, so we run four times by default, but this may need to be adjusted.

Print Edition XML Files

A LEMDO print edition is established by creating a standard LEMDO TEI file, but with a filename ending in _Print.xml. This should be created in the main folder of the work itself. So for example, if youʼre creating a print edition of Othello, you would create this file: data/texts/Oth/main/emdOth_Print.xml
This file is like any other TEI text in the project; it has metadata, responsibility statements and so on. But it will mostly consist of content from other parts of the work folder. Primarily, it will include a modern-spelling edition of the play, but it will also have other components such as critical materials and a bibliography. The following is a simplified example which will be explained below, and should cover all the main components.
<TEI version="5.0" xml:id="emdOthX_Print">
  <teiHeader> [Normal metadata should appear here...] <textClass>
    <catRef scheme="tax:emdDocumentTypes" target="cat:ldtBornDigPrint"/>
  </textClass> [More metadata...] </teiHeader>
  <text>
    <text>
      <front>
        <linkGrp type="printInclude">
          <ptr target="doc:emdOth_TextIntro"/>
        </linkGrp>
      </front>
      <body>
        <div>
          <linkGrp type="printInclude">
            <ptr target="doc:emdOth_M#emdOth_M_castList"/>
            <ptr target="doc:emdOth_M"/>
          </linkGrp>
        </div>
      </body>
      <back>
        <div>
          <divGen type="bibliography"/>
          <linkGrp type="printInclude">
            <ptr target="doc:emdOth_M_annotation"/>
          </linkGrp>
        </div>
      </back>
    </text>
  </text>
</TEI>
The header is a normal header except for the particular document type specified with <catRef> /@target="cat:ldtBornDigPrint".
But in <text> , the first thing you will notice is that the <front> , <body> and <back> elements do not include any content directly. They can include content if necessary, and there may well be components that are intended to be used only for one particular print edition, and therefore belong directly in this file, but most content is in the form of <linkGrp> elements containing pointers. These pointers specify other files in the repository, or sections of files. They use the doc: prefix to point to the ids of files, and an optional fragment identifier to point to a specific part of the file. These includes will be processed by the first stage of the build code to create a complete TEI file incorporating all these components. That structure will then be processed into the TEI file.
Notice the organization: critical materials come in the <front> element, the castlist and the play itself come in the <body> , and the bibliography (of which more below) appears in the <back> . Also in the <back> are any annotation files which are needed; these are processed into footnotes at build time. Only annotations which are actually referenced in the included texts will be used, and the rest will be discarded.
Finally, note the special case of the <divGen> element for the bibliography. This is acted on by the build code, which retrieves from BIBL1.xml all bibliography items which are actually mentioned in the rest of the content, and generates a bibliography from them automatically. Note that if there is a reference to an item which does not appear in BIBL1.xml, the PDF build will fail and stop.

Prosopography

Janelle Jenstad

Janelle Jenstad is a Professor of English at the University of Victoria, Director of The Map of Early Modern London, and Director of Linked Early Modern Drama Online. With Jennifer Roberts-Smith and Mark Kaethler, she co-edited Shakespeare’s Language in Digital Media: Old Words, New Tools (Routledge). She has edited John Stow’s A Survey of London (1598 text) for MoEML and is currently editing The Merchant of Venice (with Stephen Wittek) and Heywood’s 2 If You Know Not Me You Know Nobody for DRE. Her articles have appeared in Digital Humanities Quarterly, Elizabethan Theatre, Early Modern Literary Studies, Shakespeare Bulletin, Renaissance and Reformation, and The Journal of Medieval and Early Modern Studies. She contributed chapters to Approaches to Teaching Othello (MLA); Teaching Early Modern Literature from the Archives (MLA); Institutional Culture in Early Modern England (Brill); Shakespeare, Language, and the Stage (Arden); Performing Maternity in Early Modern England (Ashgate); New Directions in the Geohumanities (Routledge); Early Modern Studies and the Digital Turn (Iter); Placing Names: Enriching and Integrating Gazetteers (Indiana); Making Things and Drawing Boundaries (Minnesota); Rethinking Shakespeare Source Study: Audiences, Authors, and Digital Technologies (Routledge); and Civic Performance: Pageantry and Entertainments in Early Modern London (Routledge). For more details, see janellejenstad.com.

Joey Takeda

Joey Takeda is LEMDO’s Consulting Programmer and Designer, a role he assumed in 2020 after three years as the Lead Developer on LEMDO.

Kate LeBere

Project Manager, 2020–2021. Assistant Project Manager, 2019–2020. Textual Remediator and Encoder, 2019–2021. Kate LeBere completed her BA (Hons.) in History and English at the University of Victoria in 2020. During her degree she published papers in The Corvette (2018), The Albatross (2019), and PLVS VLTRA (2020) and presented at the English Undergraduate Conference (2019), Qualicum History Conference (2020), and the Digital Humanities Summer Institute’s Project Management in the Humanities Conference (2021). While her primary research focus was sixteenth and seventeenth century England, she completed her honours thesis on Soviet ballet during the Russian Cultural Revolution. She is currently a student at the University of British Columbia’s iSchool, working on her masters in library and information science.

Martin Holmes

Martin Holmes has worked as a developer in the UVicʼs Humanities Computing and Media Centre for over two decades, and has been involved with dozens of Digital Humanities projects. He has served on the TEI Technical Council and as Managing Editor of the Journal of the TEI. He took over from Joey Takeda as lead developer on LEMDO in 2020. He is a collaborator on the SSHRC Partnership Grant led by Janelle Jenstad.

Navarra Houldin

Project manager 2022–present. Textual remediator 2021–present. Navarra Houldin (they/them) completed their BA in History and Spanish at the University of Victoria in 2022. During their degree, they worked as a teaching assistant with the University of Victoriaʼs Department of Hispanic and Italian Studies. Their primary research was on gender and sexuality in early modern Europe and Latin America.

Tracey El Hajj

Junior Programmer 2019–2020. Research Associate 2020–2021. Tracey received her PhD from the Department of English at the University of Victoria in the field of Science and Technology Studies. Her research focuses on the algorhythmics of networked communications. She was a 2019–2020 President’s Fellow in Research-Enriched Teaching at UVic, where she taught an advanced course on Artificial Intelligence and Everyday Life. Tracey was also a member of the Map of Early Modern London team, between 2018 and 2021. Between 2020 and 2021, she was a fellow in residence at the Praxis Studio for Comparative Media Studies, where she investigated the relationships between artificial intelligence, creativity, health, and justice. As of July 2021, Tracey has moved into the alt-ac world for a term position, while also teaching in the English Department at the University of Victoria.

Bibliography

OED: The Oxford English Dictionary. 2nd ed. Oxford: Oxford University Press, 1989.

Metadata