This chapter of our documentation is still in beta. We welcome feedback, corrections,
and questions while we finalize the page in our 2024–2025 work cycle.
This document lays out basic programming practices, style rules, and naming conventions
for programmers working on the XSLT, HTML, JavaScript, and CSS in the LEMDO project.
We do not yet conform to these principles, but we are committed to doing so in time.
¶ Names for Variables, Classes, and Other Identifiers
All LEMDO programming code identifiers and file names should be constrained to the
ASCII range.
All LEMDO identifiers should be as long as necessary to describe their meaning/function
unambiguously, with information going from general to specific. Examples: note-editorial, title-monograph. This enables easier sorting, searching for, and comparing items which belong in
groups.
Generally speaking, the following should all be lower-case with components separated
by dashes:
CSS filenames (including SCSS source files).
JavaScript filenames.
Image filenames (for site chrome etc.; facsimiles etc. have their own naming conventions).
HTML custom attribute names (data-*).
Note that HTML data-* attribute values will often be derived from TEI element names and other similar constructs, so these
may take whatever form is dictated by their source: <div data-el="persName">.
ECMAScript practices are different:
ECMAScript variable and function names should be lower camelCase (function getSomething()).
ECMAScript constants should be UPPERCASE_WITH_UNDERSCORES.
ECMAScript class names should be upper CamelCase (class NotePopup {...}).
Identifiers should be as descriptive as possible, with information going from general
to specific, so: noteEditorial, titleMonograph.
Wherever possible, prefer HTML5 Semantic Elements over generic elements. Examples would be nav, section, main, article, header, footer.
The only elements requiring id attributes are those which need to be pointed at, linked to, or manipulated by ECMAScript.
To identify a category of element that needs to be styled (for example), prefer the
use of data-* attributes derived from the source TEI wherever possible. So:
<span data-el="persName">
<span data-el="speaker">
<div data-el="ab">
This provides a solid link between the underlying TEI code and the HTML which then
propagates easily into the CSS, making maintenance and debugging easier.
Reserve the use of style and class attributes for the propagation of primary source style description from the source TEI into the HTML. Examples:
<span data-el="stage" style="font-size: 80%;">
<span data-el="speaker" class="rnd_gothic">
Here the source element name is identified using the data-el attribute, so default house styling can be applied uniformly if necessary, but the
appearance from the original source text is captured in a local style attribute or a class.
Following on from the principles above, site-level CSS (as opposed to text-level CSS
derived from the TEI encoding of primary source features) should avoid the use of
class attributes wherever possible. To select an element for styling:
First prefer its semantic tag name (header, footer, aside).
Next, use TEI-derived data-* attributes (<span data-el="fw" data-type="catch">).
If that is not enough, fall back on a selector based on position in the hierarchy
or a semantic feature of an ancestor (q q{...}, div[data-el="speech"] aside).
XSLT files are named in lower-case with underscores, and use the extension .xsl.
All root (master) XSLT file names end in _master.xsl.
All modules which are included or imported into master files have names ending in
_module.xsl, and are stored in a modules subfolder alongside the master file(s) that use them.
XSLT files are documented using elements in the XSLT Stylesheet Documentation Namespace, as supported by the Oxygen XML Editor. They are also validated using project-specific
custom Schematron in Oxygen, ensuring adherence to basic good practices and style
rules.
Variable and function names should use lower camelCase.
A static build is a process which takes a stack of input documents such as TEI files, along with
related resources such as images, and creates a website from them. It does this in
such a way that the resulting website has no dependency or requirement for any back-end
engine such as a database or a PHP processor; all the content consists of static HTML
and related files, and the site can be hosted on any webserver just by copying the
files up to the server. This process, championed by UVicʼs Project Endings, is described in detail in the many presentations and publications of the Endings team. The great value of a static site is that it requires no ongoing
maintenance, is easy to replicate and archive, and has a good chance of surviving
for decades without much attention.
The LEMDO project and its related sites are all built using a Jenkins Continuous Integration
Server run by HCMC at the University of Victoria, so unless you are a project administrator
or programmer, you should never need to run a build yourself. However, if you are
writing or editing code for the project, you can consult the documentation on running builds to get detailed information on how to run complete builds on your computer, and also
how to run partial, rapid builds to test specific processes or outcomes.
The LEMDO repository has a complex build process that can be difficult to understand;
it also takes quite a long time to do a complete builds, so when developers or designers
are working on the site, they need to be able to do rapid partial builds to see the
results of their work. This documentation describes the build requirements for running
builds, and outlines some of the strategies for rapid test-builds that are available.
This is a list of software that is required for running the various build processes.
Some of it is actually stored in the repository, and some must be installed on the
machine doing the build.
To run the various LEMDO build processes, you will need the following software to
be installed on your machine. At present most of the build processes have to be run
on *NIX systems because they depend on command-line utilities. If you are forced to
use Windows, you’ll probably have to install the Windows Subsystem for Linux. For
running specific components of the build, you may not need all of these applications
or libs.
As part of a full build process, the XML documents from the data folder are copied over to an output folder (products/lemdo-dev/site/xml/source) and validated there, with RNG and Schematron. You can invoke this process without
having to run the rest of the build by running: ant quickValidateSource.
This is a useful way to check that you havenʼt broken anything while doing a multiple-file
search-and-replace or a similar global operation.
The complete static build process takes a long time. If you’re working on fixing a
build problem and you need to test your changes, it is obviously not practical to
run the entire build process and wait to see the results. However, in most cases,
you don’t need to. Here are a number of examples of how you can run only a small component
of the build process to test specific changes.
Important note: In most cases, you must have an existing completed build in place before you can
successfully run partial builds. That means that once in a while, you will need to
run a complete local build for yourself. You can, of course, do that over lunch or
overnight.
Once you have a full completed build available locally, you can start running only
the part of the build that you are interested in. For example, if you are trying to
work on a problem that relates to the generation of the Original XML, you might do this: ant createOriginalXml validateOriginalXml.
This will perform only those two steps, and you can then examine the results in the
products/lemdo-dev/site/xml/original folder.
Similarly, if you are working on the XHTML generation, you could run: ant createXhtml.
To see a full list of the subtasks available, type ant and press the tab key twice. To see more info, including descriptions of each of
the tasks, type: ant -p.
If you’re working on something more substantial that requires several steps, you can
just chain them together as appropriate. Make sure you run them in the order they
would normally run, because each process may depend on the output from a preceding
process. You can discover the order by looking at the
@depends attribute on the target named all.
Another useful approach to rapid building is to process only a specific subset of
documents. For example, imagine that you are dealing with an HTML problem that affects
lots of documents, but you know that one particular document (emdH5_F1.xml) exemplifies the issue, and can be used as a test. You can run this: ant createXhtml -DdocsToBuild=emdH5_F1.
This will run the part of the build that transforms the Standalone XML into HTML files, but it will only process a single document, making it very fast
indeed; you can then inspect the changes to that specific document. To process more
than one document, separate them with commas: ant createStandaloneXml -DdocsToBuild=emdH5_F1,emdH5_Q1.
You can even use a regular expression, so you could build all of the Henry V documents
by running this: ant createStandaloneXml -DdocsToBuild=emdH5_.*.
Finally, there is a specific target named quick, which is designed to do the minimum processing to get from the source XML to the
XHTML output (in other words, the most important stages in the build process). If
you run: ant quick -DdocsToBuild=emdH5_F1,emdH5_Q1, you’ll pass those two documents through the entire process from source to HTML output,
but the process should be relatively fast. Again, it’s important to remember that
you must have a complete set of build products in place in your products/lemdo-dev/site folder before this will work properly.
The various strategies described above provide the basis for a programmer to work
efficiently on solving a specific problem or adding a specific feature without having
to wait for long periods to see the results of changes. If you triage the issue you’re
working on carefully, you’ll be able to break it down into small steps, and identify
a specific subset of documents which can be used for testing, then develop and test
your changes carefully, so that when you do commit changes to the repository, it’s
much less likely that the full build will fail because of something you did.
In the LEMDO universe, an anthology is a collection of texts along with supporting
information and documentation, presented through a web interface which is based on
a set of defaults, but is customized through CSS styling, menu configuration, and
so on.
The lemdo-dev anthology is a special instance which includes all the texts that exist in the repository,
whatever their state of development. It is never published at any formal URL, but
is available through a continuous integration build server which enables everyone
working on any LEMDO project to see the latest state of their own data. The process
of building lemdo-dev also detects and reports errors, invalidities and other problems related to any texts
in the collection.
All other anthologies exist as subfolders of the data/anthologies folder in the LEMDO repository. Each anthology is based primarily on these features:
A subfolder in data/anthologies named for the id of the anthology (for example qme or dre).
A TEI Corpus file (a TEI file whose root element is
<teiCorpus>
rather than
<TEI>
), with an
@xml:id attribute identical to the anthology id. This file contains several key components:
A
<teiHeader>
element which holds the anthology-level metadata (the anthology editor(s), publication
statements, etc.).
A hard-coded
<TEI>
element with
@xml:id={anthId}_index. This file contains the content for the anthology home page. The presence of this
<TEI>
element makes the anthologyʼs
<teiCorpus>
valid.
A sequence of import instructions in the form of Processing Instructions, looking
like this: <?lemdo-import ref="emdFV_edition"?>. Each one of these points to the id of an edition file, whose contents are to be
published in the anthology.
A collection of other XML files populating folders within the anthology folder, each
of which consists of a TEI document that is to be turned into a web page in the rendered
anthology site. These are all in the category ldtBornDigital, and they have filenames and IDs beginning with the anthology ID followed by an underscore.
All such files found in the anthology folder will be converted into web pages. Note
that these files are also transformed as part of the regular lemdo-dev build, so you
can see them in their default state in that build; when the anthology itself is built,
those files will be converted to the anthology rendering style, and will have their
prefixes removed (so qme_index.html will become simply index.html in the products/qme/site output folder).
A site folder containing all the site-level customizations which apply to the anthology
output, including template/sitePage.html, which will form the basic template for all pages in the site. This is where the
banner, menu, footer and other standard site components are defined. Other folders
include images, for images needed on the site; fonts, for any specialized fonts that need to be included; and css, which contains one or more SCSS files which will be transformed to CSS using SASS,
and added to the output pages after the standard CSS inherited from the main lemdo-dev
project, enabling a designer to override any display defaults. There should be one
main SCSS file named {anthId}.scss which imports any other files and that will be built into {anthId}.css and linked into all the output files.
The build process for anthologies is quite complex:
Before anything else happens, the main lemdo-dev build process must run. This processes
all the texts and identifies problems that might need to be rectified before an anthology
build can proceed.
Next, the main build process runs a diagnostic for each anthology that is in the repository,
checking to see whether it is in a state where it can be successfully built. That
process produces two outputs: an HTML file (products/{anthId}/anthologyStatus.html) which lists problems found, and a text file (products/{anthId}/anthologyStatus.txt) which simply says "OK" or "FAIL". The checking process is described below.
The anthology file itself is then checked for two things: the
<revisionDesc>
/
@status for the anthology itself must be set to published, and so must the
<revisionDesc>
/
@status attribute in the first embedded
<TEI>
element which contains the home page content. If either of these is not published,
the build does not proceed.
The anthology home page, which is the TEI element embedded in the anthology configuration
corpus file, is processed into a page in a temporary folder.
Each item in the collection of site-page XML documents located in the anthology folder
is first checked to make sure its
<revisionDesc>
/
@status value is published.
Each LEMDO edition which is claimed by the anthology (using a lemdo-import processing
instruction, as described above) is processed as follows:
The edition file itself is checked to ensure that it has a
<licence>
element which permits the inclusion of the edition file itself in the anthology.
This is an example of such a
<licence>
element:
<licence resp="pers:JENS1" from="2020-10-21" corresp="anth:dre">
This content is licensed for inclusion in the DRE anthology.
</licence>
The text inside the
<licence>
element is informational and not required; the key components are a person taking
responsibility for the licence, a starting date for it, and a pointer to the target
anthology using the
@corresp attribute and the anth prefix. The build process checks only that there is at least one valid
@resp value in the attribute. It does not have any way to know whether the person concerned
actually has the right to make such a declaration; this check is the responsibility
of the anthology lead.
The
<revisionDesc>
/
@status value for the edition file is checked to make sure it is set to published.
If the edition is allowed for inclusion and is published, it is then parsed, and each
text linked from it is checked for the same licence permission and publication status.
At this point, we have a complete list of all the documents needed for the anthology,
but they are all in the vanilla lemdo-dev style etc. These documents are then processed into the output directory.
During this process, all boilerplate components such as banners, menus, footers and
so on are replaced with those from the anthologyʼs sitePage.html template file, and the anthologyʼs CSS and JS files are added into the headers of
the files.
A post-build diagnostics process is now run on the built site to check that every
link points to something which is where it is expected to be, and every file is valid.
(Not yet implemented.)
To customize the functionality and appearance of your anthology, you will need to
have a complete checkout of the repository, and you must be working on a computer
which is set up to run the required software. Practically speaking, this means that
you need either a Linux or Mac operating system (Windows is not able to run required
scripts), and you will need both curl and DART-SASS installed and working from the command line. Of course you will also need an svn
client in order to check out and commit to the repository, as well as a NetLink ID and permission to commit to the required folders. In addition, if you want to follow
the recommended working method described below, you will need to have ant and ant-contrib installed.
There are three aspects of an anthology that can be customized:
Some components of the basic HTML framework used to create pages can be changed, by
editing the file data/anthologies/{anthId}/template/sitePage.html. This file basically contains a banner/header, a site menu, and a footer; these are
used to replace the generic versions that come from the lemdo-dev build pages.
The style/design/appearance of the site can be modified by editing the SCSS file lemdo/data/anthologies/{anthId}/site/css/{anthId}.scss. This file is compiled to create {anthId}.css, which is then placed in the css folder in the anthology build output, and linked into the HTML files after the default lemdo-dev.css file, so that it can be used to override any rules in that file.
Site functionality can be customized by editing the file lemdo/data/anthologies/{anthId}/site/js/{anthId}.js, which is linked into the HTML files after any other JS, so it can override (for
example) object methods and add new functionality.
¶ A Suggested Working Method for Customization Work
For developers or designers working on customization, we suggest the following working
method.
First of all, if you are only working on the CSS, images, fonts and/or the JavaScript,
you do not need to run a complete site build yourself. You can download a pre-built
version of the site from our Jenkins server to your local machine, then make changes
and simply rebuild the CSS. This is how: (In the following, {anthId} means the id
of the target anthology, such as dre or qme.)
At the command line, in the lemdo folder, run the following command:
./getSiteFromJenkins.sh {anthId}
This will download a complete copy of the latest build of the anthology from the
Jenkins server, and place it in the products folder inside the lemdo folder. The main site content appears in the site subfolder. You can view this content directly in a browser, but if you want to work
on JavaScript interactivity, you may need to run a local web server in that folder
so that browsers will permit all the code to run. Most developers will be familiar
with this situation.
Now that you have a local copy, you can make changes to the CSS and JavaScript code
in the lemdo data/anthologies/{anthId}/site folder, then run this command to update the local site and see your changes:
ant -f build_anthology.xml updateAnthology -Danthology.id={anthId}
This will copy all JS, images, and font files over to the local built site, and run
SASS to build the SCSS file to create a CSS file in the css folder.
Once you are happy with the changes youʼve made, you can commit them to the svn repository
(donʼt forget to svn add any new images or fonts files), and then Jenkins will rebuild the site in due course.
If you are editing the site template, then the process is much more complex, because
every page will need to be rebuilt before your changes will appear. In that case,
you will also need a complete local copy of the base lemdo-dev anthology. You can
get that by running: ./getSiteFromJenkins.sh.
That will give you a copy of the lemdo-dev site, which is the source from which the
anthology pages are built. After editing the anthology site template file in data/anthologies/{anthId}/template/sitePage.html, run this command to rebuild your entire anthology: ant -f build_anthology.xml -Danthology.id={anthId} -DbuildFailedAnthologyAnyway=true
buildAnthologies.
This process may take a little time, depending on the size of your anthology.
The anthology template files all include an XHTML <h2> element which appears as the first item in the <article> element. That <h2> is the trigger leading to the insertion of the page title(s).
We never take the page title from the content of the TEI
<text>
element. Our basic assumption is that the title appearing at the top of a page on
the site is drawn from the content of the
<titleStmt>
in the
<teiHeader>
, according to these rules:
If there is a single
<title>
element with
@type=main, then the contents of that element are processed into an XHTML <h2> element.
If there is no
<title>
with
@type=main, but there is a single
<title>
element with no
@type attribute, then the contents of that element are processed into an XHTML <h2> element.
If there is a
<title>
element with
@type=sub, then that element is processed into an XHTML <h3> element.
All other
<title>
elements in the header
<titleStmt>
are ignored for the purposes of rendering the main heading on the page.
<front>
elements should only be used in primary source documents, and should only contain
genuine transcribed front matter from a primary source document. Any front element
in a primary source document will be processed and rendered following the main page
title created following the rules above.
¶ Bylines and Other Front-Like Things in Born-Digital Documents
If you need to provide bylines, prefatory statements, or other such content at the
head of the page in a born-digital or mixed-content document, put this information
at the beginning of the
<body>
.
¶ How the Table of Contents is Generated for Semi-Diplomatic Transcriptions
Semi-diplomatic transcriptions vary widely in their original form and in the encoding
practices of their editors, but when a play is rendered for the end-user, a table
of contents needs to be constructed to appear in the slide-out Content tab on the
website. For a non-diplomatic text, the TOC will be constructed using the text node
of the
<head>
element of any
<div>
elements in the text which have a
<head>
, since these are obviously the major divisions in the text (usually act-scene numbers
or scene numbers), but for semi-diplomatic texts, there may well be no such obvious
structure to draw on.
Therefore a rather complicated algorithm tries to decide what components of the text
should best be used to create a useful TOC. This is how the process works:
By default,
<pb>
elements having a signature or folio number in their
@n attribute will be used.
However, if the text contains 20 or more
<label>
elements, then these are assumed to be more suitable text-division headings, and
will be used instead. (For an example, see the texts in the Douai Shakespeare Manuscript Project.)
If 20 or more
<label>
elements have
@n attributes, then only the
<label>
elements having
@n attributes will be used, and the text of the TOC entries will be taken from the
@n attributes.
If more than 20
<label>
elements exist, but fewer than 20 have
@n attributes, then all
<label>
elements will be used, but whenever a
<label>
has
@n, its value will be used for the TOC entry text instead of the content of the label.
Why so complicated? While a TOC constructed from
<pb>
elements may be very straightforward, it is not very helpful to a general reader
looking for the major sections of the text, and it may end up being extremely long,
so
<label>
is usually a better choice if the text contains headings or similar markers which
can be tagged as labels. However, the text content of a
<label>
element may not be very helpful in itself; it might look like this:
and this will be used in preference to the textual content. The
@n attribute can be used on all
<label>
s to create an entirely curated TOC if that is preferred.
LEMDOʼs long-term plan for most texts other than Douai texts is to mobilize the
<milestone>
element in order to note correspondences between places in the semi-diplomatic transcription
and the modern text. Many semi-diplomatic transcriptions already contain commented-out
<milestone>
elements. When a modern text is finalized, we will be able to finalize the milestone
elements in the semi-diplomatic transcription by adding a
@corresp attribute with a value of the
@xml:id value of a scene or act
<div>
in the modern text. We will also add an
@n attribute whose value will be used to generate a TOC of act-scene or scene beginnings.
Ideally, users will be able to toggle between a signature TOC (A1r, A1v, A2r, A2V,
etc) and a milestone TOC.
LEMDO has thousands of
<ref>
elements that point to editions, anthologies, components therein, and to resources
outside LEMDO. These are all processed into HTML hyperlinks for the static site.
Our processing does not do anything with
<ref>
elements in PDFs. Any pointing you do inside a portfolio with the
<ref>
element will result in nothing in a PDF.
The
<ref>
and
<ptr>
elements can co-exist because they are pointing to xml:ids, but only the
<ptr>
element can be converted to strings at build time in LaTeX.
We would like to have a canonically-structured textual reference in the output. In
the digital edition, we want the A.S.Sp. system (e.g., 5.1.2) plus a precise hyperlink.
In the print edition, we want the A.S.L reference system (e.g., 2.3.101). We do not
want to have the author of the critical text write a literal 2.3.101 into their document,
because lineation may change as the text is edited, but we do want a critical introduction
to be able to contain A.S.L citations when it is printed. The actual text in the parenthetical
citation must be generated at build time.
LEMDO therefore has two different processing chains for pointers: one for the digital
edition and one for the print edition (a camera-ready PDF that can be downloaded or
printed through print-on-demand services).
For online publication, we generate a parenthetical citation that gives LEMDOʼs canonical
reference system (A.S.Sp.). Clicking on the citation takes one directly to the part
of the speech being cited.
For the PDF, we generate a parenthetical citation that gives A.S.L (act, scene, line
number) using the line numbers generated for the PDF at the time we make the PDF.
For example, an editor might use a
<ptr>
element in their critical introduction to point to anchors in the middle of a long
speech in their modern text. In the processing for the PDF, LEMDO will calculate and
supply the A.S.L, so that reader may find the exact line(s) being cited in the generated
parenthetical citation. For the online version, the parenthetical citation will be
A.S.Sp. but the hyperlink on the citation will go directly to the target point in
the speech.
If editors are writing a critical introduction to Othello, they will naturally want to provide links to other relevant texts in the LEMDO collection.
This is normally done using the
<ref>
element, like this:
This points to the modern edition of Titus by its document id. However, if you want to point to a specific location in that
text, things become more complicated, in the sense that although you may point at
an id in that text, itʼs not clear what textual reference you should provide. For
example, if you call it 2.5.35 (Act 2, Scene 5, Line 35), and then the editor of Titus determines that there are
two missing lines at the beginning of the scene, and adds them in, the text inside
your
<ref>
element is now misleading. The following sections provide two solutions to this problem,
one used within a specific play folder (a portfolio), and one used more generally.
One approach to the mutability of online texts produced both within and outside LEMDO
is to choose a specific print edition of source texts (such as the New Oxford Shakespeare) to which all references can point. This has the obvious disadvantage that such a
pointer cannot be made into any kind of link, but links are fragile anyway, and this
approach also fits with the numerous citations of critical print texts which occur
throughout critical material.
We have seen above that certain types of citation between texts are not very robust,
because texts are (for the forseeable future, anyway) steadily evolving, and even
our principles for lineation and our citation styles are not absolutely finalized.
However, when youʼre editing critical material that will be bundled with an edition
of a text in (for example) a print volume, you need to be able to point into the text,
just as you need to be able to attach annotations to specific points in the text.
There are two scenarios in which we do this:
Annotations are
<note>
elements (documented in detail in Writing Annotations) which live in separate files outside the plays to which they refer. At build time,
annotations may be rendered as links with popups, or as footnotes in a print edition.
Annotations are pinned to a location in the text using a system of anchors and pointers which is documented
in Writing Annotations.
Although we know that pointing between electronic texts which are in constant development
is inherently fragile, there are situations in which we need to be able to create
a canonically-formatted text reference in a critical text to a specific point in the
text which is being discussed. If these texts are in the same portfolio, then we know
they will be built at the same time, and therefore any output created will be consistent
across that build.
This enables us to solve the particular problem noted above, where we would like to
have a canonically-structured textual reference such as 2.3.45 appearing in the critical text, and this is particularly important for the print
texts that we are going to publish. We donʼt want to have the author of the critical
text write a literal 2.3.45 into their document, because lineation may change as the text is edited, but we do
want a critical introduction to be able to contain such text when it is printed; therefore
the actual text must be generated at build time. We do this using a
<ptr>
element with
@type=localCit:
<div><!-- In the critical text: --> <p><!-- ... --> the king addresses the <quote>noble English</quote>
(<ptr type="localCit" target="doc:emdH5_FM.xml#emdH5_FM_anc_2000"/>) separately <!-- ... --></p> <!-- In the play text: --> <l>Oh, <anchor xml:id="emdH5_FM_anc_2000"/>noble English,</l> </div>
At build time, this will be expanded to (for example) (1.2.111). You will notice that we use the same mechanism for creating a point in the text that
can be addressed as we do for annotations: we insert an anchor (see Writing Annotations for instructions on how to do that). To specify a range, include pointers to two
anchors with a space between them:
If youʼre pointing at an entire line, speech, scene or act, then thereʼs no need to
insert two anchors. You can instead add an
@xml:id to the target element (
<l>
,
<sp>
, or
<div>
, if there isnʼt one there already, and point to that instead. To create a new
@xml:id, the simplest way is to insert an anchor element in the usual way, then take its
@xml:id, which is guaranteed to be unique, and use that, discarding the rest of the
<anchor>
.
This documentation lists and explains a number of different custom processing instructions
used by the LEMDO project to include content from elsewhere, and to trigger the automatic
generation of content.
LEMDO prefers the use of processing instructions for the purposes of inclusion over
other methods such as XInclude because it is more flexible; processing for XPointers
in XInclude instructions is not widely supported, and some processors and validators
may act upon XInclude instructions when theyʼre not intended to be processed. There
are two PI-based inclusions in LEMDO: <?lemdo-include href="doc:learn_encodeLinks_intro"?>
This lemdo-include PI is used in the lemdo.odd file to assemble the separate documentation files found in the data/documentation folder into a single structured document before that is processed into the documentation
web pages. This PI should not be used outside of the ODD file. See Documentation and the ODD File for more information.
Another set of processing instructions provides a way to generate content in an output
page based on metadata elsewhere in the project. These are three examples: <?taxonomy-table
ref="emdAudiences"?>
This tells the processor to find the
<taxonomy>
element in TAXO1 which whose
@xml:id=emdAudiences, and process it to create a TEI
<table>
element laying out all the categories and their definitions. That table is later
processed into an HTML table in the documentation page for the site.
<?charDecl-table ref="characters"?>
This tells the build process to generate a table from from character declarations
(
<charDecl>
elements) in TAXO1.
This generates a table from a
<listPrefixDef>
, also in TAXO1.
Processing for these PIs is specified in two places: first in the documentation_inclusion_master.xsl file, which handles the majority of cases since they occur mainly in documentation;
but also in the xml_original_templates.xsl module, to handle any cases in which a PI may be used in a page which is not part
of the documentation. These templates should be updated in a synchronized way.
LEMDO sites are static sites conformant with Endings principles, so they use the Endings-developed staticSearch system to provide search functionality for individual anthologies. Rather than check
out a fresh copy of the staticSearch codebase every time we build something, we store
a static copy of the codebase in our repository in code/search/staticSearch. This should be an up-to-date stable version of the codebase.
When any anthology is built, the staticSearch codebase is copied into a staticSearch
folder in the anthologyʼs products folder. We could run all our staticSearch build
indexing for all anthologies directly from the code/search/staticSearch folder, but making a copy enables us to do tests with alternative versions of staticSearch
if we need to, using a single anthology.
The file build_globals_module.xml contains a property called staticSearchBranch which specifies which branch we want to use for our staticSearch codebase copy. It
should normally be set to a stable release branch, unless we are doing some unusual
testing. Release branches are updated periodically, for bugfixes and minor enhancements,
so there is also an Ant task in the same file called refreshStaticSearchCode, which will update the files in the code/search/staticSearch folder automatically. After running the update (ant -f build_globals_module.xml refreshStaticSearchCode), check the svn status of the code/search/staticSearch folder to see whether there are any new files that need to be added, or perhaps files
that need to be deleted.
The encoder documentation provides good info on how witnesses should be encoded. A
<listWit>
appears in the
<sourceDesc>
of a collation file, accompanying the apparatus list which is encoded in the body
of the document. Each
<witness>
element represents a single source text which was used by the editor in creating
the collation.
A Schematron rule constrains the
<witness>
element such that it either has
@corresp and is itself empty, or has a tagged prose description of the witness and does not have
@corresp. The first scenario is used when the BIBL1 entry pointed at by
@corresp provides sufficient information and no further explanation is needed. In the second
case, the editor provides a prose description which is more elaborate, but is expected
to link some of the contents of that description to one or more entries in BIBL1.
Witness lists are part of collations, and stored in collation files, but the build
process that creates a fully-realized text combines the text of the work with the
collations to produce an integrated whole. During the generation of standalone XML,
the
<listWit>
is first copied into the play source file (always a modern edition). At that point,
if a
<witness>
element is empty and it has
@corresp, the target bibl element is copied into the content of the witness (minus its
@xml:id), and the
@corresp attribute is removed.
At the HTML stage, the witness list is processed into a sequence of <div> elements in the appendix of the document, along with the bibliography items, person
elements and so on. These elements are hidden by CSS and shown only in response to
user actions such as clicking on a linked name. Apparatus elements are also part of
the appendix. When a user clicks on a collation icon, the relevant apparatus item
appears in a popup. In that popup, each collation siglum is a link to its witness
element, and clicking on that link causes the JavaScript to retrieve the witness info
and append it to the bottom of the popup. Thus the detailed witness info is always
available from any apparatus popup.
LEMDO has created a few tools to make your encoding work easier. This documentation
will guide you through using our file templates and transformations. Another useful
tool (keyboard shortcuts) is documented in Keyboard Shortcuts and Special Characters.
You can use LEMDOʼs file templates when creating new files for your edition. These
files are created and maintained by the LEMDO Team and provide you with metadata,
basic file structure, necessary elements, and helpful information and documentation
links for the type of file that you are creating. For example, our critical paratext
template gives the metadata required for critical paratexts, sample
<div>
and
<p>
elements, and sample block quotes (using the
<cit>
and
<quote>
elements).
To create a file using a template, follow these steps:
At the top of your Oxygen window, click File and then select New from the drop down menu.
In the window that pops up, scroll down to the Framework templates folder. Click on the LEMDO subfolder. This will show you a list of the templates that we have created.
Select the template that you wish to use.
At the bottom of the New file window, select Save as. If you know the pathway down which you wish to save your file, you can type it into
the available field (i.e., lemdo/data/texts/{your edition abbreviation}/{the appropriate folder}). Otherwise, click on the folder to the right of the text field and browse for the
correct directory. Name your file according to LEMDOʼs naming conventions.
Click Create.
Follow the instructions outlined in your newly created file. We use XML comments liberally
in template files to provide you with instructions and helpful tips. You may delete
comments as you complete the tasks therein.
In addition to making templates to create new files, LEMDO has written XSLTs (eXtensible
Stylesheet Language Transformations) to help you complete encoding tasks. Some are
designed to create a new file from an existing one (e.g., our transformation to create
a baseline modernized text from semi-diplomatic transcriptions), while some are simply
meant to complete repetitive tasks (e.g., our transformation to number
<lb>
elements with @type="wln" in semi-diplomatic transcriptions). Regardless, these transformations are meant to
save you time and effort so that you can focus on other editorial tasks.
Running transformations is generally fairly straightforward. Follow these steps:
In Oxygenʼs project view, find the file that you wish to run a transformation on.
Right click that file.
Hover your mouse over Transform.
Select Transform with…
Scroll down the list to find the transformation that you are interested in. Select
that transformation.
Click Apply selected scenarios (1). If there is a number greater than 1 in the parentheses on that button, your file
likely has other associated transformations. Generally, we do not want this. Unselect
any transformations that you do not want to run before clicking to apply the selected
scenarios.
Open the file that you have run a transformation on. Check that the transformation
has worked.
This example will show the process for running a transformation. It will number
<lb>
elements with a
@type value of wln in the file emdH4_F1.xml.
The first step is to right click on the file in Oxygenʼs project view:
Here, we want to transform emd1H4_F1.xml, which lives in data/texts/1H4/main.
Next, hover over Transform and select Transform with…:
Note that we generally do not need to configure transformation scenarios for specific
files. This will permanently associate a specific transformation with the file that
you are working on. Most of the time, we only need to use a transformation once on
a file and we do not want it to be associated with the file long-term as we do not
want to repeatedly apply the same transformation.
When you click Transform with…, a window will open allowing you to select the appropriate transformation:
In this case, we want to number line beginnings in a semi-diplomatic transcription,
so we will select lemdo_number_wlns_lb_in_semi-dip. If you are uncertain which transformation to use, or you want us to add a new transformation
to our list, please email lemdotech@uvic.ca.
After clicking the Apply selected scenarios button, we open the file to check that the transformation has worked as expected:
The
<lb>
elements with @type="wln" now have consecutively numbered
@n attributes. The transformation has successfully worked as expected.
As always, the last step in Oxygen is to validate the file.
LEMDO publishes print editions of some of its modern texts. This section of the documentation
is intended to cover how those print editions are generated, and is aimed primarily
at programmers, since itʼs unlikely that anyone other than programmers will venture
into this part of the codebase.
The print editions are generated using LaTeX, and specifically the Xelatex compiler,
so anyone wanting to generate a print edition will need to have not only Ant and ant-contrib
but also the (substantial) LaTeX codebase installed. On Linux, we recommend installing
the texlive-full package, which should include everything you need. On Mac, you can use the mactex distro. You can also install texlive on Windows, but we do not expect most of our
build processes to work on Windows for a variety of reasons. *NIX-based OSes are a
much better bet. The distributions are large, so donʼt install this stuff just for
fun; only do it if you have a need to build PDFs for print.
The PDF build file includes a quick check for the availability of the Xelatex compiler,
so starting from the repository root, you can do this:
Change directories into the PDF build directory: cd code/pdf
Run the check task: ant checkForXelatex
If this works, youʼre probably OK, although itʼs always possible that a particular
package required by the build process is not installed. If thatʼs the case, when you
try to run a build, you should see helpful error messages from LaTeX.
All other requirements (specifically, the fonts used to build the PDF) should be in
the repository.
As you might expect, the codebase for building a print edition lives in code/pdf. It is basically very simple:
build.xml, the Ant build file.
Several XSLT files, in the xsl folder, of which the root file is latex_master.xsl. These files are inadequately organized at the time of writing, because they have
developed as part of a learning process; when there is time, they will be reorganized.
The content should be well-commented, though.
A fonts folder, in which there are two open-source fonts, Vollkorn and Josefin-Sans. These
are configured respectively as the main font and the sans font for the PDF build.
A README file and a TODO file, which are essentially ad-hoc notes.
Once you have ensured that your system is set up with all the requirements, and you have set up your TEI print edition document, youʼre ready to try a build.
Starting from the LEMDO repository root, this is what you do:
Change directories into the PDF build directory: cd code/pdf
Run the build process, supplying the id of the document you want to build: ant -DdocsToBuild=emdOth_Print
(You can supply multiple IDs, comma-separated, if you want to.)
The build process will create a folder called pdf inside the main folder of the text you are building. In there, a number of files will be saved, including
a log file, the .tex file containing the LaTeX code which is generated and then compiled, and the PDF
file of the print edition. If anything goes wrong, you should see either helpful messages
from our code or mysterious messages from the LaTeX compiler.
During the build process you will see many very puzzling emanations such as the common
Underfull \hbox (badness 1033) message from the compiler. These are mostly informational, warning you when the layout
engine has had to stretch or squash a line a little more than it would like to in
order to get the justification to work. However, if the build actually fails, you
will need to pay attention to whatever message coincides with the termination of the
build.
You will notice that the Xelatex compiler is run at least four times on the .tex file. This is because at each run, the layout engine is able to do a slightly better
job of adjusting spacing, pagination and so on, but every time it does this, page
references and similar content which were generated at the time of the previous build
are potentially no longer accurate, so another run is recommended. The number of runs
required to get a final version is not easy to determine, so we run four times by
default, but this may need to be adjusted.
A LEMDO print edition is established by creating a standard LEMDO TEI file, but with
a filename ending in _Print.xml. This should be created in the main folder of the work itself. So for example, if youʼre creating a print edition of
Othello, you would create this file: data/texts/Oth/main/emdOth_Print.xml
This file is like any other TEI text in the project; it has metadata, responsibility
statements and so on. But it will mostly consist of content from other parts of the
work folder. Primarily, it will include a modern-spelling edition of the play, but
it will also have other components such as critical materials and a bibliography.
The following is a simplified example which will be explained below, and should cover
all the main components.
The header is a normal header except for the particular document type specified with
<catRef>
/@target="cat:ldtBornDigPrint".
But in
<text>
, the first thing you will notice is that the
<front>
,
<body>
and
<back>
elements do not include any content directly. They can include content if necessary,
and there may well be components that are intended to be used only for one particular
print edition, and therefore belong directly in this file, but most content is in
the form of
<linkGrp>
elements containing pointers. These pointers specify other files in the repository,
or sections of files. They use the doc: prefix to point to the ids of files, and an optional fragment identifier to point
to a specific part of the file. These includes will be processed by the first stage
of the build code to create a complete TEI file incorporating all these components.
That structure will then be processed into the TEI file.
Notice the organization: critical materials come in the
<front>
element, the castlist and the play itself come in the
<body>
, and the bibliography (of which more below) appears in the
<back>
. Also in the
<back>
are any annotation files which are needed; these are processed into footnotes at
build time. Only annotations which are actually referenced in the included texts will
be used, and the rest will be discarded.
Finally, note the special case of the
<divGen>
element for the bibliography. This is acted on by the build code, which retrieves
from BIBL1.xml all bibliography items which are actually mentioned in the rest of the content, and
generates a bibliography from them automatically. Note that if there is a reference
to an item which does not appear in BIBL1.xml, the PDF build will fail and stop.
Prosopography
Janelle Jenstad
Janelle Jenstad is a Professor of English at the University of
Victoria, Director of The Map
of Early Modern London, and Director of Linked Early Modern Drama
Online. With Jennifer Roberts-Smith and Mark Kaethler, she
co-edited Shakespeare’s Language in Digital Media: Old
Words, New Tools (Routledge). She has edited John Stow’s
A Survey of London (1598 text) for MoEML
and is currently editing The Merchant of Venice
(with Stephen Wittek) and Heywood’s 2 If You Know Not
Me You Know Nobody for DRE. Her articles have appeared in
Digital Humanities Quarterly, Elizabethan Theatre, Early Modern
Literary Studies, Shakespeare
Bulletin, Renaissance and
Reformation, and The Journal of Medieval
and Early Modern Studies. She contributed chapters to Approaches to Teaching Othello (MLA); Teaching Early Modern Literature from the Archives
(MLA); Institutional Culture in Early Modern
England (Brill); Shakespeare, Language, and
the Stage (Arden); Performing Maternity in
Early Modern England (Ashgate); New
Directions in the Geohumanities (Routledge); Early Modern Studies and the Digital Turn (Iter);
Placing Names: Enriching and Integrating
Gazetteers (Indiana); Making Things and
Drawing Boundaries (Minnesota); Rethinking
Shakespeare Source Study: Audiences, Authors, and Digital
Technologies (Routledge); and Civic
Performance: Pageantry and Entertainments in Early Modern
London (Routledge). For more details, see janellejenstad.com.
Joey Takeda
Joey Takeda is LEMDO’s Consulting Programmer and Designer, a role he
assumed in 2020 after three years as the Lead Developer on
LEMDO.
Kate LeBere
Project Manager, 2020–2021. Assistant Project Manager, 2019–2020. Textual Remediator
and Encoder, 2019–2021. Kate LeBere completed her BA (Hons.) in History and English
at the University of Victoria in 2020. During her degree she published papers in The Corvette (2018), The Albatross (2019), and PLVS VLTRA (2020) and presented at the English Undergraduate Conference (2019), Qualicum History
Conference (2020), and the Digital Humanities Summer Institute’s Project Management
in the Humanities Conference (2021). While her primary research focus was sixteenth
and seventeenth century England, she completed her honours thesis on Soviet ballet
during the Russian Cultural Revolution. She is currently a student at the University
of British Columbia’s iSchool, working on her masters in library and information science.
Martin Holmes
Martin Holmes has worked as a developer in the
UVicʼs Humanities Computing and Media Centre for
over two decades, and has been involved with dozens
of Digital Humanities projects. He has served on
the TEI Technical Council and as Managing Editor of
the Journal of the TEI. He took over from Joey Takeda as
lead developer on LEMDO in 2020. He is a collaborator on
the SSHRC Partnership Grant led by Janelle Jenstad.
Navarra Houldin
Project manager 2022–present. Textual remediator 2021–present. Navarra Houldin (they/them)
completed their BA in History and Spanish at the University of Victoria in 2022. During
their degree, they worked as a teaching assistant with the University of Victoriaʼs
Department of Hispanic and Italian Studies. Their primary research was on gender and
sexuality in early modern Europe and Latin America.
Tracey El Hajj
Junior Programmer 2019–2020. Research Associate 2020–2021. Tracey received her PhD
from the Department of English at the University of Victoria in the field of Science
and Technology Studies. Her research focuses on the algorhythmics of networked communications. She was a 2019–2020 President’s Fellow in Research-Enriched
Teaching at UVic, where she taught an advanced course on Artificial Intelligence and Everyday Life. Tracey was also a member of the Map of Early Modern London team, between 2018 and 2021. Between 2020 and 2021, she was a fellow in residence
at the Praxis Studio for Comparative Media Studies, where she investigated the relationships
between artificial intelligence, creativity, health, and justice. As of July 2021,
Tracey has moved into the alt-ac world for a term position, while also teaching in
the English Department at the University of Victoria.
Bibliography
OED: The Oxford English
Dictionary. 2nd ed.
Oxford: Oxford
University Press,
1989.