LEMDO: Structure of a Single Documentation File

Structure of a Single Documentation File

The root element of all documentation files is a


                                       <div>

element with an


                                    @xmlns

attribute and the value "http://www.tei-c.org/ns/1.0". Make the xml:id of the root


                                       <div>

element identical to the name of the file.

XML files are primarily made up of nested


                                       <div>

elements that allow us to indicate the hierarchical structure of the documentation file and add descriptive navigational


                                       <head>

elements to each


                                       <div>

element.

Before you encode a file, establish the structure of the file by doing a document analysis. When we write documentation in Teams, we usually use Styles to indicate the levels of headers and the subordination of sections in the document hierarchy. When we write documentation directly in an XML file, we tend to have the Outline View open and the filter set to show


                                       <div>

elements so that we can see our document hierarchy at a glance.

The complexity of the structure of your file increases with the length of the file. Consider whether your documentation should be a single file or several files. Separate files are each listed in the ODD file, where we can order pieces however we like and effectively chain them back together. We can also make links between files and between


                                       <div>

elements within files.

LEMDO uses four main elements to organize content within documentation files:


                                             <div>

(always with an


                                          @xml:id

attribute and a child


                                             <head>

element)

<p>


                                             <list>


                                             <table>

These elements are not used exclusively in documentation files, but this section explains how to use them to encode documentation.

¶ Divisions

Divisions are the highest-level structural unit of documentation files. Each documentation file is rooted on the


                                          <div>

element. The root


                                          <div>

element has one or more child


                                          <div>

elements structuring the content of the documentation file.


                                          <div>

elements can be nested to capture the hierarchical and subordinate sections within a document.

XML is hierarchical and works by nesting content in a series of containers. For our purposes, those containers are


                                          <div>

elements. You need to know how to properly nest


                                          <div>

elements in order to structure the content in your documentation correctly. It is a good idea to plot the structure of your document prior to encoding it because rearranging


                                          <div>

and


                                          <head>

elements can be difficult.

Nest all other


                                          <div>

elements in the documentation file within the root


                                          <div>

element.

¶ Create IDs for Document Divisions

Each


                                             <div>

element must have an


                                          @xml:id

attribute with a value based on the position of the division in the hierarchy. Make the xml:id of each


                                             <div>

element (besides the root


                                             <div>

element) the xml:id of the


                                             <div>

element within which it is nested, plus a word or phrase (usually the heading of the section) that makes it unique to that file. We use these xml:ids across the rest of the documentation to generate document pointers and HTML links. The URLs of these links will become very long if the division is nested far into the document hierarchy, because every nested


                                             <div>

element must have an xml:id that includes the xml:ids of all the parent


                                             <div>

elements. This system quickly becomes unwieldy if xml:ids are not constructed economically.

Follow these principles to keep xml:ids short:

Think carefully about nesting


                                                   <div>

elements within


                                                   <div>

elements. Do you really need deeply nested divisions? Is the content actually a set of paragraphs or a list? One advantage of creating a new


                                                   <div>

is that we can link to it from elsewhere, but think about whether we really need to link to the very specific information that would be contained within a deeply nested


                                                   <div>

or if users might more profitably be pointed to a higher level in the hierarchy.

Be economical in the wording of the xml:id. Say perf rather than performance, ed rather than editing, eg rather than example, encode rather than encoding, and so on.

The final part of the xml:id on a


                                                   <div>

needs to be similar to the text node of the


                                                   <head>

element but does not have to replicate it. If possible, choose a single representative word from the heading.

In this fictional example showing three nested


                                             <div>

elements, the xml:id of the root


                                             <div>

element is "learn_encodeTerm". The xml:id of the file is the first part of the xml:id of each


                                             <div>

element within that file. Each subsequent nested div inherits the full xml:id of its parent


                                             <div>

element and adds to the xml:id of its parent an underscore and a brief phrase.

<div>


  <div xml:id="learn_encodeTerm">
    <head>Encode a Term</head>
    <div xml:id="learn_encodeTerm_eg">
      <head>Examples</head>
      <div xml:id="learn_encodeTerm_eg_uncommon">
        <head>Uncommon Words</head>
      </div>
    </div>
  </div>
  
</div>

"learn_encodeTerm" is the xml:id of the root


                                             <div>

element, so the xml:ids of all of its child


                                             <div>

elements in the file must contain that phrase. The xml:id of the root


                                             <div>

element has the word Term in it because it reflects the content of the file (which is also captured in the heading in the


                                             <head>

element associated with this


                                             <div>

element). The first


                                             <div>

element is the largest container that holds the other two


                                             <div>

elements, and the second


                                             <div>

element contains the third


                                             <div>

element.

¶ Headings

Every


                                             <div>

must have a heading, tagged with the


                                             <head>

element. Add an appropriate heading for that section of the document in the text node of the


                                             <head>

element.

The content of the


                                             <head>

elements on all the


                                             <div>

elements in the document will be processed on the site into tables of contents, page content lists, and direct links to sections of documents. (At processing time, which is not the domain of the encoder but some knowledge thereof helps us understand why our work is valuable, the


                                             <div>

hierarchy determines the level of the heading—h1, h2, h3, and so on—when it is transformed into HTML, LEMDO’s output format.)

For example, the


                                             <head>

element just below the root


                                             <div>

element renders as the title of the document (i.e. the highest level of heading), and the


                                             <head>

element below the second


                                             <div>

element in the document renders as a level one heading (i.e. the second highest level of heading). The text nodes of the


                                             <head>

elements will also be rendered as a list of page contents (accessible via a navigation pane) that users will use to navigate the page once it is rendered on the site.

¶ Paragraphs

The basic structural unit within the


                                          <div>

element is the paragraph, wrapped in the

<p>

element. Three paragraphs will often suffice instead of three divisions with headings, especially if the paragraphs convey information that an editor or encoder is likely to read as a unit. In other words, paragraphs are the obvious way of organizing information when you do not want to split the information across multiple


                                          <div>

elements.

Prose is the default mode of writing documentation, but do think carefully about how people are most likely to need information presented. You are not writing an argument. You are giving explanations and instructions. Instructions are often for step-by-step procedures and often include forks in the road where an editor has to make a forced choice between options. Step-by-step procedures lend themselves to numbered lists. Forced choices lend themselves to nested lists. A paragraph may contain a list and/or nested lists. A paragraph may be used to introduce or comment on a list.

¶ Lists

Lists are an excellent way to organize information in documentation. They are highly readable on screen and make it easy for readers scanning for a solution to find what they need. If you find that you are creating comma-separated prose lists, consider turning them into formal lists, wrapped in the


                                          <list>

element with each item wrapped in the


                                          <item>

element.

Conversely, do not encode as lists long blocks of text that are really paragraphs. If you find that your list items are becoming long, then consider converting the list back to paragraphs or even to small


                                          <div>

elements with


                                          <head>

elements to guide the reader to the right information.

Decide whether your list should be independent of a paragraph or embedded in a paragraph. Consider the information in this table when deciding whether to embed a list in a paragraph or keep it separate:

Lists Independent of Paragraphs	Lists Embedded in Paragraphs
Usually longer	Usually shorter
Extrinsic to the paragraph in terms of content	Intrinsic to the paragraph in terms of content
Probably not discussed in the next paragraph	Discussed in the remainder of the paragraph

Example: List independent of paragraph

<div>


  <p>Running prose about something.</p>
  <list rend="bulleted">
    <item>Item</item>
    <item>Item</item>
  </list>
  <p>Running prose about something.</p>
  
</div>

Example: List embedded in paragraph

<div>


  <p>Running prose introducing the list:</p>
  <list rend="bulleted">
    <item>Item</item>
    <item>Item</item>
  </list>
  <p>Running prose commenting on the list.</p>
  
</div>

For those readers who are thinking ahead to the XSLT processing that turns our XML into HTML: HTML does not allow lists to appear inside paragraphs. However, we donʼt convert our lists and paras to HTML lists and paras; we use more generic block elements.

¶ Types of Lists

LEMDO permits numbered, bulleted, and simple lists.

Use "numbered" as the value on the


                                          @rend

attribute if the items in the list are in a particular order (a step-by-step process whereby one step must be completed before the next step) or if enumerating them is important. In the latter case, there might be a preceding comment and an introductory colon.

<div>


  <p>Follow these three steps:</p>
  <list rend="numbered">
    <item>Step One described, beginning with a second-person, imperative action verb.</item>
    <item>Step Two described.</item>
    <item>Step Three described.</item>
  </list>
  
</div>

Use "bulleted" as the value on the


                                          @rend

attribute if the items are not sequential or ordered in any particular way:

<div>


  <p>Tips for Encoding:</p>
  <list rend="bulleted">
    <item>Learn Keyboard Shortcuts to save time</item>
    <item>Update (<code>svn up</code>) often</item>
    <item>Use the outline view in Oxygen to quickly see the structure of your file</item>
  </list>
  
</div>

Simple lists have neither numbers nor bullets. We generally discourage the use of simple lists because they have no distinguishing feature that marks them as a list in the HTML output, other than indentation and a new line beginning. They are less useful for the scanning reader than the two other types of lists. One use-case where using the "simple" value on the


                                          @rend

attribute is the right choice is the linked table of chapter contents that we create in the introductory file for each chapter.

Example from the table of contents for the documentation chapter:

<list rend="simple">
  <item>
    <ref target="doc:learn_docStructure">Structure of Documentation</ref>
  </item>
  <item>
    <ref target="doc:learn_docPrinciples">General Documentation Principles</ref>
  </item>
  <item>
    <ref target="doc:learn_docStyle">Write Documentation: Style Guide</ref>
  </item>
  <item>
    <ref target="doc:learn_docCreate">Create and Name Documentation</ref>
  </item>
</list>

¶ Tables

Tables are an excellent way to display information. Use tables to convey information that requires the user to look up something in order to learn how to encode or edit that particular thing. For example, the list of abbreviations for various bibliographic


                                             <idno>

elements lends itself to a two-column list with the name of the bibliographic resources (e.g., Short Title Catalogue) in the first column and the abbreviation (STC) in the second.

Tables have the advantage of being sortable in alphabetical and reverse alphabetical order on either column. The table foreseen in the preceding paragraph would give the editor or encoder the ability to sort on the second column if they need to know the details of the value on an


                                             <idno>

, or to sort on the first column if they know the name of the resource and want to discover its


                                             <idno>

value.

Tables may have a


                                             <head>

element. Use the first


                                             <row>

element (with the


                                          @role

attribute and the "label") to label the columns. Each subsequent


                                             <row>

will have the "data" value and must have the same number of cells as the


                                             <row>

with the "label" value.

Note that tables with more than four columns do not render well in our HTML output and are therefore less usable to the editor/encoder.

To encode a table, use the following model:

<table>
  <head>Optional Head</head>
  <row role="label">
    <cell>Head of first column</cell>
    <cell>Head of second column</cell>
  </row>
  <row role="data">
    <cell>Data for first column of first row</cell>
    <cell>Data for second column of first row</cell>
  </row>
  <row role="data">
    <cell>Data for first column of second row</cell>
    <cell>Data for second column of second row</cell>
  </row>
</table>

¶ Tips For Managing Complex Structures

¶ Tips for Divisions

You must close the


                                                <div>

elements in your documents in the same order that you opened them. Since the first


                                                <div>

element is the largest container, its opening and closing tags must enclose all the other


                                                <div>

elements. Be careful when closing


                                                <div>

elements because a misplaced closing


                                                <div>

tag can, in rare cases, disrupt the content hierarchy without being invalid. Furthermore, it is easy to lose sight of your opening


                                                <div>

tag in a long, complex document. Errors are most likely to occur when you are moving


                                                <div>

elements and their contents to new places in your document; you may well accidentally disrupt your document hierarchy.

Strategies you can use to avoid misplaced closing tags:

If you added your content to the file (perhaps by copying and pasting content from Teams, a .docx file, a .txt file, GoogleDrive, or any other non-XML context) and are adding tags after the fact:

highlight the text that you want to contain under a particular heading and press Ctrl+e (PC/Windows keyboards) or Cmd+e (Mac keyboards) to wrap it in a


                                                            <div>

element.

If you have done a document analysis and are adding tags first and copying in (or writing) content after:

add all the


                                                            <div>

elements and nest them, add

<p>

elements as necessary, add your content to them, and encode it.

Use the Outline view in Oxygen (Window → Show View → Outline) and filter for


                                                      <div>

elements (by typing div in the filter box).

¶ Tips for Encoding Lists

Wrap all of the text that you want to render as a list in a


                                             <list>

element using ctrl+e, then wrap each list item in an


                                             <item>

element. Nest lists by making a


                                             <list>

element a child of an


                                             <item>

element:

<list rend="bulleted">
  <item>Check Tagging of Verse and Prose <list rend="numbered">
    <item>Remove Mode Milestones</item>
  </list>
  </item>
</list>

Prosopography

Isabella Seales

Isabella Seales is a fourth year undergraduate completing her Bachelor of Arts in English at the University of Victoria. She has a special interest in Renaissance and Metaphysical Literature. She is assisting Dr. Jenstad with the MoEML Mayoral Shows anthology as part of the Undergraduate Student Research Award program.

Janelle Jenstad

Janelle Jenstad is a Professor of English at the University of Victoria, Director of The Map of Early Modern London, and Director of Linked Early Modern Drama Online. With Jennifer Roberts-Smith and Mark Kaethler, she co-edited Shakespeare’s Language in Digital Media: Old Words, New Tools (Routledge). She has edited John Stow’s A Survey of London (1598 text) for MoEML and is currently editing The Merchant of Venice (with Stephen Wittek) and Heywood’s 2 If You Know Not Me You Know Nobody for DRE. Her articles have appeared in Digital Humanities Quarterly, Elizabethan Theatre, Early Modern Literary Studies, Shakespeare Bulletin, Renaissance and Reformation, and The Journal of Medieval and Early Modern Studies. She contributed chapters to Approaches to Teaching Othello (MLA); Teaching Early Modern Literature from the Archives (MLA); Institutional Culture in Early Modern England (Brill); Shakespeare, Language, and the Stage (Arden); Performing Maternity in Early Modern England (Ashgate); New Directions in the Geohumanities (Routledge); Early Modern Studies and the Digital Turn (Iter); Placing Names: Enriching and Integrating Gazetteers (Indiana); Making Things and Drawing Boundaries (Minnesota); Rethinking Shakespeare Source Study: Audiences, Authors, and Digital Technologies (Routledge); and Civic Performance: Pageantry and Entertainments in Early Modern London (Routledge). For more details, see janellejenstad.com.

Joey Takeda

Joey Takeda is LEMDO’s Consulting Programmer and Designer, a role he assumed in 2020 after three years as the Lead Developer on LEMDO.

Martin Holmes

Martin Holmes has worked as a developer in the UVicʼs Humanities Computing and Media Centre for over two decades, and has been involved with dozens of Digital Humanities projects. He has served on the TEI Technical Council and as Managing Editor of the Journal of the TEI. He took over from Joey Takeda as lead developer on LEMDO in 2020. He is a collaborator on the SSHRC Partnership Grant led by Janelle Jenstad.

Navarra Houldin

Project manager 2022–present. Textual remediator 2021–present. Navarra Houldin (they/them) completed their BA in History and Spanish at the University of Victoria in 2022. During their degree, they worked as a teaching assistant with the University of Victoriaʼs Department of Hispanic and Italian Studies. Their primary research was on gender and sexuality in early modern Europe and Latin America.

Nicole Vatcher

Technical Documentation Writer, 2020–2022. Nicole Vatcher completed her BA (Hons.) in English at the University of Victoria in 2021. Her primary research focus was womenʼs writing in the modernist period.

Tracey El Hajj

Junior Programmer 2019–2020. Research Associate 2020–2021. Tracey received her PhD from the Department of English at the University of Victoria in the field of Science and Technology Studies. Her research focuses on the algorhythmics of networked communications. She was a 2019–2020 President’s Fellow in Research-Enriched Teaching at UVic, where she taught an advanced course on Artificial Intelligence and Everyday Life. Tracey was also a member of the Map of Early Modern London team, between 2018 and 2021. Between 2020 and 2021, she was a fellow in residence at the Praxis Studio for Comparative Media Studies, where she investigated the relationships between artificial intelligence, creativity, health, and justice. As of July 2021, Tracey has moved into the alt-ac world for a term position, while also teaching in the English Department at the University of Victoria.

Orgography

LEMDO Team (LEMD1)

The LEMDO Team is based at the University of Victoria and normally comprises the project director, the lead developer, project manager, junior developers(s), remediators, encoders, and remediating editors.

Metadata

Authority title	Structure of a Single Documentation File
Type of text	Documentation
Short title
Publisher	University of Victoria on the Linked Early Modern Drama Online Platform
Series	Linked Early Modern Drama Online
Source	TEI Customization created by Martin Holmes, Joey Takeda, and Janelle Jenstad; documentation written by members of the LEMDO Team
Editorial declaration	n/a
Edition	Released with Linked Early Modern Drama Online 1.0
Encoding description	Encoded in TEI P5 according to the LEMDO Customization and Encoding Guidelines
Document status	prgGenerated
Funder(s)	Social Sciences and Humanities Research Council of Canada
License/availability	This file is licensed under a CC BY-NC_ND 4.0 license, which means that it is freely downloadable without permission under the following conditions: (1) credit must be given to the author and LEMDO in any subsequent use of the files and/or data; (2) the content cannot be adapted or repurposed (except in quotations for the purposes of academic review and citation); and (3) commercial uses are not permitted without the knowledge and consent of the editor and LEMDO. This license allows for pedagogical use of the documentation in the classroom.