The TEI XML Document

Please note that the TEI XML document is a work in progress. Please visit my github page to download the working document: PROJECT_2023-1100_5May2023.xml

Encoding Manuscript Features

Unicode

Below are listed a few of the Unicode characters used to represent the morphology of medieval abbreviations for the diplomatic transcription.

Using Diacritical Marks

  • These are combined with the immediately preceding character. 

 

Symbol  NCR code¹ Unicode Name 
  771  Tilde 
  772  Macron 
  867  Latin Small Letter a 
  869  Latin Small Letter i 
  879  Latin Small Letter x 
ᴿ̥  7650  Latin Capital Letter R Above²

¹ The Numerical Character Reference (NCR) code is written in the TEI XML document with preceding ampersand and hashtag and closing semicolon, like this 

² Modifier Letter Capital R  is another option for encoding the ‘ur’ abbreviation with a superscript R. This is more widely supported than the diacritical mark. (Diacritical marks combine with other characters to create a single glyph).

 See W3 Schools for more information. 

Using Unicode Characters for Medieval abbreviations 

  • These are used to approximate the morphology of some specific abbreviations.  
Symbol  NCR code  Unicode Name  Medieval Abbreviation 
ę  281  Latin Small Letter E with Ogonek ę  ae 
  42845  Latin Small Letter Rum Rotunda  -rum 
  10076  Heavy Single Comma Quotation Mark Ornament   -us (as in eius) 
ȝ  541  Latin Small Letter Yogh ȝ  -us (as in –bus) 

 

Using Unicode Characters for Medieval symbols 

  • These are used for a variety of abbreviations. 
Symbol  NCR code  Unicode Name  Medieval Abbreviations 
ħ  295  Latin Small Letter H with Stroke ħ  Ihesus, antiphonis 
  42825  Latin Small Letter L with High Stroke ꝉ  gloria, apostolus 
  43848  Latin Small Letter Double R ꭈ  Occurrit, inter 
  42896  Latin Capital Letter N with Descender Ꞑ  Introit, Natiuitate 

 

TEI XML Elements & Attributes

Feature  TEI XML Encoding 
Rubrication  <hi rend=”color: red”>à</hi> 
 Small Caps <hi rend=”sc”>

OR

Use Latin Small Capital Letter Characters