atelier villejuif, lamop, paris mai 2008 digital philology from xml & xslt to xpath 2.0 &...
TRANSCRIPT
Atelier Villejuif, LAMOP, Paris mai 2008
Digital Philologyfrom XML & XSLT toXPath 2.0 & XQuery
in Medieval TranscriptionsThe Arts & Humanities Research Council
The Roberts FundCNRS
Agence Nationale de la Recherche
www.python.orgxmlfr.org
Un laboratoire pour la philologie numérique – une bourse gagné des Fonds Roberts par Mansfield en 2007 afin de préparer les outils XML pour les chercheurs ingénieurs en Moyen Français
UTF-8 is an8-bit Unicode Transformation Format
It forgivesnon-accentedcharacters but You must manage things when yourweb pagehasaccented characters
eg Georgian Commas• ii •
While an ampersand or une esperluettewe show as an html entity&
November 2007 TEI P5 Announced
Full elements need an opening and a closing tagwhile an ‘empty’ element is complete in one tag.
A full element is opened <name> and closed </name>WhileAn empty element is complete within itself:<abbr expan="n"/>t
November 2007 TEI P5 Announced
Our empty elements, most of which Python converts fromsimple rules: tie<abbr expan="n"/>t
<cb n="b"/> here marking the start of column b
<lb n="CDAM.369c:01"/>
<layout ruledLines="38"/>
<pb n="241" side="r"/> page break, here marking the start of fol. 241r
The only one you will need to type in will be:soubz<space/>rire will show as'soubz rire' in the diplomatic transcription and as'soubzrire' in the modern edition
XML + XSL = HTML Web Page
EditiX and Oxygen have built-in transformProcessors so that you can applyXSLT scripts to your XML transcription at any time during your tagging work
You cannot change the HTML web-page. It is frozen. However, it doeshighlight for you any potential problems
Our Portfolio of Essential Files
transcription.xml the text you are transcribing.4431.dtd is our Document Type Description which lists in alphabeticorder the elementsand their attributes which we have chosen from the TEI P5 guidelines mat/table.xsl is our main transform program for the whole manuscript and for individual books. It differs from our specialised XSLT programs (which include the XSLTransform: glossary.xsl for rendering a table of glosses from all those we have typed-in inline). Please be careful when downloading XSLTransform files because some systems (especially on Windows) add a further file extension of .xml after the .xsl.
classes.css is our full list of class definitions for our cascading style-sheet which renders font size and colour on the final web version.
javascript.js is CM's JavaScript or ECMAScript code for the interactive glosses and name pop-up windows.
fleur.png is the triangular red notes pointer image for the web versions of the online Harley transcription:
get.html is the pilot version of our Glossary File and system which passes a link to the DMF2 at our partner laboratory ATILF.
A Native XML DatabaseNative, so we do not have to do any additional coding beyond our TEI P5 XML
http://sligachan.lib.ed.ac.uk:8080/exist/admin/admin.xql
FiNcharlie.mansfield à vodafone.netarn
ou
l.vjf
.cn
rs.f
r/ac
tes Future Funding
HERA Joint Research Programmes
http://www.heranet.info/Default.aspx?ID=274
National Contacts:France - CNRS Bruno [email protected]
United Kingdom - AHRC Christelle Pellecuer [email protected]