Monthly Archives: April 2015

Digital Humanities Internship Blog Post #2 – ‘Why So Called I Know Not:’ Transcribing and TEI Marking Up Joseph Holloway’s ‘Impressions of a Dublin Playgoer’

The following is the second blog post written in order to document progress on ‘The Lost Theatres of Dublin’ internship as part of the MPhil in Digital Humanities and Culture. This first post will detail the process of transcribing excerpts from Joseph Holloway’s ‘Impressions of a Dublin Playgoer’ that refer to the Queen’s Theatre and with the process of translating this transcription into a TEI document marked up according to the TEI-C guidelines.

It was decided at the start of this internship that the status of the manuscript itself would not form part of the TEI code and what mattered was the recording of the content, independent of the form in which it appeared. As such, features such as line breaks, blemishes or annotations were not transcribed.

Holloway’s punctuation is often inconsistent. His periods, commas and hyphens are used interchangeably and sometimes he will refrain from punctuating his sentences at all. Instead, they are allowed to run into one another. This led to procedural difficulties, not only because transcription becomes more difficult when dealing with jumbled syntax, but because at this point in the project it was decided that the end product would probably be provided on an open-access website. As Holloway’s diaries are presumably of interest to both amateur theatre enthusiasts as well as researchers, it was decided that as part of the transcription process the punctuation and spelling would be standardised, both in order for the code to make sense for the end-user or reader and in order to not give the impression that mistakes were made at the encoding or transcription stage. In one instance, in the entry given for the production of Sisyphus or the Forgotten Friend (1900), Holloway’s misspelling of the name of the character ‘Sisyphus’ as ‘Sisiphus’ was maintained and encoded using the element. The mistake was encoded within the element and the corrected spelling ‘Sisyphus’ was encoded by use of the element.

The TEI Header is a fundamental component of TEI documents and contains metadata relevant to the text that is being marked up. The element in turn contains and elements. It was decided that the title of this particular TEI document would be Excerpts from the microfilmed manuscript of Joseph Holloway’s ‘Impressions of a Dublin Playgoer’ from the years 1895, 1896, 1900, 1905 & 1910 as regards the Queen’s Theatre. This somewhat cumbersome title was used because if the title was simply Impressions of a Dublin Playgoer it could have been regarded as an inaccurate or misleading title, as if the full manuscript is being encoded, rather than just a series of excerpts. The was encoded as ‘Joseph Holloway.’ His birth and death dates, (1861-1944) were also provided inside this element.

The lists ‘Joseph Holloway’ as being the person responsible for originally preparing the manuscript. However, if Holloway was the sole individual credited with the creation of the text, it would ignore the role of those responsible for creating the microfilm of the manuscript. Unfortunately, in the ‘credits’ for the microfilm at the beginning of the reel, the individuals responsible for carrying out the work of converting the manuscript to film are not named. Instead, The American Microfilm Company, the company responsible for the project of converting the Manuscripts of the Irish Literary Renaissance, is named. The American Microfilm Company was therefore named in the as converting the manuscript into microfilm. ‘Chris Beausang’ was named as editor and transcriber.

The element gave the publication status of the manuscript as “Unpublished” and the fact that it is currently held by the National Library of Ireland. The address of the National Library of Ireland, Kildare Street, Dublin 2, was provided in the element.

The element was 2015, the year that this project commenced, rather than the year that the manuscript was written or the conversion into microfilm was carried out, in order to accurately reflect the time that the TEI file was created and written.

The element contained a element, to indicate the presence of bibliographic information about the resource being marked up. The following statement was inserted into this element: “Selections from Joseph Holloway’s ‘Impressions of a Dublin Playgoer,’ microfilmed by The American Microfilm Company in 1968.”

The and elements provide a good opportunity to summarise the rationale behind the project. The following text was placed within these tags between

tags: “This TEI document was prepared as part of a ‘Lost Theatres of Dublin’ internship. This was done in order to potentially provide a basis for a future, fully digitised, TEI version of the diaries. Each date, performance, actor, playwright and company was marked up. Each entry and year is contained in a separate ‘div’ element.” The contains in turn the elements and . In the element and through use of the

tags, it was indicated that punctuation is corrected ‘silently,’ meaning that their correction is not stated in the TEI code itself. As was stated earlier, this is because the preservation of the manuscript and its bibliographic codes was not a priority for this project. In the element, the following was embedded between the

tags: “Notes are not encoded as notes. Syntax and punctuation is corrected when the lack of punctuation corrupts the sense.”

When the first year being transcribed is encoded, the

element was used. The

was ‘year’ and the year was provided by using n= immediately afterwards. Here is an example for encoding the

for the year 1895:


When beginning a new entry for a particular performance, the

element is also used, but the div type is, in this instance, given as the word ‘performance.’ The element is also used, but the use of the element necessitated use of the element , as the TEI-C assumes that the use of would always be used in a context pertaining to bibliographic information. The result for the first entry in Holloway’s diary that pertains to the Queen’s Theatre is the play Forty Thieves and reads as follows: “

Forty Thieves.”

The text of Holloway’s entries is contained within the

tags. Though this was not a priority, this breaking of paragraphs only when a new entry is started has the serendipitous effect of being faithful to the layout of the manuscript as Holloway never uses paragraph breaks inside of a single entry.

Holloway provides the date of each entry in the following format: “21 September” followed by a dash or hyphen. This was embedded inside the element while a more expansive and detailed account of the date was provided inside the angled brackets. The result is as follows: “21 Monday”.

When a theatre company is referenced in one of Holloway’s entries, it is encoded as such, using the element . By way of example, the following is one instance of how the Milton-Rays company is encoded in TEI. Milton-Rays .

When Holloway’s writing was too difficult to decipher or the ink Holloway used blotted or the text had faded (either from the microfilm or the manuscript itself) the tag was used. Initially a guess as to what was said was embedded between the opening and closing tags but this practice was discontinued because of the likelihood that it was inaccurate. The unclear tags appear in the TEI document without these guesses and read as follows: “.” This has the advantage of allowing those who may wish to build on the work of this project to see where the absences in this project are and will allow them to be filled in more easily.

Performer’s names are encoded using the tag and the fact that they are performers is declared through use of the @type attribute. An example of this from Holloway’s first entry reads as follows: “Miss Ellie White.”

When Holloway refers to a character in the play by their name, the same element and @type attribute is used, as in the following example: “Lassim.”

Holloway sometimes notes what time that he gets home at or at what time the performance ended. Initially, this was marked up using the element, with the ‘when’ @type attribute, but this was discontinued, as Holloway only uses it a handful of times in the entire corpus. The fact that this was used only a limited number of times made it structurally insignificant.

When a playwright is mentioned, the element with the @type attribute is used to declare them as such. However, it seems that playwrights often appeared as performers in the plays that they write. Therefore, what they are declared as being depended on context and were decided on a case by case basis.

On or two occasions, Holloway mentions the person responsible for the scenery, probably in instances where the scenery was of sufficient quality to merit discussion. As these instances are as rare as they are, it was decided that another @type attribute would not be used. The mentioning of the scenographer is so infrequent, creating a new @type seemed gratuitous. Instead, the @type attribute ‘performer’ was used, which is not inaccurate, considering the porous nature of different roles in travelling theatre companies.


Digital Scholarly Editing Blog Post #3: A Derridean Critique of TEI Mark-up and XML Language

The development of TEI mark-up has had the effect of standardising the encoding practice that surrounds the creation of machine readable texts. TEI has the advantage of rendering these texts in a form that is more easily indexed and searchable. Despite the fact that this non-proprietary agreed code of practice facilitates more dynamic interaction than would otherwise be possible, there are critics that argue that the hierarchical arrangement of elements in a text is antithetical to the practice of the humanities, that it contravenes openness to differing interpretations and understandings of a text through overt standardisation.

Jerome McGann is probably the most well-known among those critics of TEI encoding. This is due to his association with early digital humanities projects such as the Rossetti Archive and his writings on the theoretical implications of using an encoding language for literary data. The hierarchical nature of the language is a particular bugbear for McGann. As he writes in Radiant Textuality: Literature after the World Wide Web (2001) “its [XML’s] hierarchical principles and other design characteristics set permanent and unacceptable limits on its usefulness with arts and humanities materials.”[1] This level of reticence to engage in an exercise that essentially amounts to the standardisation of literary texts can be explained by a brief and simplified survey of the contemporary critical environment within the humanities, primarily as regards literary criticism.

The trend towards systematised critical approaches such as structuralism and formalism in the early twentieth century has been wholly reversed following the development of post-structuralism, deconstructionism, post-colonialism, feminism, etc, that established themselves in response to and in conjunction with these more schematic critical approaches in order to reveal what many of these critical practitioners had neglected  or ignored. This wave of critics in the sixties and seventies were influential in emphasising the need to resist over-arching grand narratives and neat answers to textual questions. Humanities practitioners now exist in an environment where the theories of Gayatri Chakravorty Spivak, Michel Foucault and Jacques Derrida have entered the mainstream and are no longer as revolutionary or ‘against the grain’ as they have been in the past.

What we have inherited from these theorists I have cited above is a post-Derridean notion of what a text is. A text is infinite, endlessly referential and fundamentally indeterminate, embroiled as it is in an endless play of deferred meaning. This is the one of the cornerstones of the contemporary humanities landscape and is one of the first things I learnt as an undergraduate. One can see why there is a vocal cohort of critics speaking out against the application of a rigid TEI code that leaves no room for ambiguity in approaching a text. However, it is the contention of this blog post that many of these critics who inherit their sense of textuality from deconstructionists such as Derrida, are at least partially unaware of what it is exactly that Derrida was positing when he developed his own critical approaches. This blog post will attempt to, in as much as possible, argue that TEI is not as antithetical to deconstructionism as these critics maintain.

Firstly, this post will provide a (very) brief summary of Derrida’s ideas as expressed in the first part of Of Grammatology (1967), ‘Writing Before the Letter.’ If Derrida can be said to have one chief argument to make in this section, it is an attack that he mounts against the implicitly held beliefs of Western philosophy, which upholds the existence of a particular kind of ‘presence’ embodied in the spoken word. This has the consequence of downgrading the value of writing, which is in the discourse of Western philosophy, a mere representation or image of full speech. This distrust of an alleged representation and preference for an ideal is a tenant of Western philosophy that Derrida identifies as existing as far back as Plato’s Phaedrus (c. 370 B.C.E.). Derrida dismisses this notion of speech as ‘theological,’ a kind of scholastic purism that has no place in a post-Enlightenment philosophical framework. This binary opposition and privileging of speech is difficult for contemporary thinkers to elide, however. If it can be located in the originary Western philosophical texts, the trace of the idea remains in the philosophical schemata that have been devised since. It is therefore necessary for Derrida to make use of the discourse of Western philosophy as it exists now, for better or worse, while without perpetuating these binary oppositions and thereby prevent the implicit privileging of one over the other. An example of how to go about this is Martin Heidegger’s use of the word ‘Being’ with a line through it, to draw attention to the fact that he is discussing Being in a way that the reader would traditionally understand it, while simultaneously distancing the word from inherited notions of Being. This is one of the methodologies behind Derrida’s ‘science’ of grammatology.

This presents the question as to what deconstruction means for a programming language such as TEI. Derrida, in deconstructing a text, is attempting to rectify this relegation of writing to the margins of philosophy. Derrida posits that as a corrective, that we emphasise the endless ‘play’ of meaning and recognise the self-contradictory nature of all utterance. This would be problematic for TEI for a number of reasons. The marking up of tortured ambiguity is not a straightforward task for any encoder. When marking up a text, questions are generally rather either/or in nature. Can this alleged inflexibility of TEI be used meaningfully to approach a text of any kind? The thoughts of Derrida’s translator Spivak on the act of translation and interpretation will be productive in this context:

“Any act of reading is besieged and delivered by the precariousness of intertextuality. And translation is after all, one version of intertextuality. If there are no unique words, if, as soon as a privileged concept-word emerges, it must be given over to the chain of substitutions and to the “common” language” why should that act of substitution that is translation be suspect? If the proper name or sovereign status of the author is as much as barrier as a right of way, why should the translator’s position be secondary?”[2]

If we understand the act of marking up a text as a kind of translation, involving many of the same skills, procedures and means of interpretation, according to Spivak there is no reason why such a product would be devalued. If deconstructionists oppose distinctions being made between low/high, original/copy, why is TEI so open to criticism? From through my own (admittedly limited) reading of Derrida and the critics who use his understanding of text as devoid of inherent meaning and more of a trace effect in the mind of the reader have yet to produce a convincing reasons why the most basic units of text should be exempt from classification. Simply put, a word, a sentence or a paragraph are not always simply effects of the reader’s interpretation and can be quite easily identified as textual features in themselves.

Therefore, it is hoped that this blog post has demonstrated that one can be a committed Derridean and still carry out the work of the TEI-C with a clear conscience. As Derrida himself writes, intellection about the instability of the signifier must be consciously forgotten occasionally in order for things to move forward. Derrida never advocates the complete destruction of the economy of signs as it exists today, regardless of some of the more violent implications of the word deconstruction.

“Up to a certain point, such repression is even necessary to the progress of positive investigation. Beside the fact that it would still be held within a philosophising logic, the ontophenomenological question of essence…could, by itself, only paralyze or sterilise the typological of historical research of facts.”[3]

The question as to where this certain point is located is as always, up to the editor.

Rather than continuing to rail against that which Derrida did, we should recognise the changed critical landscape which over time has developed its own orthodoxies and blasphemies. It is these that the critic and editor should be trying to overturn and re-conceptualise. This post will conclude with the argument that it was not Derrida’s intention that we incorporate his ideas into our critical approaches only to arrive at a point of stasis. Instead, we should continue to apply and develop the kind of critical rigour that we find in his writings and re-invigorate deconstruction as an ongoing process, rather than a dead-end. This is not to argue for a return to value judgements or the worst excesses of ‘rational’ humanist thought that Derrida, Foucault and Jacques Lacan read against the grain. It is instead an invocation to recognise that deconstruction does not end; it is instead an ongoing process of undoing hierarchies in a productive manner.

[1] McGann, Jerome, Radiant Textuality: Literature After the World Wide Web (Palgrave: 2001), p.17

[2] Derrida, Jacques, Of Grammatology (The Johns Hopkins University Press: 1997), p.lxxxvi

[3] Ibid, p.28


Derrida, Jacques, Spivak, Gayatri Chakravorty (Translator), Of Grammatology (The Johns Hopkins University Press: 1997)

McGann, Jerome, Radiant Textuality: Literature after the World Wide Web (Palgrave: 2001)

Schreibman, Susan, Siemens, Ray and Unsworth, John, A Companion to Digital Humanities (Blackwell: 2004)

Schreibman, Susan and Siemens, Ray, A Companion to Digital Literary Studies (Blackwell: 2008)

Text Encoding Initiative Consortium Website,