Tag Archives: a portrait of the artist as a young man

A (Proper) Statistical analysis of the prose works of Samuel Beckett

MTE5NDg0MDU0ODk1OTUzNDIz.jpg

Content warning: If you want to get to the fun parts, the results of an analysis of Beckett’s use of language, skip to sections VII and VIII. Everything before that is navel-gazing methodology stuff.

If you want to know how I carried out my analysis, and utilise my code for your own purposes, here’s a link to my R code on my blog, with step-by-step instructions, because not enough places on the internet include that.

I: Things Wrong with my Dissertation’s Methodology

For my masters, I wrote a 20000 word dissertation, which took as its subject, an empirical analysis of the works of Samuel Beckett. I had a corpus of his entire works with the exception of his first novel Dream of Fair to Middling Women, which is a forgivable lapse, because he ended up cannibalising it for his collection of short stories, More Pricks than Kicks.

Quantitative literary analysis is generally carried out in one of two ways, through either one of the open-source programming languages Python or R. The former you’ve more likely to have heard of, being one of the few languages designed with usability in mind. The latter, R, would be more familiar to specialists, or people who work in the social sciences, as it is more obtuse than Python, doesn’t have many language cousins and has a very unfriendly learning curve. But I am attracted to difficulty, so I am using it for my PhD analysis.

I had about four months to carry out my analysis, so the idea of taking on a programming language in a self-directed learning environment was not feasible, particularly since I wanted to make a good go at the extensive body of secondary literature written on Beckett. I therefore made use of a corpus analysis tool called Voyant. This was a couple of years ago, so this was before its beta release, when it got all tricked out with some qualitative tools and a shiny new interface, which would have been helpful. Ah well. It can be run out of any browser, if you feel like giving it a look.

My analysis was also chronological, in that it looked at changes in Beckett’s use of language over time, with a view to proving the hypothesis that he used a less wide vocabulary as his career continued, in pursuit of his famed aesthetic of nothingness or deprivation. As I wanted to chart developments in his prose over time, I dated the composition of each text, and built a corpus for each year, from 1930–1987, excluding of course, years in which he just wrote drama, poetry, which wouldn’t be helpful to quantify in conjunction with one another. Which didn’t stop me doing so for my masters analysis. It was a disaster.

II: Uniqueness

Uniqueness, the measurement used to quantify the general spread of Beckett’s vocabulary, was obtained by the generally accepted formula below:

unique word tokens / total words

There is a problem with this measurement, in that it takes no account of a text’s relative length. As a text gets longer, the likelihood of each word being used approaches 1. Therefore, a text gets less unique as it gets bigger. I have the correlations to prove it:

Screen Shot 2016-11-03 at 12.18.03.png

There have been various solutions proposed to this quandary, which stymies our comparative analyses, somewhat. One among them is the use of vectorised measurements, which plot the text’s declining uniqueness against its word count, so we see a more impressionistic graph, such as this one, which should allow us to compare the word counts for James Joyce’s novels, A Portrait of the Artist as a Young Man and his short story collection, Dubliners.

Screen Shot 2016-11-03 at 13.28.18.png

All well and good for two or maybe even five texts, but one can see how, with large scale corpora, this sort of thing can get very incoherent very quickly. Furthermore, if one was to examine the numbers on the y-axis, one can see that the differences here are tiny. This is another idiosyncrasy of stylostatistical methods; because of the way syntax works, the margins of difference wouldn’t be regarded as significant by most statisticians. These issues relating to the measurement are exacerbated by the fact that ‘particles,’ the atomic structures of literary speech, (it, is, the, a, an, and, said, etc.) make up most of a text. In pursuit of greater statistical significance for their papers, digital literary critics remove these particles from their texts, which is another unforgivable that we do anyway. I did not, because I was concerned that I was complicit in the neoliberalisation of higher education. I also wrote a 4000 word chapter that outlined why what I was doing was awful.

IV: Ambiguity

The formula for ambiguity was arrived at by the following formula:

number of indefinite pronouns/total word count

I derived this measurement from Dr. Ian Lancashire’s study of the works of Agatha Christie, and counted Beckett’s use of a set of indefinite pronouns, ‘everyone,’ ‘everybody,’ ‘everywhere,’ ‘everything,’ ‘someone,’ ‘somebody,’ ‘somewhere,’ ‘something,’ ‘anyone,’ ‘anybody,’ ‘anywhere,’ ‘anything,’ ‘no one,’ ‘nobody,’ ‘nowhere,’ and ‘nothing.’ Those of you who know that there are more indefinite pronouns than just these, you are correct, I had found an incomplete list of indefinite pronouns, and I assumed that that was all. This is just one of the many things wrong with my study. My theory was that there were to be correlations to be detected in Beckett’s decreasing vocabulary, and increasing deployment of indefinite pronouns, relative to the total word count. I called the vocabulary measure ‘uniqueness,’ and the indefinite pronouns measure I called ‘ambiguity.’ This in tenuous I know, indefinite pronouns advance information as they elide the provision of information. It is, like so much else in the quantitative analysis of literature, totally unforgivable, yet we do it anyway.

V: Hapax Richness

I initially wanted to take into account another phenomenon known as the hapax score, which charts occurrences of words that appear only once in a text or corpus. The formula to obtain it would be the following:

number of words that appear once/total word count

I believe that the hapax count would be of significance to a Beckett analysis because of the points at which his normally incompetent narrators have sudden bursts of loquaciousness, like when Molloy says something like ‘digital emunction and the peripatetic piss,’ before lapsing back into his ‘normal’ tone of voice. Once again, because I was often working with a pen and paper, this became impossible, but now that I know how to code, I plan to go over my masters analysis, and do it properly. The hapax score will form a part of this new analysis.

VI: Code & Software

A much more accurate way of analysing vocabulary, for the purposes of comparative analysis when your texts are of different lengths, therefore, would be to randomly sample it. Obviously not very easy when you’re working with a corpus analysis tool online, but far more straightforward when working through a programming language. A formula for representative sampling was found, and integrated into the code. My script is essentially a series of nested loops and if/else statements, that randomly and sequentially sample a text, calculate the uniqueness, indefiniteness and hapax density ten times, store the results in a variable, and then calculate the mean value for each by dividing the result by ten, the number of times that the first loop runs. I inputted each value into the statistical analysis program SPSS, because it makes pretty graphs with less effort than R requires.

VII: Results

I used SPSS’ box plot function first to identify any outliers for uniqueness, hapax density and ambiguity. 1981 was the only year which scored particularly high for relative usage of indefinite pronouns.

screen-shot-2016-11-03-at-12-27-38

It should be said that this measure too, is correlated to the length of the text, which only stands to reason; as a text gets longer the relative incidence of a particular set of words will decrease. Therefore, as the only texts Beckett wrote this year, ‘The Way’ and ‘Ceiling,’ both add up to about 582 words (the fifth lowest year for prose output in his life), one would expect indefiniteness to be somewhat higher in comparison to other years. However, this doesn’t wholly account for its status as an outlier value. Towards the end of his life Beckett wrote increasingly short prose pieces. Comment C’est (How It Is) was his last novel, and was written almost thirty years before he died. This probably has a lot to do with his concentration on writing and directing his plays, but in his letters he attributed it to a failure to progress beyond the third novel in his so-called trilogy of Molloy, Malone meurt (Malone Dies) and L’innomable (The Unnamable). It is in the year 1950, the year in which L’inno was completed, that Beckett began writing the Textes pour rien (Texts for Nothing), scrappy, disjointed pieces, many of which seem to be taking up from where L’inno left off, similarly the Fizzlesand the Faux Départs. ‘The Way,’ I think, is an outgrowth of a later phase in Beckett’s prose writing, which dispenses the peripatetic loquaciousness and the understated lyricism of the trilogy and replaces it with a more brute and staccato syntax, one which is often dependent on the repetition of monosyllables:

No knowledge of where gone from. Nor of how. Nor of whom. None of whence come to. Partly to. Nor of how. Nor of whom. None of anything. Save dimly of having come to. Partly to. With dread of being again. Partly again. Somewhere again. Somehow again. Someone again.

Note also the prevalence of particle words, that will have been stripped out for the analysis, and the ways in which words with a ‘some’ prefix are repeated as a sort of refrain. This essential structure persists in the work, or at least the artefact of the work that the code produces, and hence of it, the outlier that it is.

Screen Shot 2016-11-03 at 12.55.13.png

From plotting all the values together at once, we can see that uniqueness is partially dependent on hapax density; the words that appear only once in a particular corpus would be important in driving up the score for uniqueness. While there could said to be a case for the hypothesis that Beckett’s texts get less unique, more ambiguous up until 1944, when he completed his novel Watt, and if we’re feeling particularly risky, up until 1960 when Comment C’est was completed, it would be wholly disingenuous to advance it beyond this point, when his style becomes far too erratic to categorise definitively. Comment C’est is Beckett’s most uncompromising prose work. It has no punctuation, no capitalisation, and narrates the story of two characters, in a kind of love, who communicate with one another by banging kitchen implements off another:

as it comes bits and scraps all sorts not so many and to conclude happy end cut thrust DO YOU LOVE ME no or nails armpit and little song to conclude happy end of part two leaving only part three and last the day comes I come to the day Bom comes YOU BOM me Bom ME BOM you Bom we Bom

VIII: Conclusion

I would love to say that the general tone is what my model is being attentive to, which is why it identified Watt and How It Is as nadirs in Beckett’s career but I think their presence on the chart is more a product of their relative length, as novels, versus the shorter pieces which he moved towards in his later career. Clearly, Beckett’s decision to write shorter texts, make this means of summing up his oeuvre in general, insufficient. Whatever changes Beckett made to his aesthetic over time, we might not need to have such complicated graphs to map, and I could have just used a word processor to find it — length. Bom and Pim aside, for whatever reason after having written L’inno none of Beckett’s creatures presented themselves to him in novelistic form again. The partiality of vision and modal tone which pervades the post-L’inno works demonstrates, I think far more effectively what is was that Beckett was ‘pitching’ for, a new conceptual aspect to his prose, which re-emphasised its bibliographic aspects, the most fundamental of which was their brevity, or the appearance of an incompleteness, by virtue of being honed to sometimes less than five hundred words.

The quantification of differing categories of words seems like a radical, and the most fun, thing to quantify in the analysis of literary texts, as the words are what we came for, but the problem is similar to one that overtakes one who attempts to read a literary text word by word by word, and unpack its significance as one goes: overdetermination. Words are kaleidoscopic, and the longer you look at them, the more threatening their darkbloom becomes, the more they swallow, excrete, the more alive they are, all round. Which is fine. Letting new things into your life is what it should be about, until their attendant drawbacks become clear, and you start to become ambivalent about all the fat and living things you have in your head. You start to wish you read poems instead, rather than novels, which make you go mad, and worse, start to write them. The point is words breed words, and their connections are too easily traced by computer. There’s something else about knowing that their exact correlations to a decimal point. They seem so obvious now.

Advertisements

Augustus Young’s ‘Light Years’ and the anti-bildungsroman

I’ve already written about the sub-genre of bildungsroman, but just as there are antiromana, capricious responses to the bristling and audacious baggy monsters, there will be anti­bildungsromana. Categories, as they always are in order for the endless conversation about literature to continue, are difficult and the lines that separate one from the other are fraught.

James Joyce’s A Portrait of the Artist as a Young Man is fraught with indeterminacy, we can be fairly sure we’re not meant to take this Dedalus all that seriously, but are we right to dismiss him totally when his life seems to mirror that of Joyce’s? Could we rightly envision him growing up into becoming the kind of writer who was capable of writing Ulysses?

As such, we already have a complicating factor in one of the foundational examples of the genre, Portrait is both for and against the emergence of self through the muddy waters of abstract thought and the wholemeal bread of life experience; the mechanisms are deployed in conjunction and opposition with one another.

I have already mentioned that J.M. Coetzee’s Boyhood’s tendency is more oppositional than not, Coetzee’s flight from the experience of life, choosing hermetic seclusion, repudiating the bildung aspect is firstly, the kind of choice that befits his chilly aesthetic and secondly, probably more realistic, bearing in mind the amount of solitude that is required to commit a novel to paper. Augustus Young’s work Light Years could also be viewed as existing in this contrarian tradition.

Young emigrates from his native Cork to London, to begin pursuing his career as a poet, an avant-garde, modernist one no less. This, predictably enough, is more difficult than he thought. Young’s nationality, coupled with the sometimes obscure nature of his poetry, makes him prone to being pigeon-holed; his readers seem prone to detecting a Celtic note, much to Young’s chagrin, anticipating some of the vitriol in Storytime, a memoir detailing Young’s touring with Light Years.

Growing tired of this and the disappointments that arise from carousing with literary narcissists, motivates Young’s exile from exile and to declare his utilitarian manifesto for his life: “I see myself as a socially useful human being but with a harmless secret. When I die some poems will be discovered. If any are good enough, they will survive. If not, so be it.” This is not only a long way off Dedalus’ plan to “encounter for the millionth time the reality of experience and to forge in the smithy of my soul the uncreated conscience of my race,” it is its exact opposite.

Contributing further to this sense of Young’s writing against the bildungsroman tradition, is in its structure, which begins with a number of childhood and adolescent memories, continues with the adolescent flight from home and then, in its third section, enacts a regression back into childhood and early adolescence, almost as if the embracing of ‘life’ in London repels the book on a structural level, forcing it to move backwards into its earlier stages.

This third part is a short memoir of Young’s childhood. Siblings, parents, childhood friends and ancestral memory, handed down in the form of anecdotes and oral history loom large, much in keeping with Young’s attitude to memory and the genre in which he writes in general (“Memories aren’t true. But you can be true to them”). It equally expresses Young’s wish to be ‘useful,’ immersed in the idiosyncrasies of lived lives, rather than a shallow and solipsistic urban bohemia.

J.M. Coetzee’s ‘Boyhood’ and the childhood bildungsroman

The bildungsroman is straightforward enough to define, despite the intimidating length of the word itself. In fact, I’d hazard to estimate that a substantial chunk of the novelistic canon of the twentieth century probably falls into the category. The bildungsroman translates literally from the German to ‘novel of formation/education/culture,’ and as such generally narrates a sequence of key events in the life of the narrator, specifically those that turn out to be foundational in shaping the kind of self-consciousness that a writer of literary art seems to require. There are some works which chart this development of an artistic consciousness from the perspective of the author as a child and J.M. Coetzee’s fictionalised memoir Boyhood is one such example.

While it is not a novel, William Wordsworth’s epic poem The Prelude is definitely of this ilk. It describes how the young Wordsworth came to appreciate the revealed religion in nature and how it shaped his awareness of the world around him. One thinks of Wordsworth and Samuel Taylor Coleridge’s stance on child rearing, i.e. allowing them to run free in nature, rollicking through fields, smelling the daisies, what-have-you and begins to feel a bit uneasy about this iambic tract extolling the virtues of natural freedom, given what an atrocious job Coleridge did in raising his son, Hartley Coleridge.

From our contemporary perspective, it can seem as though twentieth-century examples of the genre are less extravagantly irresponsible and more realistic to boot. This is possibly due to the rise of the Viennese School which, whatever one’s opinion on its more outlying ideas, at least began to make mainstream some elementary form of developmental psychology and stressed the importance of the child’s inner world.

Proust’s In Search of Lost Time may be the archetypal bildungsroman, if only for the monumental task that reading it represents for most people, the edition that I own runs to some 3500 pages. This is not said in order to attenuate our sense of Proust’s achievement; he succeeded in marking the genre indelibly and it is rare that one finds the memoirs of an artist that is not conducted relative to Proust’s foundations. The young Marcel is an acutely sensitive young man, the vividness of his impressions of the world, the intensity of his attachment to those who populate it and the pull that art exerts on him from a young age is difficult to shake once one has put down The Way by Swann’s, the first part of this sprawling six volume arc.

James Joyce’s A Portrait of the Artist as a Young Man is a far more withdrawn work, a tone that suits Joyce’s languorous and easy irony. It depends on a deceptively straightforward system of symbolic patterning, anchored in Catholic iconography which in equal measure oppresses and enlivens Stephen Dedalus’ (a Joyce analogue) creativity. There’s no quote that can demonstrate this process taking place in its totality, but I really like this one from the first chapter: “The Vances lived in number seven. They had a different father and mother. They were Eileen’s father and mother. When they were grown up he was going to marry Eileen. He hid under the table. His mother said:

—O, Stephen will apologise.

Dante said:

—O, if not, the eagles will come and pull out his eyes.—

Pull out his eyes,

Apologise,

Apologise,

Pull out his eyes.

Apologise,

Pull out his eyes,

Pull out his eyes,

Apologise.”

J.M. Coetzee’s Boyhood is in a similar tradition of childhood memoir, but with a key difference. As he was born in South Africa, racial politics are to the fore and as such, violence is never far from the domestic space. The brutal oppression of the land and its native people spills over into the young, unnamed boy’s awareness and from his perspective, his family is a stage of primal anger and barely suppressed, unrefined and unarticulated feelings. In one scene, the boy memorably compares himself to a spider, one of the most abject of creatures, but fittingly, one that is often seen in the home: “Always, it seems, there is something that goes wrong. Whatever he wants, whatever he likes, has sooner or later to be turned into a secret. He begins to think of himself as one of those spiders that live in a hole in the ground with a trapdoor. Always the spider has to be scuttling back into its hole, closing the trapdoor behind it, shutting out the world, hiding.” It is one of the only instances in the genre of bildungsroman where seclusion, a self-imposed exile from experience is represented as beneficial to the subject. It is an image that befits the development of a consciousness as cerebral and distant as Coetzee’s.

My Dissertation

I finished my dissertation – a quantitative analysis of the works of Samuel Beckett. There’s a copy available in Hodges & Figgis because I left one there.

Alternatively, here is the PDF.

Against the Wordle

Thoughts on James Joyce’s A Portrait of the Artist as a Young Man or ‘Revenge of the Cringe-Inducing Marginalia’ Part 2

In the previous post I confessed to having a first-year-of-undergraduate-itis when it came to annotating books that I was reading, taking up space in margins that should probably be reserved for my future self who (hopefully) knows a thing or two more about a thing or two than I do.

In the library, it’s generally the texts that are prescribed in first year that are in the worst nick, not least for the often jaw-dropping levels of hubris exhibited by its readers. If you want to see a sequence of teenagers who have recently encountered Karl Marx for the first time quibble uselessly with Terry Eagleton about his definition of a novel, you’ll know where to look. It sometimes impresses me that students in later years make an effort to respond; as if the page functions as an analogue comment board and that the conversation is some way ongoing.

As was made clear below, I wasn’t immune from the tendency myself, I also once explained Roland Barthes’ theory of the honest sign as reminiscent of the way Heath Ledger’s Joker moves in the Christopher Nolan film The Dark Knight. But occasionally my notes aren’t as oppressively baffling, as I found in my copy of James’s Joyce’s novel A Portrait of the Artist as a Young Man. The paragraph in question reads as follows:

“Now it seemed as if he would fail again but, by dint of brooding on the incident, he thought himself into confidence. During this process all those elements which he deemed common and insignificant fell out of the scene. There remained no trace of the tram itself nor of the tram-men nor of the horses: nor did he and she appear vividly. The verses told only of the night and the balmy breeze and the maiden lustre of the moon. Some undefined sorrow was hidden in the hearts of the protagonists as they stood in silence beneath the leafless trees and when the moment of farewell had come the kiss, which had been withheld by one, was given by both. After this the letters L. D. S. were written at the foot of the page, and, having hidden the book, he went into his mother’s bedroom and gazed at his face for a long time in the mirror of her dressing-table.”

My note helpfully notes: “Women, Freud, Lacan.”

What set me of on this trail was the presence of the mirror in the above scene, a bit of home décor that can get the interpretative ball rolling in any novel handily.

This is due to French psychoanalyst Jacques Lacan’s theory of the mirror stage, a juncture in a person’s life in which their self begins to exist. According to Lacan, this happens when a child first perceives themselves as an individual subject, a being that is distinct from their mother. It doesn’t necessarily involve an actual mirror.

This is fitting and is a loaded scene because of how Portrait is a novel concerned with how its precocious child Stephen Dedalus grows into a pretentious aesthete. Portrait is an extended exploration of Dedalus’ mirror stage, as he begins to see himself ‘mirrored’ as a literary artist. This can be seen in Dedalus’ emulation of Narcissus, cosying up to his new self-image as a writer.

Anne Enright once said that becoming a writer is to adopt a position of importance. Dedalus’ swollen ego certainly comes across in his preening, gazing and autographing a piece of juvenilia with his whimsical pseudonym “L. D. S.,” as if mindful of future antiquarian Christmas addicts who will come calling for the relic of the author’s manuscripts.

Joyce is ambivalent about his creature, not just in the above quotation, but in this novel in general. Throughout, he leans a bit more heavily than he does in Dubliners on the irony dial, giving us plenty of hints that the reader shouldn’t be taking the antics of this aesthete seriously. Far from a budding Joyce, Dedalus may be what Joyce was at risk of becoming, if his self regard and consciousness had overwhelmed his capacity to write anything of note.

The rather ingenious way that Joyce has this come across in this scene is the fact that Dedalus’ mirror stage takes place while he inspects his reflection in his mother’s mirror, after having written what sounds like a horrendous poem.

It is just as likely that Dedalus’ mirror stage marks the futility of his adolescent declaration of “Non serviam!” He pinched the line from Milton anyway.