Tag Archives: james joyce

Collocations in Modernist Prose

Screen Shot 2017-07-24 at 14.51.47I have recently begun to experiment with Natural Language Processing to determine how particular words in modernist texts are correlated. I’m still getting my head around Python and NLTK, but so far I’m finding it much more user-friendly than similar packages in R.

Long-term I hope to graph these collocations in high-vector space, so that I can graph them, but for the moment, I’m interested in noting the prevalence of the term ‘young man’, Self and Baume being the only authors that have female adjective-noun phrases, and the usage of titles which convey particular social hierarchies; Joyce, Woolf and Bowen’s collocations are almost exclusively composed of these, as is Stein’s, with the clarifier that Stein’s appear shorn of their ‘Mr.’, ‘Miss.’ or ‘Doctor’.

Here’s all the collocations in the modernist corpus:

young man; robert jordan; new york; gertrude stein; old man; could see; henry martin; every one; years ago; first time; long time; hugh monckton; great deal; come back; david hersland; good deal; every day; edward colman; came back; alfred hersland

Canonical modernist texts:

young man; robert jordan; gertrude stein; henry martin; new york; every one; old man; could see; years ago; long time; hugh monckton; first time; great deal; david hersland; come back; good deal; every day; edward colman; alfred hersland; mr. bettesworth

Contemporary texts, Enright, Self, Baume, McBride:

fat controller; phar lap; von sasser; first time; per cent; could see; old man; one another; even though; years ago; new york; front door; young man; either side; someone else; dave rudman; last night; living room; steering wheel; every time

Djuna Barnes

frau mann; nora said; english girl; someone else; long ago; leaned forward; london bridge; come upon; could never; god knows; doctor said; sweet sake; first time; five francs; terrible thing; francis joseph; hôtel récamier; orange blossoms; bowed slightly; would say

Eimear McBride

kentish town; someone else; first time; last night; jesus christ; something else; years ago; five minutes; every day; hail mary; take care; next week; arms around; never mind; every single; little girl; little boy; two years; soon enough; come back

Elizabeth Bowen

mrs kerr; lady waters; mrs heccomb; major brutt; mme fisher; lady naylor; miss fisher; good deal; said mrs; first time; lady elfrida; one another; young man; colonel duperrier; aunt violet; last night; ann lee; one thing; sir robert; sir richard

Ernest Hemingway

robert jordan; old man; could see; colonel said; gran maestro; catherine said; jordan said; richard gordon; long time; pilar said; thou art; pablo said; nick said; bill said; girl said; captain willie; young man; automatic rifle; mr. frazer; david said

F. Scott FitzGerald

new york; young man; years ago; first time; sally carrol; several times; fifth avenue; ten minutes; minutes later; richard caramel; thousand dollars; five minutes; young men; evening post; old man; next day; saturday evening; long time; last night; come back

Gertrude Stein

gertrude stein; every one; david hersland; alfred hersland; angry feeling; family living; independent dependent; jeff campbell; julia dehning; mrs. hersland; daily living; whole one; bottom nature; madeleine wyman; good deal; mary maxworthing; middle living; miss mathilda; mabel linker; every day

James Joyce

buck mulligan; said mr.; martin cunningham; aunt kate; says joe; mary jane; corny kelleher; ned lambert; mrs. kearney; stephen said; mr. henchy; ignatius gallaher; father conmee; nosey flynn; mr. kernan; myles crawford; cissy caffrey; ben dollard; mr. cunningham; miss douce

Marcel Proust

young man; faubourg saint-germain; long ago; caught sight; first time; every day; one day; great deal; des laumes; young men; could see; quite well; next day; one another; would never; nissim bernard; victor hugo; would say; louis xiv; long time

Samuel Beckett

said camier; said mercier; miss counihan; lord gall; miss carridge; mr. kelly; panting stops; said belacqua; mr. endon; said wylie; said neary; one day; otto olaf; dr. killiecrankie; come back; vast stretch; mrs gorman; push pull; something else; ground floor

Sara Baume

even though; tawny bay; living room; old man; passenger seat; bird walk; maggot nose; shut-up-and-locked room; stone fence; food bowl; lonely peephole; low chair; old woman; kennel keeper; rearview mirror; shih tzu; shore wall; safe space; every day; oneeye oneeye

Virginia Woolf

miss barrett; mrs. ramsay; mrs. hilbery; young man; st. john; could see; years ago; peter walsh; mrs. thornbury; miss allan; said mrs.; young men; mrs. swithin; human beings; wimpole street; mrs. flushing; mr. ramsay; mrs. manresa; sir william; door opened

Anne Enright

new york; per cent; eliza lynch; dear friend; years old; even though; first time; came back; years ago; long time; michael weiss; señor lópez; living room; every time; looked like; could see; one day; said constance; pat madigan; mrs hanratty

Will Self

fat controller; phar lap; von sasser; one another; old man; could see; first time; per cent; dave rudman; let alone; front door; young man; skip tracer; quantity theory; jane bowen; los angeles; young woman; either side; charing cross; long since

Flann O’Brien

father fahrt; good fairy; father cobble; said shanahan; mrs crotty; said furriskey; said lamont; mrs laverty; one thing; sergeant fottrell; said slug; old mathers; public house; far away; cardinal baldini; monsignor cahill; mrs furriskey; red swan; black box; said shorty

Ford Madox Ford

henry martin; hugh monckton; edward colman; privy seal; mr. bettesworth; mr. fleight; young man; mr. sorrell; sergius mihailovitch; young lovell; new york; jeanne becquerel; lady aldington; kerr howe; anne jeal; miss peabody; mr. pett; great deal; marie elizabeth; robert grimshaw

Jorge Luis Borges

ts’ui pên; buenos aires; pierre menard; eleventh volume; richard madden; nils runeberg; yiddische zeitung; stephen albert; hundred years; erik lönnrot; firing squad; henri bachelier; madame henri; orbis tertius; vincent moon; paint shop; seventeenth century; anglo-american cyclopaedia; fergus kilpatrick; years ago

Joseph Conrad

mrs. travers; mrs verloc; mrs. fyne; peter ivanovitch; doña rita; miss haldin; mrs. gould; assistant commissioner; charles gould; san tomé; chief inspector; years ago; captain whalley; could see; van wyk; old man; dr. monygham; gaspar ruiz; young man; mr. jones

D.H. Lawrence

young man; st. mawr; mr. may; mrs. witt; blue eyes; miss frost; could see; one another; mrs bolton; ‘all right; come back; said alvina; two men; of course; good deal; long time; mr. george; next day

William Faulkner

uncle buck; aleck sander; miss reba; years ago; dewey dell; mrs powers; could see; white man; four years; old man; ned said; division commander; general compson; miss habersham; new orleans; uncle buddy; let alone; one another; united states; old general

Literary Cluster Analysis

I: Introduction

My PhD research will involve arguing that there has been a resurgence of modernist aesthetics in the novels of a number of contemporary authors. These authors are Anne Enright, Will Self, Eimear McBride and Sara Baume. All these writers have at various public events and in the course of many interviews, given very different accounts of their specific relation to modernism, and even if the definition of modernism wasn’t totally overdetermined, we could spend the rest of our lives defining the ways in which their writing engages, or does not engage, with the modernist canon. Indeed, if I have my way, this is what I will spend a substantial portion of my life doing.

It is not in the spirit of reaching a methodology of greater objectivity that I propose we analyse these texts through digital methods; having begun my education in statistical and quantitative methodologies in September of last year, I can tell you that these really afford us no *better* a view of any text then just reading them would, but fortunately I intend to do that too.

This cluster dendrogram was generated in R, and owes its existence to Matthew Jockers’ book Text Analysis with R for Students of Literature, from which I developed a substantial portion of the code that creates the output above.

What the code is attentive to, is the words that these authors use the most. When analysing literature qualitatively, we tend to have a magpie sensibility, zoning in on words which produce more effects or stand out in contrast to the literary matter which surrounds it. As such, the ways in which a writer would use the words ‘the’, ‘an’, ‘a’, or ‘this’, tends to pass us by, but they may be far more indicative of a writer’s style, or at least in the way that a computer would be attentive to; sentences that are ‘pretty’ are generally statistically insignificant.

II: Methodology

Every corpus that you can see in the above image was scanned into R, and then run through a code which counted the number of times every word was used in the text. The resulting figure is called the word’s frequency, and was then reduced down to its relative frequency, by dividing the figure by total number of words, and multiplying the result by 100. Every word with a relative frequency above a certain threshold was put into a matrix, and a function was used to cluster each matrix together based on the similarity of the figures they contained, according to a Euclidean metric I don’t fully understand.

The final matrix was 21 X 57, and compared these 21 corpora on the basis of their relative usage of the words ‘a’, ‘all’, ‘an’, ‘and’, ‘are’, ‘as’, ‘at’, ‘be’, ‘but’, ‘by’, ‘for’, ‘from’, ‘had’, ‘have’, ‘he’, ‘her’, ‘him’, ‘his’, ‘I’, ‘if’, ‘in’, ‘is’, ‘it’, ‘like’, ‘me’, ‘my’, ‘no’, ‘not’, ‘now’, ‘of’, ‘on’, ‘one’, ‘or’, ‘out’, ‘said’, ‘she’, ‘so’, ‘that’, ‘the’, ‘them’, ‘then’, ‘there’, ‘they’, ‘this’, ‘to’, ‘up’, ‘was’, ‘we’, ‘were’, ‘what’, ‘when’, ‘which’, ‘with’, ‘would’, and ‘you’.

Anyway, now we can read the dendrogram.

III: Interpretation

Speaking about the dendrogram in broad terms can be difficult for precisely the reason that I indicative above; quantitative/qualitative methodologies for text analysis are totally opposed to one another, but what is obvious is that Eimear McBride and Gertrude Stein are extreme outliers, and comparable only to each other. This is one way unsurprising, because of the brutish, repetitive styles and is in other ways very surprising, because McBride is on record as dismissing her work, for being ‘too navel-gaze-y.’

Jorge Luis Borges and Marcel Proust have branched off in their own direction, as has Sara Baume, which I’m not quite sure what to make of. Franz Kafka, Ernest Hemingway and William Faulkner have formed their own nexus. More comprehensible is the Anne Enright, Katherine Mansfield, D.H. Lawrence, Elizabeth Bowen, F. Scott FitzGerald and Virginia Woolf cluster; one could make, admittedly sweeping judgements about how this could be said to be modernism’s extreme centre, in which the radical experimentalism of its more revanchiste wing was fused rather harmoniously with nineteenth-century social realism, which produced a kind of indirect discourse, at which I think each of these authors excel.

These revanchistes are well represented in the dendrogram’s right wing, with Flann O’Brien, James Joyce, Samuel Beckett and Djuna Barnes having clustered together, though I am not quite sure what to make of Ford Madox Ford/Joseph Conrad’s showing at all, being unfamiliar with the work.

IV: Conclusion

The basic rule in interpreting dendrograms is that the closer the ‘leaves’ reach the bottom, the more similar they can be said to be. Therefore, Anne Enright and Will Self are the contemporary modernists most closely aligned to the forebears, if indeed forebears they can be said to be. It would be harder, from a quantitative perspective, to align Sara Baume with this trend in a straightforward manner, and McBride only seems to correlate with Stein because of how inalienably strange their respective prose styles are.

The primary point to take away here, if there is one, is that more investigations are required. The analysis is hardly unproblematic. For one, the corpus sizes vary enormously. Borges’ corpus is around 46 thousand words, whereas Proust reaches somewhere around 1.2 million. In one way, the results are encouraging, Borges and Barnes, two authors with only one texts in their corpus, aren’t prevented from being compared to novelists with serious word counts, but in another way, it is pretty well impossible to derive literary measurements from texts without taking their length into account. The next stage of the analysis will probably involve breaking the corpora up into units of 50 thousand words, so that the results for individual novels can be compared.

Re-reading Eimear McBride’s ‘A Girl is a Half-Formed Thing’

A book that I’m looking forward to reading, that doesn’t exist yet, is an academic account of how Irish contemporary fiction went, in such a short space of time, from social realism, to the precociously sentenced art writing with dissociative narrators that now composes the Irish literary milieu. It’s the sort of thing that was probably brewing for a long time, these trends tend to be, but I first became aware of it when Eimear McBride’s A Girl is a Half-Formed Thing was published in 2013. It caused a bit of stir in the literary press at the time, for its supposed uncompromising experimentalism, and its fraught, J.K. Rowling-esque publication history. Critics compared it to Marcel Proust or Samuel Beckett, but I don’t think there was a single review that didn’t mention James Joyce.

In the works of Sara Baume, Joanna Walsh or Claire-Louise Bennett, there are certainly comparisons to be made along these lines, but I think McBride is the novelist of the current generation who is suffering most egregiously under these comparisons. This leads to a kind of distortion that McBride has spoken about recently, saying that it’s ‘a way of not being seen’. Claire Lowdon, writing on McBride’s prose style in Areté, has used the Joyce comparisons as a way of demeaning the novel’s experimental qualities, saying that they are ‘redundant’ and ‘artificial’:

Having invoked Joyce, Joyce has to be McBride’s standard. She has taken all the difficulty and none of the brilliance.

Lowdon’s reading is important and thorough, but I have problems with it. The most significant one being that I think it’s nonsensical to say that just because a work is in some way formally indebted to Joyce has to be 1) as good, 2) as innovative and 3) as good and as innovative in exactly the same ways. I think it’s a very strange point to make that we should benchmark a writer relative to their influences , particularly when this is a comparison furthered more by the laziness of critics than something that McBride has taken upon herself. It’s also inadequate to assume McBride and Joyce’s modernisms are coterminous; I happen to think that they’re rather distinct in a number of significant ways.

Firstly, it’s clear that A Girl is more formally aligned with the Wake than with Ulysses, but taken relative to the former, A Girl manifests far less attention to the materiality of language. In A Girl, there’s less puns, there’s less references, there’s less leitmotifs. It’s also possible to make sense of A Girl without reference to other works. But it’s a mistake to regard this as McBride’s failure to live up to her twentieth century modernist aesthetics. An example from the novel’s opening that Lowdon cites reads as follows:

For you. You’ll soon. You’ll give her name. In the stitches of her skin she’ll wear your say. Mammy me? Yes you. Bounce the bed I’d say. I’d say that’s what you did. Then lay you down. They cut you round. Wait and hour and day.

‘Wait and hour and day’, carries with it the vague association with the phrase ‘a year and a day’ but it doesn’t strictly make sense in that context, there’s no clear reason for the semantic distortion. But there’s also no requirement that there is, nor that it add up to some enormous mythic framework in the same way that the Wake does. I think that once we approach the novel from this position, one which takes account of McBride’s actual concerns, we’ll be able to come to a more sophisticated understanding that doesn’t amount to downgrading her because of her perceived inadequacy in relation to Joyce.

By her own admission McBride retains an interest in nineteenth century novels with less self-consciousness about their language or processes of meaning-making. She has cited the work of the Russian novelist Fyodor Dostoevsky as significant, particularly as an example of proto-modernism, or modernism in a nascent stage of its development, wherein human intersubjectivity was beginning to make itself known within the novel while the tenets of realistic fiction was still trying to accommodate it. Being aware of the fact that The Lesser Bohemians is not the novel under discussion, it’s important to note the way in which it demonstrates this interplay. Within the context of what has been referred to by the author as a ‘modernist monologue’ there is a very sensationalistic narrative in which a character lays out their life story in a very direct and straightforward manner in the same way that you might find extended and directly rendered narratives nested within nineteenth century novels. McBride has said that this is a very deliberate formal mechanic which is pertinent to the text’s thematic concerns, as it is a novel about relating to another person in spite of one’s traumatic past:

In the end you tell a person and you have to use the words that they’ll understand.

What makes McBride’s modernism distinct then, is the centrality it gives to the conveying of narrative information, deploying it as a means of bringing the reader closer to

physical experience, to write about the female experience…the reader can partake in the experience.

McBride has said that the language of A Girl, was written in a way that would create a physical experience for the reader, an immediacy on the page that is reminiscent of theatre. She’s expressed frustration at the content of many of her reviews which have emphasised the quality of the language at the expense of the novel’s content, which she regards as very significant. This stands in contrast to the tradition of the Wake or other modernist works famed for their unintelligibility, such as Gertrude Stein’s The Making of Americans: Being a History of a Family’s Progress is a novel that she has spoken about dismissively for being ‘too navel-gaze-y.’

This stated interest in what the book is ‘about’ and a reader-centric ethic, is I think at least a partial reversal of expectations within the modernist tradition. McBride’s modernism is therefore conceptualised, not as a constructed textual estrangement from reality, but an attempt to bring it closer, to a dwelling-place of authentic being. Not that it’s likely to close off such comparisons in the future.

Can a recurrent neural network write good prose?

At this stage in my PhD research into literary style I am looking to machine learning and neural networks, and moving away from stylostatistical methodologies, partially out of fatigue. Statistical analyses are intensely process-based and always open, it seems to me, to fairly egregious ‘nudging’ in the name of reaching favourable outcomes. This brings a kind of bathos to some statistical analyses, as they account, for a greater extent than I’d like, for methodology and process, with the result that the novelty these approaches might have brought us are neglected. I have nothing against this emphasis on process necessarily, but I do also have a thing for outcomes, as well as the mysticism and relativity machine learning can bring, alienating us as it does from the process of the script’s decision making.

I first heard of the sci-fi writer from a colleague of mine in my department. It’s Robin Sloan’s plug-in for the script-writing interface Atom which allows you to ‘autocomplete’ texts based on your input. After sixteen hours of installing, uninstalling, moving directories around and looking up stackoverflow, I got it to work.I typed in some Joyce and got stuff about Chinese spaceships as output, which was great, but science fiction isn’t exactly my area, and I wanted to train the network on a corpus of modernist fiction. Fortunately, I had the complete works of Joyce, Virginia Woolf, Gertrude Stein, Sara Baume, Anne Enright, Will Self, F. Scott FitzGerald, Eimear McBride, Ernest Hemingway, Jorge Luis Borges, Joseph Conrad, Ford Madox Ford, Franz Kafka, Katherine Mansfield, Marcel Proust, Elizabeth Bowen, Samuel Beckett, Flann O’Brien, Djuna Barnes, William Faulkner & D.H. Lawrence to hand.

My understanding of this recurrent neural network, such as it is, runs as follows. The script reads the entire corpus of over 100 novels, and calculates the distance that separates every word from every other word. The network then hazards a guess as to what word follows the word or words that you present it with, then validates this against what its actuality. It then does so over and over and over, getting ‘better’ at predicting each time. The size of the corpus is significant in determining the length of time this will take, and mine required something around twelve days. I had to cut it off after twenty four hours because I was afraid my laptop wouldn’t be able to handle it. At this point it had carried out the process 135000 times, just below 10% of the full process. Once I get access to a computer with better hardware I can look into getting better results.

How this will feed into my thesis remains nebulous, I might move in a sociological direction and take survey data on how close they reckon the final result approximates literary prose. But at this point I’m interested in what impact it might conceivably have on my own writing. I am currently trying to sustain progress on my first novel alongside my research, so, in a self-interested enough way, I pose the question, can neural networks be used in the creation of good prose?

There have been many books written on the place of cliometric methodologies in literary history. I’m thinking here of William S. Burroughs’ cut-ups, Mallarmé’s infinite book of sonnets, and the brief flirtation the literary world had with hypertext in the 90’s, but beyond of the avant-garde, I don’t think I could think of an example of an author who has foregrounded their use of numerical methods of composition. A poet friend of mine has dabbled in this sort of thing but finds it expedient to not emphasise the aleatory aspect of what she’s doing, as publishers tend to give a frosty reception when their writers suggest that their work is automated to some extent.

And I can see where they’re coming from. No matter how good they get at it, I’m unlikely to get to a point where I’ll read automatically generated literary art. Speaking for myself, when I’m reading, it is not just about the words. I’m reading Enright or Woolf or Pynchon because I’m as interested in them as I am in what they produce. How synthetic would it be to set Faulkner and McCarthy in conversation with one another if their congruencies were wholly manufactured by outside interpretation or an anonymous algorithmic process as opposed to the discursive tissue of literary sphere, if a work didn’t arise from material and actual conditions? I know I’m making a lot of value-based assessments here that wouldn’t have a place in academic discourse, and on that basis what I’m saying is indefensible, but the probabilistic infinitude of it bothers me too. When I think about all the novelists I have yet to read I immediately get panicky about my own death, and the limitless possibilities of neural networks to churn out tomes and tomes of literary data in seconds just seems to me to exacerbate the problem.

However, speaking outside of my reader-identity, as a writer, I find it invigorating. My biggest problem as a writer isn’t writing nice sentences, given enough time I’m more than capable of that, the difficulty is finding things to wrap them around. Mood, tone, image, aren’t daunting, but a text’s momentum, the plot, I suppose, eludes me completely. It’s not something that bothers me, I consider plot to be a necessary evil, and resent novels that suspend information in a deliberate, keep-you-on-the-hook sort of way, but the ‘what next’ of composition is still a knotty issue.

The generation of text could be a useful way of getting an intelligent prompt that stylistically ‘borrows’ from a broad base of literary data, smashing words and images together in a generative manner to get the associative faculties going. I’m not suggesting that these scripts would be successful were they autonomous, I think we’re a few years off one of these algorithms writing a good novel, but I hope to demonstrate that my circa 350 generated words would be successful in facilitating the process of composition:

be as the whoo, put out and going to Ingleway effect themselves old shadows as she was like a farmers of his lake, for all or grips — that else bigs they perfectly clothes and the table and chest and under her destynets called a fingers of hanged staircase and cropping in her hand from him, “never married them my said?” know’s prode another hold of the utals of the bright silence and now he was much renderuched, his eyes. It was her natural dependent clothes, cattle that they came in loads of the remarks he was there inside him. There were she was solid drugs.

“I’m sons to see, then?’ she have no such description. The legs that somewhere to chair followed, the year disappeared curl at an entire of him frwented her in courage had approached. It was a long rose of visit. The moment, the audience on the people still the gulsion rowed because it was a travalious. But nothing in the rash.

“No, Jane. What does then they all get out him, but? Or perfect?”

“The advices?”

Of came the great as prayer. He said the aspect who, she lay on the white big remarking through the father — of the grandfather did he had seen her engoors, came garden, the irony opposition on his colling of the roof. Next parapes he had coming broken as though they fould

has a sort. Quite angry to captraita in the fact terror, and a sound and then raised the powerful knocking door crawling for a greatly keep, and is so many adventored and men. He went on. He had been her she had happened his hands on a little hand of a letter and a road that he had possibly became childish limp, her keep mind over her face went in himself voice. He came to the table, to a rashes right repairing that he fulfe, but it was soldier, to different and stuff was. The knees as it was a reason and that prone, the soul? And with grikening game. In such an inquisilled-road and commanded for a magbecross that has been deskled, tight gratulations in front standing again, very unrediction and automatiled spench and six in command, a

I don’t think I’d be alone in thinking that there’s some merit in parts of this writing. I wonder if there’s an extent to which Finnegans Wake has ‘tainted’ the corpus somewhat, because stylistically, I think that’s the closest analogue to what could be said to be going on here. Interestingly, it seems to be formulating its own puns, words like ‘unrediction,’ ‘automatiled spench’ (a tantalising meta-textual reference I think) and ‘destynets’, I think, would all be reminiscent of what you could expect to find in any given section of the Wake, but they don’t turn up in the corpus proper, at least according to a ctrl + f search. What this suggests to me is that the algorithm is plotting relationships on the level of the character, as well as phrasal units. However, I don’t recall the sci-fi model turning up paragraphs that were quite so disjointed and surreal — they didn’t make loads of sense, but they were recognisable, as grammatically coherent chunks of text. Although this could be the result of working with a partially trained model.

So, how might they feed our creative process? Here’s my attempt at making nice sentences out of the above.

— I have never been married, she said. — There’s no good to be gotten out of that sort of thing at all.

He’d use his hands to do chin-ups, pull himself up over the second staircase that hung over the landing, and he’d hang then, wriggling across the awning it created over the first set of stairs, grunting out eight to ten numbers each time he passed, his feet just missing the carpeted surface of the real stairs, the proper stairs.

Every time she walked between them she would wonder which of the two that she preferred. Not the one that she preferred, but the one that were more her, which one of these two am I, which one of these two is actually me? It was the feeling of moving between the two that she could remember, not his hands. They were just an afterthought, something cropped in in retrospect.

She can’t remember her sons either.

Her life had been a slow rise, to come to what it was. A house full of men, chairs and staircases, and she wished for it now to coil into itself, like the corners of stale newspapers.

The first thing you’ll notice about this is that it is a lot shorter. I started off by traducing the above, in as much as possible, into ‘plain words’ while remaining faithful to the n-grams I liked, like ‘bright silence’ ‘old shadows’ and ‘great as prayer’. In order to create images that play off one another, and to account for the dialogue, sentences that seemed to be doing similar things began to cluster together, so paragraphs organically started to shrink. Ultimately, once the ‘purpose’ of what I was doing started to come out, a critique of bourgeois values, memory loss, the nice phrasal units started to become spurious, and the eight or so paragraphs collapsed into the three and a half above. This is also ones of my biggest writing issues, I’ll type three full pages and after the editing process they’ll come to no more than 1.5 paragraphs, maybe?

The thematic sense of dislocation and fragmentation could be a product of the source material, but most things I write are about substance-abusing depressives with broken brains cos I’m a twenty-five year old petit-bourgeois male. There’s also a fairly pallid Enright vibe to what I’ve done with the above, I think the staircases line could come straight out of The Portable Virgin.

Maybe a more well-trained corpus could provide better prompts, but overall, if you want better results out of this for any kind of creative praxis, it’s probably better to be a good writer.

100 Great Novels (A Potentially unending Work in Progress)

Aspiration: 50/50 gender & POC split (currently at a lame and terrible 20% and 0% respectively)

  1. Samuel Beckett — How It Is

Reaching the conclusion that How It Is represents Beckett’s prose writing reaching its most concentrated point of distillation and intensity is somewhat inevitable, seeing as it was his last novel; the longest prose work subsequent to How It Is barely reaches the length of a novella, almost as if the weight of the novelistic tradition, a form known for its expansiveness and maximalism, couldn’t withstand Beckett’s striving towards a more hermetic and taciturn literature.

Having said this, I don’t wish to fetishise How It Is for its its impecuniousness alone, for there are plenty of sections in which traditionally pretty descriptive prose appears:

we are on a veranda smothered in verbena the scented sun dapples the red tiles yes I assure you the huge head hatted with birds and flowers is bowed down over my curls the eyes burn with severe love I offer her mine pale upcast to the sky whence cometh our help and which I know perhaps even then with time shall pass away

The ‘yes I assure you’ is demonstrative of How It Is’ overriding push/pull dynamic, in advancing an almost sickly description, almost reminiscent of Keats alongside its subverting narrative commentary. But this doesn’t deaden the effect of the writing, just as setting imagery of abject ugliness and inhumanity amid these lyrical digressions intensifies the effects of both:

as it comes bits and scraps all sorts not so many and to conclude happy end cut thrust DO YOU LOVE ME no or nails armpit and little song to conclude happy end of part two leaving only part three and last the day comes I come to the day Bom comes YOU BOM me Bom ME BOM you Bom we Bom

2. Jorge Luis Borges — Labyrinths

In talking about the short story’s as one of the more concentrated literary forms, one in which space is at a premium, and there can’t be too many words that don’t belong there, I think the work of Jorge Luis Borges is most deserving of mention. No other writer that I’m aware of is capable in under five hundred words of totally challenging the ways in which you think, how you think about how you think, and how you think about how you think about how you think. His capacity to do so through use of a style that is predominantly unadorned and perhaps uninviting makes him all the more fit to be praised.

Since ‘On Exactitude in Science’ is the length of just one paragraph, I’ll present it here:

In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography.

At the premium of literary art is its capacity to open up entire worlds with just words on a page. For those who believe world-building to be a preserve of genre fiction only, I encourage them to read Borges.

3. J.M. Coetzee — Waiting for the Barbarians

The allegory, and playing with the conventions around allegory, is a way in which Coetzee’s writing career in its entirety has been characterised by critics, but it might be a line of interpretation advanced too tenuously; it might be more accurate to say that his novels reflect a radical scepticism regarding narrative itself; an unwillingness to confront anything directly. In the Heart of the Country is one of the most deft examples of metafiction I’ve ever come across, and in its refusal to fix its plot around any one sequence of events, we see a narrative force that is as congenial to the forces of its unmaking as its genesis.

Waiting for the Barbarians is more contained than In the Heart of the Country in this sense, but in no other. That it has parallels to South African society under apartheid will surprise no one familiar with the rich literary tradition of that political milieu of the past fifty years, but it has also an uncanny capacity to encompass and seemingly respond to the nature of racial prejudice and ethnically-based in general. I was so sure that it was a product of the Bush years, so I Googled it to find out whether it was written in 2007 or 2005, only to discover that it was published in 1980. Not to turn my ignorance into a virtue, but I think this speaks to its universality.

Which is not to say that the narrative entire is grounded in geopolitics — in the colonial administrator’s love affair with one of the supposed barbarians, we are permitted to meditate on the unknowability of any love object, and by extension ourselves, how ‘In all of us, deep down, there seems to be something granite and unteachable.’

4. Don DeLillo — Underworld

To write a Great American Novel has, thankfully, become rather passé, after feminist critics drew attention to how unusual it is for a female author to be feted with this title. The liberal commentariat’s realisation that they have committed the error of elevating Jonathan Franzen to the role of cultural commentator. Underworld, I would say, is one of the few published in recent years that’s worth reading, for the reason that it is a novel about America that won’t allow real life in.

Underworld is a novel supposedly about baseball, the lost era of old New York, the faux-simplicity of the Cold War, and yet there is nothing ordinary, white bread or milquetoast about the America in this novel; the closest we get to a ‘nuclear’ family is the most distorted and unsettling sections in the text.

It is a novel about subterranean connections and invisible intersections. As you read it, you may find yourself compulsively noticing, drawing analogies, knowing that you’re missing others that only reveal themselves the second time around. This is Underworld’s underworld; more so than many other novels from the time, it is pointing you again and again to what is beyond the page, to what’s beneath the words. You could go mental doing it, wonder why some chapters would be more aptly named with the title that a different chapter has, in what precise order the baseball passes from one character to another, which I suppose is only fitting for a novel in which a baseball is semi-seriously analogous to the famous magic bullet. But for once, I’d encourage any potential reader not to spend their time trying to read past Underworld, not when the prose is this good.

Civilisation did not rise and flourish as men hammered out hunting scenes on bronze gates and whispered philosophy under the stars, with garbage as a noisome offshoot, swept away and forgotten. No, garbage rose first, inciting people to build a civilisation in response, in self-defense.

5. Anne Enright — The Green Road

Enright is one of those few authors that refuses to write the same book twice, and never makes you regret it. Because there is, as publishers well know, a great seductive quality in becoming used to one writing style. Many authors who are too protean, simply do not catch on in a crowded marketplace. Well Enright is interested, and is good at, change. This is how she can move from the hilariously picaresque and surreal The Wig my Father Wore through the tortured monologue of The Gathering to an adept Irish family novel about land, which one could almost call realist, so subtle is the indirect discourse which drives it.

Enright is a deeply intellectual author, but unlike many book-readin’ writers, her ideas exists beneath the surface of the words, just gestured towards, to be decoded on repeated readings. For first readings, just allow the sentences to do their thing. You could read The Green Road all the way through and have no notion of the fact that its in conversation with William Shakespeare’s King Lear. You wouldn’t want to, of course, but you could.

It is a novel of many parts. Each of Rosaleen Madigan’s children get their own section and so the novel roves from Clare to New York to Mali and back, before they are all assembled for the set piece of the Christmas dinner. I really can’t emphasise enough how well this is done. It is in the novel’s closing sections that the function behind its structure becomes clear, in seeing exactly where these people are coming from, their ambivalence regarding their role in the family before their adult lives, then watching those roles slowly overcome them is great, hilarious and sad. A novel with characters you care about, things to say and great writing is too rare, which makes The Green Road all the more valuable.

6. William Faulkner — As I Lay Dying

7. David Foster Wallace — Infinite Jest

David Foster Wallace might be said to be undergoing his D.H. Lawrence moment, in having his reputation defined for too long by a reading community of dudey-bro-y dudebro brodudes, and y’know, to look at his representations of women, here and in The Pale King, not to mention his opinions, or life, it can be hard to say his books don’t deserve scrutiny. It is slightly disappointing all the same to see an author who, among the authors of phallogocentric literary fiction, to be tarred as such, considering he’s among the most giving of them. Infinite Jest apportions its fun about twenty per cent more generously than your average example of the genre, and reading about eschaton is about as much fun as you can have with your eyes open.

Its flaws, the sections dealing with the Québecois separatists, the exposition-laden conversations between Hal Incandenza and his older brother Orin, don’t totally come good in the end, but the unavoidable ambivalence one develops when reading a novel Infinite Jest’s length and ambition, is a feature, rather than a bug. As in any important relationship, the challenge is what matters.

So give yourself the chance to read it. It’s more than readable, and far more interesting than Foster Wallace’s persona as it has been construed in the pop-culture landscape since his death; as an icon, he simply cannot compare with the questions that his work throws up.

8. William Gaddis — The Recognitions

William Gaddis’ The Recognitions is a very conflicted novel. It is a profoundly generative work, one which may have given us every maximalist, encyclopaedic 500+ page text in contemporary American letters since, and it is also a profoundly angry text, one which lashes out at everything: organised religion, the commodification of great art, the hyper-mediation of our reality via advertising, the complacently bourgeois creative class, all these and more are targets of Gaddis’ ire.

However, it is also a novel based on profound erudition and cultural awareness. Its most proximate literary cousin is Marcel Proust’s In Search of Lost Time and just as gallantly as Proust does, Gaddis manages to balance many portentous thematic concerns with Being, death and sex, alongside a vibrant social comedy. If I had to guess, I would say about sixty-five percent of it is spent convincing the reader how shallow the hipsters of 1950’s New York are.

And of course, the sentences are very powerful

Undisciplined lights shone through the night instructed by the tireless precision of the squads of traffic lights, turning red to green, green to red, commanding voids with indifferent authority: for the night outside had not changed, with the whole history of night bound up inside it had not become better or worse, fewer lights and it was darker, less motion and it was more empty, more silent, less perturbed, and like the porous figures which continued to move against it, more itself.

It can often be a struggle, Jonathan Franzen tried, and mostly failed to deal with it (in a public article no less), but the bonus of my edition is a foreword by William H. Gass himself, who provides us with a great key to the work, as well as a get-out clause, should we find it too difficult:

No great book is explicable, and I shall not attempt to explain this one. An explanation…would defile it, for reduction is precisely what a work of art opposes…Interpretation replaces the original with the lamest sort of substitute. It tames, disarms.

9. William H. Gass — The Tunnel

10. James Joyce — Ulysses

I was once challenged to sum up a novel’s plot in six words, and for Ulysses, my attempt was ‘2 sad men meet. a woman thinks.’ This is a perfect example of how, when it comes to summing up Ulysses, its hard to know where to begin. Humour, bathos, beauty, poetry, history, love, death, family, sex, great writing, it has everything you could ever want.

I won’t contest that it’s a grower, and if you come to it fresh (‘fresh’ in this case meaning, having read Dubliners and A Portrait of the Artist as a Young Man, which will be necessary), expect to find yourself moving your eyes over large tracts of text without quite knowing exactly what’s happening. Reading aloud helps.

For those who may be used to more genre fare, there are sections for you too, there’s an episode written in the manner of a nineteenth-century romance novel, and while the line attributed to Joyce about enigmas codified into the text in sufficient quantities to keep the professors busy for hundreds of years is definitely apocryphal, what it tells us about the novel is definitely true — the novel is so dense with allusion, red herrings and unresolved questions that you’ll find yourself in the role of a sort of detective, which, is not a wholly inappropriate tack to take with Ulysses, since Joyce designed his one day in Dublin with meticulous attention to detail, his notes on how long it takes to walk down particular stretches of urban walkways, or the businesses Bloom encounters in his perambulations, were all derived from sources, and correspondences with people Joyce contacted in Dublin. A staggering work, everyone should make time for it.

11. Ben Marcus — The Flame Alphabet

12. Flann O’Brien — The Third Policeman

13. Marcel Proust — In Search of Lost Time

The term ‘baggy monster’, so often applied to the novel, is a rather ingenious one, as it captures a central ambivalence regarding the form in relation to itself. Both terms can be read negatively, in fact, they are perhaps more on the negative end of the spectrum than not, but taken together there’s something alluring about it, particularly when you have come to know, over the course of reading many of them, how successful a novel can be in reaching for exactly the kind of excess that ‘good taste’ might seem to advise against. Well there’s plenty baggy and monstrous in Proust’s seven volume work In Search of Lost Time, but, as much as it could be said to be in need of an editor, its vices are perhaps indissociable from its virtues.

And this is itself a virtue. What other work of fiction can be so assuming as to impose itself on you 1,267,069 words? Well it isn’t for no reason, and a close reading of fin-de-siecle French bourgeois culture next to the metaphysician Bergson is more than worth the time you’d spend on it. Yes, it is occasionally tedious, and seemingly repetitive, but you’re unlikely to come away from Proust without recognising yourself in at least a few of the characters, nor coming to some disturbing conclusions regarding the way you live your life. Write down your definitions of habit, love and time before getting into these novels. It’s unlikely they’ll have remained intact in your journey through these texts.

But don’t come to it with a pious reverence. James Grieve, a translator of À l’ombre des jeunes filles en fleurs, writes in his introduction to the second volume that

Proust’s reflections, his enunciation of philosophical and psychological truths…are often more importance to him than his verisimilitudes. His composition was often not linear; he wrote in bits and pieces; transitions from one scene to another are sometimes awkward, clumsy even…His paragraphing often seems idiosyncratic.

Far from being a virtuoso of words, or a fluent weaver of imaginative reality, Proust is in many ways inept, or amateurish, and it is in this way that we should appreciate him; the idiosyncrasies are what make In Search of Lost Time such a brilliantly bizarre novel.

14. Thomas Pynchon — Gravity’s Rainbow

15. J.D. Salinger — The Catcher in the Rye

Yes, I know, I should definitely have grown out of thinking this novel is great. Well, every time I’ve gotten back to it, convinced that this time, this time, I’ll realise that I am an adult, and that Holden Caulfield is an annoying idiot, and The Catcher in the Rye is a novel for teenagers, well, it doesn’t happen, and I could read him a hundred novels with him just going about his business, being judgemental and obnoxious inside his own head forever and ever. My liking him is somewhat beside the point, and perhaps proves my immaturity, so I’ll try to deal with why these critics are wrong, for the fact that they seem to miss the rather big reveal at the end that Holden’s been institutionalised, and the oscillation between two different periods of time in his narrative; a representation of his thoughts in the moment and his recollection, attest further to his divided state of mind. It’s a bit odd to hear literary critics condemn him so roundly when his curmudgeonly attitude surely doesn’t lack for a cause.

It’s a great testament to Salinger’s skill as a writer that the surface level of the text, a brash, abusive narrator, can seem so available, that going any deeper into it would seem wrongheaded, but I think he, like all unreliable narrators, provides you with a clue up front. The novel begins, after all, with an act of self-censorship, an invocation to silence, as Holden refuses to provide a holistic appraisal of his self or his place in the world, something that he dismisses as “all that David Copperfield kind of crap.”

16. Will Self — How The Dead Live

17. William Shakespeare — King Lear

18. Virginia Woolf — To The Lighthouse

Will Self’s Umbrella and post-modern modernity

12383443.jpg

As has been repeated in any number of the literary outlets which give Will Self column inches, Self has thumbed his nose at the British literary establishment, readers and writers alike, by returning to the ground zero of avant-garde prose writing in his trilogy of Umbrella, Shark and the forthcoming Phone. I held off reading Umbrella for some time, for the same reason that one generally doesn’t read a novel written by one of the authors that one might rate highly, sensing in advance that it will be in some way a disappointment, particularly when said author has set themselves the task of re-invigorating an dormant genre in which one is steeped in, on a semi-professional basis.

But I did listen to, and read, an awful lot of interviews in which Self spoke on why he’s returning to modernism as a wellspring for his own fiction. In one of these interviews, which unfortunately, I can’t seem to find, Self says that one of the things he was trying to avoid, was writing a post-modern version of modernity. At the time I heard it, I had no idea what that might mean, or what a post-modern modernity might look like. After having read Umbrella, whether Self intended it or not, I have a far better understanding of the phrase, because I think that a post-modern modernity is exactly what Self has stumbled upon in Umbrella.

The plot moves between roughly three time frames, centred around four individuals, the primary one being Zack Busner, a fixture in many of Self’s works, Busner generally functions as a composite of the author and the late neurologist Oliver Sacks. In Umbrella, Busner is a psychiatrist based in London, treating Audrey Death for her encephalitic lethargica, which has left her in a catatonic state for decades. In some parts of the novel, Busner is doing so in 1970, and in other parts, he looks back on the affair in 2010. While this is happening, the narrative will jump back to the Audrey’s early adulthood in the opening decades of the century, working in a munitions factory, getting involved in radical socialist circles. Her brothers, Stanley and Albert, are also focalisers of the narrative at points, albeit in very different ways. Indirect discourse and interior monologue are probably the two best known characteristics of modernist prose, and these two take the lion’s share of the novel’s foray into experimentation, allowing for the character’s voices to blend suggestively with the narrator’s, making it difficult to tell where Audrey, Busner, Albert and Stanley are speaking amidst the barrage of music-hall pieces, street rhymes and song lyrics. Side Note: Azaelia Banks and The Kinks feature. Unfortunately, Self generally does so through use of italics. Here’s a typical example:

The boyfriend hadn’t minded gotta split, man and Busner was split…a forked thing digging its way inside her robe. She fiddled with bone buttons at her velvety throat. His skin and hairs snagged on the mirrors, his fingers did their best with her nipples. She looked down on me from below … one his calves lay cold on the floorboards. There was the faint applause of pigeons from outside the window —

Italics are used here to allow us access to Busner’s mind, his memory, and for Lear references. There’s nothing bad in here (or in the novel overall, Self’s sentences are staggering for how rhymically attuned they are, particularly when he dallies with academic verbiage and sub-clauses to the extent that he does), the problem is you sort of know where these turns are coming from the typography. There was a ‘Remastered’ version of Ulysses published about six years ago, produced by Robert Gogan, in which the interior monologue appeared in italics. The three or four people in the world who care about such things were outraged at the simplification, seeing the text as having been purged of its ambiguity. I think this periodic italicisation is to Umbrella’s detriment overall; it substitutes a reading that might have demanded even more of you for a more surreal-looking typeface.

My own notion of Umbrella’s modernism would therefore be rather distinct from the identification made between Umbrella and this rather inflexible and monolithic modernism made in some literary journalism, because I don’t see it as modernist in the same way that the ‘men of 1914’ are modernists. Although they might have one thing in common.

will-self-1420801432

Self’s modernism is a selling point serving a rather specific function in today’s literary marketplace. Self’s modernism builds upon his persona as a surly performer on television news-panel shows and newspaper columns, going out of his way to discourage people from reading his books by his performative hauteur and dismissive attitude regarding everything. Returning to a praxis of literary art some six decades out of date is the logical conclusion of being Will Self. For Self, being a latter day modernist is to reject the commodification of the literary artwork, and insist upon the right of the author to write something wholly non-commercial. Umbrella therefore carries with it a critique of commodity culture, and the proliferation of screens, which Self also decries regularly, believing it to signal an end to the novel. However, the canard of modernism’s opposition to commodity culture has been overhyped after postmodern novelists made such a point of engaging with the novel as a commodity, and one should remember that modernism was deeply involved in the marketplace of its time; Ezra Pound began using zeitgeist-y words like ‘modern’ and ‘futurity’ to draw Marinetti’s audiences, who were substantially larger than his own when he first came to London. Performative modernism, cultivated for the purchasing attentions of a well-groomed and discerning élite is one of the things that Self gets right regarding his channeling of the genre.

Umbrella also seems to draw on modernism’s sometimes overlooked heritage, as it is at least somewhat to blame for the volume of secondary literature written subsequent to its boom and bust. From even a vague knowledge of these texts we might produce some foundational aspects of modernism; that it is taken to entail a shift in consciousness and human subjectivity, that exposure to slaughter and death on an industrial scale led to an ambivalence regarding technology and a sundering of rigid social hierarchies, an increasing mediation of our reality through mass media, growth of radical political movements such as feminism and socialism, etc. etc. etc. Our responses to these texts are thereby pre-determined; we know what we can expect from a canonical modernist text.

Which is why the modernism of Umbrella seems post-modern. It’s hard to read Audrey’s re-animation in the 1970’s, or Busner’s recollection of the time in 2010, as a meta-commentary on Umbrella’s resuscitation of the genre. The fact that Audrey worked in a munitions factory, as a radical socialist and feminist, that one of her brothers, Stanley, went to fight in the war, while her other brother, Albert, Pynchon-like, became an arms manufacturer selling weapons which fuelled the conflict, that in her comatose state she rehearses the actions of her time at the lathe, seems to have been dictated by our relationship to modernism in our contemporary setting. In the novel’s closing stages, Audrey’s status as a symbol of technology’s encroachment into our subjectivity is made overt:

The final words Audrey Death had spoken before relapsing into a merciful swoon were a string of nonsensical fractions — eighteen over four-point-two, ninety-four over fourteen-point-seven, sixty-six-point-three over thirty-three…that, even as he accepted the futility of the exercise, Busner had tried to fit into some conceptual framework. Were they, perhaps, the numerical analogue of her brain-chemistry’s intro-conversions between the discrete and the continuous, the quantifiable and the relativistic?

The irony here is that the paragraph in which Self is telling you exactly what the novel is about, features a character attempting to make sense of a random string of numbers. This is far from what the book is, a novel which has been compulsively over-determined in any number of columns, interviews and lectures which, taken collectively, probably come to a length equal to the text. While the modernists can be considered guilty of pushing particular interpretations — they often wrote about their own work, in the way that authors often do, by pretending to write objectively on other authors, The Waste Land came with annotations (parodic ones, but annotations nonetheless) — it feels as though Self’s foray into it is too overtly packaged as such. It’s probably my own fault for consuming it as I did, a book has to be sold after all, and no one made me read those six Guardian interviews. I should wrap up by saying that this novel is very good, and that you should read it, and, in true modernist style, ‘the rest is noise’.

A (Proper) Statistical analysis of the prose works of Samuel Beckett

MTE5NDg0MDU0ODk1OTUzNDIz.jpg

Content warning: If you want to get to the fun parts, the results of an analysis of Beckett’s use of language, skip to sections VII and VIII. Everything before that is navel-gazing methodology stuff.

If you want to know how I carried out my analysis, and utilise my code for your own purposes, here’s a link to my R code on my blog, with step-by-step instructions, because not enough places on the internet include that.

I: Things Wrong with my Dissertation’s Methodology

For my masters, I wrote a 20000 word dissertation, which took as its subject, an empirical analysis of the works of Samuel Beckett. I had a corpus of his entire works with the exception of his first novel Dream of Fair to Middling Women, which is a forgivable lapse, because he ended up cannibalising it for his collection of short stories, More Pricks than Kicks.

Quantitative literary analysis is generally carried out in one of two ways, through either one of the open-source programming languages Python or R. The former you’ve more likely to have heard of, being one of the few languages designed with usability in mind. The latter, R, would be more familiar to specialists, or people who work in the social sciences, as it is more obtuse than Python, doesn’t have many language cousins and has a very unfriendly learning curve. But I am attracted to difficulty, so I am using it for my PhD analysis.

I had about four months to carry out my analysis, so the idea of taking on a programming language in a self-directed learning environment was not feasible, particularly since I wanted to make a good go at the extensive body of secondary literature written on Beckett. I therefore made use of a corpus analysis tool called Voyant. This was a couple of years ago, so this was before its beta release, when it got all tricked out with some qualitative tools and a shiny new interface, which would have been helpful. Ah well. It can be run out of any browser, if you feel like giving it a look.

My analysis was also chronological, in that it looked at changes in Beckett’s use of language over time, with a view to proving the hypothesis that he used a less wide vocabulary as his career continued, in pursuit of his famed aesthetic of nothingness or deprivation. As I wanted to chart developments in his prose over time, I dated the composition of each text, and built a corpus for each year, from 1930–1987, excluding of course, years in which he just wrote drama, poetry, which wouldn’t be helpful to quantify in conjunction with one another. Which didn’t stop me doing so for my masters analysis. It was a disaster.

II: Uniqueness

Uniqueness, the measurement used to quantify the general spread of Beckett’s vocabulary, was obtained by the generally accepted formula below:

unique word tokens / total words

There is a problem with this measurement, in that it takes no account of a text’s relative length. As a text gets longer, the likelihood of each word being used approaches 1. Therefore, a text gets less unique as it gets bigger. I have the correlations to prove it:

Screen Shot 2016-11-03 at 12.18.03.png

There have been various solutions proposed to this quandary, which stymies our comparative analyses, somewhat. One among them is the use of vectorised measurements, which plot the text’s declining uniqueness against its word count, so we see a more impressionistic graph, such as this one, which should allow us to compare the word counts for James Joyce’s novels, A Portrait of the Artist as a Young Man and his short story collection, Dubliners.

Screen Shot 2016-11-03 at 13.28.18.png

All well and good for two or maybe even five texts, but one can see how, with large scale corpora, this sort of thing can get very incoherent very quickly. Furthermore, if one was to examine the numbers on the y-axis, one can see that the differences here are tiny. This is another idiosyncrasy of stylostatistical methods; because of the way syntax works, the margins of difference wouldn’t be regarded as significant by most statisticians. These issues relating to the measurement are exacerbated by the fact that ‘particles,’ the atomic structures of literary speech, (it, is, the, a, an, and, said, etc.) make up most of a text. In pursuit of greater statistical significance for their papers, digital literary critics remove these particles from their texts, which is another unforgivable that we do anyway. I did not, because I was concerned that I was complicit in the neoliberalisation of higher education. I also wrote a 4000 word chapter that outlined why what I was doing was awful.

IV: Ambiguity

The formula for ambiguity was arrived at by the following formula:

number of indefinite pronouns/total word count

I derived this measurement from Dr. Ian Lancashire’s study of the works of Agatha Christie, and counted Beckett’s use of a set of indefinite pronouns, ‘everyone,’ ‘everybody,’ ‘everywhere,’ ‘everything,’ ‘someone,’ ‘somebody,’ ‘somewhere,’ ‘something,’ ‘anyone,’ ‘anybody,’ ‘anywhere,’ ‘anything,’ ‘no one,’ ‘nobody,’ ‘nowhere,’ and ‘nothing.’ Those of you who know that there are more indefinite pronouns than just these, you are correct, I had found an incomplete list of indefinite pronouns, and I assumed that that was all. This is just one of the many things wrong with my study. My theory was that there were to be correlations to be detected in Beckett’s decreasing vocabulary, and increasing deployment of indefinite pronouns, relative to the total word count. I called the vocabulary measure ‘uniqueness,’ and the indefinite pronouns measure I called ‘ambiguity.’ This in tenuous I know, indefinite pronouns advance information as they elide the provision of information. It is, like so much else in the quantitative analysis of literature, totally unforgivable, yet we do it anyway.

V: Hapax Richness

I initially wanted to take into account another phenomenon known as the hapax score, which charts occurrences of words that appear only once in a text or corpus. The formula to obtain it would be the following:

number of words that appear once/total word count

I believe that the hapax count would be of significance to a Beckett analysis because of the points at which his normally incompetent narrators have sudden bursts of loquaciousness, like when Molloy says something like ‘digital emunction and the peripatetic piss,’ before lapsing back into his ‘normal’ tone of voice. Once again, because I was often working with a pen and paper, this became impossible, but now that I know how to code, I plan to go over my masters analysis, and do it properly. The hapax score will form a part of this new analysis.

VI: Code & Software

A much more accurate way of analysing vocabulary, for the purposes of comparative analysis when your texts are of different lengths, therefore, would be to randomly sample it. Obviously not very easy when you’re working with a corpus analysis tool online, but far more straightforward when working through a programming language. A formula for representative sampling was found, and integrated into the code. My script is essentially a series of nested loops and if/else statements, that randomly and sequentially sample a text, calculate the uniqueness, indefiniteness and hapax density ten times, store the results in a variable, and then calculate the mean value for each by dividing the result by ten, the number of times that the first loop runs. I inputted each value into the statistical analysis program SPSS, because it makes pretty graphs with less effort than R requires.

VII: Results

I used SPSS’ box plot function first to identify any outliers for uniqueness, hapax density and ambiguity. 1981 was the only year which scored particularly high for relative usage of indefinite pronouns.

screen-shot-2016-11-03-at-12-27-38

It should be said that this measure too, is correlated to the length of the text, which only stands to reason; as a text gets longer the relative incidence of a particular set of words will decrease. Therefore, as the only texts Beckett wrote this year, ‘The Way’ and ‘Ceiling,’ both add up to about 582 words (the fifth lowest year for prose output in his life), one would expect indefiniteness to be somewhat higher in comparison to other years. However, this doesn’t wholly account for its status as an outlier value. Towards the end of his life Beckett wrote increasingly short prose pieces. Comment C’est (How It Is) was his last novel, and was written almost thirty years before he died. This probably has a lot to do with his concentration on writing and directing his plays, but in his letters he attributed it to a failure to progress beyond the third novel in his so-called trilogy of Molloy, Malone meurt (Malone Dies) and L’innomable (The Unnamable). It is in the year 1950, the year in which L’inno was completed, that Beckett began writing the Textes pour rien (Texts for Nothing), scrappy, disjointed pieces, many of which seem to be taking up from where L’inno left off, similarly the Fizzlesand the Faux Départs. ‘The Way,’ I think, is an outgrowth of a later phase in Beckett’s prose writing, which dispenses the peripatetic loquaciousness and the understated lyricism of the trilogy and replaces it with a more brute and staccato syntax, one which is often dependent on the repetition of monosyllables:

No knowledge of where gone from. Nor of how. Nor of whom. None of whence come to. Partly to. Nor of how. Nor of whom. None of anything. Save dimly of having come to. Partly to. With dread of being again. Partly again. Somewhere again. Somehow again. Someone again.

Note also the prevalence of particle words, that will have been stripped out for the analysis, and the ways in which words with a ‘some’ prefix are repeated as a sort of refrain. This essential structure persists in the work, or at least the artefact of the work that the code produces, and hence of it, the outlier that it is.

Screen Shot 2016-11-03 at 12.55.13.png

From plotting all the values together at once, we can see that uniqueness is partially dependent on hapax density; the words that appear only once in a particular corpus would be important in driving up the score for uniqueness. While there could said to be a case for the hypothesis that Beckett’s texts get less unique, more ambiguous up until 1944, when he completed his novel Watt, and if we’re feeling particularly risky, up until 1960 when Comment C’est was completed, it would be wholly disingenuous to advance it beyond this point, when his style becomes far too erratic to categorise definitively. Comment C’est is Beckett’s most uncompromising prose work. It has no punctuation, no capitalisation, and narrates the story of two characters, in a kind of love, who communicate with one another by banging kitchen implements off another:

as it comes bits and scraps all sorts not so many and to conclude happy end cut thrust DO YOU LOVE ME no or nails armpit and little song to conclude happy end of part two leaving only part three and last the day comes I come to the day Bom comes YOU BOM me Bom ME BOM you Bom we Bom

VIII: Conclusion

I would love to say that the general tone is what my model is being attentive to, which is why it identified Watt and How It Is as nadirs in Beckett’s career but I think their presence on the chart is more a product of their relative length, as novels, versus the shorter pieces which he moved towards in his later career. Clearly, Beckett’s decision to write shorter texts, make this means of summing up his oeuvre in general, insufficient. Whatever changes Beckett made to his aesthetic over time, we might not need to have such complicated graphs to map, and I could have just used a word processor to find it — length. Bom and Pim aside, for whatever reason after having written L’inno none of Beckett’s creatures presented themselves to him in novelistic form again. The partiality of vision and modal tone which pervades the post-L’inno works demonstrates, I think far more effectively what is was that Beckett was ‘pitching’ for, a new conceptual aspect to his prose, which re-emphasised its bibliographic aspects, the most fundamental of which was their brevity, or the appearance of an incompleteness, by virtue of being honed to sometimes less than five hundred words.

The quantification of differing categories of words seems like a radical, and the most fun, thing to quantify in the analysis of literary texts, as the words are what we came for, but the problem is similar to one that overtakes one who attempts to read a literary text word by word by word, and unpack its significance as one goes: overdetermination. Words are kaleidoscopic, and the longer you look at them, the more threatening their darkbloom becomes, the more they swallow, excrete, the more alive they are, all round. Which is fine. Letting new things into your life is what it should be about, until their attendant drawbacks become clear, and you start to become ambivalent about all the fat and living things you have in your head. You start to wish you read poems instead, rather than novels, which make you go mad, and worse, start to write them. The point is words breed words, and their connections are too easily traced by computer. There’s something else about knowing that their exact correlations to a decimal point. They seem so obvious now.