Walkthrough of Textual Analysis Through R

The attached PDF (used because not even .txt files, let alone .r ones are supported by wordpress) is a 100-some line code in RStudio that I’ve used for some basic forays into textual analysis.

It is essentially useless, and quantifies three textual phenomena: richness of vocabulary, density of indefinite pronouns and density of hapax legomena, or words that appear only once. All measures are obtained from sequential samples of words, the size of which is based on the size of the text that is ‘fed’ into the code.

The measures are essentially useless; all variables are essentially contingent on one another, that is, if uniqueness goes up, indefinite pronoun density would have to go down, and hapax density would go up, though not to the same extent that indefinite would decrease, since these last two are arbitrary groupings of words, of course their increases would be to uniqueness’ detriment. Mostly I just needed to get some code up and running for a statistics project.

Comments are included to give a sense of what each line is doing, because not enough people using R for literary analysis do that.

Thanks are due to Matthew L. Jockers for his book, Text Analysis with R for Students of Literature, which I found literally indispensable.







8 responses to “Walkthrough of Textual Analysis Through R

  1. Are you sure this isn’t a Jackson MacLow poem?

    Liked by 1 person

  2. This post is neither analog nor humanist.

    Liked by 1 person

  3. Hey, I’m a philosophy/English double-major who writes SDK documentation for a living, I won’t cast any stones. : )

    Liked by 1 person

  4. Well, is not one of the roots of Humanism a tradition of Biblical exegesis not a million miles removed from what this script does?

    Liked by 1 person

  5. Pingback: A (Proper) Statistical analysis of the prose works of Samuel Beckett | Analogue Humanist/Chris Beausang

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s