Literary Fun With Text Mining


My wife is doing her PhD in political science on the topic of political interest groups and how they use social media to disseminate information and reach new audiences, and how they utilize this new(ish wow we’re old) medium to effect voting behaviour. Part of this has meant learning how to mine Twitter data and analyze it through the R programming language; in order to provide technical support and to have someone to troubleshoot coding issues, I’ve also been learning to use R to mine and analyze texts. What I’ve been concentrating on, in order to learn the language and the processes, is using it to mine and visualize data gathered from fictional texts, specifically the bibliography of Stephen King. What I want to do is to analyze plot trajectories drawn from sentiment data – quantitative measures of emotional sentiment words based on established dictionaries used for that sort of thing. Research questions on this would include things like: is there a pattern that King has for his plots, based on emotional language cues? Is this pattern, if any, different from other well-known horror writers? Furthermore, are there established “archetypal” emotional plot patterns for horror books, and do these patterns differ when you switch genres – say, to fantasy, military science fiction, paranormal romance, etc. etc. down the fracture lines of human experience.

So to start I’ll be going book by book through the King bibliography and presenting what are basically preliminary findings based on the sentiment dictionaries included in the quanteda package for R: Afinn, Bing et al, and the NRC emotional sentiment dictionary. Ultimately none of these will be ideal; a custom dictionary for emotional sentiment specifically in literature would be necessary to really capture a more accurate picture, but this is where linguistics comes in and I don’t have much formal training in that area. My on-paper expertise is in English literature and political science, and while Stuart Soroka’s Lexicoder program is what I’ll likely use to build and code the custom dictionary, the Lexicoder topic dictionaries that exist to date are meant to examine political speech rather than literary texts. Building my own will take a lot of time and research.

This should be fairly quick work up until about 1989 or so – The Dark Half, at any rate. I happen to think it’s the absolute nadir of his oeuvre, a self-indulgent author-insert story used to deal with the professional regret of outing his pseudonym and failing his dream of being Donald Westlake/Richard Stark. From Carrie to Tommyknockers, though, I’m familiar enough with the books that I can look at specific chapters pointed out on the graphs and quickly grasp what they mean in the story as a whole. After that…I’m going to end up having to read a number of King books I haven’t actually read yet, like Dolores Claiborne, The Girl Who Loved Tom Gordon, and any of the newer crime trilogy books he’s written. I mean, I guess it’s as good an excuse as any, right? I plan on running the data through the process and then reading, to see what kind of predictive power is embedded into the visualized data.

There is minimal pre-processing being done to the texts. They are epubs that are being converted to .txt files through calibre. They are then trimmed to get rid of all the extraneous matter – the list of other books, the reviews of other books, the acknowledgements, the endless introductions, and in some cases the other stories tacked on after the main story is finished. An example of this is the inclusion of “One For The Road” and “Jerusalem’s Lot” at the end of “Salem’s Lot”. Both stories are included in short fiction collections so that isn’t much of a concern. The texts are also gone through to ensure a certain similarity in terms of chapter breaks; these are necessary since chapter breaks are how I am tracking plot progress as the x variable. Carrie, for example, has no chapter breaks; breaks were inserted at the beginning of each epistolary passage, since those marked natural breaks in the story. Salem’s Lot meanwhile has sixteen chapters, but each of those chapters has multiple sub-chapters within them; these were all used as breaks, after converting each one to a “Chapter (n)” format. Rage and The Stand, the two other books I have to date processed, have luckily been blessed with a more normal chapter break format. One other I know off the top of my head that will require greater pre-processing is The Running Man, since it has that weird (n) And Counting chapter heading.

Some explanation of the text mining process and a number of glowing recommendations of the work of Julia Silge will follow, and the data visualization of Carrie.


Uncle Acid & The Deadbeats – The Night Creeper


Uncle Acid & The Deadbeats – The Night Creeper

Uncle Acid & The Deadbeats are as The Sword were – a Sabbath-worshipping doom metal band with a serious case of the groove.  Where The Sword have moved on to allowing other Seventies hard rock icons into their sound, Uncle Acid have chosen to stay the course.  If you caught Mind Control two years ago, The Night Creeper will seem instantly familiar, and this can be good or bad depending on your thoughts on recycled Sabbath riffs and rotted psychedelic atmosphere.  Even if you’re getting tired of bands reappropriating classic rock, however, there’s a lot to like about The Night Creeper.  This is serious head-nodder music; the band can mine a groove like very few other bands, and they can manipulate the flow like masters.  When “Inside” starts bouncing after the hazy dream-inducement of the title track, you’ll start jumping up and down without even realizing it.  It’s “Slow Death”, though, that brings the creepy horror film vibe to it’s peak, with it’s atmosphere of dread fueled by a musical space that is not often found in Uncle Acid songs.

What it comes down to is this:  if you haven’t soured on hard rock riffs after all of these years, then The Night Creeper is right up your alley.  Otherwise, take a pass.


Chelsea Wolfe – Abyss


Chelsea Wolfe – Abyss

Abyss is, at first blush, loud and crushingly heavy.  This is, of course, not new territory for Chelsea Wolfe; the L.A. singer-songwriter has claimed black metal, doom, drone, and dark ambient music as her influence since the very beginning.  Compared to her last album, 2013’s Pain Is Beauty, however, it’s practically a doom metal album in its own right.  A good deal of this is the presence of guitarist Mike Sullivan, whose post-metal group Russian Circles sets the standard for crushingly heavy guitar work.  The very first moments of “Carrion Flowers” make for the most oppressive sounds Chelsea Wolfe has ever engaged in, and the way her dusky voice cuts through the thickness is a moment of sheer frisson.  The album cover sets the tone perfectly:  the singer falling into deep water, sinking beyond breath, light, and life.

Unlike many of her influences, however, she manages to expertly balance oppressive heaviness with passages of lighter (though no less eerie) folk work; “Iron Moon” is the standard-bearer for this, shifting from the pound of sledgehammer guitars to fingerpicked strings and vocals with ease and a deftness of which a thousand grunge bands from two decades prior could only dream.  “Maw” and “Crazy Love” focus more on the quieter parts, outlining a masterful interplay between acoustic instrumentation and the singer’s emotive voice.  She even manages, on “Grey Days”, to incorporate programmed drums without having it sound out-of-place, or like bad Evanescence.  It’s gothic-tinged rock done correctly, without angst or pandering to the over-makeup’d karaoke set.

Abyss takes Chelsea Wolfe’s music to a new, heavier level that plays up her influences while still keeping the proceedings firmly in her own camp.  At times it feels as though the music is creeping out of your speakers to surround you, and smother you in darkness.  Rather than go over-the-top in this, like many of her influences, she keeps her music agile, dynamic, and always interesting.

The Nightmare Collective: A New Horror Anthology



THE NIGHTMARE COLLECTIVE: AN ANTHOLOGY OF SHORT STORIES, a great site for general horror, has released their first curated anthology of creepy, terrifying, spine-tingling tales.  Why am I telling you this?  Because I’m published in it, of course.  You can check out 12 killer pieces of short fiction, including my own “The First Mark Of Survival”, by grabbing it from the Amazon link I’ve very helpfully provided.

Don’t forget to leave a review when you do grab it!