Text Mining: Salem’s Lot

Standard

 

SALEM’S LOT (1975)

Alright, now that we’ve established there’s some preliminary evidence of a link between emotional sentiment peaks and the plot progress of a Stephen King novel let’s keep going so we can start to see if there are patterns and also to generate a corpus of King material that we can use for topic modeling and other fun supervised/unsupervised machine learning stuff.

So, let’s go to the Lot as it slowly turns into a vampire colony.

First of all, let’s start with the soundwave-graph that gives us an overall look at the sentiment values for Salem’s Lot.

SLSoundwave

This gives us a fun snapshot of what the book looks like, as a whole, but doesn’t impart as much useful data as the one for Carrie did. This is likely due to the fact that there is far more data points to draw from in Salem’s Lot than in Carrie. This is going to happen when you compare a 61,000 some-odd word novel to the 153,000 words that comprise Salem’s Lot. We’ll have to set boundaries and break things down to get anything useful. We’ll start with words.

SLContributionToSentiment

Here’s how much each word contributed to the overall sentiment. Notice that once again negative words completely overwhelmed their positive counterparts. Pretty expected for a horror novel, especially one as raw and creepy as King’s take on the Dracula mythos. “Fell” is once again up there but “Dark” and “Dead” win the sentiment sweepstakes with each racking up over 100 negative points to the total.

In further fact, the mean sentiment score for the book is negative. The most negative chapter clocked in at -75 and the most positive chapter was only scored at 46. The mean sentiment score for the Lot as a whole was -3.519. This score will mean more when we stack it up against other books, but for now the takeaway is that Salem’s Lot is overall a pretty negative book.

In fact…

SLOverTime

Unlike CarrieSalem’s Lot actually has a linear relationship between the sentiment score and the chapter number; that is to say, the book starts at roughly zero and gets more negative the further in we go. An examination of a linear regression model fit using this data shows us that there is a statistically significant relationship between sentiment and chapter (P=0.000541, which satisfies even the “P must be less than 0.005” people), although it is a rather weak relationship (R2 = 0.06417, or not damn much). This is an interesting observation on it’s own, but it’ll be one that looks more or less interesting depending on whether this sort of relationship is the same or different across books, authors, and genres.

So, as with Carrie, the best way of fitting the sentiment plot graph to the actual plot progression of Salem’s Lot is to watch the peaks. In this case, the magic number seems to be 25: that is, most sentiment in the novel is scored between 25 and -25; the peaks beyond these points are where the real action happens. To make the data easier to read there will be two graphs, the first illustrating where the peaks over 25 sentiment occur (positive guide posts) and where peaks under -25 occur (negative guide posts).

SLHighPoints

SLLowPoints

Notice that the last positive guide post occurs just over a quarter of the way through the book. Did I mention it gets negative over time?

Our first positive guide post is 4, which is a news report after the fact about weird disappearances in the sleepy Maine town of Salem’s Lot; it’s matter-of-fact Seventies-newsman language is where the positive score is coming from. What it’s describing is ominous, and after the fact it’s rather horrifying, but it’s a brisk, newsy account stripped of the sort of “horror language” that King uses elsewhere.

12 is the first major high point and there’s good reason for that: it’s the chapter where writer-protagonist Ben Mears finds himself falling for restless heroine Susan Norton. They get along like a house on fire and that really drives the sentiment score upward for this scene.

16, the first negative guide post, is the flip side of this; this is the scene where Ben talks to Susan and explains his lifetime obsession with the Marsten House, and we also get some background on Hubie Marsten, the ostensible Man of the House. Hubie was a character – made his money with the mob as a killer, murdered his wife, hung himself, but not before inviting the ancient vampire Barlow to come shack up at his house when the old bastard got a chance.

We then have a pair of guideposts, first negative then positive, although the language here is deceiving; the negative chapter introduces us to a very positive character, the young Mark Petrie, while the positive chapter introduces us to Straker, who is the Renfield to Barlow’s Dracula. The Mark chapter details his fight with the schoolyard bully (hence the negative language score) and the Straker is much more netural and business-oriented. Like the newsy scene at 4, Straker’s scene at 28 is told in very positive, business-oriented language. Larry Crockett, the Lot’s premier real estate wheeler and dealer, trades the Marsten House and an old laundromat for $4 million in land elsewhere. The real underlying context of the chapter, of course, is contained in the last line.

When they crossed the street, Lawrence Crockett was thinking about deals with the devil.

Then there’s a period of lull where the sentiment scores slowly begin to creep up into wider ping-pongs; while none of them manage to cross the 25 threshold, there’s some definite logic to the peaks contained in this section. The high points at 33 and 52 concern Ben Mears meeting and winning over Susan’s father, if not so much her mother. The low point at 42 is the creeping scene where Royal Snow and Henry, a couple of Crockett’s day labourers, unload the vampire’s casket from the ship it came on. None of them are particularly important points in the plot line – hence their being below the magic number for the book – but they add some colour and their sentiment-context match bears mentioning.

The high point at 57 breaks the threshold finally and with good cause; the scene is where Ben Mears is introduced to the Van Helsing of Salem’s Lot, high school English teacher Matt Burke. King, meet King. It’s a major scene, since the two of them together are going to do some serious damage to Barlow. It’s also the last real positive height of the novel, since the final peak, 61, is carried over into being a positive guide post on the strength that it’s Danny Glick’s funeral, and Father Don Callahan’s Christian invocations at the ceremony make what is a rather dark experience – the funeral of a child – into a positive sentiment peak. That it is a peak makes sense from a plot standpoint: Danny Glick’s funeral is where it all goes south, and everything that comes after that is the chronicle of the spread of vampirism in the Lot. However, it’s marked as highly positive because much of the Christian language employed by Callahan in this scene is in the positive section of the NRC sentiment dictionary.

“He was nourished with your body and blood; grant him a place at the table in your heavenly kingdom. We ask this in faith.”

62/63 immediately follow and plot out important scenes. 62 finds gravedigger Mike Ryerson meeting the vampire, in particularly creepy fashion at the cemetery as he’s trying to bury the coffin of Danny Glick. 63 finds Mark Petrie wrestling with the concept of death – an important factor in his actions and character growth throughout the rest of the book.

After that, as the linear regression model shows, it’s negativity all the way down. Not marked but worth mentioning is 92: it’s the peak that’s bang on -25 and thus just barely missed getting marked out by the graph. That scene is the beginning of the second half of the book, “The Lot (III)”, where the spread of the vampire really stops ratcheting up. In particular, 92 is the scene where dull, violent Sandy McDougall discovers that the infant son she’s been abusing has succumbed to being a vampire.

The remaining negative peaks tell the rest of the story. At 103 Mark Petrie discovers that vampires exist when Danny Glick shows up floating at this window begging to be let in. The even-lower scene at 114 is when the Lot’s good doctor Jimmy Cody also discovers that vampires exist, this time with Ben Mears. The similarly negative scene at 118 details the ill-fated trip by Susan and Mark to break into the Marsten House and figure out what was going on. This is the scene where Susan more-or-less dies; at any rate, she’s captured by the vampire and put out of play. Mark barely escapes with his life but that’s in a later scene, of course.

They found themselves listening to the silence, fascinated by it. There did not even seem to be the faint, high hum that comes in utter stillness, the sound of nerve endings idling in neutral. There was only a great dead soundlessness and the beat of blood in their own ears.

The final two negative peaks are the only two scenes that really matter in the climax of the book. The first is where Callahan faces the vampire in a battle of Christian faith versus ancient evil and loses. He eventually redeems himself, of course, but that’s not here: here, Father Callahan fights the vampire and ends alive but unclean and cast out. It’s a climax for the Barlow movement, since it caps off the deaths of Susan Norton, Matt Burke, and something like 80% of the minor characters that people the Lot. The second is the reverse – where Ben and Mark finally face Barlow and defeat him. It’s markedly negative, the most negative scene in the entire book. It starts, in the background, with the death of Jimmy Cody even as Ben and Mark discover the whereabouts of Barlow’s coffin, which he has hidden beneath the very rooming house that Ben had taken up residence in at the beginning of the book. It ends with Barlow’s writhing death at Ben’s hands, and Ben and Mark fleeing with a vow to come back and finish the entire vampire colony off.

After that final negative guide post we get the denouement, which like Carrie peaks a little positive before ending in a negative downturn. The final downturn is another whole mess of death, this time of all the vampires remaining in the Lot, by fire.

So what can we learn from this? Well, a few observations. There is a definite pendulum effect in both Carrie and Salem’s Lot where scenes bounce back and forth between positive and negative sentiment. Like Carrie, there are also three sections of middling sentiment that lie between sharp peaks in sentiment – stable periods of build-up. The period between 63 and 103 becomes a bit wilder, sentiment-wise, near the end of it, and slowly turns downward in terms of negative sentiment; the periods between 118 and 145 and 145 and 174 have wild swings right near their peaks as well but remain much more stable in terms of overall sentiment.

Another thing is that Salem’s Lot is, as we can see from the visualized data above, a natural fit for the stages of plot development. There is a section of exposition encompassing guideposts 4, 12, and 28 – meet the town, meet Ben and Susan, meet Straker and by implication Barlow. Mark’s introduction comes much later, but his introduction always feels late to me, everytime I read the book, so this makes some sense to me. Following 28’s sentiment peak we see a period of rising action (or, in this case, “lowering action”) as the language in the book slowly becomes more negative. This rising action hits bigger and bigger negative peaks until we hit the climax at 174. Following that climax we have the denouement as mentioned above, with a little downturn-twist on the tail.

Coming soon: Rage and the beginning of the question of King vs Bachman and the visualization of that data.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s