Text Mining: Cujo

Standard

When it comes to my least favourite King novels, Cujo is third. Why? It’s disjointed, for one; a lot of the book is taken up by the foibles of the Sharp Cereal Professor and honestly I can’t bring myself to care enough about the dying art of marketing kid’s cereals in the early 1980s. Also, the Trentons are not sympathetic characters. Look, I’ve written elsewhere about how your characters don’t necessarily need to be likable. I’ve gone off at length about how needing your characters to be the reader’s best friend is just a trap that encourages an immature fanbase that will rise up and kidnap you when you decide to kill those characters off…

Wait, actually, I think that was Misery.

Anyway, the Trentons are middle-class assholes. Vic somehow seems to be completely oblivious to the fact that his wife is feeling trapped in their bourgeois life of comfort and Donna spins in place, rebelling against all that awful success that has her feeling like youth has passed her by. Rather than, I don’t know, taking up a hobby or volunteering in her community, she decides to lie, by having an affair with the dirtbag painter/poet who seems more than a little nuts. I mean, if you want to have sex with someone else because you’re feeling bored, maybe communicate some of those feelings with your significant other. So many of the problems in this book could have been solved if the characters would just talk to each other about more than petty surface-level bullshit. Meanwhile, little Tad is having the psychic equivalent of a nervous breakdown but his parents are too self-absorbed to even notice.

The Chambers family, meanwhile, is much more sympathetic, despite being the nominal antagonists, sort of. Cujo is the antagonist, sure, because he kills a lot of people (including Sheriff Bannerman from The Dead Zone) but he’s just a rabid dog. Poor guy. Joe Chambers should have gotten him his rabies shots, but in the end Cujo’s rabies are the product of a very tragic, very archetypal human story. Joe is a solid, working class man who’s been bitten both by the drink and by the bigotry of low expectations. He’s an alcoholic, sure, but his place in rural Maine society is such that no one expects anything else out of him. He drinks, fucks off work, fishes with the boys, and lives that good ol’ boy life that marks him out as a certain sort of person. His son Brett is following quickly in his footsteps, and that’s why Charity Chambers is desperate to get him away from his father’s influence. When her lottery win comes along, it’s the perfect chance to do just that, and herein is a far more interesting tale than the Trenton’s narcissistic story. Will the boy become the man? Will the son become the father? Can Charity make sure her son grows up right, and not become the latest in a long line of Maine tosspots?

Plus, King breaks the cardinal rule of mainstream horror, which is that if one of your main characters is a kid, they have plot armour. When Tad dies, I prefer to think of it as the cocaine talking, rather than King, because he should know better. He’s seen all the movies. I know, I’ve read Danse Macabre.

Anyway Cujo looks like this:

CujoSoundwave

So we have here a book whose major emotional volume seems to occur at the beginning and through the middle; the end tends to be quite a bit quieter.

CujoDistro

Unlike a lot of King’s previous books, the sentiment in Cujo seems more evenly distributed. Also of note is that nothing goes over the +20 mark. The full stats range looks like this:

Min: -61

Max: 19

Median: -9

Mean: -12.36

Fairly negative, as far as King books go.

CujoNegative

There’s a lot of negative spikes here but there’s three that stand out as guideposts for the book: 10, 45, and 77. They divide the book into three basic parts, with the last one of course being quite short (which seems to be a King hallmark). 10 kicks off Cujo’s downward spiral, 45 is where Donna and Tad arrive at Chez Chambers only to find a rabid dog prowling the yard, and 77 is where Vic realizes that Tad and Donna are missing and the whole “Tad dreaming about dying” thing comes crashing down on him.

Note that the negative sentiment peak isn’t even where Tad dies. That’s not even 80, the little spike that occurs after – that’s where Cujo dies. Tad’s death is relegated to the muckery that comes near the end of the book, near to the zero-mark. Like Wilson in The Naked And The Dead, Tad Trenton’s death occurs between one moment and the next, and his passing isn’t even surrounded with sentiment cues.

Which brings up my ultimate beef with this book: Tad exists in this book just to die. We don’t focus on him much, his story gets lost among everyone else’s and he exists in the car just to provide a foil for Donna to try to keep going. He’s a sketch of a character and his death is used as a token in the ongoing saga of the Trenton Marriage.

So there’s that.

Finally, the word contributions:

CujoWordContri

Note that “hot” is quite misleading here; Tad dies of thirst in a hot car, but “hot” is counted as positive here, meaning the book is probably more negative than the mean indicates.

Advertisements

Text Mining: Firestarter

Standard

Firestarter: another classic King tale of a troubled young girl who develops strange psychic powers and uses them to literally burn people alive. Charlie and her dad are chased by a mysterious U.S. alphabet agency bent on weaponizing the intersection of science and paranormal research. Half the book is the chase; the other half is the catch, and that combination makes for some interesting results, as we’ll see.

The stats:

MIN: -237

MAX: 28

MEDIAN: -1

MEAN: -5.734

Just as a quick aside, the min/max values themselves aren’t going to show much in the end, as a lot of it depends on length of chapters and how King divides them up (or in cases how I divide them up, like when there are no traditional chapter breaks. I include them for the sake of completion but they’re not neatly comparable across texts. The median is a little more useful but the mean value will likely be the most useful statistic that comes out of the NRC sentiment analysis.

So, Firestarter looks like this:

FirestarterSoundwave

Notice the values on the Y axis. Negative scores range down to -1000, although only on a couple of occasions. There are two bombs dropped in terms of emotional sentiment in this book, and, interestingly, not a lot of major activity for large parts of the book.

There’s some mild activity right at the beginning, as we go in media res to Andy and Charlie’s first escape from the Shop. Then there’s nothing until the early 30s when the showdown at the farm happens. Then the twin bombs, without much between or after them.

FirestarterDistro

In fact, 90% of the book is ranged between 50 and -50, which is pretty calm overall.

FirestarterNegative

Based on where most distribution tends to fall, these are the peaks. Note there is movement along the line but it remains within a definite range, briefly going down in a very minor way in 31 and 37 – the former where we meet psychotic antagonist John Redbird, and the latter where Charlie blows the shit out of the Shop men who’ve come to the farm to abduct her.

In fact, except for 53 and 66, it keeps an even keel. 53 and 66, incidentally, hit those peaks because they are a lot longer – all the big action happens in them. 53 is where Charlie and her dad finally get picked up by the Shop. 66 is where everyone confronts each other and the Shop gets ripped apart and Charlie’s dad finally finishes the job of dying.

This who pattern is interesting because of a Goodreads review I came across. GR user Councillor panned it with a one-star rating and his thoughts on it can be summed up like this:

“Firestarter was one of the most boring, long-stretched, boring, uninspiring, badly-written and – did I mention this already? – boring novels I have read for a long time.”

At one point he bemoans the idea that there is nothing to sink yourself into for 350 pages and guess what? The sentiment analysis bears this out. There are a number of stretches of this book where nothing happens, emotionally speaking. The line graph above shows what looks more like a slow slog than anything else.

Look, I liked Firestarter, but there’s definitely a whole lot of nothing going on for long parts of it. When it hits, it hits hard, but a lot of it is Charlie being a scared kid and Andy trying to soothe her while worrying about blowing out his brain, intentionally or accidentally.

FirestarterSmoothLine

The smooth line graph shows this quite well, I feel. From 0 until about 50 or so there’s nothing going on in terms of an up or down movement in sentiment. That’s 62.5% of the book. Councillor is really on to something here: unlike a lot of other King books, this one looks stretched out, staid, and not very interesting if examining it solely on a statistical basis. Some objective empirical evidence to support a feeling about a book is always a result of interest.

To cap it off, here are the word contribution scores:

FirestarterWordContri

 

 

Text Mining: The Dead Zone

Standard

You want to talk about an out-there outlier for what we’ve seen of Stephen King’s bibliography so far, let’s talk about The Dead Zone.

A quick run-down: John Smith suffers a head injury as a kid but comes out mostly ok. Greg Stillson is a crazy but wildly charismatic traveling salesman. Johnny becomes a teacher, falls in love, and then is driven into a coma by a car accident. When he emerges he has wild psychic powers where he can touch people and know both their secrets and their future. He endures some tabloid celebrity, solves murders, tries to keep teaching and being normal, saves some kids from dying, and then discovers that Stillson, now running for office, is going to win and eventually become President briefly before destroying the world in a nuclear holocaust. Johnny becomes a would-be assassin, dying but also revealing Stillson to be a huge coward and an electoral loser after he grabs a kid as a human shield. It’s a timely examination of the American hunger for an end to the seemingly endless corrupt two-party circus and a bit of a satire of the then-blossoming American Tabloid market.

Continue reading

Text Mining: The Shining

Standard

I completely skipped The Shining somehow, so we’ll circle back and do that one now.

The Shining (1977)

Stephen King’s third novel finds him cycling through doing his own take on all the classic horror bits: the avenging revenant of Carrie, updating Bram Stoker’s Dracula to the modern (in 1976) age in Salem’s Lot, and now the Haunted House – in this case, a whole haunted hotel. There’s an element of Shirley Jackson’s The Haunting Of Hill House in Salem’s Lot as well; the house that the villain Barlow moves into in the Lot is a long-time haunted house inhabited by cursed individuals.  The Overlook Hotel has been the destination of rich, shady people since it’s inception and by the time full-time alcoholic/on-his-last-chance writer Jack Torrence comes around to be it’s winter caretaker, it’s charged with their energies: the awful, unspeakable emotions that were left behind and whose ghosts now bestow a strong, malevolent force of will upon the hotel.

Making matters worse, Jack’s sweet son Danny has psychic powers called “shining” or “the shine.” These powers lend him a sensitivity to the problems with both the hotel and his father; he’s taught something of them from Magic-Negro-In-Chief Dick Hallorann, After the winter storms come to the mountains of Colorado, Jack, Wendy, and Danny Torrence are locked away in the hotel, dealing with boredom, dread, and madness. As you might imagine (or as you probably saw in Kubrick’s adaptation), murder ensues, although in the book it doesn’t actually ensue. In comparison to King’s other novels there’s very little outright death – funny, given it’s a haunted house novel full of ghosts. The only person who dies is Jack himself, though; Wendy, Danny, and (despite what you saw on film) Dick all survive to relax by a pool in the novel’s last chapter. The dead that haunt the Overlook are the manifestations of dead memories; the people they depict didn’t necessarily die at the hotel, save of course for the bathtub suicide. In this sense, Kubrick’s film misses the point entirely.

The difference between the book and the movie goes like this:

The Shining, the Stephen King novel, is a book about a man struggling to overcome his addictions and his own innate nature in order to be the best husband and father he can be for his family. It’s a book about how the past is a millstone hung around our necks by our parents, and about how that millstone will try to drag us down whenever we let our guard down. Jack Torrence’s father was a sodden, abusive drunk and Jack Torrence is afraid that he’s really no different. He knows the hotel gig in Colorado is his last chance; if he fucks this up, his wife and kid are leaving him and he’s probably going to kill himself. He struggles so hard against it but eventually it drags him under; he manages to defeat that base nature in the end and saves his family, although he destroys both himself and the hotel. The ghosts in the hotel are a part of this metaphor – they are the negative, awful emotions of the past lingering to trip Jack up and make him into what he most fears.

The Shining, the Stanley Kubrick film, is a movie about a haunted hotel.

ShiningSoundwaveTrue

This is actually one of those instances where the initial soundwave graph can be a little misleading. Looking at this one might think that the second and third quarter are where the peaks are, but they’re really just showing the overall volume of those chapters, in terms of emotional sentiment. Other graphs show some different takes, which reveal something that brings us back to Salem’s Lot.

ShiningLeastSquaresScatterPlot

Yes, that’s right ladies and gentlemen, just like Salem’s LotThe Shining has a statistically significant negative linear relationship of sentiment over time. The coefficient here is -0.5648 – that is, for every chapter that goes on we get an average drop of about 0.56 in emotional sentiment scores (P=0.0179). Something interesting I found, in terms of it’s similarity to Salem’s Lot, lies in a question asked on Goodreads. GR user Beesarahlee asked:

This is my favorite Stephen King book,I love the how it slowly stresses you out! Anyone on this site know of any other books that are very similar to this?! I love anything to do with witches and vampires! Thanks for the help!

Well Beesarahlee, you are exactly correct. It slowly gets more negative over time in terms of it’s sentiment scores, and so it does in fact slowly stress you out. Excellent observation, and one that we can empirically prove. How much fun is that?

Anyway, The Shining does the same thing. It starts off near neutral and carefully structures the positive and the negative so that it creeps toward the bottom at an even pace. Rather than striking big emotional chords at strategic times, it slowly terrifies you, like a boa constrictor, or that proverbial frog in a pot of water that’s really just a metaphor for human extinction.

ShiningSmoothLine

The smooth-line graph shows much the same, with the interesting caveat that there is a brief reprieve somewhere in the mid-30s that doesn’t last long before the descent picks up speed again.

ShiningHist

The distribution histogram, though, shows that a good third of the book is actually positive sentiment, which is strange to see given the basic stats for it:

Min: -94

Max: 47

Median: -14

Mean: -19.28

That’s quite a low mean sentiment score, given our other examples. The Stand and The Long Walk are the only ones with lower mean sentiment scores, including two books I have the stats and graphs for but haven’t done write-ups on yet. Yet a third of the book is above the zero line.

ShiningPosti

Here are the positive sentiment peaks for the book, as well as a look at the heartbeat\line graph for the overall piece. 9-12 (with 11 to break up the flow) look like a plateau before everything starts to go to hell; if I had to guess without examining the text, I would think it’s the part where they first move into the hotel and Jack/Wendy/Danny get lulled into thinking that everything’s going to be okay, that they’ll pull through the winter, Jack will finish his play, and life will get better from there.

Examining the text, it turns out I’m partially correct, just moved slightly “to the left”. 9 is where the family meets with Ullman, technically Jack’s boss and the manager of the Overlook Hotel. 10 introduces Hallorann, 11 is actually the introduction to the concept of “shining” (and boy are there some disturbing things embedded in there, mirrored in the sentiment) and 12 is the “grand tour” of the Overlook. Later, on the other side of the crash, at 44, Jack dances and drinks and has a grand old time at the Hotel’s Party For The Dead – the glitter and glamour of the glammer outweighs the sheer amount of mentions of “blood” in that chapter, just to show you how shiny and glorious it seems to Jack.

ShiningNegative

More negative peaks, of course, but a few things of interest. 16 and 46 are the big negative spikes, and 50-57 is an entire valley of negative sentiment.

16 is where Danny starts falling into weird trances because of the hotel – dreaming about a crashing madman coming through the hallways, getting stung by creepy ghost-wasps, and first croaking “REDRUM” which everyone knows and loves from the film adaptation. It also is the first real kickstart of Jack’s drinking-without-drinking downfall.  46, on the other end of a lot of negative emotion, is where Wendy and Danny play cat-and-mouse with Jack in the hotel amidst a screaming blizzard, and manage to lock him in a room. They make good guideposts for the progression of the novel: 16 is where the hotel actively starts trying to do harm to the family, and 46 is where the hotel has to ramp up it’s efforts to kill Wendy and Danny. The sentiment scores between 16 and 46 are low but higher than that between 50-57, where the hotel attempts it’s endgame and it’s only through Jack’s honest love of his family that he saves their lives, if not his own. It’s interesting, then, to see that the sentiment scores here match the severity of the antagonist’s efforts to do harm to the protagonists.

I guess that while both Salem’s Lot and The Shining have that negative linear relationship between sentiment and time, they have it for different reasons. In Salem’s Lot it’s because, as beesarahlee mentions in her GR review, it is structured to slowly stress you out and get more negative over time. In The Shining it’s because the book is structured in way that brings to mind shifting gears: it starts at one level, a big spike happens, and then it goes on for a time at a lower sentiment level, until it bottoms out right near the end. Not the end, of course, because the final chapter features Wendy, Danny, and Dick relaxing and planning out how to get their lives back in order and it shoots us back up to positive – a happy ending, which is not always in the cards for later King books.

To finish off, the word contributions:

ShiningWordContri

 

 

Text Mining: The Long Walk

Standard

Now that we’ve established that there is a link between key scenes in the plot progress of a Stephen King novel and mapped sentiment peaks coded from the text, we can spend significantly less time on analyzing each peak to show this. This will allow us to go through books with a little less ponderous text.

The Long Walk (1979)

The Long Walk is another short Bachman novel about sexually frustrated young men. This time it’s about the contestants of a gruelling, cruel national sport instituted after America’s loss in the Second World War and the institution of military rule by “The Squads.” The backdrop is briefly described but evocative for that when it is mentioned. At any rate, the protagonist is one of 100 contestants who start the Long Walk. They have to keep walking at a certain speed or they are shot by soldiers who are driving around beside them. They get three warnings to get their speed back up, otherwise the guns ring out and down goes another contestant. It’s a pretty horrifying idea when it comes right down to it, if only for how weirdly plausible it is given the modern love of both spectacle and fascism. It’s also pretty psychologically taxing, especially once the weakest contestants die off and it becomes a game to walk your opponents into the ground.

Continue reading

Text Mining: The Stand

Standard

The Stand (1978)

So…it may behoove you to know that The Stand, King’s gigantic, bloated, sprawling epic, was picked by American adults in 2008 as their fifth-favourite book of all time. The Bible was #1 – this is America that was being polled, after all – but The Stand kept company with other books you may be familiar with: Gone With The WindThe Lord Of The Rings, and the Harry Potter series. Generational touchstones, in other words. As a further fact, Generation X picked it as their #1 favourite (again, behind the Bible). That’s some big company, so an examination of this one should yield some interesting results.

Continue reading

Text Mining: Rage

Standard

Rage (1977)

Today we turn our attention to the first Richard Bachman book, Rage, a book that lives up to it’s name in as pure a fashion as you could imagine. If you haven’t found a copy of this yet, you might want to get on that: they aren’t making any more of them, at the behest of the author. As the events depicted in the book came into depressing vogue in the 21st Century, King feared that the portrayal of Charlie Decker would give aid and comfort to others in similarly desperate emotional situations.

It’s about a school shooter, you see.

Continue reading

Text Mining: Intro + Carrie

Standard

As mentioned in my previous post I’m examining Stephen King texts through the magic of text mining, using a number of tools in the R language, but especially through Julia Silge’s tidytext package. The book Text Mining With R: A Tidy Approach by Julia Silge and David Robinson was a godsend in explaining the process of using tidy data formats to store and analyze text-as-data. I will roughly summarize the basics to give you an idea as to what’s involved but there is a great deal more that can be done than I am covering here.

Continue reading

Literary Fun With Text Mining

Standard

My wife is doing her PhD in political science on the topic of political interest groups and how they use social media to disseminate information and reach new audiences, and how they utilize this new(ish wow we’re old) medium to effect voting behaviour. Part of this has meant learning how to mine Twitter data and analyze it through the R programming language; in order to provide technical support and to have someone to troubleshoot coding issues, I’ve also been learning to use R to mine and analyze texts. What I’ve been concentrating on, in order to learn the language and the processes, is using it to mine and visualize data gathered from fictional texts, specifically the bibliography of Stephen King. What I want to do is to analyze plot trajectories drawn from sentiment data – quantitative measures of emotional sentiment words based on established dictionaries used for that sort of thing. Research questions on this would include things like: is there a pattern that King has for his plots, based on emotional language cues? Is this pattern, if any, different from other well-known horror writers? Furthermore, are there established “archetypal” emotional plot patterns for horror books, and do these patterns differ when you switch genres – say, to fantasy, military science fiction, paranormal romance, etc. etc. down the fracture lines of human experience.

Continue reading

Interstitial Burn-Boy Blues

Standard

Stuart watched the kid shake and mutter to himself in the seat across the aisle. His skin looked waxy in the dingy interior bus lights, and Stuart was sure that if he reached across and caressed the kid’s forehead with the back of his hand that skin would be near to scalding. He ran his tongue along the back of his teeth and watched the kid carefully. No one else in the general vicinity seemed to be concerned. Stuart noticed an old man dozing in the seat behind the kid, and a young couple murmuring to each other beneath a blanket in the seat ahead of him.

Continue reading