Lexical Analysis of the First 2016 Presidential Debate

Like 80.6 million of the rest of you, we watched the presidential debate on Monday, but our perspective was a little different. We're here to offer our presidential debate analysis.

Scripted is built to connect businesses with great writers, so we care deeply about language. During the debate there was some back and forth about how much words matter and we wanted to show just how much you can tell by looking closely at words. Using our platform's technology, we analyzed the transcript of the debate to learn more about each participant's viewpoint and speaking style.

Specifically, we scored each candidate's words for:

Complexity
Grade Level
Subjectivity
Positivity
Repetition

Getting Started

The first thing we needed for our presidential debate analysis was a point of comparison, a baseline. For that, we turned to the database of political articles that have been submitted through Scripted over the years. We also included the transcript of moderator Lester Holt as another point of comparison. Once we'd collected all those texts, we were ready to start measuring.

Complexity

One of the most basic lexical measurements is the complexity of words, which can be approximated by counting the number of syllables. It tells us who was throwing out SAT words and who was pulling from Hooked on Phonics.

Grade Level

The Flesch-Kincaid readability scale allows us to grade the presidential candidates against a typical school curriculum. To put it simply, Mr. Trump's political language reads like The Phantom Tollbooth, Mr. Holt and Secretary Clinton clock in at To Kill a Mockingbird, and our baseline tops the graph at roughly The Brothers Karamazov.

Subjectivity

So we know our candidates spoke in short words at a junior high (or elementary school) level, but how did they drive their points home? Objective sentences lay out a balanced view of topics, while subjective sentences tend to prefer one side of an issue. Our algorithm searches for words that indicate subjectivity. For example, saying "I believe that..." is the start of a subjective statement. "It is a fact that..." is the start of an objective statement.

Of course, just because a candidate claims something is a fact doesn't mean that it is. We'll leave that to the fact checkers. This is merely a measure of how often our candidates relied on personal anecdotes and opinions as opposed to objective statements.

Positivity

Positivity is a measure of the disposition (dare I say "temperament") of each candidate. Using negatively connotated words ('terrible', 'angry', 'frown') brings the score down, while positively connotated words ('great', 'happy', 'smile') send it back up.

All the transcripts returned a positive polarity, with Secretary Clinton leading the optimism pack. Naturally, we expect the baseline and Mr. Holt to maintain as neutral a polarity as possible, since they address positive and negative topics alike.

Repetition

Secretary Clinton, whose answers clocked in at a total length of 5,251 words, spoke significantly less than Mr. Trump, who spoke 7,514 words. Mr. Holt, meanwhile, spoke 1,783 words. But the overall count of words doesn't actually indicate how many distinct ideas were communicated. Debates (and campaigns more generally) are full of repetitive speech. Secretary Clinton's low score indicates she spoke with the least repetitive language, while Trump and Holt repeated themselves more often.

Curious what the buzz phrases of the debate were? We organized them below, removing common "stopwords" such as pronouns and articles, which add structure but can interfere with gleaning context.

We'll let you draw your own conclusions about the meaning of this presidential debate analysis. For more insights like these, make sure to check out the Scripted blog.

Published by Boris Vassilev on Thursday, September 29, 2016 in Content Marketing.