Advertisement

SKIP ADVERTISEMENT

Words by Numbers

The Word Choices That Explain Why Jane Austen Endures

Kathleen A. Flynn and

Two hundred years after her death, Jane Austen commands a cultural empire — fan fiction, adaptations, merchandise — with her six novels at the center. It raises the question: Why her, as opposed to someone else?

Franco Moretti, founder of the Stanford Literary Lab, which applies data analysis to the study of fiction, argues that certain books survive through the choices of ordinary readers, a process something like evolution: “Literary history is shaped by the fact that readers select a literary work, keeping it alive across the generations, because they like some of its prominent traits.”

What traits make Austen special, and can they be measured with data? Can literary genius be graphed?

Her novels, reasonably successful in their day, were innovative, even revolutionary, in ways her contemporaries did not fully recognize. Some of the techniques she introduced — or used more effectively than anyone before — have been so incorporated into how we think about fiction that they seem to have always been there.

One thing early readers did notice was naturalism. Austen’s novels, unlike those she grew up reading, owed nothing to improbabilities. No settings of spooky Italian castles (she mocks such Gothic devices in “Northanger Abbey”); no characters kidnapped by rakes or bequeathed a fortune with strange provisions attached. Walter Scott, a popular author of historical novels mostly forgotten today, praised her “art of copying from nature as she really exists in the common walks of life and presenting to the reader, instead of the splendid scenes of an imaginary world, a correct and striking representation of that which is daily taking place around him.”

Let’s try to see this naturalism with data.

We started with a set of English novels published between 1710 and 1920. Using a technique called principal components analysis, we plotted each work on a two-dimensional chart based on the vocabulary in each book. Books closer together on the chart use more similar words.

On the horizontal dimension, words farther to the left tend to be more abstract, concerned with states of mind and social relationships: acquaintance, affection, attended, conduct, depend, desire, endeavoured, favour, gratitude, indulgence, merit, obliged, occasion, prevailed, received, resentment, resolution, resolved, suffered and virtue.

Words farther to the right connect to the physical world and the senses: blue, close, dark, edge, empty, fingers, grass, head, hot, outside, picked, rolled, round, shoulder, slipped, slowly, stand, top, watch and white.

The most abstract are the earliest, and Austen’s novels, published from 1811 to 1818, fit right in. The vertical dimension is intriguing, though. The bottom is associated with Macbeth-ish words like banquet, beheld, slain, sword and thee. Scott’s medieval romp “Ivanhoe” (1820) lands at the bottom.

But what about the top, where Austen is nearly alone, with “Diary of a Nobody” (1892) and “New Grub Street” (1891)? This is related to a higher-than-average propensity for words like quite, really and very — the sort that writers are urged to avoid if they want muscular prose. It also connects to time markers and states of mind: always, fortnight and week; awkward, decided, dislike, glad, sorry, suppose.

No “splendid scenes of an imaginary world,” then. But as Virginia Woolf observed about Austen, “Of all great writers she is the most difficult to catch in the act of greatness.” To capture that “correct and striking representation” of which Scott spoke, we need more data.

Another study analyzed words in Austen’s novels, comparing them with a collection of contemporary British fiction, and of British fiction from 1780 to 1820. It found several distinctive aspects.

Austen uses comparatively more words referring to women — “she,” “her,” “Miss” and to family relationships like “sister,” unsurprising considering her subject matter. We also see more words like “content,” “respected,” “expected.” This corresponds with our chart’s horizontal axis.

Austen used intensifying words — like very, much, so — at a higher rate than other writers. The study linked this intensifier use to a crucial trait of her writing, one that might at first seem to resist quantification: irony.

Traditional literary approaches to Austen have long focused on this aspect of her work: “the incongruities between pretence and essence, between the large idea and the inadequate ego,” as the critic Marvin Mudrick put it. A look at passages where words like very are used frequently often finds the stated meaning conceivably at odds with the real one, the exaggeration subtly inviting doubt.

In “Emma,” Emma and her friend Mrs. Weston discuss the charming Frank Churchill, who Emma, though she has just met him, thinks ought to be attracted to her:

Mrs. Weston was very ready to say how attentive and pleasant a companion he made himself — how much she saw to like in his disposition altogether. He appeared to have a very open temper — certainly a very cheerful and lively one; she could observe nothing wrong in his notions, a great deal decidedly right...This was all very promising; and, but for such an unfortunate fancy for having his hair cut, there was nothing to denote him unworthy of the distinguished honour which [Emma’s] imagination had given him; the honour, if not of being really in love with her, of being at least very near it, and saved only by her own indifference — (for still her resolution held of never marrying) — the honour, in short, of being marked out for her by all their joint acquaintance.

A first-time reader cannot know that Frank is harboring a significant secret, or that Emma does not know her own heart, but the welter of intensifiers creates a sense of insisting too much, that not all is as it seems.

Austen was also a heavy user of words like could and must: those indicating likelihood, ability, permission and obligation. “Could” in her novels often combines with “not,” and tends to be found with words or phrases connected to thinking, perceiving or talking. This reflects the challenge that Austen’s characters, especially her heroines, face in seeing things as they truly are, or in speaking out.

The naïve Catherine of “Northanger Abbey” views the world through the prism of her beloved Gothic novels, misreading reality. Anne Elliot, of “Persuasion,” brought back into contact with the ex-fiancé she still loves, must determine if he retains any feelings for her, but cannot voice her own.

“Must” tends to be found near “to be” and “to have,” in constructions often used in the characters’ speculations about people’s true thoughts or emotions:

She felt that something must be the matter. The change was indubitable. The difference between his present air and what it had been in the Octagon Room was strikingly great. “Persuasion”

It is at the heart of Austen’s work: What is going on behind the veneer that politeness demands? These distinctive words, word clusters and grammatical constructions highlight her writerly preoccupations: states of mind and feeling, her characters’ unceasing efforts to understand themselves and other people.

Human nature (together with the operation of time) is the true subject of all novels, even those full of ghosts, pirates, plucky orphans or rides to the guillotine. By omitting the fantastical and dramatic elements that fuel the plots of more conventional novels both of her own time and ours, Austen keeps a laser focus.

That points to traits so crucial for her endurance in the Darwinian struggle for literary immortality: acute emotional intelligence, and a rare ability to render it in stories that amuse even as they instruct. Austen seems to see people so clearly that we, her readers, cannot fail to improve in perceptiveness, too.

A correction was made on 
July 30, 2017

A chart on July 16 with an article about how data science can be used to analyze the vocabulary in Jane Austen’s novels misstated the number of works by other British authors being compared with Austen’s. There were 125, not 127.

How we handle corrections

The Upshot provides news, analysis and graphics about politics, policy and everyday life. Follow us on Facebook and Twitter. Sign up for our newsletter.

A version of this article appears in print on  , Page 13 of the Sunday Book Review with the headline: Style and Substance. Order Reprints | Today’s Paper | Subscribe

Advertisement

SKIP ADVERTISEMENT