Science —

MIT claims to have found a “language universal” that ties all languages together

A language universal would bring evidence to Chomsky's controversial theories.

MIT claims to have found a “language universal” that ties all languages together

Language takes an astonishing variety of forms across the world—to such a huge extent that a long-standing debate rages around the question of whether all languages have even a single property in common. Well, there’s a new candidate for the elusive title of “language universal” according to a paper in this week’s issue of PNAS. All languages, the authors say, self-organise in such a way that related concepts stay as close together as possible within a sentence, making it easier to piece together the overall meaning.

Language universals are a big deal because they shed light on heavy questions about human cognition. The most famous proponent of the idea of language universals is Noam Chomsky, who suggested a “universal grammar” that underlies all languages. Finding a property that occurs in every single language would suggest that some element of language is genetically predetermined and perhaps that there is specific brain architecture dedicated to language.

However, other researchers argue that there are vanishingly few candidates for a true language universal. They say that there is enormous diversity at every possible level of linguistic structure from the sentence right down to the individual sounds we make with our mouths (that’s without including sign languages).

There are widespread tendencies across languages, they concede, but they argue that these patterns are just a signal that languages find common solutions to common problems. Without finding a true universal, it’s difficult to make the case that language is a specific cognitive package rather than a more general result of the remarkable capabilities of the human brain.

Self-organising systems

A lot has been written about a tendency in languages to place words with a close syntactic relationship as closely together as possible. Richard Futrell, Kyle Mahowald, and Edward Gibson at MIT were interested in whether all languages might use this as a technique to make sentences easier to understand.

The idea is that when sentences bundle related concepts in proximity, it puts less of a strain on working memory. For example, adjectives (like “old”) belong with the nouns that they modify (like “lady”), so it’s easier to understand the whole concept of “old lady” if the words appear close together in a sentence.

You can see this effect by deciding which of these two sentences is easier to understand: “John threw out the old trash sitting in the kitchen,” or “John threw the old trash sitting in the kitchen out.” To many English speakers, the second sentence will sound strange—we’re inclined to keep the words “threw” and “out” as close together as we can. This process of limiting distance between related words is called dependency length minimisation, or DLM.

Do languages develop grammars that force speakers to neatly package concepts together, making sentences easier to follow? Or, when we look at a variety of languages, do we find that not all of them follow the same pattern?

The researchers wanted to look at language as it’s actually used rather than make up sentences themselves, so they gathered databases of language examples from 37 different languages. Each sentence in the database was given a score based on the degree of DLM it showed: those sentences where conceptually related words were far apart in the sentence had high scores, and those where related words sat snugly together had low scores.

Then, the researchers compared these scores to a baseline. They took the words in each sentence and scrambled them so that related words had random distances between them. If DLM wasn’t playing a role in developing grammars, they argued, we should be seeing random patterns like these in language: related words should be able to have any amount of distance between them. If DLM is important, then the scores of real sentences should be significantly lower than the random sentences.

They found what they expected: “All languages have average dependency lengths shorter than the random baseline,” they write. This was especially true for longer sentences, which makes sense—there isn’t as much difference between “John threw out the trash,” and “John threw the trash out” as there is between the longer examples given above.

They also found that some languages display DLM more than others. Those languages that don’t rely just on word order to communicate the relationships between words tended to have higher scores. Languages like German and Japanese have markings on nouns that convey the role each noun plays within the sentence, allowing them to have freer word order than English. The researchers suggest that the markings in these languages contribute to memory and understanding, making DLM slightly less important. However, even these languages had scores lower than the random baseline.

The family tree

This research adds an important piece of the puzzle to the overall picture, says Jennifer Culbertson, who researches evolutionary linguistics at the University of Edinburgh. It’s “an important source of evidence for a long-standing hypothesis about how word order is determined across the world’s languages,” she told Ars Technica.

Although the paper only looked at 37 languages, it’s actually incredibly difficult to build these databases of language in use, which makes it a reasonably impressive sample, she said. There is a problem here, though: many of the languages studied are related to one another, representing only a few of the huge number of language families, so we’d expect them to behave in similar ways. More research is going to be needed to control for language relatedness.

This paper joins a lot of previous work on the topic, so it’s not the lone evidence of DLM—it’s corroborating, and adding to, a fair bit of past research. It’s “a lot of good converging evidence,” she said.

“There are many proposed universal properties of language, but basically all of them are controversial,” she explained. But it’s plausible, she added, that DLM—or something like it—could be a promising candidate for a universal cognitive mechanism that affects how languages are structured.

For a debate as sticky as the one about language universals, there could be multiple ways of interpreting this evidence. Proponents of Chomsky's school might argue that it's evidence for a dedicated language module, but those who favour a different interpretation could suggest that working memory affects all brain functions, not just language.

Did you know that Ars Technica now has a weekly newsletter? It contains all of the week's top stories, plus info about upcoming meetups and other events. Sign up now.

PNAS, 2015. DOI: 10.1073/pnas.1502134112  (About DOIs).

Listing image by flickr user: Fernando Marcelo Cañuelo

Channel Ars Technica