Distant Reading

I'm playing with Voyant Tools today, which is "a web-based tool for reading and analyzing text."  The software let's you do what is called distant reading, or looking at large bodies (corpus) of texts to pull out themes.

For example, consider the U.S. Constitution in Voyant. 

US Constitution in Voyant Tools
US Constitution in Voyant Tools

Bet you didn't know the number one word in the Constitution (other than common words such as "the") was "Shall." Also, the word "president" is used 34 times and "congress" is used 29 times.

Now, actual linguists, as opposed to dilettantes like myself, point out that just counting words in a single text doesn't help explain meaning, certainly not why those words matter or how word usage changes over time. Still, I'm going to introduce distant reading to students so they can see the possibilities. For example, and this would take a bit more work that cutting and pasting a URL as I did with the Constitution example, what if we put all of the Constitutions formed across the world from the period 1776-1825 into Voyant?

What analysis might we be able to do based on that body of texts? Themes can emerge using these data tools that only the most creative thinkers would be able to string together reading them in a textbook. There's a chapter in the textbook I've used that asks students to compare constitutions in this period, and for what it is, the chapter works with strong introductory material and nicely excerpted laws. Still, when class rolls around and the students have to compare and contrast particular issues, they do as much time flipping pages as talking or analyzing. In short, it's hard to compare a, b, c, and d if you can only view each one separately. My hope is that Voyant and other distant reading tools will help my students better compare historical texts.

Leave a Comment