Distant Reading Assignment

An introduction to using computers to analyze large amounts of text, or distant reading.

Learning objectives:

Student will be able to:

  1. Enter URLs or text into Voyant.

  2. Demonstrate an understanding of the Voyant tool through counting words and drawing conclusions from those counts.

  3. Demonstrated in their paragraphs the use of stopwords.

  4. Demonstrate that they can draw conclusions based on their use of Voyant, including word counts and word patterns.

1. Distant vs. close reading. A lesson using historical philosophical, religious, and legal texts.

Most reading we do is “close reading.” We read each word, place each word in the sentence or context, and then create meaning out of the words all strung together. For example, “today, I ate cake.” You must read those words in context and in an order to understand those words.

Sometimes, we read in ways that aren’t so “close.” For example, if you go to a weather website, and look up the forecast, you don’t read all the words in their context. You scan for the information you need, and ignore the rest. This is the first step to distant reading: recognizing that not all information included in a text is relevant and looking only for the material (or data) that is important.

For part of the Computers and Words module, we’re going to use distant reading websites to analyze large amounts of text.

For example, below I’ve placed a URL https://github.com/jackhistorynorton/history_1101/blob/master/readings/Mahabharata_Gutenberg.txt of the entire text of the Mahabarata, the other great Indian epic (along with the Ramayana) into Voyant.

This tool counts words and looks for patterns. It is almost impossible to count words in large numbers for multiple books as a human, but computers can do it for us. This is what “distant reading” means: humans are far away from the texts and computers hold and manipulate the texts.

You can access Voyant at https://voyant-tools.org/
or at https://voyant-tools.org/docs/#!/guide/mirrors. If these sites are not responding, you can also download and install a local version of Voyant on your own computer.

Distant vs. close reading. A lesson using historical religious and popular culture texts.
Distant vs. close reading. A lesson using historical religious and popular culture texts.

2. The entire Mahabarata, by its words and numbers.

I clicked “Reveal” and Voyant has now analyzed the entire text, and counted every word, generating what we call a word cloud. A word cloud shows the words used most often in a text. More popular words or symbols are bigger. in the word cloud below, “said,” “great,” “continued,” and “like” are the most popular, which is useless to us. So, we need to tell Voyant to edit out those common words. We call common words we don’t want “stop words.”

The entire Mahabarata, by its words and numbers.
The entire Mahabarata, by its words and numbers.

3. Editing stopwords

I clicked on the switch icon below the word cloud and it will give me the option to add Stopwords.

Editing stopwords
Editing stopwords

4. Select Edit List and make sure “Apply Globally” is checked

You can also manually add words for Voyant to ignore by clicking on “Edit List.”

Select Edit List and make sure "Apply Globally" is checked
Select Edit List and make sure "Apply Globally" is checked

5. Add any words to the list that you don’t want.

Here I’ve added “said,” “unto,” “hath,” and “continued,” and several other common words, hitting the return key after each word and then hitting “Save” and then “Confirm.”

Add any words to the list that you don't want.
Add any words to the list that you don't want.

6. New word cloud revealed.

New words emerge as

Clicking on any slider icon in the upper right gives you more options.

New word cloud revealed.
New word cloud revealed.

7. Clicking on icons gives you access to more options.

Clicking on slider icon reveals more tools.

Clicking on arrow in a box icon reveals more of that section of Voyant.

Clicking on icons gives you access to more options.
Clicking on icons gives you access to more options.

8. What matters more: son, daughters, wife, or husband.

I added the words daughter, son, husband, and wife to the trends tool on the right (1).

Voyant split the Mahabarata into 10 segments, and shows the number of times each word is used in each part of the ancient Indian epic.

Looking at this graph, I might be tempted to argue that this text is most concerned with sons. Still, I would need to know more before I offered that argument. Voyant just counts words and shows those counts. It doesn’t offer explanations for anything.

What matters more: son, daughters, wife, or husband.
What matters more: son, daughters, wife, or husband.

9. How can distant reading help us understand the past?

As humans we can only read a limited number of words at a time. Computers in the form of software, however, can “read” huge numbers of words, entire libraries in fact, fairly easily. More importantly, software can count, compare, and display patterns in ways we can’t.

Computers also help by revealing patterns that are contrary to our assumptions, which is especially important when studying complicated texts, like philosophy, religion, and law.


10. Assignment

Copy and paste the URLS below for your class into Voyant Tools.

Be sure to chose the appropriate URLs for your course.

World History 1-

Ibn Rushd (Averroës), 1126–1198 CE, Religion & Philosophy, c. 1190 CE

Moses ben Malmon (Maimonides)- 1138–1203, The Guide of the Perplexed, 1185–1190.

World History 2-

1858 Government of India Act

French Second Republic Constitution of 1848

Minnesota History

Minnesota 1858 Constitution

Answer the following questions.

In complete sentences, answer the following questions:

a. What are the most important words in the text(s) you are analyzing?
b. What three historical questions do you have about the most important words?

 Pause- Do a brief "add Wikipedia" search for the authors or the texts you are reading.

c. What new information did you learn from Wikipedia that helps you understand your word counts?
d. If you had to offer a hypothesis on the most important two ideas in your document(s), what would you argue? Put another way, what were the major ideas that appear in your documents based solely on the word counts.

11. Grading Criteria

Student

  1. Entered the relevant URLs or text into Voyant.

  2. Demonstrated an understanding of the Voyant tool through counting words and drawing conclusions from those counts.

  3. Demonstrated using the SIFT practice of adding Wikipedia to use lateral reading.

  4. Demonstrated that they can draw conclusions based on their use of Voyant, including word counts and word patterns.

  5. Demonstrated standard college writing expections with punctuation, capitalization, proper quotations of words not their own and citations.