How many times did the word “God” appear in the title of books published in England from the year 1789 to 1914? And how about “science”? And “truth”? Others?
Well, Dan Cohen and Fred Gibbs grabbed Google’s book scan data and counted. So now we know that, in the year 1840 there were 118; but in 1841 there were only 76. Were people saturated “God” books from the year before and thus weary of the topic? Or was the drop a coincidence?
What a fun use of statistics!
According to their blog, Cohen is the Director of the Center for History and New Media and an Associate Professor in the History and Art History Department at George Mason University, and Gibbs is the Director of Digital Scholarship and an Assistant Professor at the same place.
Both are historians and are curious about Victorian-era thought. To augment—and certainly not replace—scholarship on Victorian literature, the pair decided to create a compilation of keywords in book titles, and then look for trends in these keywords.
They are well aware of the caveats:
First, we are well aware that the meaning of words change over time, as does word choice. “Science,” for instance, starts the long nineteenth century as an expansive term not so far from “knowledge,” but ends the era with a more narrow focus on the natural sciences. “Evil” might be a theme of Victorian thought but not necessarily the term most frequently used by authors when they discuss the subject.
They know not to read too much into their results. For an example of how easily things can go awry, a New York Times profile describing similar work by Princeton professor Meredith Martin:
She recalled finding a sudden explosion of the words “syntax” and “prosody” in 1832, suggesting a spirited debate about poetic structure. But it turned out that Dr. Syntax and Prosody were the names of two racehorses.
Another scholar wisely says that “Fewer references to a subject do not necessarily mean that it has disappeared from the culture, but rather that it has become such a part of the fabric of life that it no longer arouses discussion.”
Perhaps the cleverest thing is what Cohen and Gibbs did not do: they did not attempt to overlay—perhaps straightjacket is a better word—any kind of formal statistical model on the data. As the old saying goes, they let the data speak for themselves.
Even better, the two gentlemen, in a Victorian spirit of open debate, have made their data freely available. I downloaded it and used it to create the pictures below. I only create what they did not (or at least have not yet shown us), in order to make a small point about the possibilities for misinterpretation.
But go to their site and examine the many pictures they have for about two dozen keywords. A lot of curiosities there.
Meanwhile, here’s a plot that starts everything: the number of books published by year (they did not show this one).
Isn’t that slick? See those spikes at 1800, 1850, and 1900? These are accompanied by decreases the years immediately after. Fatigue?
Even better is this next plot (this one not shown either):
This is the per capita number of books published. I used the population of England only (from this site), and used a simple extrapolation to fill in the missing years. This is crude, but what a difference normalizing by population makes!
That spike ending at 1800 now looks suspicious. First guess suggests something is screwy in Google’s data. It seems less likely to me that there was a major shift in the publishing industry (but I make this statement based on limited knowledge, i.e. ignorance). I also recall reading elsewhere that the scanning of a book’s meta data is rife with error.
This next picture is the per capita number of books with “God” in their title, followed by the raw counts.
An inexorable decline until about 1890, then a steady but small flow of books. All of these pictures so far are to be contrasted against this last one, which Cohen and Gibbs do show: the percent of all books with “God” in their title (the per capita normalization appears in the numerator and denominator and thus disappears).
The reason for making and comparing these extra plots is now obvious. It appears the decrease in book titles having “God” is because of a lack of interest in publishing them, and not, say, from just an increase in the number of books of all topics published. We can say this because the per capita “God” plot closely matches the percent “God” plot.
Well, this is just a start. If you have any interest in Victorian-era thought, I encourage you to keep up with Cohen and Gibbs’s site for updates.
Update Population is in the millions.