Big data and Downton Abbey
The hit British TV series Downton Abbey has drawn some fire for alleged anachronisms: phrases that just don’t fit Georgian-era Britain.
Ben Schmidt has unleashed gigabytes of data on this problem, with the Google Books n-grams. When Google digitized lots of books, it also tabulated the frequencies of words, pairs of words, triples of words, and so on, by year of publication. In two posts, Ben compares word pairs from the TV script with the Google frequencies for books published in the 1910s and the 1990s. The comparison shows up several two-word phrases that were much less common in Downton Abbey’s historical period than they are now, but still appear in the script. In some cases these phrases were not observed at all in written English until much later; in other cases they existed but were rare.
As a check on the process, he also looks at a genuine play from the period, George Bernard Shaw’s Heartbreak House, which passes the phrase test with flying colors.
Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »
I found his list of phrases very interesting, clearly “pansystolic” is very anachronistic but the phrase “unicorn if” seemed to me very odd. Going to the Ngram site http://books.google.com/ngrams/graph?content=unicorn+if&year_start=1900&year_end=2000&corpus=0&smoothing=3
shows the phrase as suddenly dropping out of favour about 1912 until 1925 but
http://books.google.com/ngrams/graph?content=Unicorn+if&year_start=1900&year_end=2000&corpus=0&smoothing=3 show the the phrase “Unicorn if” much more common in this period.
13 years ago
Interesting social commentary?
http://books.google.com/ngrams/graph?content=chip+on+his+shoulder%2C+chip+on+her+shoulder&year_start=1900&year_end=2008&corpus=0&smoothing=3
This search was inspired by the following link suggesting that “chip on his shoulder” was not seen in print in England till the 1930s: http://www.phrases.org.uk/meanings/chip-on-your-shoulder.html
Evidently it was a common phrase in US print well before then.
13 years ago