Friday, December 17, 2010

Google Books Ngram Viewer (Was: Fifty Years of Cataloging artifacts)

Related obliquely to all of this could be Google's textual analysis tool that searches the text of Google Books (announced in the NY Times http://www.nytimes.com/2010/12/17/books/17words.html). The article has some interesting searches, and I thought I would try my own words searching "catalog card" and "card catalog" in US and UK spellings. http://tinyurl.com/32pn8vv

Apart from some strange anomaly in the mid 1700s, we see that as the industrial revolution makes possible card production in the mid-1800s, the usage skyrockets until computerization in the 1970s, when the usage of the terms plummets, along with card production, although the terms are still being used today a little bit. When selecting for "different Englishes" (English Fiction, English One Million--whatever those mean) the trends still appear to be the same.

I am not really sure how this tool works, for example, I *believe* that typing multiple words automatically provides a phrase search, and I don't even want to bring up the problems of OCR, but in any case, this could prove to be an interesting tool. How actually *useful* it will be remains to be seen, but it is interesting.

This database comes from the scanned books, and it would be interesting to be able to compare these words with "born digital" web pages as well.

No comments:

Post a Comment