Google Books Ngram Viewer

When writing a novel situated at the beginning of the 20th century, the language used in dialogues is an issue. A modern reader expects snazzy and easy-to-follow repartees, so an exact imitation of the way people talked at the time isn’t always called for. On the other hand, you don’t want to use anachronistic words either. Moreover, the characters are mainly British, and Americanisms hadn’t taken over the language yet. It’s a minefield. A good thesaurus can help — I use the Oxford Thesaurus of English. But to be honest, the tool I turned to most when in doubt was Google Books Ngram Viewer.

Now, what is an n-gram? I’m just copying the Wikipedia definition here: ‘In the fields of computational linguistics and probability, an n-gram is a contiguous sequence of n items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or base pairs according to the application.’

Google scanned millions of books, which they made available in Google Books. This gigantic words database they then used to create Google Books Ngram Viewer. It’s straightforward to use. You simply put in a word, a period, and then select whether you want the engine to search in all English texts, in American or British English, or in English Fiction. Hit the Enter key, and you immediately get a graph that indicates the word’s frequency of use over the selected years. It even allows you to research several words at the same time — all you have to do is insert a comma to separate them — and the graphs will tell you how often they were used over a given period.

Here is an example that’ll make things clear. I typed in a few synonyms for informer. Try it for yourself. It’s fun.

Ngram Viewer is not a panacea, however. Words can be around for a long time before they acquire a specific meaning. In the next graph, you’ll find several British synonyms for policeman. Three words — Bobby, Peeler, and rozzer — all refer to Robert Peel, who created the Metropolitan Police Force in 1829. But, of course, the name Bobby was around long before that without signifying the police. Same issue for ‘the filth’, probably the crudest term ever to designate the police: ‘Filth’ has been (and is still) used in a totally different context than ‘police’. So Ngram Viewer is useless to find out whether e.g. ‘the filth’ already meant ‘police’ in 1918. The same can be said about ‘pigs’, which I excluded from the search for obvious reasons.