Two centuries, two authors, two platforms

CasualConc is another excellent free concordance tool for text files, with special support for East Asian languages.

Before we became overwhelmed with Olympic fever in the summer of 2012, I celebrated two bicentenaries.

The first and more obvious was the birth of Charles Dickens, which hopefully took you to read some of his finely-crafted novels; if you have yet to read one, I recommend Bleak House with its many highly contemporary messages, and its 2006 BBC adaptation is also worth watching.

Dickens harboured some remarkable secrets, such as his long affair with actress Ellen Ternan, which true to Victorian standards was not fully made public until 1939.

Even less recognised was his influence by near-verbatim transcripts of interviews with lower-class Londoners painstakingly collected and published by Henry Mayhew, born nine months after Dickens and my second bicentennial.

In addition to co-founding Punch and playing a key part in the establishment of the Illustrated London News, Mayhew published hundreds of interviews with patterers, pickpockets, and prostitutes first in the Morning Chronicle, then in four collected volumes entitled London Labour and the London Poor.

With the complete works of Dickens freely available online, and limited access to Mayhew’s books at the Tufts Digital Library (such as here), I thought it a good opportunity to explore their language.

This required the facilities of a concordancing tool, a feature seldom found in mere word processors, but which compiles word frequency counts and performs sophisticated searches in context. From experience I know this to be of little interest to the bland commercial world of Microsoft Word, so looked instead at apps intended for serious wordsmiths.

My current first choice word processor, Nisus Writer Pro, is known among other things for its search tool and extensions including word frequency analysis. For all-round authoring, Scrivener is uniquely-equipped, whether you want to generate the names of characters or produce quick synopses, but it lacks the analytical features for such introspection. BBEdit, king of text editors, is aimed more at the hewer of code than former of phrases, although I am sure that it could be extended to compile frequency lists.

Settling for Nisus I ran my first word frequency count, only to be baffled as to why all the commonest words, such as and and get, were conspicuously absent. Thankfully this feature is implemented as a macro, and a quick glance at its code revealed that it was generously ignoring all the most common words. I carefully carved that out to create a custom macro that yielded some fascinating results.

Although Nisus has many of the best writing tools around, it does not quite match the search in context offered by a classical concordance tool.

AntConc is an excellent free concordance tool for text files.
AntConc is an excellent free concordance tool for text files.

For that I had to download AntConc, free from Laurence Anthony here, and CasualConc, free from Yasu Imao here.

CasualConc is another excellent free concordance tool for text files, with special support for East Asian languages.
CasualConc is another excellent free concordance tool for text files, with special support for East Asian languages.

Armed with these and some large chunks of interesting English, I have spent many happy hours watching progressive tenses flourish, and more.

Not perhaps my first choice of platform for authoring or linguistic exploration, the iPad has gathered a fine collection of tools, most recently augmented by Agile Tortoise’s Terminology and Phraseology apps. Although the latter doesn’t quite offer regular concordance features, it parses works and offers word frequencies by part of speech, a novel tool for the writer (or linguistic analyst).

I am not yet sure how well it copes with Dickensian grammar, so it could struggle with some of the accounts shared between Mayhew and Dickens.

All this leads me to my word of the week: bechuxt, a seldom-encountered hiberno-english variant of betwixt, which is also the basis of some excellent Googlewhacks.

Updated from the original, which was first published in MacUser volume 28 issue 04, 2012.