Tell Mac speak human

When first launched, AppleScript had one truly wonderful feature: recordability.

Being an active commercial developer at the time, I was enthralled to discover that I could step through a series of actions in Apple’s specially-re-engineered demo text editor, and see them transcribed into code. Sadly, like so many radically new ideas, this has gradually faded into the sunset of Mac OS Classic, and all but disappeared from OS X.

This came back to me as I struggled to know how best to tackle a problem. I had downloaded a couple of old but worthwhile Scottish Gaelic to English dictionaries, tidied up their HTML tagging, and now want to cast them into more useful formats.

Among the strongest candidates was that designed for another unsung and free hero, the Dictionary application. This expects source content to be presented in rigorous XML using a RELAX NG schema, quite different from the source that I had to hand. How helpful it would have been to have a text tool that would record what I did to a single dictionary entry as script, then let me repeat that on each of the many thousands of others.

Several snags stopped this naive dream dead.

First, there are remarkably few programming languages – even those designed for sophisticated manipulation of text – that recognise words and markup tags usefully. There is no shortage of languages that specialise in text: AWK, Icon, OmniMark, Perl, Python, REXX, Ruby, SNOBOL, and Tcl to name but a few. SNOBOL and its successor Icon were the life work of one of the computer science greats, Ralph E Griswold; AWK and Perl are darlings of Unix wizards; Python, REXX, Ruby, and Tcl have all been vogue scripting languages in their day.

To those I should add XSLT, spawned from the SGML-XML behemoth, but for the life of me I cannot think of a more cryptic and esoteric collection of tools. Any one of them would have solved my problem with great elegance, but only after protracted mental gymnastics, the second snag.

The final and most damning issue was that I wanted to transform my dictionary source content interactively, as I expected some entries to need a bit of fine tuning afterwards.

I did not fancy trudging through all the XML output, which was far more verbose that the current source. And that was before I had got to my wishlist of interactively merging multiple dictionaries to produce a composite. TextSoap and BBEdit are great in their way, but we who forge words into documents lack anything as powerful as Illustrator or Photoshop are to graphics.

Much of this comes back to the fact that text is still seen by computers as a string of (albeit Unicode) characters. Automatic parsing of text into structure remains the subject of research, and tools to untangle tags are about as far as we get. What hope have we when industry-standard products purporting to be ‘word’ processors can still not position punctuation correctly when we rejig our writing?

The centenary celebrations of the birth of Alan Turing held in 2012, and still reverberating in the likes of the movie The Imitation Game, sometimes invoked the Turing Test, a Holy Grail of artificial intelligence.

Heavyweight contenders such as IBM’s Watson system might be able to handle a question-and-answer session as well as it competed in Jeopardy, but to computers languages are still far from those used between humans. As computer languages go, AppleScript is radical in being quite ‘natural’ in syntax. A very recent attempt in TypeCoder comes closer still, but for the time being at least is far less expressive.

If you want to get a feel for the gulf that remains between humans and computers, try chatting to a friend in fluent AppleScript, or TypeCoder.

Updated from the original, which was first published in MacUser volume 27 issue 17, 2011.