Inside Dictionaries and Keyboards

Add further reference dictionaries using the Dictionary app's Preferences dialog.

Macs have always provided good tools to support writing in different languages, and Yosemite has continued to advance them. But how can you tailor and extend them?

Whether you are writing a terse email or a trilogy of blockbuster novels, you need to get spelling correct, and to use a keyboard layout that is appropriate to the language that you are using.

Spelling is not only important to give the right impression, but to ensure that your meaning is conveyed correctly. Although combined spelling and grammar checkers do not pick up all your mistakes, and some will always slip through even the most assiduous proof-reading, they can act as a good initial screen.

If you are going to use your Mac’s built-in spelling checker, it is essential that you run it with an appropriate dictionary. If you have ever unintentionally checked a British English document with a US English dictionary, or the other way around, you will know how infuriating it is when completely different spelling rules are applied.

Check a document in a different language, or worse still text that mixes two or more languages when your application is using a single dictionary, and you will be familiar with the innumerable words that are underlined as being incorrect. This renders the feature useless, and ensures that many errors will slip out unnoticed.

Open Spell

Since OS X 10.6, Apple has settled on the open source Hunspell spell checker, which is related to Ispell and MySpell, and you may come across it being referred to (internally within OS X, at least) as Open Spell.

Hunspell, originally developed to check Hungarian with its rich list of accented characters, has become a popular choice for open source products, and is now the standard system in LibreOffice, OpenOffice, Mozilla’s Firefox and relatives, Google Chrome, Opera, and Adobe CS and CC, but not Microsoft Office.

It is descended from the original and primitive Unix spell-checker named simply ‘spell’, which gave rise to first the improved Ispell, then the even better MySpell, which was used by older versions of some major open source products.

Hunspell has the potential to be both powerful and reliable at detecting errors, but is completely reliant on applying rules supplied in the dictionary that is currently in use. Commercially-sourced dictionaries, such as those supplied as part of Mac OS X, not only contain very large wordlists, but sophisticated rules that are applied to many words to determine how they can be modified: for instance, different types of plurals from rule/rules to study/studies, and the conjugation of verbs.

When I carried out detailed tests a few years ago, the Hunspell-based checker in Snow Leopard found more genuine errors than any competitor product, 67% of those embedded in our test document, and did not report as many false positives as several others. Its British English dictionary appears sound, although those who prefer to use z instead of s in words like specialise will find its improved versions since Lion more accommodating.

OS X 10.7 also brought a marked increase in the number of localised dictionaries available, now in Yosemite including five different flavours of English (but still not Scottish/Scots English), all the major European languages, and most significant world tongues too.

However significant regional languages, such as Irish, Welsh, and Scottish Gaelic, are not yet included in the standard Yosemite distribution. If you wish to add specialist dictionaries to support such languages, details are given below.

Alternatives

There are remarkably few languages that remain entirely unsupported by spell checkers. If you are unable to locate Hunspell/Ispell format dictionaries, you are likely to find them for GNU Aspell, a rather older alternative to Hunspell that uses a different dictionary format. Its free Mac port, cocoAspell, will allow you to choose from one of the vast range available by FTP from GNU here.

If you try installing cocoAspell on a fresh copy of Yosemite, you may find that it fails and does not work because of an installer issue. You can fix this manually, either by copying folders from a previous installation or setting them up yourself. You need a folder named aspell-0.60 in /usr/local/lib, inside which are several libraries with names starting with libaspell or libpsell, which are owned by root. You will also need to put the file named cocoAspell in /Library/Application Support.

Another free alternative is Excalibur, which is particularly suited to checking marked up text such as TeX and LaTeX.

Dictionaries for reference

The Mac OS X spelling checker does not, of course, tell you what words mean, nor provide any thesaurus facility. For those purposes, Apple provides the under-used Dictionary application, a standard item in the top-level Applications folder.

The Dictionary app offers an excellent range of reference dictionaries: this is but a small sample.
The Dictionary app offers an excellent range of reference dictionaries: this is but a small sample.

If you have not explored this thoroughly yet, it is worth using, and now gives access to the Oxford Dictionary of English, a comprehensive and detailed reference for those on the eastern shores of the Atlantic. The bundled version includes phonetic pronunciation, example usage, and etymology, and you can also enable a companion British English thesaurus.

In successive releases of OS X, Apple has steadily increased the number of languages supported with dictionaries, and these now include English, French, German, Russian, Italina, Spanish, Portuguese, Dutch, Turkish, and several Asian languages. You will not yet find any for Celtic tongues, but several specialists offer a very wide range of single-language and translating dictionaries.

Unlike spelling dictionaries, this dictionary format is derived from XML source files, so developing your own is more laborious than technically challenging, given Apple’s developer information here and Xcode’s tools.

If you can locate StarDict format dictionaries, the Mac Dictionary Kit here may be able to convert them, as detailed here.

Add further reference dictionaries using the Dictionary app's Preferences dialog.
Add further reference dictionaries using the Dictionary app’s Preferences dialog.

Reference dictionaries for use by the Dictionary app should be installed in /Library/Dictionaries.

A custom Georgian-English dictionary accessed using the Dictionary app.
A custom Georgian-English dictionary accessed using the Dictionary app.

Keyboards

Armed with localised spell-checking and perhaps a full dictionary or two, the final essential for working in many languages is a proper localised keyboard layout. The Character Viewer allows you to enter any Unicode character, invaluable for the occasional incursion into Cyrillic, for instance, but is not the way to compose a tract in Arabic or Urdu.

Once you need regular access to accented or non-Roman characters you will find it preferable to switch to an appropriate localised keyboard. In Yosemite, the choice has increased further over the already rich selection in previous releases of OS X, and there are third party keyboard layouts available for others.

The Input Sources tab in Keyboard sets up those keyboard layouts to be accessed through the menu.
The Input Sources tab in Keyboard sets up those keyboard layouts to be accessed through the menu.

When you press a key, Mac OS X receives a message containing the virtual key code, such as 0 for the letter ‘a’ and 14 for ‘e’. This is looked up, with the state of the modifier keys (Shift, Shift Lock, Control, Alt/Option, and Command), in the active software keyboard layout, and a Unicode character (or string of characters) representing the assigned key meaning is then returned to the active application.

For instance, you could design a software keyboard layout in which pressing the left Shift modifier and ‘e’ keys returned the Unicode character 0119 ‘Latin small letter e with ogonek’ or ę (if you have a font supporting its display), or even the string 0065 006E 0064 ‘end’.

Further information about keyboards and how keyboard input works is in this article.

If you find yourself working mostly in a non-English language, you can buy localised keyboards to match the software layouts that you use. Although this expense may appear unnecessary, it is far preferable to amateur efforts which can easily force you to replace your keyboard prematurely. You should then have a Mac that helps you create good, correctly-spelled documents in whatever language you choose, from Sanskrit to Klingon.

Technique: Creating and Installing Spelling Dictionaries

Creating your own spelling dictionary for Snow Leopard or Lion is not particularly technically challenging, once you understand the standard dictionary format. Each consists of two files, named according to the standard abbreviations for languages. The main word list is placed in a text file with the extension .dic, and definitions to refine and implement word mutations are given in a text file with the extension .aff.

The best documentation for the Hunspell dictionary format is in its man pages, although these are not readily accessible either at its home page, or even through Terminal’s command line. They are, though, available from Linux documentation sites such as here. The Hunspell home page does now give access to better documentation which can supplement the man page.

You will also find it helpful to browse some existing dictionaries using a plain text editor such as BBEdit or TextWrangler. Probably the most extensive collection of Hunspell dictionaries is that for OpenOffice here.

Having created your own, or acquired existing, dictionary files, all you have to do is place them in the Spelling folder in an appropriate Library folder. In most cases it makes best sense to use the main, top-level Library folder, where they will be accessible to all users.

The Text tab of the Keyboard pane defaults to selecting the spelling dictionary automatically, but it is worth using the Set Up… item in that popup menu to ensure that your custom dictionary is selected, and placed in appropriate order.

Spelling dictionaries are added through the Text tab of the Keyboard pane.
Spelling dictionaries are added through the Text tab of the Keyboard pane.

Better applications, such as the word processor Nisus Writer Pro which has long excelled in its multilingual support, will then allow you to define and apply language sets, including chosen spelling dictionaries, to different sections within each document. Support in the more popular commercial products such as Microsoft Word will differ, though.

Technique: Creating and Installing Custom Keyboard Layouts

To make system-wide changes to your keyboard to support another language, create an installable layout in XML Keyboard Definition format. The complete contents of an XML .keylayout file, and the Keyboard Bundle composed of one or more of those files, are detailed here, and the definition is stored as a DTD in /System/Library/DTDs/KeyboardLayout.dtd.

Unfortunately you cannot use the standard keyboard layouts installed as part of Mac OS X as models for your own modification, as they are pre-compiled into large binary files. However an excellent and well-documented example is provided by Jan Borchers here, and the free online service here will generate any of the Mac formats for you.

Ukelele, free from here, is a good tool for offline work that is replete with prepared .keylayout files for a host of different languages, and this list is a valuable additional resource.

There are some undocumented wrinkles that can catch the unwary. You may have to name your Localised XML Keyboard Bundle Roman.bundle, and give it a recognised keyboard name (such as “U.S.”) and ID (such as 5000). If that does not work, selecting your custom layout in Input Sources may not stick properly, and Mac OS X might suddenly switch you back to one of its standard layouts.

Avoid installing a custom layout in the ~/Library/Keyboard Layouts folder, as some releases of OS X would then become unable to cope with password entry; use the /Library/Keyboard Layouts folder instead. Further idiosyncrasies are described in the lengthy discussion on Ukelele’s home page.

The other essential aids include the Keyboard and Character Viewers, and keyboards that can readily have individual caps removed and reordered to match your software mapping.

Updated from the original, which was first published in MacUser volume 27 issue 20, 2011.