Forming unusual characters using Unicode and typography

Last week, I was asked the apparently simple question as to how to form the mathematical symbol commonly used to express the mean or average of x, x̄  (that should display as the letter x with a bar over it). In this article I explore several answers to that question, and the aids available in macOS to help you manipulate characters in this way. These not only apply to mathematics, but to many Roman and non-Roman languages, and can also be used to add effects such as strikeout characters, which aren’t supported directly in many apps.

This problem divides neatly into two, depending on whether you’re trying to create text, or characters laid out in a document which might be viewed in a page layout language such as PDF.

Text combinations

In general, text consists of a series of Unicode characters, which are rendered one after the other. If you can find a single Unicode character which combines all the elements you want, then that’s by far the simplest solution. In the case in point, Unicode offers ready-made vowels with the macron accent or diacritic combined, like ā and ē. Look through the Accented Latin Characters section in Latin, using the Character Viewer (Emoji & Symbols), and you won’t find x with a macron there.

For this type of addition to a regular character, Unicode offers what it terms combining marks, in this particular case a Combining Diacritical Mark, found at code points from C+0300. You have a choice of COMBINING MACRON at U+0304 and COMBINING OVERLINE at U+0305. Scroll down through the different code points and you’ll come across further sets of entries which are categorised as Combining Diacritical Marks, each of which will be combined with the preceding character in what appears to be a single mark.

unichar12

There are two good ways to add these combining marks in macOS: either from the Character Viewer or, if you know how to produce them from a standard keyboard, using that directly. In the case of the Combining Macron, you can add that by enabling the ABC – Extended keyboard in the Input Sources tab of the Keyboard pane. Normally, when adding an accent to form a character such as é, you press Option-e for the accent, followed by the character e. The ABC – Extended keyboard works in reverse order here: press the x first, then Option-Shift-A will add the macron to it.

You could also use the Unicode keyboard. Hold the Option key down, type in the UTF-16 code for any Unicode character, and that will insert it. The snag with this is that you’ll probably need the Character Viewer to look up the correct UTF-16 code, and it’s then easier just to double-click on the character there to insert it.

unichar13

There’s one problem you must also be wary of. Many plain text editors use fonts with limited support for these features in Unicode. Although the characters in your text may be perfectly correct, if the editor is using a limited font, you still won’t see the result you’re expecting. It’s worth switching the font to one which you know supports an extensive range of Unicode features, such as the free Junicode, which supports an amazing range of diacritics and other variations of Roman fonts.

unichar11

The plain text which you have now assembled using Unicode Combining Diacritical Marks should work in all apps which support Unicode, when set in a font which can cope. The limitation here is what Unicode offers: if you want to combine two arbitrary characters, then it can’t help in plain text.

Typographic tricks

Better page layout apps provide a range of tools which allow you to place glyphs wherever you want on the page. You could, if you wanted to, put the x in one text box, the macron in another, and align the two together. The text content of that page would then contain the two characters separately, which won’t help those using screen readers, for example, but they’ll look right when turned into a page description language like PostScript or PDF.

unichar14

Thankfully, bringing two or more glyphs together into a single compound glyph is quite a common task in typography, and normally involves changing the kerning of the second and subsequent characters. For example, in Pages you select the two (or more) glyphs to be combined and use the Advanced Options in the Text sidebar. Reduce the Character Spacing there until the two glyphs are correctly aligned, and they will be visibly combined. However, if you copy them, or export them as text, for example, what you will see are the two separate characters, x followed by the macron diacritic.

When working with maths and scientific symbols, particularly with more extensive notation such as sums and integrals, there are two specialist classes of app to consider: those based on the TeX page layout language, including LaTeX, make typesetting maths and science remarkably simple and powerful, and a more recent alternative MathML has also gained support in apps like Microsoft Word. While these provide standard ways to exchange laid-out text, they aren’t the same as Unicode, and will probably have to be rendered into PDF or similar for others to be able to access.

Which to use?

Whenever possible, it’s preferable to set combined characters using Unicode Combining Diacritical Marks, as they’ll be preserved in (almost) any format derived from that text, including apps which don’t support typographic controls. The only exception to this should be when you have to use a font which can’t display the combined character correctly, in which case typography will save the day provided that the document remains in laid-out format, for example as a PDF.

Thanks to EcleX for sparking this off, and to Iljitsch van Beijnum for suggesting one solution. The latter’s website may also be helpful if you can’t do this on your Mac.