The macOS Natural Language framework and Nalaprop

One of the Mac’s great attractions has been its support for those whose first language isn’t English. That means many of you, as WordPress tells me that you speak German, Dutch, Chinese, Spanish, French, Italian, Japanese, Polish, Swedish, and more, although perhaps not all at once. While English is great as a lingua franca, our mother tongue is our culture and our literary tradition, and a multilingual world is far richer for all our languages.

What you may not realise is the deep support for your languages in macOS. I’m not here referring to Language & Region settings, or translation support, but to the features in the Natural Language framework, introduced in macOS 10.14. It provides support for apps to analyse text in many different natural languages and do useful things with those analyses. These days, that not only includes support provided by Apple, but enables apps to deploy custom natural language models using Machine Learning, or AI if you prefer the term.

AI seems a particular problem for non-English languages at present. In the headlong rush to be first with the most powerful Large Language Model, an industry dominated by monolingual US corporations has focussed its efforts almost entirely on English. Although most of the leading LLMs are claimed to be multilingual, and some include over 50 languages, their models are in reality overwhelmingly built on English, with less than 10% representing all other languages. And that small minority breaks down to even less when you consider individual languages: even major European languages like Italian barely get a look-in.

I’d be interested to hear of your experience accessing LLMs using non-English languages.

This is an area that Apple’s enthusiastic support for smaller, local models could make them more useful than hugely expensive LLMs built in all those US-run data centres.

When the Natural Language framework was first released for macOS, I built an app to demonstrate some of its powers, Nalaprop, and its current version still runs happily in Tahoe. Although it remains useful for some, I feel the time has come to make better use of this framework, or let Nalaprop slip away quietly with the arrival of macOS 27 this autumn/fall. Let me explain what it currently does.

Nalaprop relies on linguistic support modules loaded into macOS. As far as I can tell at present, those provide full support for English, French, Spanish, German, Italian, Portuguese, Russian and Turkish. It can also recognise many other languages, but support for those doesn’t extend to analysing them more fully.

Load your Mac up with a good selection of those, some you’d like it to aspire to, and give it an hour or so to download and install additional language support. Then open Nalaprop’s bundled demonstration text file drawn from Wikipedia’s many languages.

It then analyses the text (on the left) for the common parts of speech, such as nouns, verbs, adjectives, and colours all the words according to that classification (in the centre). As you can see here, it’s not afraid to do this on texts containing multiple languages, and appears to make a good job of all those its supports.

The next stage is initiated by clicking on the MultiParse button, which performs an even more thorough analysis, including lemmas, converting words into their ‘root’ form. For example, the English word is is a form of the verb to be, just as the French est is of être, so Nalaprop displays that root form of the word in the centre panel. As you can see, this doesn’t do much for English, which doesn’t decline words much, but for many languages it can be a great help when you’re trying to understand them.

Given all those lemmatised forms, Nalaprop can then build word lists by parts of speech, classifying the word young as an adjective, and finding a total of 28 examples (on the right) in the text of Charles Dickens’ novella A Christmas Carol.

Since I wrote Nalaprop, the Natural Language framework has extended its capabilities, and there’s a great deal more that the app could do, even down to building gazetteers of place-names, exploring similarities between words and sentences using semantic distance, and of course integrating AI built into macOS.

Nalaprop is available from its Product Page.

Should I put it into retirement, or do something more useful with it, and if so, what would you find most useful?