What’s going on with AI in Sequoia?

If you only read the headlines, you’d now presume that ChatGPT was the major new feature in macOS Sequoia and its sister operating systems due for release in about three months. This article explains how that’s not correct, and what is really going on with AI and ML on the Mac. Because it’s clearest to understand, I’ll focus here on changes coming in how macOS works with text, features that Apple calls Writing Tools. To understand where we are now, I’ll start forty years ago.

Spell-check

Back in the early 1980s when personal computers including the Mac were starting off, one big breakthrough was being able to check the spelling of the words you typed into word processors and later in laid-out pages when the Mac brought the Desktop Publishing revolution.

spellcheck1

Although this started simply, checking spelling turned out to be more complex, as it had to cope with language variations, and even then wasn’t as clever as a human. How could it tell whether that word should be there, their or they’re, for example? As we became increasingly unimpressed, so checking spelling extended into grammatical context, and came to examine our grammar too.

spellcheck2

Word completion

On devices where typing is performed by tapping on tiny keyboards or on-screen, predictive text became universal. Using a system of simple rules, it enables you to enter the most likely words with the fewest keystrokes. Since its introduction this has steadily become more sophisticated, so that it learns which words a user is most likely to type, and in which contexts. This has now extended to word completion for Mac text entry, with considerable ability to learn what to suggest next. That machine learning is performed on-device, and doesn’t require any assistance from external resources.

spellcheck3

Optical character recognition

Over the same period, computers and devices have become able to recognise text in images. At first this was the preserve of dedicated Optical Character Recognition (OCR) software used to convert scanned pages into text, and sometimes was performed off-device. It was built into macOS Monterey as Live Text, alongside Visual Look Up.

Live Text doesn’t transfer any data off device, although it’s dependent on linguistic support data that may have to be downloaded. Much of the work performed in Visual Look Up also remains on-device, although image recognition may call on servers containing data for specific types of image, such as paintings.

One popular technique is for the local calculation of a form of hash that is distinctive to a part or the whole of an image, and for the remote system to match that against hashes for known objects, and propose the closest match as its identification of the image. As the hash function used is one-way (and normally derived from a neural network), there’s no way to reconstruct the original image from that hash. That contrasts with online systems that require the image to be uploaded for remote analysis and identification, so putting your privacy at risk.

Writing Tools

These are extensions of existing text analysis that look beyond a word and its immediate context, to analyse paragraphs or whole documents. To do this requires more substantial models, extending up to Large Language Models (LLMs) that have become so famous in popular AI like ChatGPT.

Apple has developed an LLM that is powerful enough to provide Writing Tools, but is small enough to run on a device, or an Apple silicon Mac. This has been fine-tuned into a task-specific version for performing text-based functions such as summarising and proofreading text. It can turn your text into a summary of key points, or generate lists or tables from it. These work both with editable text in most editors and word processors, and even with non-editable text from other sources.

Writing Tools don’t themselves generate new content in text, but use the original text to produce derivatives. I’m particularly looking forward to using its proofreading feature, which can suggest improvements that I can choose to ignore, or adapt to my own style, as I wish.

Private Cloud Compute

Some of the more demanding tasks in Apple Intelligence can’t be run on-device at present, although as techniques and hardware improve that may become possible in a year or two. For those challenges that need more computing power, Apple has devised what it terms Private Cloud Compute, using servers with Apple silicon chips designed to preserve privacy at all times, and overseen by independent experts to verify privacy measures. Apple has just published an article explaining how that will work.

Until there’s more experience of which tasks can be run on-device, it’s not clear which will benefit from Apple’s servers. Given the capable hardware in an Apple silicon Mac, it currently looks unlikely that any Writing Tools will be required to be run remotely.

Generative AI

Writing Tools don’t create new content, they use your original to generate derivatives, much in the way that a sub-editor might proofread and create a summary for an author. Generative AI uses LLMs and other methods to create new content, perhaps bringing together text from a range of other sources to write you an essay about something you want to know about. While some of us (me included) have no interest in using ChatGPT or its competitors, Apple recognises that many who use Macs and devices do want easy access to those services, and has promised to integrate access to them into Sequoia and its sisters. However, their use is entirely optional, and at present there are no plans to somehow incorporate third-party LLMs into everyday macOS features.

Summary

Starting this autumn, rather than having to write my own summary of an article like this, I will be able to use the Writing Tools in Apple Intelligence to do it for me. This will take place within the confines of my Mac, where it remains private. I retain complete control, and can reject the summary provided, or modify it to my own taste. For some more demanding tasks, macOS may decide to enlist the help of Apple’s Private Cloud Compute, which will also respect my privacy and not retain any of my text once the job is complete. And no, I’m not going to use ChatGPT, thank you, but if that’s what you like, it will be there and free to use.