How good is Monterey’s Visual Look Up?

Visual Look Up might have come as a surprise in Monterey 12.3, but it’s been brewing for a good while. It’s the next step beyond facial recognition and the extraction of text from images. It was almost announced last summer when Apple first discussed its ill-fated plans to perform image recognition to detect Child Sexual Abuse Material (CSAM). The technical account of the methods that Apple was proposing to use had clearly more general intention. Now we’ve got it, not for CSAM detection, but to recognise paintings, breeds of pets, landmarks, and some other image content.

If you’ve updated to 12.3, it’s very easy to test. Bring up the image of a painting in one of the supported apps, currently including Safari, Photos and Preview. In Safari, Control-click on the image to produce the contextual menu.

vislookup1

At the foot, you should see (sometimes not immediately) the command Look Up.

vislookup2

If that’s successful, a white circle with the Visual Look Up icon inside will appear in the middle of the image. In some cases, the image may be opened in a new window for this to happen, and it’s common for this step to run straight through into the next. If not, simply click on the white circle.

vislookup3

A window then pops up over the circle and displays information about the painting, together with a menu of suggested links at the bottom. The amount of detail given in these windows varies considerably. For the Mona Lisa, there’s a brief essay with a thumbnail image.

vislookup4

For some paintings containing other well-known works of art, Visual Look Up goes one step further and recognises one or more of the paintings within the painting, in this case Louis Béroud’s The Mona Lisa in the Louvre (1911).

vislookup5

However, for the moment, Visual Look Up doesn’t seem capable of recognising different objects within a painting, such as breeds of dog, which it happily does within photo images. It’s very skilled at those, recognising our daughter’s Havanese, a fairly unusual breed, in a poor photo, for example.

To perform Visual Look Up in Preview, open the image and wait until the Inspector button shows a + sign at its upper left, indicating that the feature is available. Click on that Inspector button, and as well as the Inspector window appearing, the white Visual Look Up button should appear in the centre of the image. If you leave the Inspector window open, this should happen automatically when you open other images.

Photos is similar to Preview: open the image you want to Look Up, and after a couple of moments its Get Info button should show a + sign at its upper left if Look Up is available. Click on the button to open the Info window, and all Visual Look Up buttons available should be revealed on the image.

Visual Look Up is successful in identifying a very wide range of Western paintings, and seldom reports that it can’t identify a painting. It does make the occasional mistake, which gives us a glimpse into the mechanism it uses for recognition. For instance, the only error it has made here so far was misidentifying Pissarro’s Setting Sun and Fog, Éragny (1891).

Camille Pissarro, Setting Sun and Fog, Éragny (1891), oil on canvas, 54 x 65 cm, Private collection. WikiArt.
Camille Pissarro (1830–1903), Setting Sun and Fog, Éragny (1891), oil on canvas, 54 x 65 cm, Private collection. WikiArt.

Visual Look Up told me that was Akseli Gallen-Kallela’s Landscape in Kuhmo (1890), an error no human would make. This reflects the fact that the mechanism used here doesn’t resemble that of human vision.

gallenkallelakuhmo
Akseli Gallen-Kallela (1865–1931), Landscape in Kuhmo (1890), oil on canvas, 14 x 31 cm, Gösta Serlachiuksen taidesäätiö, Mänttä-Vilppula, Finland. Wikimedia Commons.

From my initial exploration, Visual Look Up is dependent on multiple internet connections to several different Apple servers. Much of the work is performed by VisionKit and VisualSearch. These initially analyse the image and determine which type it is. If it’s a painting, they then calculate what Apple terms a Neural Hash, a near-unique signature which is largely independent of colour balance, resolution, even cropping within limits. That Neural Hash is then looked up online in Apple’s dictionary to find the nearest match, for which details are then fetched from Apple’s servers.

Confusion between the Pissarro and Gallen-Kallela paintings above resulted from a ‘collision’, in which their Neural Hashes are sufficiently similar to result in misidentification, one of the reasons that Apple was dissuaded from using this technique to detect CSAM. When it’s relatively unimportant, errors like this are tolerable.

Currently for paintings, Visual Look Up appears remarkably accurate, with vast coverage, far greater that you’d expect any single human to be capable of. However, as implemented it has two significant shortcomings.

The first is in the data that it links to. Look up most Turners in the Tate, for example, and Look Up doesn’t link to their pages on the Tate Gallery site, but in many cases to commercial vendors of replicas, whose information about these works is often limited and unhelpful, sometimes incorrect. The information given doesn’t appear to have been structured by an art expert either: media and location are rarely given, and the only dimension provided is the height of the painting in feet. Hopefully Apple will improve these in the future, and make Visual Look Up the uniquely powerful tool that it should be.

The sad thing for me, though, is that I’ll never be able to get away with spoofs like this again. Visual Look Up gets the artist right every time, and all but one of the paintings identified correctly.