Spotlight on search: How to diagnose and fix problems

As with so much of macOS, diagnosing problems with Spotlight is made much easier when you understand how it works, and where those problems can arise. This article presents an overview of Spotlight in the context of diagnosing its problems. Here’s a handy summary of the two main processes which take place in Spotlight: the indexing of files to maintain its databases, and how it processes search requests.

spotlightsteps1

Much of what happens with Spotlight is invisible, and doesn’t even get recorded in the log. In the diagram, the only parts which are visible to, or under direct control of, the user are shown in green.

Indexing

The process of indexing a file and its contents is initiated when an app (or other software) writes a new or changed file, which is recorded in the hidden FSEvents database. This triggers an XPC call to process that file if it’s in a location which is within Spotlight’s scope.

The first cause of problems is that the file you’re interested in may be in a location which is excluded from indexing. So the first step in diagnosing any Spotlight problem is to inspect Spotlight’s Privacy tab in its pane in System Preferences, to ensure that Spotlight should be indexing that location. There are other situations in which items won’t get indexed, because that has been blocked by another mechanism, including:

  • appending the extension .noindex to the folder name (this previously worked using .no_index instead);
  • making the folder invisible to the Finder by prefixing a dot ‘.’ to its name;
  • putting an empty file named .metadata_never_index inside the folder; this no longer works in recent macOS.

Provided the changed file isn’t in an excluded location, an mdworker process should then start to add its contents to the volume indexes. To do this, it first checks what type of file it is, in terms of its UTI. If that’s incorrect, then the remainder of the steps won’t work properly. In most cases these days, that means the file must have the correct extension for its type. If it doesn’t, then mdworker won’t be able to index it correctly.

Spotlight then looks up the correct mdimporter for that type. For many file types, those are provided as part of macOS and stored in the system, in /System/Library/Spotlight. Importers for third party apps may be in /Library/Spotlight or its equivalent in your Home folder Library, or in the /Library/Spotlight folder inside the app itself. Use the command
mdimport -L
to list all mdimporter plugins currently installed, with their paths.

Spotlight importers and mdworker itself can crash when there’s a bug, or the mdimporter encounters a malformed file. If that happens, the log normally records repeated crashes and restarts of that mdworker process. In the past, it has been a common cause of problems more generally. If you can identify and remove the file that’s causing the problems, that can allow indexing and CPU use to return to normal.

Once the mdworker has extracted the data from the file, those are added to the volume’s indexes in the hidden folder .Spotlight-V100. This is typically seen in a log entry from mds_stores containing the message
compressing 5686 bytes to <private>
or similar, for each file which has had text content extracted and added to the indexes. Those indexes may be missing or damaged, in which case you’ll need to force them to be rebuilt.

Search

There are at least four different flavours of Spotlight, which can include internet search, local search with or without limits, and Core Spotlight’s in-app search. Although their scope differs, local searches perform a query using a search predicate on the Spotlight indexes. There is plenty of opportunity for error here: the scope of the search may be incorrect, for instance excluding specific folders, or Mail’s messages. The search predicate can be incorrect, and the search may fail partially or completely. As there’s no other way to make a direct query of the Spotlight database, these are almost impossible to check.

Searching is normally started with a log entry from the Spotlight server reading something like
KEEP!!!!! Client <private>[2892 2892] xpc checking in
following which there’s a directQueryOpenReply and QueryOpen is successful. Spotlight index then reports that it has sorted a number of flat pages, giving the time taken to do so. Spotlight’s response is then distinguished by the server responding with a directQueryFetchResultsReply, which gives a Quality of Service (qos) value for that search response.

Spotlight searches continue until they’re complete, and it’s possible for a calling app not to wait sufficiently long for all results to be gathered.

To diagnose your Spotlight problems, try my free utility Mints, which gives you detailed information about each process, and will examine custom file types and their Importers too.