Is it time for a good Scrub? Building a tool to enhance privacy

At the start of this week, I published an incomplete list of some of the different places that macOS can cache potentially private data. Since then, I have been looking through that list and wondering what we can do about it. The answer seems to be an app which can systematically empty those caches when needed, which I have dubbed Scrub.

While we’re using bathroom metaphors, I don’t want to throw the baby out with the bathwater. Disabling many of the important tools which we rely on everyday reduces macOS to a quirky Unix with a fancy front-end, and losing key tools like Spotlight and versions completely would make life far less productive, and much more frustrating. Several times versions have come to my aid when something has gone missing from an important document.

The purpose of the app is to remove as much ‘leaked’ data and make the contents of a folder or volume as private as possible, within the limitations of macOS, and without compromising important macOS features.

Some example use cases might include someone working on sensitive documents stored on encrypted removable media who must not leave traces of those documents on their unencrypted boot volume, or someone wanting to minimise the forensic footprint of a batch of documents, wherever they are stored.

In the former case, that user may only need to disable QuickLook caching and empty that cache from their startup volume, but they might also want to play safe and clear the unified log once they have finished their work.

In the latter case, the user will probably want to strip all extended attributes (which can contain the download paths of files, for example), remove Spotlight metadata, strip all old versions of documents, empty the QuickLook cache, set all files to a uniform date of creation, and even trash their logs. Those would severely limit the information which could leak with those documents.

This is what I have at the moment:

scrub01

Its current options include:

  • Stripping extended attributes (xattrs). There are some types which are easy picking, and provide metadata which a user may well want to remove, such as the URL from which a downloaded file was obtained. However, other users might want to strip all xattrs, so this is a popup menu offering three different levels, from leaving them alone to removing everything.
  • Spotlight metadata. This uses the mdutil command to turn indexing off for that volume, and erase the current local store. There doesn’t seem to be any finer control, though.
  • Versions. This iterates through all the documents, and strips all old versions which can be removed, leaving just the current document. Although version data are stored locally on each volume, previous versions of documents can contain all sorts of material that you wouldn’t want anyone else to see. This is a very specific action, like removing xattrs, which only affects those documents in the selected folder.
  • QuickLook cache. This is stored on the boot volume, and as I have explained previously, can leak potentially sensitive information beyond the safety of encryption. Turning caching off and emptying the existing cache is an effective and almost instant solution.
  • Unified and traditional logs. Although the unified log does work hard to preserve privacy, we have already seen in High Sierra that some software can breach that protection. It’s not something that I would choose to do, but some may need to.
  • Setting all document creation dates to an old date. This prevents the use of those dates in evidence-gathering.

There are two areas where I don’t think Scrub can go.

The first is Notifications, as there is no global feature which can remove all notifications before a certain date and time, as far as I am aware, and apps cannot go round managing notifications posted by other apps.

The other is the file system events record, FSEvents. This doesn’t record what changes are made in file content, but keeps a record of all such events. It is of critical importance to Time Machine backups, as it is this which lets Time Machine backup work out which documents have changed, and it is used by a range of apps and tools. I don’t know of any reliable way of managing it, so the only action would be to remove the whole database, which I think would be a foolish move.

I am currently adding those features listed above to an initial alpha release. It has two modes: audit, which simply checks what would be removed if you were to run a Scrub, and the Scrub itself, which can record its actions and findings in normal or verbose modes.

Already running it in Audit mode is quite a surprising experience: as you’ll see in the screenshot, one fairly well-used folder in my Documents folder has over 500 MB of xattr data stored for around half of the documents there, and 833 of its documents have a total of more than 17,000 old versions between them. That’s an awful lot of material which could be studied by someone who gained access to that folder, or during a thorough forensic analysis.

Would you ever use this app? If so, what other features would you want to use? Please pass on your ideas and suggestions, either as comments here or by email.