Last Week on My Mac: Still struggling with snapshots

It’s strange to think that most of us have now been using APFS for well over four years, but it still surprises and baffles us. Much of this week I’ve been looking at snapshots, one way or another, whether they’re the vanilla variety which live alongside the volume they’ve captured, or those cunningly created as backups by Time Machine.

It all started with what should have been a simple question which we’ve now learned not to ask unless we’re desperate: where did all the free space go on that disk? Last night, there was around 120 GB free, but this morning it’s down to less than 30 GB. What was that MacBook Pro up to in its sleep?

storage

The last place to look, of course, is the Storage view in About This Mac, which as usual just writes off most of that disk to “System Data”, saying that it’s in use but macOS isn’t going to tell you what’s using it, so there’s nothing to be done about it. Because this is Monterey, we can at least rely on Disk Utility to tell us how much is taken by snapshots, and that turns out to be a major culprit, with two 25 GB snapshots made in the late evening of the previous day.

It’s at this point that we enter conceptually confusing territory. Knowing that those large snapshots coincided with Time Machine backups lures us into difficulty, by making comparisons between those snapshots and Time Machine’s backups. After all, they’re both snapshots, aren’t they? Yes and no, and if you don’t make a clear distinction between them, you’ll quickly steer into further danger.

When Time Machine is set to back up a volume on APFS to backup storage on APFS, it provides you with two snapshots each hour. The first is a normal snapshot of the volume which is stored locally, alongside the volume itself, and the second is a synthetic snapshot used to create the backup on backup storage.

Both snapshots contain a copy of the whole of the file system for that volume at the instant that they’re made. Otherwise, they’re quite different.

The normal snapshot then accumulates all changed data for that volume over time. When a file on that volume is deleted, instead of its data being marked for re-use, it’s retained, and forms part of that snapshot. That applies to everything in that volume, including files in the Trash, and those in folders which aren’t backed up. This enables you to roll back the whole volume to the state it was when the snapshot was made, and to retrieve folders and files which have since changed or been removed, even if they’re not included in the regular backup. Those snapshots are removed automatically after 24 hours, to prevent them from growing too large.

The synthetic snapshot created on the backup storage contains the same file system data covering the whole volume, but Time Machine copies across only the data it needs to turn that into a backup.

For example, on one volume which Time Machine backs up on my Mac, I’ve set the exclusions so that only one folder is actually backed up. The normal snapshot stored with that volume retains all data changed on that volume, which might include temporary files, versions in the macOS versioning system database, Spotlight indexes, and any Virtual Machines. But when Time Machine creates a backup of that, while its file system still contains directories for the whole volume, the only data it’s connected to is that for the single folder it’s backing up. Spotlight indexes, the versioning database and other folders are excluded by Time Machine’s defaults, as are all folders except the one it’s set to back up.

Unfortunately, what you’ll see in the Finder is misleading. Select one of those backups and inspect its size using Get Info. You’re then shown the size of the whole volume as would be expected for that snapshot, which could be hundreds of GB, although within that all you see is the single folder of much smaller size.

So, if you want to discover what made any given snapshot so large, looking at the matching backup is of no help unless it happens to contain the large items which swelled the size of that snapshot. Indeed, it demonstrates that it’s probably best not to call Time Machine’s backups snapshots in the first place. One tool which could help is a snapshot diff, something which Time Machine can do when determining which files need to be backed up, although it’s normally not used.

By the time that you’ve struggled through all this, over 24 hours will have elapsed and the offending snapshots will have been removed automatically, a feature over which you have absolutely no control. A few moments later, with a bit of luck, you should see free space jump to something more comfortable. At least, until the next time that macOS decides you need a huge snapshot, when you’ll start the cycle all over again.

Snapshots are wonderful tools, and Time Machine backing up to APFS is a great advance and a technical accomplishment. What they need, both for ordinary and advanced users, are interfaces which don’t condescend or confuse. The Storage view pats you on the head and reassures you that most of your disk has now been taken over by System Data, and it doesn’t really matter that free space is fast vanishing. We also need good documentation which is both conceptual and practical, so that we can understand and solve problems.

There’s an important lesson for us with respect to apps which make snapshots, like Time Machine and Carbon Copy Cloner. Where you have large files which change a great deal but don’t need to be backed up as frequently, don’t simply exclude them from your backups, but put them on a separate volume which doesn’t have snapshots taken. This could apply to folders of temporary and cached data, Virtual Machines, downloads perhaps, even databases and Photos libraries. There’s something to be said for a user-equivalent to the hidden VM volume.