Time Machine to APFS: Evolution

For anyone who uses Time Machine, one of the major improvements in Big Sur is that it can now make its backups to APFS volumes. In this series, I’m going to explore what this means, how it works, and how it can go wrong. To distinguish between what are effectively two different systems, I’m going to refer to Time Machine backing up to APFS as TMA, and when backing up to HFS+ as TMH. One other caution: although I’ve been testing and using TMA since last summer, knowledge of and experience with it remain limited. If you spot any errors, please don’t hesitate to correct me.

The Past

It’s easy to forget, but TMH was released in Mac OS X 10.5 not just as an enhancement for all users, but to support new hardware products, Time Capsules, which Apple introduced in January 2008, just three months after we first started using TMH.

TMbackup105

From its release, TMH has been dependent on features of the HFS+ file system to create its Finder illusion. Every hour the backup service examined the record of changes made to the file system since the last backup was made, using its FSEvents database. It worked out what had changed and needed to be copied into the backup. During the backup phase itself, it only copied across those files which had been created or changed since the last backup was made.

TMbackuphardlinks

It still does this in TMH by using hard links in the backup, and Apple added a new feature to its HFS+ file system to support this: directory hard links. Where an entire folder has remained unchanged since the last backup, TMH simply creates a hard link to the existing folder in that backup. Where an existing file has been changed, though, the new file is written to the backup inside a changed folder, which in turn can contain hard links to unchanged contents.

This preserves the illusion that each backup consists of the complete contents of the source, but only requires the copying of changed files, and creation of a great many hard links to files and folders. It’s also completely dependent on the backup volume using the HFS+ file system, to support those directory hard links.

Without the directory hard link, backups would quickly be overwhelmed by hard links to files. If you had a million files and folders on the backup source volume, every hourly backup would have to create a total of a million copied files or hard links for those which remained unchanged. Directory hard links enable the efficiency needed for this scheme to work.

Apple later introduced what it called Mobile Time Machine, intended for laptops which could be away from their normal backup destination for some time. In around 10,000 lines of code, Mac OS X came to create something like a primitive snapshot, but on HFS+.

When Apple released the first version of APFS on Mac OS X in High Sierra, its new snapshot feature was incorporated into TMH. They were initially used instead of the FSEvents database to determine what should be backed up.

TMbackupAPFS

Since then, making each backup of an APFS volume has involved creating a snapshot, but that is stored locally on the APFS volume being backed up. The structure of backups themselves didn’t change: they still required an HFS+ volume and used directory hard links.

Catalina introduced a more complicated scheme to replace snapshots as the normal means for determining what to back up. This was presumably because computing a snapshot delta proved slow, and the introduction of the Volume Group, with specialist types of APFS volume for which snapshot deltas would be inappropriate or impossible.

TMbackup1015

Determining what gets backed up in Catalina can thus depend on snapshots, and snapshots are made during each backup, but because the backup destination remains in HFS+ format it can’t use snapshots itself, and instead still has to rely on directory hard links.

TMH

Big Sur retains the option to continue using TMH to backups stored on HFS+ volumes, where it fixes one bug which for many made TMH unusable in Catalina: in Big Sur, TMH should no longer choke when trying to back up large hidden folders containing a volume’s version database, as those are now excluded from backups. They were in any case of limited value, as they could only be restored when a whole volume was being restored, and not when restoring only part of the volume.

TMA

TMA interestingly reverses the design of TMH in High Sierra: instead of using snapshots to determine what needs to be backed up before creating a backup using traditional hard links, most of the time TMA determines what’s changed using the traditional method with FSEvents, then creates its backup as a snapshot on the backup volume. The latter is essential, as without directory hard links, there’s no way of using the TMH method to make backups to an APFS volume.

TMA backup volumes are special: you can’t just use any convenient APFS volume to store backups. They use the case-sensitive variant of APFS, which normally isn’t used in macOS but is the standard for iOS/iPadOS, volume ownership is enabled, and the volume is assigned the role of Backup.

Backups are called using the same timer system involving DAS and CTS. TMA (at least) might now bring forward a backup when there have been many events recorded in FSEvents: I believe that I have precipitated this when writing over 130 GB of test files during SSD benchmarking, for example.

As with TMH, several strategies are available to determine which files need to be backed up from each volume. Local snapshots are made and stored on each APFS volume to be backed up, then the decision is made as to which strategy to use. On most occasions, when backing up a Data volume or similar, this prefers the use of FSEvents, as used prior to High Sierra. Note that, unlike previous versions of Time Machine, in Big Sur neither TMH nor TMA can back up the System volume.

A detailed assessment of the items to be backed up is then made, thus the total size to be backed up is forecast. Once that is complete, the local snapshot is copied to an .inprogress folder on the backup volume, and backup copying proceeds. Where possible, only changed blocks of files are copied, rather than having to copy the whole of every file’s data, an option termed delta-copying, which can lead to significant savings. Old backups are removed both according to age, and to maintain sufficient free space on the backup volume, in what TMA refers to as age-based and space-based thinning.

Data copied to assemble the backup on the backup volume is formed into a snapshot, which is then used to present the contents of that backup both in the Time Machine app and the Finder. Those snapshots appear in /Volumes/.timemachine/ although they’re still stored on the backup volume.

One puzzling process which appears periodically is what’s shown as backup activity on the backup volume, when TMA isn’t making a backup. This appears to be housekeeping performed on the backup volume, including Spotlight indexing of the snapshots stored there.

TMA appears to be considerably quicker and more efficient than TMH, and doesn’t seem prone to choking problems which have troubled recent versions of TMH. However, its snapshot system is less flexible that the hard links of TMH. In Big Sur, TMA’s snapshots aren’t whole-volume, only including the folders which are being backed up. Unlike in TMH, though, at present it doesn’t appear possible to delete any of the contents of an existing backup, only the whole of the backup stored in any given snapshot. These new part-volume snapshots don’t yet appear to be available to users.

The end result is, in almost every respect, superior to TMH, and Apple rightly recommends users to prefer it.