How sparse files can explode backups

One of the commonest problems with making backups, whether you use Time Machine or not, is when the process takes far too long; another is with backups which take too much space. Thanks to a report from Andrew, I’ve realised that there are circumstances in which a single file can cause both, bringing Time Machine to its knees, if the file in question happens to be a sparse file.

Sparse files are a new feature in APFS, and on Macs didn’t exist before High Sierra. Generally thought to be rare, like anything unusual they tend to catch up with you unawares, and in this case can create havoc. This is because of their dual nature: a sparse file is one in which much of the content is empty, with just a relatively small amount of real data within it. Further technical details are given here.

Let’s say for a moment that, somewhere on your Mac, there’s a sparse file containing 10 GB of void, and less than 1 KB of real data. Provided that the app which creates this does so according to the rules, that file will have a ‘size’ of just over 10 GB, but will only take a few KB of disk space, because that’s how sparse files work in APFS.

tmsparse1

When Time Machine wants to back that file up to an HFS+ disk, using Time Machine, it checks what size it is on disk, which is only 8 KB. Because Time Machine backing up to HFS+ copies new and changed files complete, when it comes to copy the sparse file to HFS+, which doesn’t support sparse files anyway, all 10 GB is copied across, giving the backup a 10 GB headache. On the source disk, what took only 8 KB of space has now expanded to require a full 10 GB, without a single warning. Now increase the size of that sparse file to 100 GB or more, and you can see how Time Machine’s backups can readily grind to a halt and run out of space when trying to back up your 8 KB sparse file.

The only way to avoid this with Time Machine – or anything else – backing up to HFS+ is to exclude the sparse file from your backups.

We now have another way, though, thanks to the new ability in Big Sur to make Time Machine Backups to APFS (TMA).

tmsparse2

Not only does TMA spot that this is a sparse file and creates a sparse file in its backup, but it isn’t daft enough to try copying the whole file across either. This is reflected in the file sizes given for the backup, although BackupLoupe doesn’t seem aware of this difference, and still reports that the file occupies 10 GB.

tmsparse3

Prior to Big Sur, all my searches for sparse files in the wild had drawn a blank, and I presumed that they were pretty unusual. I’ve recently enhanced my utility Sparsity, which can create sparse files, so that it can also hunt them down. Although they’re not common in Big Sur, they certainly seem to be on the rise. I’ll write more of this when it’s ready to distribute. For the moment, at least we know that it’s safe to back them up to APFS.