Explainers: Sparse things

Over the last couple of weeks, the word sparse has cropped up a lot here, in two very different contexts, leading to some confusion. This brief article draws distinction between its two main uses, and adds a third you might come across too.

Sparse bundles

A sparse bundle isn’t a file at all, but a folder in disguise, and is a form of disk image: when you mount it in the Finder, it presents itself as an additional volume, in which you can save files. These are often required to store Time Machine backups on networked storage, and there (just to confuse) may be named .sparseimage, although their normal extension is .sparsebundle. You can also create your own, for instance using my free utility Spundle.

Sparse bundles have a fixed structure. If you open them in the Finder, you’ll find an Info.plist file, its backup, a token, and a folder named bands. That contains all the individual band files which are used by the sparse bundle to store data in its virtual file system. Sparse bundles provide a very flexible way to store volumes of files using Mac native formats including HFS+ and APFS on other file systems. Unlike some other types of disk image, they can grow and (when maintained) shrink in size according to what you store in them.

One potentially confusing feature of sparse bundles is that individual band files can also be sparse files!

Sparse files

Almost any file on your Mac, if it’s stored on an APFS volume, can be a sparse file, which refers to a specific file format which makes most efficient use of storage space when files contain significant quantities of empty or void data. This applies only to files, though: there’s no such thing as a sparse folder.

You have no control over whether any given file is stored as a sparse file: it’s determined by the apps which create and save to it, together with the APFS file system. Apps which simply write out gigabytes of zeroes to fill blank space in a file, such as a database, won’t be using sparse files. Instead, the app needs to skip over the voids in the data, and only write out the bytes which aren’t zeroes.

You can create test sparse files, and check for existing ones, using my free utility Sparsity (Big Sur only, I’m afraid). Once thought to be rarities, sparse files are increasingly common in macOS and can be used in disk images and the band files in sparse bundles to make more efficient use of storage. But they don’t really give the user any more space, as they could be expanded to their full size.

Sparse files are different from APFS clones, in which one file is copied or duplicated, and for the moment shares space with the original. Clones are even more common than sparse files.

Sparse matrices

You’re far less likely to come across these, although they are used in some maths, AI, and graphics. A matrix is a two- or more dimensional array of numbers. It’s common for them to contain mainly zeroes, with relatively few non-zero values. As these matrices can be huge, macOS has advanced routines which work with these sparse matrices, storing them efficiently but still allowing access to their values. They might be stored in sparse files, but that’s quite a separate issue.

I think that’s all the uses you’re likely to come across of sparse things in Macs.