Lightweight VMs are sparse files, and how to keep them compact

You never know where macOS is going to use APFS features like sparse files next. Although we initially assumed these would be unusual, then later understood how they’re used by some databases, they’re now becoming even more commonplace. Use a macOS or Linux lightweight virtual machine (VM) on an Apple silicon Mac, for example, and that’s probably running on a sparse file or two.


They’re easy to find: select a VM in the Finder and Get Info (Command-I) for it. If you notice a disparity between the Size, here 141 GB, and space taken on disk, a mere 27.98 GB, then that VM has at least one sparse file inside it. In both lightweight macOS and GUI Linux VMs, the boot disk Disk.img is normally sparse, and the macOS AuxiliaryStorage component is also a sparse file.

My most spectacular examples of sparse storage are 21.47 GB of Disk.img taking just 1.58 GB on disk for a file sparsity ratio of 14, and a hefty 128.85 GB Disk.img taking 15.89 GB for a ratio of 8. AuxiliaryStorage is smaller and less impressive, typically with 33.6 MB taking 17.8 MB.

This is an interesting and significant design decision for lightweight virtualisation, and what appears to be a new type of disk image. Existing variants of UDIF disk images can be compressed, but I’m not aware of any that are stored as sparse files. Documented formats that support sparse storage, sparse bundles and sparse disk images, aren’t themselves sparse files, but use other techniques such as band files to minimise their storage use.

In theory, writing to a sparse file could achieve higher performance than other types of disk image, although the current implementation may need further optimisation to realise that. It will also be interesting to see if Apple makes this format available to developers and users through Disk Utility and hdiutil.

Safe movement

These and other sparse files raise the perennial question of how you can move them around without causing them to explode to full size. Although macOS Monterey 13.1 is much better than older versions at preserving sparse files, here are some guidelines derived from testing.

Only ever move or copy sparse files to APFS volumes, unless you protect them inside some form of archive. If they move to another file system, like HFS+, any Windows/PC file system, ZFS or Btrfs, then their sparseness won’t be preserved, and they’ll assume their full size when you copy them. This also prolongs the act of copying, as they’re read from the source at full size. If being transferred over a network, this is annoying to say the least.

Copying or moving between two APFS volumes, even on different disks, connected to the same Mac should always preserve their size. Although file sharing between Macs should also preserve their size, never use AirDrop, which doesn’t.

Surprisingly, copying or moving a sparse file to or from iCloud Drive doesn’t cause any increase in size, provided that it’s only accessed by that one Mac. If you try to copy or move a sparse file from iCloud Drive to a different Mac, the sparse file will then increase to full size.

If you need to move a sparse file by a means that would normally expand it, copy it into the protection of an APFS-format sparse bundle or sparse disk image for transit. On the destination Mac, mount that sparse bundle or disk image, and you should be able to copy the sparse file to that Mac without any increase in size.

Archiving and compression are even more complicated. In general, popular methods such as Zip won’t preserve size, and some result in archived or compressed files that are even larger than fully expanded size. The one format which appears able to keep sparse files sparse is Apple Archive, but even there it’s easy to end up with the file at full size. Archive Utility, Keka and my own Cormorant will reliably conserve the sparse file format and size when the file(s) are put inside a folder prior to compression. Use them on a bare file and you could be disappointed with the result.

Backing up sparse files can be tricky. Time Machine making backups from APFS to APFS backup storage normally preserves sparse files well, but when backing up to HFS+ storage they will explode to full size during copying, and so take a long time to copy, and take up a lot of space on the backup storage. Other backup methods should be tested carefully before assuming they can both store and restore sparse files faithfully. You should in any case avoid backing up VMs in your regular backups because of the space they will consume, even when kept as sparse files.

To keep sparse files sparse:

  • store them only on APFS volumes;
  • transfer over a network using macOS file sharing;
  • never use AirDrop, and avoid iCloud;
  • when needed, protect them inside sparse bundles or sparse disk images;
  • compress inside a folder using Apple Archive;
  • back up using Time Machine to APFS.