Inside Apple Archive: more than a compression format

This week I’m moving away from fractious external storage and looking in more detail at Apple Archive (AA), sometimes AppleArchive without the space, compression first opened up in Big Sur, and enhanced with encryption in Monterey. This article is a general introduction before I dive deeper into how you can get the best from Apple Archive.

Apple first introduced AA’s primary compression scheme LZFSE (for Lempel-Ziv Finite State Entropy) back in OS X 10.9, and provided limited access for disk image compression in Mac OS X 10.11. Because it has become extensively used in Apple’s operating systems, considerable work has been put into optimising its performance, particularly on ARM systems, where it’s thought to have the benefit of hardware assistance.

AA was first opened up fully to third-parties in Big Sur, and is classed as part of Apple’s Accelerate libraries for high performance computation. The most recent account of AA in its current incarnation was that given by Jonathan Hogg at WWDC 2021, when he introduced Apple Encrypted Archives.

AA is now the basis for compression-decompression provided by Archive Utility, which is tucked away in /System/Library/CoreServices/Applications. It’s an app which is so useful that I routinely put it into the Dock to ensure I have direct access. Although I normally prefer Archive Utility to work with Zip archives, it can both create AA archives and expand them, although it doesn’t appear to support encryption yet.

archutil

Monterey also comes with three command tools which provide good access to most of AA’s features.

aa is the grandfather, which has a wide range of verbs for archiving folders and files, extracting them, verification and conversion. compression_tool provides rather simpler access to AA’s compression and decompression features, including choice of algorithm and control over the number of threads used. aea is the primary tool for working with Apple Encrypted Archives, with an extensive set of options for both compression and encryption.

Archives compressed using AA’s LZFSE may have the obvious .lzfse extension, or the older .aar, and by default are normally opened and immediately decompressed by Archive Utility.

Writing your own code using AA is exceptionally well-supported by examples covering most use cases, and has now been supplemented with examples using encryption. Although its documentation still has plenty of gaps, those examples should make it straightforward to incorporate into third-party apps.

AA using LZFSE compression has the following advantages when working with Apple systems:

  • it supports the full suite of HFS+ and APFS file attributes
  • it supports extended attributes (xattrs), preserving those so designated
  • it’s multithreaded for optimum performance
  • it supports APFS special files, including clone files and sparse files, although currently not in decompression
  • it has optional support for error correction, digests and manifests
  • it’s fully streamable.

If you compress APFS special file types such as clones and sparse files, unlike with other compression tools, those files don’t explode to their full size for compression, and their space-efficient state is preserved in the archive. If you were to compress a folder containing one full 1 GB file, a clone of that, and a sparse file requiring only 4 MB to store a whole file of 10 GB, the size of the archive would be little larger than that required for the compressed 1 GB file, as the clone and sparse files would occupy next to nothing.

Unfortunately, this doesn’t hold good when that archive is decompressed, which writes the clone and sparse files out at their full size. However, their decompression is faster than if those files had been exploded before compression.

The other notable limitation is that, while AA LZFSE archives can transit through other systems, for instance as email attachments or stored on servers, they can’t be decompressed normally on non-Apple operating systems at present.

By default, when using AA for compression or decompression, it creates one thread for each available core, whether the processor is Intel or ARM. This delivers good if not excellent performance, but can be disruptive when handling large archives. compression_tool in particular lets the user set the number of threads to be used, and third-party apps using AA can also give the user that control.

When run at high Quality of Service settings typical of most apps, this means that, for the duration of the task, AA will fully occupy all the cores of an Intel processor or ARM chip. This is readily seen in Activity Monitor’s CPU History window, which is my point of departure for the next article, in which I explore AA’s performance.