Quantum computing and APFS: free and used space

On SSDs, most users now seem fairly happy with APFS, but there’s one repeated cause of complaint and confusion: inconsistencies in measuring free and used space. This article looks at why this is so difficult, and how to arrive at the best figures.

In the old Mac Extended File System, HFS+, there is little doubt about how much of a disk is used, and how much remains free. Some small anomalies appear on occasion, but with volumes of fixed size and files having definite size requirements, there isn’t much scope for error. The only slight exception to this is with sparse bundles, which can change size as required, but at any given moment in time, they occupy a known amount of disk space.

APFS poses five significant problems in measuring free space.

Volumes share the space in a Container

In HFS+, volumes have a fixed size, although you can often adjust partitioning without destroying data. When you divide a 1 TB drive into two volumes of 500 GB, each volume has that fixed size, and can’t steal or borrow free space from the other.

In APFS, volumes share the total space within a Container (which does have fixed size). Most APFS disks have a single Container, so when the space used by one of the volumes within that container grows, the free space available to each of the volumes is reduced. If a 1 TB drive has a single Container with two APFS volumes inside it, when the space taken by one increases by 100 GB, the free space available to each of those two volumes falls by 100 GB.

In normal use, this doesn’t have any significant effect in Mojave. Your boot disk has a single Container with at least four separate volumes, of which your visible volume Macintosh HD is the only one of significant size, and likely to grow with use. If Apple suddenly grew Recovery volumes from their present 520 MB size to several GB, you would notice a reduction in free space available to Macintosh HD. And on external drives with multiple APFS volumes, this needs to be borne in mind.

Clone files

In HFS+, when you copy any file, the whole of its contents are copied into a new file, with a separate directory entry. That’s it.

In APFS, when you make a copy of a file to the same volume, what the file system does is create a new directory entry for that new file, but doesn’t copy any of its data until either copy is changed. It then saves the changed data progressively until the copy eventually occupies completely separate disk space.

It’s easy to demonstrate this on an APFS volume: find a huge file, and duplicate it, or drag-copy it to a different folder. This happens instantly, unlike copying between different volumes.

This is a really neat trick, but causes confusion when estimating free and used space. To demonstrate that, I took a 6 GB file and made 19 copies of it in one folder.

apfssizing01

The Finder then reports that the folder contains 20 items, occupying a total of nearly 120 GB. However, those copies made no difference at all to the volume space reported by Disk Utility and diskutil in Terminal. When those copies are changed, though, the space required to save those changes will grow, potentially until completely separate copies are required.

So which is right: Finder or the disk utilities? The quantum computing answer is both, depending on what you want to know, although I personally don’t like answers as given by the Finder, which are simply based on adding together the file sizes reported for the contents of that folder. Unless Apple changes the way that this is reported by macOS, we’ll just have to live with this inconsistency.

Snapshots

In HFS+, Mobile Time Machine makes ‘snapshots’ when it is enabled, and those are regular collections of files and folders with easily measured sizes. Because there is no file system support for snapshots, this backup mechanism requires many thousands of lines of code, but it doesn’t hide away any of your storage.

In APFS, snapshots are made as part of Time Machine backups, on some occasions such as prior to installing a macOS update, and when the user initiates them. What happens when an APFS snapshot is made is that a complete copy is made of the file system metadata, which is very quick indeed and doesn’t involve the copying of any other file data.

However, to preserve all the files at the moment that the snapshot is made, as those files subsequently change, their original data are retained so long as the snapshot is kept. Let’s say that, in one snapshot, there’s a certain file of 1 GB in size, which then changes completely so the whole 1 GB is rewritten. So long as that snapshot is retained, its original 1 GB of data is retained, as well as its new 1 GB. So although the snapshot itself doesn’t take up much space, it stops a lot of old data from being freed up for reuse.

Time Machine purges old snapshots automatically, but by default retains the last 24 hours of hourly snapshots, which will take a total space similar to the amount of data backed up over that period. In my case, that’s typically around 30 GB at any time, but if you manipulate large media files, or old snapshots aren’t purged properly, it could easily require hundreds of GB.

The overhead of snapshots is not shown in Finder folder totals, though, as it’s known only to the file system. This normally makes up the great majority of what Disk Utility refers to as ‘purgeable’ space on a volume. macOS does manage it too, according to pressure on free space. If that starts getting low, macOS is supposed to delete old snapshots.

apfssizing02

Because snapshots aren’t shown in the Finder, the best estimate is that given by Disk Utility. Although there have been problems with this in the past, it now seems fairly reliable.

Sparse files

In HFS+, if a file requires 1 GB of storage, the only way to get it to occupy less space is to compress it.

APFS, though, supports sparse files. If that 1 GB of data is almost all empty, as can happen in some file formats, then the file system should be able to omit the empty data, and just store the few MB which actually contain data. If you were then to open that file and change it, so that it didn’t have any empty data, the file would grow from a few MB to 1 GB.

I’ve not come across such a sparse file yet in APFS, but the way this is currently handled appears correct: the storage space taken is that actually used at present, not its potential maximum. If you work with files which are stored in sparse format, you should be mindful of this, but most users can safely ignore it.

Hard links

Hard links are not new to APFS, and in fact are central to the way that Time Machine builds it backups (still on HFS+), but if you’ve ever browsed your backup folder in the Finder, you will have noticed that each backup appears to occupy an impossibly large amount of disk space, equal to the total used space at the time that backup was made. Multiply that up by the number of backups, and the total size is many times larger than the size of the storage.

The problem with hard links and file sizes is essentially the same in APFS as it is in HFS+: in reality, hard links occupy no space at all, as they are merely additional directory entries to a common file. If you have one original file of 1 GB and ten hard links to that, the total space occupied is still only 1 GB, although the Finder will report a total of 11 GB.

Few of us (outside Time Machine backups) use hard links sufficiently to make this a problem, and the only situation in which it can become confusing is in Time Machine backups, which currently can’t be stored on APFS volumes anyway. It’s another example of where simply totalling file sizes, as the Finder does, can mislead.

What not to trust

Given the limitations of the Finder that I have explained above, is there any other tool which you should avoid trusting?

The Storage tab in About This Mac is the most likely to be fooled by Finder Aliases, links, and the factors above. It can be useful as it is the only guide to what types of data are taking up storage space, but when you want to know how much space is used or free on any volume or disk, use Disk Utility: it’s much more likely to be accurate.

8Comments

Add yours

1

Russ Tolman on April 4, 2019 at 1:38 pm

Just wondering if you have heard anything about a problem moving a Time Machine backup to a new drive. Any suggestions about where I might look .
It says it is moving the backup then; it shows 0 files to copy and shows that has copied over twice the size of the backup folder. thanks for any suggestions; I know you are busy

LikeLiked by 1 person
- 2
  
  hoakley on April 4, 2019 at 2:14 pm
  
  Russ,
  Unfortunately this is a not uncommon problem, particularly with large backup folders which go back some time.
  First, ensure that you’re following Apple’s recommended procedure for this, detailed here. You can’t do this by cloning the backup, unfortunately – I have checked with Mike Bombich’s support pages and CCC won’t copy backup folders because of their internal data.
  If you’re still not getting any joy, I’d look at the existing backup to see if it really is healthy. I have looked at the command line tools for this here, and you may find that a help.
  I wish you success,
  Howard.
  
  LikeLike
  - 3
    
    Russ Tolman on April 4, 2019 at 2:37 pm
    
    Thanks for your time. I will look at the command line tools. See if I can figure out what is going on. Don’t we just love Technology.
    
    LikeLike
4

Tony on April 6, 2019 at 5:26 pm

Thank you for that useful round-up of issues. I think that the shared free space in a partitioned container and the ‘unclaimed’ space for clones/sparse files (the first looking like an extreme case of the second) are the real UI problems.

The Finder could perhaps address the shared fee space simply by reporting the actual free space in the container (as now) but suffixing some warning label, perhaps as simple as just adding “(shared)”. This would draw attention to the difference between the value on an APFS versus HFS+ etc disk.

The second is, as you suggest, somewhat intractable. It does seem to me that there is somehow a deficit in the reported used space figure, a little like an overdraft that has yet to be called upon. The problem is that the likely actual usage of that space is dependent upon the user’s intentions when making creating the clones/files. I suspect that most of us would only clone a file with the intention of then changing it but even that may be a behaviour born out of previous filing systems where we understood that to copy was wasteful compared to referencing/linking.

I wonder if reporting (presumably as an option) the total potential usage of the existing files would provide any practical value. It would put a ceiling on the instantaneous disk usage but the number might be so high as to be useless: a little like asking the consequences of a 100% run on a bank.

In the meantime, I am now dreading the advent of the error message “The disk is probably full”!

LikeLiked by 1 person
- 5
  
  hoakley on April 6, 2019 at 5:53 pm
  
  Thanks.
  I think the most important thing now is for Disk Utility to remain fairly accurate and consistent, and for us all to get used to this brave new world. I’m actually happy with the purgeable space concept, and it might be useful for Disk Utility to break that down more explicitly so that we can see what it actually involves.
  Howard.
  
  LikeLike
6

Tony on April 7, 2019 at 4:04 pm

I agree on purgeable space, that seems a valid approach to me.

It’s interesting that you say you have yet to see a sparse file. I had assumed without thinking about it that all files would start as (potentially) sparse. However, the nature of many (most?) files is such that they would be unlikely to be sparse. Most document, xml, code etc files tend to have content that starts at the beginning and is continuous, the end of the data being the end of the file; files also need to be new since potential sparse files created with HFS+ will just be stuffed with zeros. I expect that there are some applications that create sparse files often so they will gradually appear.

LikeLike
- 7
  
  hoakley on April 7, 2019 at 4:15 pm
  
  Thanks, Tony.
  I did have a little play trying to create a sparse file, but didn’t get very far before I was dragged back to reality. Maybe I’ll have another go this week.
  Howard.
  
  LikeLike
8

Michael Tsai - Blog - Quantum Computing and APFS: Free and Used Space on April 11, 2019 at 7:44 pm

[…] Howard Oakley: […]

LikeLike

Share this:

Related