Finder can’t total file sizes correctly: how hard links confuse

The Finder is often used to give total file sizes. Select a folder, Get Info on it, and you’ll see how many items are within that folder and their total size. Except that the total file size given may be completely wrong.

To demonstrate this, create a folder and copy a fairly chunky file into it. Let’s say this is 15 MB in size. Now open Terminal, and create a few hard links to that file within the same folder, using commands like
ln myfile.pdf myfilehdlink1.pdf

Now select that same folder and Get Info. Although each hard link takes no actual space on disk – it is just an additional entry in the volume metadata – the size given by the Finder for that folder has multiplied by as many hard links as you created. The Finder is unable to cope with hard links to the same file, and considers that each is a separate file, with its own data on disk. In the example below, there’s only one 15 MB file in that folder; all the other items are actually links.

hardlinks40

There’s another way of seeing this using the Finder’s Get Summary Info feature, which you may not be aware of. Select any arbitrary files and folders in a Finder window and press Control-Command-I (or Option-Command-I), and the Finder provides a summary of all the selected items together. Select the original file and one of its hard links, Ctrl-Cmd-I, and you’ll see a total file size exactly twice the actual space on disk.

hardlinks41

This problem originates in the way that macOS works with files. One way to estimate the total size of a folder is to perform a deep enumeration of the folder’s contents, and total the sizes. This is traditionally performed using the path of each of the items found. Because a hard link has a different path to the file that it links to, it is seen as being a different file, and its size added to the total.

macOS can distinguish symbolic links and Finder aliases from their originals, and thus can use the size of the link file rather than that of the file to which they point, in its calculations. But a hard link appears identical to the original file, so macOS counts 15 MB for myfile.pdf in the example at the top, and adds another 15 MB for the hard link myfilehdlink1.pdf.

One way around this, were hard links to be used extensively, would be to enumerate contents using inode numbers, as in the volfs path or FileRefURL (both of which are shown in my utility Precize). The ‘original’ file and a hard link to it have exactly the same inode number, hence the same volfs paths and FileRefURLs, even though their normal paths and URLs are quite different.

This isn’t a recent problem, and it applies to HFS+ as much as to APFS. If you look at volumes which consist of very large numbers of hard links, such as Time Machine backups, you’ll see what a nonsense it makes of their total file sizes. If an app were to perform a deep enumeration of your entire backup folder, its total size would exceed the capacity of that disk many times over.

Thankfully this doesn’t affect volume sizes, which are obtained directly from file system records and aren’t affected by the presence of hard links.

(Thanks to Luis for pointing out the Get Summary Info feature, and this anomaly.)