Aliases, hard links, symlinks, and copies in Mojave’s APFS

If there’s one topic which is most likely to bring confusion it’s the different types of links and aliases available in Unix-like file systems, and macOS running on APFS is no exception. This article hopefully clarifies the types, their behaviour, and their best use in Mojave.

There are now five different types of copy/clone/alias/link: the regular copy, APFS clone (copy on write clone), symbolic link (symlink), hard link, and Finder alias. I’ll tackle them in that order.

Regular copy

This is what happens when you drag and drop a file to a different volume. There doesn’t appear to be a way of requiring this to happen when copying files within the same volume, though, which is by default performed by cloning. It should be possible to do this using the normal cp command in Terminal (without the c option to clone rather than copy).

Regular copies are made by macOS creating a new file, and copying the contents of the old file into that. Any extended attributes are also copied within the file system metadata. You then end up with an entire fresh copy of the file, which has a different inode number and is completely separate from the original. There is no scope for saving any space on the disk: a copy takes up just the same space as the original.

You can rename and move copies around as you wish. They have no dependence on the original, once they have been created.

APFS (copy on write) clone

These are new in APFS, and have been fully functional since macOS 10.13. They are the default means of creating a ‘copy’ when copying or duplicating files within the same volume: simply hold the Option key down when dragging the old file to its new location. You can also create them in Terminal using the command
cp -c oldfilename newfilename
where the c option requires cloning rather than a regular copy.

Although a clone may appear to be like a hard link, it behaves quite differently. It is a reference in the file system to a separate copy of a file, which initially refers to the same data stored on disk. When those data are changed, the file system behaves quite differently. With a hard link, two or more references in the file system are fixed to the same object in storage. Any changes made to the data in that object are therefore reflected identically no matter which reference you use to access the data.

In a clone, the two references are actually to distinct objects in the file system. When you make changes to the data via one of those references, the data are split for the two references so that they will actually see quite different content. This is illustrated below.

clones1

Clones are completely transparent to the user. As far as the user is concerned, they appear as and act like quite separate files; only the file system knows how much of the data may be in common storage, and that will tend to reduce as they diverge in content with editing.

Symbolic link

A symbolic link is the simplest form of link to an existing file: it is just a reference containing a relative directory path to the original file. They can only be created in Terminal, using a command of the form
ln -s oldfilename newlinkname
and when listed using ls -la the directory path is conveniently displayed.

Symbolic links are the most fragile of all the links available in macOS. Anything which changes that directory path will break a symbolic link. This includes renaming the original file, moving it, and moving the symbolic link itself. Moving the symbolic link doesn’t always completely break it, though: QuickLook thumbnails and previews may still work fine, but you won’t be able to open the linked file in the Finder, for example. Although symbolic links are very efficient in terms of storage, they should generally be avoided because of their fragility.

Hard link

File systems contain directories of files, each of which references the data for that file in storage. Normally, there is a one-to-one relationship between the directory entry for a file and its storage. A hard link is simply a second (or third, or …) directory reference to exactly the same storage object. Look at any distinctive feature of the two or more files, such as their inode numbers, and they will be identical because they are exactly the same storage object.

Hard links can only be made in Terminal, using a command of the form
ln oldfilename newlinkname
but when listed using ls -la you will see identical information for the original file and its hard link, because the two different names refer to exactly the same object.

Hard links require no additional storage space on disk, merely another directory entry in the file system metadata. This is one reason that they are used in Time Machine backups, in which the vast majority of the files shown in each backup are simply hard links to the data written when that file was last changed.

Hard links are robust to path changes in the original file and the link. No matter which link you use to access the file data, you will always get the same object from storage. You can edit a file under its original name, and every hard link to that file will show the same identical changes, as there is only one copy of its data.

However, some apps can break hard links, which is confusing. One trick which has been used to increase the robustness of saving changed files is to save the whole file to a temporary file, delete the original, and rename the temporary file to the same as the original. If you look at the inode of the file before and after such a ‘safe’ save, you will notice that it changes. This has a very strange effect on hard links to the original file: they still refer to the original unchanged object on disk.

Although the original reference to it has been removed from the directory, the file system cannot remove the stored object to which that referred until all hard links to that data have been deleted. So the previous hard link will appear to an original which no longer exists under its original name, but now can only be found from the hard link. Meanwhile, the data referenced by the original name is in fact a different file, and unrelated to the old hard link(s). The user is completely confused by this, which is possibly why Apple has never offered an easier way of creating hard links.

clones2

Hard links also behave differently when copied to other volumes. Because they are a reference to an existing file, what happens is the referenced file is copied with the link name. What is apparently the same link on the two volumes now opens two quite different copies of the file, one stored on each volume. Although perfectly logical, this can cause confusion, particularly if the hard link is them copied back, as it may result in two identically-named links (in different folders) referring to different files.

Hard links are invaluable, but must be used with great care and understanding of their unique properties.

Finder Alias

Apple created Finder Aliases – the form of link you will get if you select a file and use the Finder’s Make Alias command – to address the shortcomings and fragility of symbolic links. A Finder Alias consumes the greatest amount of disk space of any of these link methods, but is also the most consistent and reliable in its behaviour for most users.

Finder Aliases are a halfway house between symbolic and hard links. macOS first tries to locate the file to which their directory path points to, as if they were a symbolic link. The path used is absolute, rather than relative, so moving the alias within the same volume, or copying it to another volume, doesn’t affect this.

In the event that it can’t locate the original using a directory path, the alias also contains an inode reference, which is used as the fallback. This should enable macOS to locate the original even if both original and alias have been moved, so long as the original is on the same volume as it was when the alias was created.

The only common situation in which a Finder Alias will break is when it has been moved to another volume, and the original file has also been moved to a different volume. In that case, neither its full pathname nor the inode reference will locate the original file, and you will have to create a new Finder Alias to it.

Until recently, there was no straightforward way of creating Finder Aliases from the command line, nor of resolving them. My free command tool alisma (available from Downloads above) does both, and is fully compatible with Mojave and APFS.

References

Aliases, links, clones, and Bookmarks
Taking Stock: Using APFS in High Sierra 10.13.1
alisma and its source