How can you tell whether a file has been ‘cloned’ in APFS?

One of the much-touted features of APFS is its ability to ‘clone’ files which are copied or duplicated. It’s easy to demonstrate this with a suitably large file of a few GB or more. Select the large file and drag-copy it to a different volume, noting how long it takes to copy across. Then select the same large file and duplicate it in the Finder (⌘D), which is instantaneous. The duplicate pretends that it’s a completely new file, but we all know that duplication simply can’t occur in such a short space of time.

What has actually happened is that APFS has created a new file record (not a hard link), but the data storage of the old and new files is identical – no free disk space has changed hands, although the Finder tells you that each of those files is the same, full size. It’s only when you start editing either of those files that changed data is written to fresh storage, and the space actually required starts to increase towards the sum of the two file sizes.

This is easy to confuse with a hard link, in which two or more file records link to the same inode. Hard links are long-established tools used widely in different file systems, including HFS+, but both the ‘files’ are fixed links to the same data. Change one, and they all change. APFS clones often diverge over time as one or both files are edited, until eventually they may become completely distinct files.

According to Apple’s developer documentation, the two conditions required for a clone to be made are:

  • both the original and copy files must be on the same APFS volume, so sharing the same file system;
  • copying must be performed using either of two specific commands (both forms of copyItem()) in the FileManager.

In practice, these include all copies and duplicates made within the same volume by the Finder, and most made by apps. This can also apply to whole folders, provided that they’re copied according to these rules.

What Apple doesn’t explain is how you can check whether a copy is a clone. Unlike APFS sparse files, which give strong clues in the Get Info dialog with file size disparities, there’s nothing out of the ordinary to be seen with a clone. Nor can you test anything at the command line, where they appear to be completely separate files.

Although clones and clone-copying were introduced in macOS High Sierra, it has only recently become feasible for apps to inspect whether any given file has at some time been part of a clone, using the mayShareFileContentKey resource value, similar to isSparseKey for sparse files. Unfortunately, its documenation isn’t exactly informative, and as its name suggests it doesn’t actually tell you whether a clone currently exists, only that it did at some time in the past.

More information is available in the man page for the equivalent C/C++ call getattrlist, where its equivalent EF_MAY_SHARE_BLOCKS is explained thus:
“If this bit is set then the file may share blocks with another file (i.e. it is a clone of another file).”

Both mayShareFileContentKey and EF_MAY_SHARE_BLOCKS go back to the same j_inode_flag listed in Apple’s APFS Reference, where INODE_WAS_EVER_CLONED is described as
“The inode has been cloned at least once. If this flag is set, the blocks on disk that store this inode might also be in use with another inode. For example, when deleting this inode, you need to check reference counts before deallocating storage.”
There’s also a note concerning a bug in this flag affecting macOS 10.13 to 10.13.3. However, I’ve been unable to see any explanation of how to determine whether any given file has clones which still exist in the file system. There is another j_inode_flag named INODE_WAS_CLONED, which appears different, but its explanation doesn’t clarify:
“The inode was created by cloning another inode.”

I’m currently testing a new version of Sparsity which can not only discover sparse files, but will also list all the files which have been cloned. One slight snag with this is that potential clones are exceedingly numerous: essentially every file which has, at some time since it first arrived in its current volume, been copied. In some folders, that’s more than 10% of all the files. Whether Apple will ever let us in on the secret as to how to confirm whether any clones currently exist, is anyone’s guess, but it has taken three years to get this far.