Last Week on My Mac: Dataless files arrive

From its introduction in Yosemite, iCloud Drive has relied on illusions created by the Finder. Until the release of Sonoma, those involved juggling hidden stub files used to keep the place for those whose local data had been removed or evicted. In the Finder, they were displayed with the icon indicating that they had to be downloaded again before they could be accessed locally. Although the illusion might look the same in Sonoma, what happens with those files is now completely different: there are no stubs any more, instead the files simply become dataless.

Stub files

Files in iCloud Drive that have been downloaded to local storage appear in special folders as regular files, and take the full space on disk that you’d expect. For the sake of this example, I’ll consider what happens to a downloaded file named MyDocument.extn. Once evicted, to remove the local copy, it’s replaced by a hidden file, most recently named .MyDocument.extn.icloud. That stub file retains all the attributes and extended attributes of the original, but has no data, and is typically less than 200 bytes in size, plus any extended attributes.

icloud104

Although an ingenious solution that works equally well with both the Mac’s native file systems, it relies on the iCloud services in macOS maintaining stub files correctly. It’s also significant that this mechanism is specific to iCloud, and not available to other cloud service providers. If you’re writing a script to access those files in iCloud Drive that have been downloaded, all you need to do is skip those whose names start with a stop or period.

Dataless files

Apple actually announced the coming of Sonoma’s new dataless files in Johannes Fortmann’s presentation on the FileProvider framework at WWDC in 2021, although that concerned itself with third-party cloud service providers rather than iCloud. Little more seems to have been said until a few weeks before WWDC 2023, when Apple published a developer Technical Note on dataless files, which appears to be the first reference to their use in iCloud Drive. While those refer to changes made in APFS to support dataless directories and files, Apple hasn’t revised its APFS reference since June 2020, and there’s no mention of them there at all.

fileobjects

This basic diagram of an APFS file shows the main components involved. In the absence of any better information from Apple, what’s most likely to happen when a normal file becomes dataless is that its file extent information is cleared, and a flag set in the file’s attributes to indicate that it is now dataless. The file’s size information remains, though, as do any extended attributes.

dataless1

This is reflected in a folder of files that have all been downloaded from iCloud Drive in Sonoma: their total size and size on disk reflect that they’re normal files. Once the contents of that folder have been evicted, though, their total size on disk falls to less than 100 KB, that’s an average of just over 3 KB per dataless file.

dataless2

There’s no change of name, no juggling of hidden stub files, and this can largely be handled by APFS.

Consequences

This change to dataless files should be entirely transparent to the user. Files in iCloud Drive that have been evicted and no longer have local data are distinguished using the iCloud icon with a downward-pointing arrow, just as before. They can be manually downloaded by clicking on that icon, or by any operation that needs to access their data.

icloudnos1

Problems arise for those writing scripts or software that needs to know whether a file is dataless or not. Apple’s TN provides valuable further information, although it assumes that dataless files have already been detailed elsewhere, and the man pages of command tools updated to incorporate these changes.

You should be able to tell whether a file is dataless by calling stat or getattrlist and examining if SF_DATALESS is present in stat.st_flags, but I’ve been unable to find any information in the available man pages. My free iCloud utility Cirrus will also tell which items are dataless. Select the item in Cirrus’s browser (opened from its Window menu) and read the Status box towards the foot of the browser window: dataless files are shown as NotDownloaded, while those stored locally are Current.

Cirrus obtains that information from ubiquitousItemDownloadingStatus in URLResourceValues, which returns either of two strings, NSURLUbiquitousItemDownloadingStatusNotDownloaded or NSURLUbiquitousItemDownloadingStatusCurrent. It simply strips the duplicated leading 26 characters to return the Status shown.

In the longer term, changing to dataless directories and files can only be a good move, and one that I suspect Apple has intended for several years. If only it had paid a little attention to its documentation first, and maybe informed users of the change.