Files come in many types and forms, but common to them are the file’s attributes, such as its identifiers and permissions, its data, and any additional metadata, typically on the Mac in the form of extended attributes. For the great majority of files, it’s the data that determines that file’s purpose and use. Without that, the file is but a stub, a placeholder for its real self.
For the last decade, iCloud has been able to evict files entrusted to it for storage, by removing locally stored data for that file and replacing it with a stub, a dataless file which must be downloaded or materialised before the contents of its data can be accessed locally. Other forms of cloud storage behave similarly, and Apple rightly draws attention to the growing problem that such dataless files pose both developers and users, in TN3150: Getting ready for dataless files. As put succinctly there: “materializing a dataless file can take time if the file is large or there are poor network conditions. In such a scenario, the app may become unresponsive.”
Over the last couple of weeks I have been exploring how macOS and its features handle dataless files. While apps that take advantage of AppKit’s NSDocument to read and write files should handle these problems seamlessly, there are some definite seams when it comes to macOS services. These result from three constraints:
- features reliant on the contents of file data can’t be used with dataless files;
- features reliant on file data stored outside the file aren’t available to other systems accessing that file from iCloud;
- limitations on the total size of extended attributes in iCloud storage may require some to be removed.
Features relying on file data
Perhaps the most obvious feature that’s lost with any dataless file is the ability to back it up locally. Although storage in iCloud should not only guarantee files against loss, but also preserve them in the event of loss of local backup storage, there are circumstances where iCloud storage is insufficient to preserve older versions of files, depending on when they are evicted and materialised from iCloud. It’s not hard to envisage conditions in which a file that is changed frequently but seldom stored locally for long before it’s evicted again, can escape all local backups. The more frequent those backups, the less likely they will fail to capture each version of that file before it’s evicted again. It seems unlikely, though, that macOS keeps track of that and takes it into account in eviction or backup.
Although less critical, absence of dataless files from the results of Spotlight searches is more obtrusive and frustrating to users. Further work is required before it can be determined whether Spotlight simply excludes search results, or whether the file’s entries in Spotlight’s indexes are removed on eviction. In the latter case, re-indexing is required after a dataless file is materialised to local storage again.
Least significant of these lost features is the file’s QuickLook thumbnail and preview. Selecting a dataless file in the Finder and triggering regeneration of its thumbnail is a trivial method of forcing files to be materialised.
Features relying on other data
The prime example of this constraint is document versions. These are normally stored locally on the same volume as the document, in a hidden folder. In the case of files in iCloud Drive, that almost invariably means the top level of that Mac’s Data volume. Stored versions aren’t transferred when a file is moved to a different local volume, neither are they transferred into iCloud storage. This results in versions remaining accessible only on the system that saved those versions locally. Fortunately, this appears to be unaffected by eviction and materialisation.
While many users never access saved versions of documents, and some apps don’t create them in the first place, for anyone wanting to use versions this is a significant constraint that might require them to switch systems during editing of a document, to access a previous version there. This is a longstanding limitation to the use of iCloud.
Limited size of extended attributes
Apple sets a maximum file size for iCloud Drive of 50 GB, but nowhere have I found any explicit limit placed on the total size of a file’s extended attributes. Experimentation demonstrates that the effective limit is about 32,650 bytes (slightly less than 32,767). Although this is unlikely to affect many files, it almost guarantees that Resource Forks (com.apple.ResourceFork) will be removed from files that are either:
- uploaded to iCloud Drive from one system, then accessed from another system, or
- evicted from local storage, then materialised to the same system.
This occurs irrespective of the xattr flag applied.
This is most apparent when customising icons on folders and files, which still relies on using Resource Forks, and therefore can only work locally, and not via iCloud Drive. While this is a nice reminder of the history of Mac OS, it’s high time that it was replaced by a mechanism that’s better suited to modern macOS and iCloud.
Is macOS ready for dataless files?
Currently, macOS has several undocumented constraints when working with dataless files. Although most might appear to be edge cases, the following need further attention:
- Time Machine backups,
- Spotlight searches,
- Document versions,
- Custom icons on folders and documents.
To quote Apple: “The system, or a person using the device, can make dataless files whenever they determine it’s appropriate, and your app needs to be ready to handle them.” So does macOS.