Every file in macOS has two different types of information: the content of the file itself, its data, and information about the file and those data, its metadata. This article looks at the second: what metadata are stored, where they are kept, and how you can get the most out of them. I include here not just regular files, but those compound documents which are actually folders (bundles) dressed up to look like files, such as RTFD and Word .docx documents.
Attributes
The most fundamental metadata of each file are those of the file system attributes. Information such as the name of the file and any extension, datestamps for its creation and modification, what type of file it is, its permissions, and so on. You can’t directly edit many of these, as they’re set by the file system itself. The notable exception here is with permissions.
A selection of the most important of these attributes are shown in the uppermost and General sections of the Finder’s Get Info dialog, with Sharing & Permissions at its foot.
Although attributes are dull and everyday, they’re the metadata most commonly used for searching and sorting files, for instance in Finder windows. Even when you search on other more specialist metadata, the results are usually ordered according to one of their attributes, such as the filename, date of creation, or size.
My app which gives most detailed access to file system attributes is Precize.
Embedded metadata
Until relatively recently, most popular file systems only supported one type of metadata, attributes, which excludes almost all the information we have come to rely on – for photos, details of the camera, lens, and settings when taking the photograph; for many document types, author and copyright. The only generally acceptable way to add those metadata was within the file data itself, to embed them in it, which isn’t the way that metadata should work at all.
For example, many Rich Text editors, even TextEdit, allow you to set metadata such as the name of the author, copyright, and keywords. Those are then saved to the data, for example in content such as
\info {\author Howard Oakley}{\title Dintch Help file}{\doccomm Convert to PDF for use in Ditnch build folder.}{\*\company EHN & DIJ Oakley}{\*\copyright \u169 ? 2020 EHN & DIJ Oakley}{\keywords help, Dintch, documentation}{\creatim \yr2019 \mo2 \dy16 \hr19 \min36 }{\revtim \yr2020 \mo5 \dy5 \hr8 \min48 }{\printim \yr2020 \mo3 \dy21 \hr19 \min44 }\nofpages9
Because these are embedded, only those apps which can read that file type can access them. They often don’t appear in the Finder’s Get Info dialog, nor in any of its methods of viewing files. Provided that Spotlight’s mdworker
processes can read these metadata, they should get indexed in the main Spotlight database, and so become accessible. This depends on there being the right Spotlight Importer, a bundle with the extension mdimporter in one of the Library/Spotlight folders.
Spotlight maps embedded metadata to the types which it recognises. This means that, in the example Rich Text above, that document will be associated with the Keywords given in the \keywords data inside the document, and searching for the keyword help will result in that document being a hit.
Extended attributes
On macOS, the preferred way to store metadata is in extended attributes (xattrs), which are stored on each APFS volume with the file system metadata, appropriately. Anyone can designate their own types of extended attribute, but Apple provides a standard set which are supported by macOS, integrate with Spotlight and, to a much more limited extent, with the Finder.
The most visible xattrs are:
- com.apple.metadata:_kMDItemUserTags, which labels and sets the colour of Finder tags,
- com.apple.metadata:kMDItemFinderComment, which determines what is displayed in Finder comments,
- com.apple.metadata:kMDItemKeywords, which are displayed in the Get Info dialog’s More info,
- com.apple.metadata:kMDItemWhereFroms, which may be displayed in More info for downloaded items,
- com.apple.metadata:kMDItemDownloadedDate, which may be displayed in More info for downloaded items.
Of those, only Finder tags and comments can be readily edited by the user.
macOS currently supports a very large range of standard xattrs, many of which are listed under their Spotlight names here. Many of those match the items listed in the search criteria popup menu in a Finder Find window, which also includes embedded and other metadata. Unfortunately their names don’t match well, and working out which Spotlight attribute matches which xattr name matches which Find criterion is often a matter of intelligent guesswork.
With so many to choose from, it can be tempting to stick to the most popular, such as Finder comments. If you want to use those metadata, particularly for searching or sorting, this often proves to be a big mistake. When researching this article, I wanted to find all executable code which contained my name in its com.apple.metadata:kMDItemCopyright xattr, known to Spotlight as kMDItemCopyright, and to the Finder as Copyright. I have used this xattr in the past to tag many of my photos, which is also perfectly reasonable. I therefore had to add more search criteria to block thousands of images appearing in the list of hits.
The general principle with metadata, and xattrs are no exception, is to use the most specific type of metadata which will generate the cleanest search results. You should avoid, if at all possible, reusing types of metadata for several different purposes. macOS supports so many different types that there’s no excuse for reuse, but many do it for convenience, for example with Finder tags.
Another concern in using xattrs is their limited support when files are moved from a Mac, either through a cloud service or a different file system. Many different xattrs are now preserved in iCloud and some other cloud services, and most file systems provide mechanisms to support them too. Before choosing a xattr to use for any significant purpose, you should assess that it’s preserved wherever you want it to go, or your metadata may be stripped the moment it leaves your Mac. You also need an app which allows you to add and maintain that type of metadata.
I have several free apps which work with xattrs. The simplest are SearchKey and SearchKeyLite, and the master toolkit is xattred, all of which are detailed here.
Info.plist in a bundle
Apps are the bundle most commonly encountered by users, and contain their own metadata, which is here set in the Info.plist property list. Developers are required to set many metadata items in those files, some of which are exposed to the user. These aren’t normally converted to xattrs, are protected by that app’s signature, and some may be displayed in the Finder.
Among the most common and important to the user are:
- CFBundleIdentifier, which gives the app/bundle’s ID,
- CFBundleShortVersionString, which gives a concise human-readable version number,
- CFBundleVersion, which gives a terse bundle version number,
- LSApplicationCategoryType, which gives its App Store category as a UTI.
In theory, there’s nothing to stop the user from editing these, at least in apps which aren’t protected by SIP. Doing so breaks their signature, which is likely to make Catalina at least refuse to open the app.
Metadata defined in a bundle’s Info.plist are indexed by Spotlight, and can therefore be used for sorting, searching and more.
Embedded Info.plist data
The final category of metadata is the least known. Single-file executable code such as a command tool has nowhere to store metadata which would, in a full bundle, be kept in its Info.plist file. Developers can instead opt for its Info.plist data to be appended to the executable code in the single file. This appears little-used and these embedded metadata may not be accessible to Spotlight anyway.