How can you check the integrity of backed up files?

A couple of weeks ago, I looked at how you can check your Time Machine backups to ensure that their file system is healthy and free from errors. The question which I ducked was how to check the integrity of files within any given backup.

This is important, as a backup may have a perfectly good file system, but the data within important files can still have become damaged or corrupted. Ideally, you might wish to be able to do this twice: first when the backup has just been made, to verify its fidelity against the original, and again when you want to access its contents, for instance when restoring.

HFS+ backups

Both the Time Machine menu bar entry and its command tool equivalent tmutil let you check any given backup to see if its checksums match those expected. This relies on checksum information made and saved at the time of backup, and the command is based on
tmutil verifychecksums

This is only available for Time Machine backups made to HFS+ from OS X 10.11 onwards.

Unfortunately, the output from tmutil isn’t particularly helpful. If there are no errors found, the command doesn’t return any result. They’re only returned when there are errors: when a file’s checksum doesn’t match that expected, it’s listed with an exclamation mark !, and when the file’s recorded checksum is simply invalid, it’s listed with a question mark ?. However, this does appear to be reliable.

APFS backups

Neither the Time Machine menu nor tmutil verifychecksums work with regular backups to APFS volumes. I’m very grateful to winmaciek for pointing out that this is possible for a special backup using the contextual menu which appears when you Control-Shift-Click on the disk icon in the Time Machine pane.

tmconsistencyscan1

This produces a special backup which might contain checksums and  therefore could be verifiable. However, kapitainsky has checked the log, and reports that all this does is verify the FSEvents database, which has nothing at all to do with integrity checking. Without documentation from Apple, we’re left to guess what this feature does. In any case, there’s no apparent way to make these the default, so even if they do check integrity, they’re of very limited use.

Although APFS does use checksums within file system metadata, it currently has no option to store or check them for file data. Although demand for such a feature might be low among those using APFS exclusively on SSDs, for some years to come many of us will still be using it on inherently unreliable hard disk storage, where checking integrity becomes more critical.

Dintch, Fintch and cintch

You can use my utilities Dintch, Fintch or their command tool companion cintch to tag and check files which are stored in any Time Machine backups, including those on APFS, or in iCloud.

These utilities compute a cryptographic hash (SHA-256, from the macOS CryptoKit) for every file which you feed them with, and save that to an extended attribute of type co.eclecticlight.dintch.hash which remains attached even when they are backed up or transferred to iCloud. The hash isn’t stored in a separate file, but remains attached to the individual file no matter where it goes.

To tag files locally or in iCloud, use one of those two apps or the command tool. To check their integrity, use them again to report on whether they still match that hash in their extended attribute. Checking files stored in regular folders and iCloud is very simple. It’s a bit more complex with those in your Time Machine backup, because you may need to mount the backup snapshot first before you can check it.

tmconsistencyscan2

Files seen there are strictly read-only, and should represent the full extent of the current Time Machine backup. It doesn’t therefore contain items which have been deleted prior to that backup, but should be an exact copy of the source of the backup at the time that backup was made.

tmconsistencyscan3

In my example, of the 97 tagged files in my test folder, all successfully validated against their original hashes. The only exception was a .DS_Store file which I neither put there not did I tag.

Dintch, Fintch and cintch are of course free from here, and come with detailed documentation.

Thank you to winmaciek and kapitainsky for looking at the hidden mystery command in the Time Machine pane.