How to check the integrity of files in a Time Machine backup

It turns out that you can check the integrity of the files in your Time Machine backups after all, so long as you’re running El Capitan or later. This isn’t offered in its application or preference pane, though, and I’m not sure what to make of it in Terminal either.

According to the man page for tmutil, the command
tmutil verifychecksums path
computes a checksum for each item in the path specified at path, and compares that with the checksum made at the time of that backup and stored in its data. If they match, it doesn’t return anything. When it finds a checksum which doesn’t match, that’s reported in a simple list of errors. The first snag is the report format: the man page claims that two different errors can be reported:

  • ! means that “the file’s current checksum does not match the expected recorded checksum”, and
  • ? means that “the file’s recorded checksum is invalid”.

This means that checksum comparison can return one of not two results, but one of three, which is puzzling to me. I’m not sure how the second type of error can be identified, except when the file’s current checksum doesn’t match the recorded checksum, which is surely the first type of error?

The man page doesn’t state it, but this command must be run with elevated privileges. So armed with
sudo tmutil verifychecksums /Volumes/ThunderBay2HFS/Backups.backupdb/Howard’s\ iMac\ Pro/2020-05-08-073609
and the like, I checked a couple of my recent TM backups.

Even quite small backups take a long time – several minutes for 140 GB – to check. You can refine the path to be checked to specify individual folders within the backup, but the thought of using this on a terabyte or more of backups isn’t appealing.

You may also be less than impressed with the result. It doesn’t tell you how many items it checked, nor how many were missing checksums. All it does is list files in that backup which don’t match their checksums. In one backup, it reported two:
! /Volumes/ThunderBay2HFS/Backups.backupdb/Howard’s iMac Pro/2020-05-08-073609/External1/Documents/0newDownloads/PrivateDocuments1.sparsebundle/bands/0
! /Volumes/ThunderBay2HFS/Backups.backupdb/Howard’s iMac Pro/2020-05-08-073609/External1/Documents/0newDownloads/PrivateDocuments1.sparsebundle/bands/ee

which are bands in an encrypted sparse bundle, stored in a recent backup on a local SSD. So what can I do now to address this failed integrity check? I can’t tamper with the backup itself to try to fix those files, nor can I force them to be backed up again. Nor do I know whether the original is also damaged, nor the cause.

All this does is demonstrate my original point: backups can silently become corrupt for no apparent reason. File integrity needs to be taken more seriously, something which macOS doesn’t currently address.

Postscript

Thanks to John (see his comment below) for discovering that Time Machine records these checksums in an extended attribute added to each file which it backs up. The xattr type is com.apple.finder.copy.source.checksum, and it’s given a flag #N which means that it’s stripped when the file is copied between volumes, such as when you restore the file.

It contains a 4 byte integer, which I would have presumed was a CRC32, but on checking the CRC32 for the data fork of backed up files, that isn’t the same as the checksum stored in the xattr. Neither does it appear to be an Adler32. This means that you can’t check the checksum manually either.

It does therefore make sense to have a ‘checksum invalid’ error, but there’s another error code which is missing: ‘checksum not present’. Ho hum.