A number of readers of this blog have recently commented on the bug which remains in macOS Sierra 10.12.6, in which Time Machine’s backups suddenly become irregular. This occurs in Macs which are left running for periods of more than about a week, but can occur sooner.
Once it has happened, the only way to restore normal automatic backups is to restart; some users instead invoke fixed-time backups, but I have warned that this doesn’t address all the problems which result from the bug.
I found a little time to explore this using the latest version of Woodpile (from Downloads above), which shows some quite distinctive and meaningful patterns, which I thought I would share here.
This is what Time Machine backups normally look like in the log: fairly, but not completely, regular bursts of about 10 log entries by com.apple.TimeMachine every hour or so. Command-click on these bursts and you’ll see that they are not at precisely hourly intervals, as the dispatching system runs them more flexibly.
This is what happens when they become irregular, or here completely missed. On this occasion, they were skipped entirely for a period of over two hours before I noticed and restarted, kicking them back into action again.
The undocumented scheduling and dispatching system DuetActivityScheduler (DAS) appears to be at fault. When Woodpile analyses a logarchive, it normally lists its activities under the process name DuetHeuristic-BM, which makes it very easy to assess. Here is a snapshot of its steady log load, ranging from 100-400 entries every 10 minutes when the Mac is being used, down to 40-90 when it is unused but not asleep, in the small hours of the morning.
This is what happens to DuetHeuristic-BM log load when the bug occurs: from 23:48 to 07:49 there is not a single log entry from it. Over that same period, not a single backup was made either.
Duet and CoreDuet are pervasive systems which monitor all sorts of environmental and other factors to do with power or energy use, and activity. Here’s the coreduetd service or daemon which runs the monitoring side over the same period that DuetHeuristic-BM was completely silent. So this part of the CoreDuet system still seems to be working fine, it is just the scheduling and dispatching service which is broken.
The impact of dispatching failure goes well beyond automatic backups. DAS schedules and dispatches over 70 different services, few of which we know much about. This chart shows log load for CalendarAgent over the period that DAS was broken: another long and uncharacteristic silence. So when your backups stop working, so do Calendar services such as notifications of events and appointments.
Here is the same pattern of silence reflected in AddressBookSourceSync, which synchronises your Address Book for Contacts. When your backups stop working, so does Address Book synchronisation.
The same pattern is shown by com.apple.geod, which is concerned with locations – another background activity which clearly stops working when automatic backups become irregular.
Recently, I have also been looking inside the DAS code. This appears to be a common component with iOS, so the buggy versions of DAS which cause this problem in Sierra are those which perform the same function for iOS 10. One significant difference, of course, is that iOS doesn’t have Time Machine, so there are no backups to worry about.
The version of DAS which ships in High Sierra is, I believe, fundamentally the same as that in iOS 11. Indications are that this finally fixes this bug, and that DAS continues to work normally in High Sierra, with Time Machine backups remaining regular after even two or three weeks of continuous running.
This is going to be difficult to confirm. The Macs most likely to manifest this bug are desktop systems, most of which have internal Fusion Drives or hard disks, and are unlikely to have been upgraded to High Sierra yet. Furthermore, the release of High Sierra’s two updates so far has required restarts, so the longest-running High Sierra systems were probably last restarted on 31 October. It will take until mid-November before the bug may manifest in systems which have been updated to 10.13.1.
Whilst I was looking at two different logarchives, I assessed how quickly log entries are weeded from Special log files too. I’m afraid that entries seem to disappear very soon after they are saved to disk, limiting their value unless they have been captured very recently.
Woodpile seems very stable now, and I am pleased with its features and function. One issue which I need to address soon is that of the time interval used to display the contents of individual log files. There is no easy way of obtaining the datestamp of the first and last entries in any individual log file, so currently Woodpile uses the time span between the first and last entries for the selected process – which is usually significantly shorter than the period of the whole log file. This makes it more difficult to compare charts for different processes in the same log file. I need to come up with a way of displaying them over a common period.
If you have experience of running High Sierra for long periods on a Mac, or suggestions for the next features to be added to Woodpile, please comment here. I’ll be delighted to hear from you.