What happens when background scheduling fails

In my first article about background ‘activity’ scheduling and dispatching using Centralized Task Scheduling (CTS), I mentioned the bug which causes it to break down and stop automatic Time Machine backups in macOS 10.12 Sierra. This article looks at that and other points of failure, which can cause problems for your Mac.

Symptoms

The most visible activities managed by CTS are the hourly automatic backups made by Time Machine, which are also probably its most important work. The symptoms were obvious to all who kept a watch on their backups: after a few days of continuous running, backups stopped or became highly irregular.

For example, you might see in the log two consecutive automatic backups nearly 5 hours apart:
02:14:27 backupd: Starting automatic backup
07:00:26 backupd: Starting automatic backup

and further backups would be cancelled because they occurred too soon after the last backup:
07:06:16 backupd-helper: Not starting scheduled Time Machine backup: Less than 10 minutes since last backup

However, periodic tasks which are managed by other systems, such as launchd, remain regular.

If you don’t run Time Machine automatic backups, problems with CTS are likely to be harder to spot, although some of the systems affected may be almost as important as backing up. In a future article, I’ll be listing all the activities managed by CTS in Catalina 10.15.7.

CTS absent from the log

Looking at the log for the period between backups, few or no entries are made by CTS or DAS. Two consecutive examples from CTS are a good example:
02:15:06 com.apple.xpc.activity.All Submitting DASActivity: <_DASActivity: "0:com.apple.backupd-auto", Utility, 60s, [2/24/17, 3:05:09 AM - 2/24/17, 3:35:09 AM], Plugin Required, Dark-Wake Eligible>
06:59:56 com.apple.xpc.activity.All DAS told us to run com.apple.backupd-auto

For a period of over four hours, although my Mac didn’t sleep, there wasn’t a single log entry from DAS, CTS, or any component of Time Machine.

The next log entry, at 06:59:56, was from DAS, which appeared to have been experiencing a problem, as it didn’t perform its normal periodic rescoring of activities, but reported that it was evaluating 72 activities based on triggers, rather than the normal scores. It then confirmed that backupd-auto should run, and CTS started its normal sequence of entries which resulted in the first backup being made.

Failure during startup

During startup, CTS and DAS make copious entries in the log. DAS is initialised quite soon after startup, typically with the announcement
Initializing DASDaemon…

A little later, CTS as the sub-system com.apple.xpc.activity registers changes, with messages stating
User Active FullWake:cpu disk net aud vid resuming activities on transition
Time Change: Registered for time change notifications

Following these, CTS starts to load DAS with activities, beginning with those which are run by root. Among the first to be installed are two Time Machine backup activities: backupd-auto, the main backup service, which repeats every 60 minutes within a grace period of 30 minutes; backupd-auto.dryspell, a non-repeating activity which runs at some time over the next day or so, follows.

With activities still being loaded into it, the DAS daemon is made fully active, with the announcement
_DASDaemon started

Error messages encountered during this initialisation should be fairly clear, and are grounds for suspecting a problem in CTS.

Sleep and wake

CTS also plays a prominent role in sleep and wake events, as it has to suspend its schedules and reactivate them respectively. When there are problems with either event, CTS can suffer its own problems. It could also play a role in systems which can’t stay asleep properly, when it merits investigation using log entries.

The first reflection of sleep being imminent is shown in the log by CTS, which reports
DarkWake:cpu disk net suspending activities on transition
The reason that this and other log entries from CTS can be duplicated, is that there are two different instances of UserEventAgent which are originating CTS events, one owned by root, the other by the user. However, DAS log entries are originated by a single process owned by root.

DAS responds to CTS in log entries with messages such as
Entering dark wake
Clearing list of activities that ran Suspending tasks due to wake state transition

CTS and DAS continue to make arrangements for sleep before it takes effect.

One of the earliest markers of the wake-up sequence is CTS, which records (from each of its two instances of UserEventAgent)
DarkWake:cpu disk net
User Active FullWake:cpu disk net aud vid resuming activities on transition

DAS then prepares itself to run its scoring system to determine which activities get dispatched, and rescores all its activities.

Solutions

As an undocumented black box system, there’s little the user can do to address problems with CTS other than to restart and hope that clears the issue. Where individual activities appear to be causing problems, you should locate the LaunchAgents or LaunchDaemons property list in which their loading is controlled, and check that. Unfortunately, bugs in background services which mess up CTS can only be addressed by fixing the bug, as CTS controls are embedded in code rather than controlled by property lists or configuration files.

I simply don’t know how common or rare CTS problems are now that the bug in Sierra has been fixed in High Sierra.