Game Mode revisited: why E cores are so important

In my first two articles (here and here) about Sonoma’s new Game Mode, I confirmed that it gives a game:

  1. exclusive access to E cores,
  2. highest priority access to the GPU,
  3. low latency Bluetooth modes for input controllers, and audio output.

How to get Game Mode

There was one outstanding question: what determines whether an app is able to take advantage of Game Mode? At first I suspected this was by means of an entitlement, but I can now confirm that it’s far more straightforward, although still beyond the reach of the user, as it’s set in the Info.plist of the app, in the LSApplicationCategoryType property. If that’s one of the many Game categories, then macOS will automatically put that app into Game Mode when it’s set to Full Screen mode.

LSApplicationCategoryType is an interesting property. For game apps destined for the App Store, it’s essential, as it determines where it’s presented there. For apps distributed independently, it makes little or no difference, and there’s no reason why other types of app couldn’t adopt a Game category if Game Mode would bring them benefits. However, because this property is set in the Info.plist file, any attempt by the user to add or change that breaks the app’s signature, and macOS won’t be amused.

Tests

For me, the great advantage of LSApplicationCategoryType is that I can now give my own apps full access to Game Mode, so that’s what I have done, and as a result have more detailed information about what Game Mode does, and a better explanation as to why it does that. Previously, I was restricted to assessing Game Mode when running a game. Now I have two test apps, AsmAttic and a Metal graphics demo, that I can run in Full Screen without Game Mode, and in both modes together.

AsmAttic is a loading benchmark app, running tight loops of assembly code that run entirely in-core, without memory access, so give a good indication of maximum core throughput for different numbers and types of cores. When run in Game Mode, its loops run at exactly the same speed as they do normally, whether they’re run on E or P cores. Furthermore, macOS control of core frequencies appears unchanged in Game Mode.

For assessing Metal and GPU performance, I used the Modern Rendering with Metal example code from Apple’s sample code repository. This is described as follows:
“Use advanced Metal features such as Indirect Command Buffers, Sparse Textures, and Variable Rate Rasterization to implement modern rendering algorithms. This sample uses advanced Metal features to render a complex scene with the latest rendering techniques and effects like GPU-based mesh culling, tile-based deferred lighting, ambient occlusion, volumetric fog, and cascaded shadow maps.”

For these tests, I built a version without the required LSApplicationCategoryType so that it would run in Full Screen Mode without entering Game Mode, and another version that enabled Game Mode with that property in its Info.plist. powermetrics was then used to sample CPU and GPU performance over 200 ms periods while the Metal demo was spun at full speed.

Measured active residency is the percentage of available processor cycles that aren’t idle, but total throughput is a combination of both active residency and frequency, as explained below.

Results

Although the 24-core GPU in the M1 Max chip ran at high frequency and with high active residency, it remained well below the maxima found when running a Compute task involving particle simulation. Measured GPU power in watts is shown for the two conditions in the chart below.

gamegpucpu1

GPU power measurements are shown here in Game Mode (solid line, + points) and in Full Screen Mode without Game Mode (broken line, x points). These range between 10-18 W once the window was set to Full Screen Mode at samples 1-3. Each mode shows peaks of 16-18 W from a steady level of about 11 W, and there’s no indication of any significant difference between them.

There were more consistent differences in measurements on the cluster of two E cores. Using the same conventions for lines and data points, the chart below demonstrates that, for much of the time, Game Mode resulted in higher active residency than Full Screen Mode alone.

gamegpucpu2

That wasn’t true of core frequency, though, which switched between just over 1200 MHz and about 1700 MHz in each mode, as shown in the chart below.

gamegpucpu3

One way to combine both core frequency and active residency into a single metric is to derive the percentage maximum active residency (or throughput), as
(measured frequency / maximum frequency) x measured active residency
Thus, a core running at its maximum frequency and 100% active residency is running at 100%, while one running at half its maximum frequency and 50% active residency is at 25% of maximum active residency. Measurements of that are shown in the chart below.

gamegpucpu4

In Full Screen Mode alone, that was commonly slightly more than half that in Game Mode, although in three of the sampling periods that difference was briefly reversed.

These results demonstrate that the E cores dedicated to Game Mode accomplished a higher instruction throughput for most of the time that the test was running, although there was no significant difference in GPU power use.

What are the E cores doing in Game Mode?

As the test didn’t use game controllers or generate any audio output, it’s not credible that the E cores in Game Mode were occupied handling low-latency Bluetooth connections. What is more likely is that the E cores were being used for processing and managing code for the GPUs, key activities for CPU cores during periods of high GPU use.

This is often overlooked in the headlong rush to have more GPU cores, but CPU cores are responsible for preparing and managing command buffers for processing by the GPU. Although the use of Unified memory eliminates the expense of copying memory, and the design of Metal also minimises the work required of CPU cores, they and GPU cores need to work in tandem. By giving exclusive access to E cores, Game Mode may thus ensure that the GPU cores can run at optimum speed.

Conclusions

  • Any app can be run in Game Mode, if the LSApplicationCategoryType property in its Info.plist is set to one of the games categories.
  • Game Mode doesn’t appear to alter frequency control of E cores, nor does it alter core-intensive performance.
  • In the test used here, Game Mode didn’t change GPU power use during sub-maximal graphics tasks.
  • Game Mode did significantly increase E core active residency as a percentage of maximum, for most of the duration of the test.
  • The primary purpose of giving an app exclusive access to E cores in Game Mode may be to ensure that CPU core tasks in support of the GPU run at optimum speed.