hoakley May 2, 2022 Macs, Technology

Don’t trust Activity Monitor on M1 Macs

As more developers are looking at giving the user control over which cores do the heavy lifting in their apps, when running on M1 Macs, they’re puzzling over contradictory figures given by Activity Monitor. In many cases, these appear to demonstrate that running code exclusively on E cores uses more energy, not less. This article explains why, and its message is not to trust Activity Monitor over CPU or Energy figures.

To understand what they’re seeing, I used my app AsmAttic to run tests with different numbers of threads at the two extremes of QoS: 9 or ‘background’, which constrains the code to E cores, and the highest of 33, which runs the code preferentially on the P cores until they’re fully loaded, then uses available E cores as well.

Are E cores really less efficient?

For this introductory example, I use 8 threads of floating point maths on an M1 Max (in a Mac Studio), and a tight loop run 1 billion times in each thread. The results from the two different QoS settings are:

8 threads on 8 P cores took 6.6 s.
8 threads on 2 E cores took 40.4 s.

So there’s a big performance hit from constraining that code to the E cores. What do you get in return? According to Activity Monitor’s Energy pane:

On 8 P cores, an energy value of 800 was sustained for 6.6 s, giving a total of 5280 units.s, or 660 per thread.
On 8 E cores, an energy value of 194 was sustained for 40.4 s, giving a total of 7838 units.s, or 980 per thread.

The clear conclusion is that running these eight threads on the E cores was considerably less efficient than running them on the P cores. And if you believe that, you drop the idea of offering the user control over QoS, and run all that app’s code at high QoS after all.

Activity Monitor’s problems

This result occurs because Activity Monitor, currently version 10.14 in macOS 12.3.1, doesn’t know the difference between processors with identical cores running at fixed frequency, and Apple’s M1 chips, with two different types of core and variable frequencies for each cluster of cores. Given that it’s now nearly 18 months since Apple started shipping its first M1 Macs, you might think that a little surprising. It’s even worse that Activity Monitor’s errors are discouraging developers from making better use of the cores in M1 chips.

If you’ve got an M1 Pro or Max available, it’s not hard to see where this is going wrong.

With an app like AsmAttic, run a single thread at minimum QoS so it runs only on the E cores. Then repeat that with two threads imposing the same computational load. Here are my results:

1 thread on 2 E cores took 19.5 s, at 98% CPU and an energy value of 99.
2 threads on 2 E cores took 10.0 s, at 194% CPU and an energy value of 194.

The single thread has a total energy of 1931 units.s, at 1931 per thread. Running two threads has a total energy of 1940, at 970 per thread. So running twice the amount of code on E cores alone takes the same amount of energy, according to Activity Monitor. That’s obviously wrong.

To discover how Activity Monitor is coming to those false results, you’ll need to run the command tool powermetrics to discover the frequencies the two E cores were running at, and their actual power consumption. Although this is fiddly, the effort is worth it: with the single thread, the E cores run at a frequency of about 972 MHz, but with two threads that increases to their maximum of 2064 MHz. Activity Monitor is oblivious of that, and doesn’t correct its CPU values to allow for frequency changes.

A little study of the figures given for energy show they’re essentially the same as CPU, and that, regardless of frequency or core type, full active residency of a core counts as 100% CPU and 100 energy units, just as they would on an Intel processor with its identical cores. Thankfully powermetrics quickly disabuses of that gross error: 100% active residency of an E core running at maximum frequency uses around 100 mW, and the same active residency of a P core running at its higher maximum frequency uses ten times that amount, 1000 mW.

Results differ on different M1 models

Because macOS controls the frequency of E cores on different M1 chips differently, your results won’t be consistent across different Mac models. Those given here were obtained on an M1 Max, and the M1 Pro is essentially identical. Use an original M1, though, and Activity Monitor considers that its four E cores and four P cores are just the same. Run the same four threads at minimum QoS on its E cores, which remain at a frequency of about 972 MHz, and they’ll compare even more unfavourably with those threads at higher QoS on its P cores, running at 3204 MHz.

Can you use Activity Monitor for rough comparisons?

Until Apple updates the figures returned by Activity Monitor for M1 chips, confounding by core type and frequency makes it not just useless, but actually misleading for comparing CPU % or energy. If you need to assess those, for example when considering whether to let the user change the QoS of threads in your code, the only reliable tool is powermetrics, which provides details of cluster frequencies and power use, as well as active residency.

Even Activity Monitor’s CPU History window requires careful interpretation to avoid propagating error. The image below shows the window for four of those tests on an M1 Max, from the left:

1 thread on 2 E cores, 19.5 s.
2 threads on 2 E cores, 10.0 s.
8 threads on 8 P cores, 6.6 s.
10 threads on 2E+8P cores, 7.1 s.

actmonerrors

Blue lines divide cores into their clusters.

To further illustrate the errors generated by Activity Monitor, here are its measurements of total energy used per thread in my series of tests:

1 thread on 2 E cores, 1931 units.s/thread.
2 threads on 2 E cores, 970.
8 threads on 2 E cores, 980.
8 threads on 8 P cores, 660.
10 threads on 2E+8P cores, 704.

Those appear to demonstrate that any use of the E cores, even in combination with all the P cores, results in higher energy use. Taken at face value, it makes you wonder why Apple bothered putting any E cores into its M1 series chips.

Until Apple updates Activity Monitor to give reasonable figures for M1 chips, don’t use or trust its CPU % or energy values: they’re nonsense.

Postscript

If you’d like to see what happens with real-world apps, and their performance on E and P cores, this sequel article shows how you can assess energy and power using powermetrics, and demonstrates that using the E cores can use less than 30% of the energy of the P cores in real life.

27Comments

Add yours

1

EcleX on May 2, 2022 at 9:05 am

Thanks. Really amazing. Hopefully, Apple monitors this awesome site. Maybe filling a report from “/System/Library/CoreServices/Applications/Feedback Assistant.app” or https://feedbackassistant.apple.com could also help.

LikeLiked by 1 person
- 2
  
  hoakley on May 2, 2022 at 8:39 pm
  
  Are you seriously suggesting that Apple isn’t fully aware of this? And, assuming that it is, why hasn’t it done something to rectify this in the 18 months that M1 Macs have been shipping? Of course, Apple would like to play the game of reporting, and then can ignore the report, or respond that the app works as intended. But why do you think that sending Feedback would alter that?
  Howard.
  
  LikeLike
  - 3
    
    EcleX on May 2, 2022 at 9:31 pm
    
    In my experience, Apple addresses some issues reported that way.
    
    LikeLiked by 1 person
    - 4
      
      hoakley on May 2, 2022 at 10:43 pm
      
      How often do your report bugs using Feedback? When was the last time, when did you hear back, and when was the issue satisfactorily addressed?
      Howard.
      
      LikeLike
    - 5
      
      EcleX on May 3, 2022 at 8:48 pm
      
      I may report about 30 issues each year, got replies on about 10 (the last one was last week) and maybe five are fixed. Those are approximate figures. What I mean is that sometimes such feedback works.
      
      LikeLiked by 1 person
    - 6
      
      hoakley on May 3, 2022 at 10:15 pm
      
      I think you’re the only person that I have ever heard of who has had that experience in Feedback. Everyone else just gets no response from Apple at all.
      Howard.
      
      LikeLike
    - 7
      
      EcleX on May 3, 2022 at 10:23 pm
      
      What I have noticed is that in the last months-years they respond less than previously. But they usually do when they need more information.
      
      LikeLiked by 1 person
- 8
  
  Oliver Busch on May 3, 2022 at 8:16 pm
  
  I sardonistically laugh-cried reading this.
  I have never, ever, ever, received any shred of feedback with bug reports since the change from the already not optimal web-based Radar bug reporting to the utterly dysfunctional and purpose-defying Feedback Assistant app.
  Not an “engineering has decided that this is the intended behavior”, no “this is a duplicate of…”, no “we need more info”. Nothing, nada, nix.
  
  LikeLiked by 1 person
  - 9
    
    hoakley on May 3, 2022 at 10:12 pm
    
    Thank you. My experience precisely.
    Howard.
    
    LikeLike
10

hstriepe on May 2, 2022 at 3:33 pm

Thanks. Given the legion of engineers they have now – not the tight, small groups of the early Jobsian Days, this kind of slack is surprising. But then, Finder display bugs and hard kernel crashes using the UDF file system should have been fixed years ago, too.

LikeLiked by 1 person
- 11
  
  hoakley on May 2, 2022 at 8:42 pm
  
  Thank you. It looks like something got put on the back-burner and forgotten.
  Howard.
  
  LikeLike
12

hstriepe on May 2, 2022 at 3:34 pm

And why do I keep getting “Duplicate comment detected; it looks as though you’ve already said that!” logging in with WordPress?

LikeLiked by 1 person
- 13
  
  hoakley on May 2, 2022 at 3:36 pm
  
  I’m sorry, I don’t know. That seems to be a bug. The good thing is that it will probably vanish again in a few days as WordPress constantly changes.
  Howard
  
  LikeLike
14

markbot2zero on May 2, 2022 at 5:37 pm

Interesting article in New York Times yesterday, “How Technocrats Triumphed At Apple”:

About Jony Ive’s departure from Apple, adapted from the book: “After Steve: How Apple Became a Trillion-Dollar Company and Lost Its Soul”

The machinations at the top of a trillion dollar tech company are many levels of abstraction removed from something as mundane, say, as maintaining “Activity Monitor.”

I suppose that somewhere there a team within Apple tasked with doing just that — keeping a humble app like “Activity Monitor” functional and up to date. And I’m guessing there’s a group that develops the CLI tools that most Mac customers will never see. And managers that have to nurture these people and harmonize their work with the bigger picture of Mac and macOS, including the costs of employing them and navigating the engineering tradeoffs, one of which is always return on investment, at the heart of every decision.

I thought this was illuminating:

“Few knew the full extent of Mr. Ive’s battles. Few were aware of his clash with Apple’s finance team. Few understood how draining he found it to fight over marketing the watch, a product that had increased sales over time and become core to the company’s $38 billion wearables business. Yet many could recognize the tediousness of annually updating the company’s iPhones, iPads and Macs.”

For those of us who closely follow, and anticipate, the annual updating of iPhones, iPads and Macs, hearing that the process is “tedious” is a bit discouraging, and may be a clue as to why annoying bugs, quirks and inconsistencies persist. Sometimes I wonder whether aspects of macOS fall into a bureaucratic no man’s land, or are otherwise considered too insignificant — or too tedious — to bother with.

-Mark

LikeLiked by 1 person
- 15
  
  bdmarsh on May 2, 2022 at 6:10 pm
  
  Frequent updates are tedious, may have been something he should changed his role to have focused on just the new/redesigns (which occur once every 4-6 years) while leaving the iterative updates to other people.
  
  LikeLiked by 1 person
- 16
  
  hoakley on May 2, 2022 at 8:50 pm
  
  Thank you.
  I know quite a few Apple engineers, past and present, and the last word I’d use to describe the annual product cycle is “tedious”! Maybe to Ive it was, but he was in a very special and privileged world.
  Howard.
  
  LikeLike
17

bdmarsh on May 2, 2022 at 6:05 pm

Great analysis.
To say it another way in actual power draw if I’m understanding this correctly:
I’d been reading that on the original M1 with 4 E and 4 P cores, that the 4 E cores combined only uses up to 1.4 watts of power (Total chip usage with a command run to limit processes to just the E cores). While using all 8 of the cores uses up to 18 watts. An M1 E core has 1/3 of the processing power vs a P core, at approx 1/10th the Power usage (according to an Apple statement when launching the M1). So assuming a processing running on an E core takes 3x longer to finish, it would still only use around 1/3 of the actual power of the same process running on a P core. And as you have shown, this is not at all reflected in Activity Monitor’s Energy Usage. So programmers would need to manually adjust for this power draw difference between E cores and P cores.
(I believe the power draw is potentially a bit higher on the E cores on the M1 Pro, Max and Ultra, but they are also faster than the original M1 4xE cores, because they allow a higher clock speed on those 2 cores)

LikeLiked by 1 person
- 18
  
  hoakley on May 2, 2022 at 8:52 pm
  
  Thank you.
  This analysis is largely reliant on my synthetic testing. Tomorrow I present a deep analysis of a real-world test of compression using AppleArchive. I hope you find that more tangible and illuminating.
  Howard.
  
  LikeLiked by 1 person
19

markbot2zero on May 3, 2022 at 7:32 pm

I just hope that the Apple staff who work on un-sexy maintenance of macOS, the non cutting-edge applications known to, and used by, relatively few Mac users, and the documentation thereof — I hope they know that there are people, customers, out in “the field” who very much appreciate, perhaps even depend on, their doing their jobs well.

(Maybe some of the drop by Eclecticlight.co from time to time, where they would find the indefatigable Howard providing them with lots of ideas to improve their products. Maybe?)

I’m grateful that Apple hasn’t, amid its pursuit of high-fashion “wearable tech,” and consumer “services,” assigned Macs to the dreaded, dead end, “legacy hardware” category, that it has instead invested anew in the M-series silicon, at least for the foreseeable future.

Just a guess, but I wonder whether there wasn’t a big marketing/political showdown in the company 5 or so years ago about where to go with what was once their flagship product: the actual working computer. Perhaps there was an “old guard” — of “technocrats”? — in the company whose arguments in favor of a major overhaul of the Mac line won out.

If so, well good for them — and I hope it reaps big dividends.

-Mark

LikeLiked by 1 person
- 20
  
  hoakley on May 3, 2022 at 10:12 pm
  
  Thank you.
  The M1 Macs go back rather further than that, and in some respects to 2000 and earlier.
  Howard.
  
  LikeLike
21

Oliver Busch on May 3, 2022 at 8:26 pm

Fascinating article, as always.
Still weird that this remains unfixed for so long. Sometimes I wonder if Apple developers actually use their own applications.

As for powermetrics, I find the asitop “frontend” really helpful.
https://github.com/tlkh/asitop
What do you think about the MX Power Gadget app as a replacement for the Intel Power Gadget?

LikeLiked by 1 person
- 22
  
  hoakley on May 3, 2022 at 10:13 pm
  
  Thank you – that’s helpful. I’m afraid that I rather like the numbers and fine detail.
  Howard.
  
  LikeLike
23

Tony on May 5, 2022 at 4:26 pm

Have you also assessed the energy consumption tool in Xcode (available while running/testing within Xcode)? I don’t know what tool underlies that but it is arguably the interface most likely to be used by developers.

On the subject of feedback to Apple, I have had some reaction from them in the past. A problem with Pages caused them to request my files (but there was no further communication and the problem remains). A problem with (Intel) firmware got no reply but was fixed in the next release (could have been coincidence of course).

LikeLiked by 1 person
- 24
  
  hoakley on May 5, 2022 at 5:45 pm
  
  Thank you, but no, I decided not to.
  This is all down to the question of the measurement tool changing the environment which you’re trying to measure. While I’m sure that the tool’s information is useful, using the far lighter-weight powermetrics with a production version of the app lets you see it as nature intended. Xcode is such a heavyweight that it would surely be rather like trying to track a mouse with an elephant.
  Maybe I’ll get time to look at that in the near future, but I’m much happier that what powermetrics is measuring is my app, and that alone.
  Howard.
  
  LikeLike
25

Michael Tsai - Blog - How macOS Manages M1 CPU Cores on May 9, 2022 at 7:52 pm

[…] Update (2022-05-09): Howard Oakley: […]

LikeLike
26

Andrew Jaffe on May 23, 2022 at 9:57 pm

Apologies if this is “out of scope”, but I’ve finally got hold of Mac Studio which I will stress mostly with scientific software run from the command line. Some of it is compiled C/C++ (often using multitasking via OpenMP), some in higher-level languages like python. What sets the QoS level for code run from within (say) terminal? (I have some very long-running jobs I want to ensure live entirely on the P processors with the highest possible priority.)

LikeLiked by 1 person
- 27
  
  hoakley on May 23, 2022 at 10:37 pm
  
  You can’t directly set the QoS for command line tools. If you want to use QoS, then you’ll have to launch the tool from within executable code. Someone did write a simple harness tool which could used to do that.
  However, there are two additional things to note.
  First, there’s no QoS which restricts threads to P cores. There is one (the lowest) which constrains them to E cores, but all the other three allow the threads to be scheduled on either P or E cores.
  Next, what you’re running needs to be written to run in background threads. The number of threads determines which threads get allocated to which cluster. If your code only has 1 thread, then it’ll be run on the first available P core. To make use of other cores, your code will need to create sufficient threads to use them. If there are four or less threads, then the code will normally only be run on the first cluster of four P cores, leaving the other cluster idle.
  Howard.
  
  LikeLike