Making the most of Apple silicon power: 4 Frequency

How can the two E cores in an M1 Pro or Max equal performance of the four in the original M1? Why does running two threads complete in half the time taken to run one?

Making the most of Apple silicon power: 3 Controls

Threads, GCD and core allocation in Apple silicon explained. How thread priority is baked into code, and how important it is to performance.

Making the most of Apple silicon power: 2 Core capabilities

P cores are conventional in that they can deliver excellent performance at maximum frequency, but with high power use. E cores may take 4 times as long for a task, but use less than a third of the energy.

Making the most of Apple silicon power: 1 M-series chips are different

An accessible account of how Apple silicon chips use cores of two different types to do their work, and how to get the best from them as a user. The start…

What shouldn’t you use an M1 series Mac for?

For the great majority of Mac users, M1 series Macs are a big step forward. But some users want the impossible. What can’t M1 Macs do?

Explainer: Vectors, Accelerate and poor performance on M1 Macs

Some apps and other code doesn’t appear to run faster on M1 chips, and some even runs more slowly. Could this be a result of it not using the best acceleration for vectors and matrices?

M1 Icestorm cores can still perform very well

What are the penalties in real-world use for running your code on Icestorm cores, using around 10% of the power used by Firestorms?

Last Week on My Mac: Queue-jumping, hints and deep integration

The cores in the M1 and the chip itself are thoroughly Apple designs, and work hand-in-glove with macOS using techniques like out-of-order execution and hints to optimise performance.

What’s in an M1 chip, and what does it do differently?

Summary and links for the latest information about what’s in the current M1 chip, from differences in caches between cores, to the Matrix Coprocessor and Fabric limitations.

Code in ARM Assembly: Lanes and loads in NEON

How ARM64 uses its special SIMD registers in lanes, and how they can be loaded with and without de-interleaving.