Making sense of M1 memory use

One of the headline features of Apple Silicon Macs is their use of Unified Memory. That simply means that there’s a single system memory which is accessed by the CPU, GPU and other parts of the M1 chip such as its Neural Engine. Previously, we’ve been used to our Macs having two separate memories, main memory accessed by the CPU, and what’s often referred to as VRAM in the graphics card, which is exclusively for the GPU. Some tools, such as iStat Menus and iStatistica Pro, provide usage figures for both types.

The main disadvantage of having separate main and video memory is that most of what ends up on the display will have to be moved from main to video memory, and when the GPU is used for non-graphics work such as intensive computation, data has to pass both ways. By unifying memory into a single system resource, such transfers shouldn’t occur, significantly reducing overhead.

That doesn’t mean that all resources in memory can always be accessed by both CPU and GPU. Metal offers different storage modes, which enable software to designate allocations from main memory for the exclusive use of the GPU, which could be useful for data being processed by the GPU alone, perhaps for a compute, render or blit pass. It’s more common, though, for M1 software to use the shared storage mode, which ensures that both CPU and GPU can access the data.

Where Unified Memory becomes more complex is monitoring its use.

In the traditional discrete memory architecture of main and video memory, we’re generally most interested in main memory use, as requiring more memory than is physically available results in the use of virtual memory, with some being paged out to disk, and a dramatic drop in performance. Video memory doesn’t have that option, though, and excess data would have to be copied back to main memory, which results in an even bigger performance hit. Generally, video memory is managed to make full use of what’s available, to get most benefit from GPU acceleration, but to keep sufficient in reserve so as not to run out of free memory. Don’t be surprised if your 8 GB graphics card is fairly constantly using 7 GB.

As M1 Macs don’t have video memory as such, that being used by the CPU, GPU or both has to be accounted for in system memory as shown by vm_stat and Activity Monitor. At present, those tools can’t make any distinction, so working out what’s being used as video memory isn’t easy. However, iStatistica Pro claims to be able to show what it terms “GPU memory used”.

With an M1 Pro driving just its internal display at low resolution that ‘looks like 1168 x 755’, iStatistica reports around 140 of 379 MB used. At default resolution, those rose to 200 of 480 MB used. Driving a second display they increased to 260 of 506 MB, and a third display took them to 337 of 698 MB. Although those figures appear plausible, I can’t find any other source for that information, so am unable to check them.

istatistica

What is puzzling about the figures given is the maximum, as macOS doesn’t appear to put a limit on the amount of main memory which can be accessed by the GPU. Unfortunately, iStatistica doesn’t give any explanation for those figures, but I wonder if they’re calculated from the resolutions of the available displays rather than any setting in macOS.

Turning to Activity Monitor, we’re none the wiser. The breakdown given at the foot of the Memory view includes:

  • Memory Used, the total amount of physical (real) memory being used by the system and apps, which should include that being used by the GPU.
  • App Memory, which is the total amount of physical memory allocated to system processes and apps, which might or might not include that being used by the GPU.
  • Wired Memory, physical memory which can’t be either compressed or swapped out to disk, which should include all that being used by the GPU.
  • Compressed, physical memory which hasn’t been used recently, so has been compressed to save space. That shouldn’t include any being used by the GPU.

But you can’t use any of those to even guesstimate what’s being used by the GPU.

Another useful indicator is the % GPU figure given in the CPU view, as that should be the proportion of work being performed by the GPU which is directly attributable to each process. Inevitably, WindowServer is usually close to the top of the list, particularly when few graphics-intensive apps are running.

If iStatistica’s figures are anywhere near accurate, even when driving two external displays, the amount of memory being used by the GPU appears surprisingly small, and far less than that used by graphics cards in Intel Macs. Perhaps that’s the real secret of Unified Memory.