Making sense of M1 memory use

One of the headline features of Apple Silicon Macs is their use of Unified Memory. That simply means that there’s a single system memory which is accessed by the CPU, GPU and other parts of the M1 chip such as its Neural Engine. Previously, we’ve been used to our Macs having two separate memories, main memory accessed by the CPU, and what’s often referred to as VRAM in the graphics card, which is exclusively for the GPU. Some tools, such as iStat Menus and iStatistica Pro, provide usage figures for both types.

The main disadvantage of having separate main and video memory is that most of what ends up on the display will have to be moved from main to video memory, and when the GPU is used for non-graphics work such as intensive computation, data has to pass both ways. By unifying memory into a single system resource, such transfers shouldn’t occur, significantly reducing overhead.

That doesn’t mean that all resources in memory can always be accessed by both CPU and GPU. Metal offers different storage modes, which enable software to designate allocations from main memory for the exclusive use of the GPU, which could be useful for data being processed by the GPU alone, perhaps for a compute, render or blit pass. It’s more common, though, for M1 software to use the shared storage mode, which ensures that both CPU and GPU can access the data.

Where Unified Memory becomes more complex is monitoring its use.

In the traditional discrete memory architecture of main and video memory, we’re generally most interested in main memory use, as requiring more memory than is physically available results in the use of virtual memory, with some being paged out to disk, and a dramatic drop in performance. Video memory doesn’t have that option, though, and excess data would have to be copied back to main memory, which results in an even bigger performance hit. Generally, video memory is managed to make full use of what’s available, to get most benefit from GPU acceleration, but to keep sufficient in reserve so as not to run out of free memory. Don’t be surprised if your 8 GB graphics card is fairly constantly using 7 GB.

As M1 Macs don’t have video memory as such, that being used by the CPU, GPU or both has to be accounted for in system memory as shown by vm_stat and Activity Monitor. At present, those tools can’t make any distinction, so working out what’s being used as video memory isn’t easy. However, iStatistica Pro claims to be able to show what it terms “GPU memory used”.

With an M1 Pro driving just its internal display at low resolution that ‘looks like 1168 x 755’, iStatistica reports around 140 of 379 MB used. At default resolution, those rose to 200 of 480 MB used. Driving a second display they increased to 260 of 506 MB, and a third display took them to 337 of 698 MB. Although those figures appear plausible, I can’t find any other source for that information, so am unable to check them.

istatistica

What is puzzling about the figures given is the maximum, as macOS doesn’t appear to put a limit on the amount of main memory which can be accessed by the GPU. Unfortunately, iStatistica doesn’t give any explanation for those figures, but I wonder if they’re calculated from the resolutions of the available displays rather than any setting in macOS.

Turning to Activity Monitor, we’re none the wiser. The breakdown given at the foot of the Memory view includes:

Memory Used, the total amount of physical (real) memory being used by the system and apps, which should include that being used by the GPU.
App Memory, which is the total amount of physical memory allocated to system processes and apps, which might or might not include that being used by the GPU.
Wired Memory, physical memory which can’t be either compressed or swapped out to disk, which should include all that being used by the GPU.
Compressed, physical memory which hasn’t been used recently, so has been compressed to save space. That shouldn’t include any being used by the GPU.

But you can’t use any of those to even guesstimate what’s being used by the GPU.

Another useful indicator is the % GPU figure given in the CPU view, as that should be the proportion of work being performed by the GPU which is directly attributable to each process. Inevitably, WindowServer is usually close to the top of the list, particularly when few graphics-intensive apps are running.

If iStatistica’s figures are anywhere near accurate, even when driving two external displays, the amount of memory being used by the GPU appears surprisingly small, and far less than that used by graphics cards in Intel Macs. Perhaps that’s the real secret of Unified Memory.

4Comments

Add yours

1

Bryan Christianson on March 1, 2022 at 9:53 am

The M1 GPU Memory (allocated, used) and Utilisation values are available from the IORegistry in the accelerator “PerformanceStatistics” dictionary.

This dictionary contains keys: “Device Utilization %”, “Alloc system memory”, “In use system memory” which report the corresponding values.

I haven’t looked in depth at how the allocation changes with workload, but I’m sure it is not constant.

My application CPUSetter, https://whatroute.net/cpusetter.html graphs these values.

LikeLiked by 1 person
- 2
  
  hoakley on March 1, 2022 at 10:19 am
  
  Thank you, Bryan. I’ll look at CPUSetter shortly, and apologise for overlooking it.
  This seems to illustrate the perils of undocumented IORegistry entries, and how to interpret them. iStatistica must assume that “Alloc system memory” is 100%, and fixed by macOS, whereas allocation might not be set in that way. If it were, this would be little different from Intel systems with ‘integrated graphics’, which would appear to just be a cheat using faster memory, and not at all what Apple described at WWDC.
  What stands out to me, though, is how small these allocations and usage are in comparison with video memory in graphics cards, although Unified Memory and the GPUs seem to rival them in performance.
  Howard.
  
  LikeLike
3

William David Schwaderer on March 1, 2022 at 3:25 pm

I am interested in knowing what the memory path access is. Say you have a 64 GB memory space on a M1 Max processor. Is the first 16 GB only accessible on the first path. Or, is the first memory word accessible on the first path, the second on the second path, etc. (Interleaved). The processor supposedly has 400GB/sec access to memory. However, that is only true when all four paths are transferring…

LikeLiked by 1 person
- 4
  
  hoakley on March 1, 2022 at 8:00 pm
  
  I’m sorry, that’s too technical for me to even understand properly!
  Howard.
  
  LikeLike

Share this:

Related