How Rosetta complicates call chains on M1 Macs

Rosetta 2 is one of the most important features of Big Sur on Apple Silicon Macs, but it can also make life considerably more complicated, particularly in command shells. This article examines some of those complications, and how best to negotiate them.

Two of Rosetta’s features are the basis for this complex behaviour. The first is its preference for running ARM-native code whenever it’s available, and the other is that it won’t mix ARM-native and Intel code in the same process. These are explained in Apple’s account of how call chains work with Rosetta 2:
“The system prefers to execute an app’s arm64 instructions on Apple silicon. If a binary includes both arm64 and x86_64 instructions, the user can tell the system to launch the app using Rosetta translation from the app’s Get Info window in the Finder. For example, a user might enable Rosetta translation to allow the app to run older plug-ins that don’t yet support the arm64 architecture.”

And this important warning: “The system prevents you from mixing arm64 code and x86_64 code in the same process. Rosetta translation applies to an entire process, including all code modules that the process loads dynamically.”

As I wrote in this article
“In other words, when running code on Apple Silicon, macOS will keep to the same architecture through each calling chain. As some have already reported on Twitter, if you run a terminal app which is Intel-only, each shell that it launches also runs under Rosetta 2 and runs the Intel version of each tool which it calls, even though those tools may be Universal and have ARM-native code available too. When those tools haven’t already been translated by Rosetta 2, translation must occur before they can be run, causing a delay, even though they could have run their ARM-native code.”

Open Terminal and run tools in zsh and, when it encounters a Universal binary, the arm64 code in that binary will be run by default. When only the x86_64 executable is available, that will be translated by Rosetta 2 and run without complaint. Should you make the mistake of trying to run an unsigned Universal binary from an arm64 shell, that will be refused with the error message that the process was killed, under ARM-native security rules.

There are two ways to force tools to be run in Rosetta translation: set this option in Terminal’s Get Info window to run the whole shell in translation, or use the shell command arch. This has simple syntax:
arch -x86_64|-arm64 commandname
So
arch -x86_64 zsh
will run the zsh shell in translation. This will by default mean that Universal binaries will be run in translation, and won’t be killed in the event that they’re unsigned.

Command shells including Terminal are still happy to run different architectures, subject to the overriding rule that a shell running ARM-native will always prefer to run the arm64 code in each Universal binary, unless it’s otherwise instructed using arch.

Scripts can also detect the current architecture of the shell.
uname -m
returns arm64 when running native, or x86_64 when in translation. Another way for a process to detect whether it’s running in translation is to call
sysctl -in sysctl.proc_translated
which returns 1 if in translation, or 0 if it’s not. Details of how to use this in scripts are given in Sasmito Adibowo’s useful article.

There are some important differences between the two architectures which can catch you out. One important difference is that Mach Absolute Time has a very different timebase. Instead of each incremental tick occurring at an interval of 1 nanosecond, as on Intel hardware, M1 ticks occur every 41.67 nanoseconds. If you have any code which accesses Mach Absolute Time, it needs to be corrected by that timebase factor. Similarly, values for Mach Absolute time will differ between the architectures.

To improve compatibility, when running in translation, ticks and the timebase are automatically expressed as if on Intel hardware. Mixing code running in translation and that running native could thus result in some exceedingly large misunderstandings.

There are some other oddities too. system_profiler running in translation returns results which are largely compatible with those on Intel systems, which are different from those given when running native on the M1. Strangely, this even results in different Provisioning UDID numbers, as shown in the following two screenshots taken on the same M1 Mac mini running in translation and natively.

sysprof01

sysprof02

I have written more about system_profiler here.

Finally, running code in translation can have some suprises in terms of performance. If Rosetta 2 has to translate the code before it can be run, that can result in a detectable delay. Calling the same subsystem in macOS can also show large discrepancies. My example here is the new framework in Big Sur supporting AppleArchive compression and decompression. There seems little difference in the performance of decompression between the two architectures on the same M1 system, but a test compression run in translation is nearly three times slower than when running native: compressing a suite of test files took 15.8 seconds in translation, but only 6.5 seconds in ARM-native code.