Tuning Apple Silicon will be complex at first

Having used and developed for Macs during their two previous transitions between processor architectures, I well remember how complex they can become. Apple provides powerful tools to ease the transition, and for the great majority of users, it all goes seamlessly. With Rosetta 2 to take us across to Apple Silicon, I’m confident this will go well over the coming couple of years, but there will also be some surprises in store. This short article looks at one of them: performance in toolchains.

Apple already hints at this complexity in this helpful developer article. It’s all down to the detail of how Rosetta 2 works.

When an Apple Silicon Mac goes to run Intel code, Rosetta 2 has to translate that into ARM native instructions first. Whenever possible, translation is performed beforehand, normally when you install an Intel-only app or executable code. If that’s not possible, ideally only when the code is compiled ‘just in time’ (on the fly), translation has to occur at runtime. That takes time before the translated code can be run, and Apple warns that “users might perceive that translated apps launch or run more slowly at times”.

Naturally, macOS running on Apple Silicon prefers to run ARM-native code, so when you launch a Universal App it will default to running its ARM code. What happens when that comes across a code module which isn’t available in ARM-native code and needs to be translated by Rosetta 2? Apple hints at the explanation when it suggests that a user might override the default and force a Universal App to launch using Rosetta translation (by opting to run the Intel version of the app) “to allow the app to run older plug-ins that don’t yet support the arm64 architecture”.

In other words, when running code on Apple Silicon, macOS will keep to the same architecture through each calling chain. As some have already reported on Twitter, if you run a terminal app which is Intel-only, each shell that it launches also runs under Rosetta 2 and runs the Intel version of each tool which it calls, even though those tools may be Universal and have ARM-native code available too. When those tools haven’t already been translated by Rosetta 2, translation must occur before they can be run, causing a delay, even though they could have run their ARM-native code.

Call the same Universal tools from a Universal terminal app, and provided each has ARM-executable code, the whole chain will run in native code instead, resulting in greatly improved performance.

There are other implications here. If you need to run the Intel version of an app in order for it to load Intel-only plugins, this surely doesn’t mean that an ARM-native Terminal can’t run Intel-only command tools. Apple refers to mixing different code in the same process, warning that “Rosetta translation applies to an entire process, including all code modules that the process loads dynamically.”

Performance and other issues like these will diminish during transition, as more Universal Apps and binaries ship. In the early days of Apple Silicon Macs, though, we’ll need to be watchful.