In the previous article, I explained the basic architecture of the registers of the ARM64 processor, and explained how they can be used by an assembly language routine to access and return values. I ended with a cliffhanger, promising to explain how arguments can be accessed from pointers passed to a routine.
There are two basic instructions for transferring data between memory and registers, LDR which loads into a register, and STR which stores into memory. Their simplest use is:
LDR X0, [reg]
loads what’s at the address inreg
, such as registerX1
, into registerX0
STR X0, [reg]
stores what’s in registerX0
to the address inreg
, such as registerX1
Note how the direction of movement differs in the two: LDR loads the first operand as the destination, whereas STR stores what’s in the first operand as the source. Instructions more commonly use the first operand to specify the destination of the result; STR is an important exception.
The reg item in those instructions can use any of five different addressing modes:
[X1]
– base register, which can include SP
to use the stack pointer
[X1,offset]
– base register with an offset, in which the effective address is the sum of the address in the base register and the offset
[X1,offset]!
– pre-indexed, which works like an offset, but before use the address in the register has the offset added to it; this is commonly used for loading or storing values in an array
[X1],offset
– post-indexed, which works like an offset, but as a result of this instruction the address in the register is updated by adding the offset at the end of execution; this is commonly used for popping values from the stack
label
– PC relative, which is used for accessing values defined in the code by a label.
To understand these, it’s useful to work through some examples. At the start of each of these, the register X1 contains the address 00000100.
LDR X0, [X1]
loads the 64-bit integer at the address 00000100 into register X0LDR X0, [X1,8]
loads the 64-bit integer at the address 00000108 (00000100 + 8) into register X0LDR X0, [X1,8]!
loads the 64-bit integer at the address 00000108 (00000100 + 8) into register X0, and leaves the address 00000108 in register X1LDR X0, [X1],8
loads the 64-bit integer at the address 00000100 into register X0, and leaves the address 00000108 (00000100 + 8) in register X1.
The final complication to specifying addresses is variation in the way of specifying the offset, here shown as an offset to the register X1:
[X1,10]
– the offset is a fixed integer, here 10[X1,X2]
– the offset is the integer contained in register X2[X1,X2, LSL 2]
– the offset is the integer contained in register X2 shifted left by 2 places[X1,W2, UXTW 2]
– the offset is the integer contained in register W2 (32-bit), shifted left by 2 places.
Of those, the first two are the most common.
These are summarised in this diagram, which also covers other operand types and return values, which I’ve already examined.
Here’s a tear-out PDF: armoperands
Now try out some of these load and store instructions with different addressing modes. This is simple using AsmAttic: first, you need to change the asmmath.h header file to provide the new call, in my case
extern double testadd(double, double*, double*);
which takes one double as a value and two as pointers, and returns a double value. To call that in Swift, use code like
let myA = theA.doubleValue
var myB = theB.doubleValue
let theTemp = theC.doubleValue
var myC = [theTemp, (theTemp + 1.0), (theTemp + 2.0)]
let myD = testadd(myA, &myB, &myC)
This first sets up the three arguments to contain an immutable Double value, a pointer to a Double, and a pointer to a three-element array of Doubles, before calling that function to return a Double result. You then display the results using
self.outputText.string = "Result = \(myD) a = \(myA) b = \(myB) c = \(myC)\n"
My assembly code then reads:
.global _testadd
.align 4
_testadd:
// – that’s a labelled value using PC-relative access
STR LR, [SP, #-16]!
LDR D5, MULT_TWO
FMUL D6, D0, D5
// – the first argument, a Double, is accessed from the D0 register
LDR D7, [X0]
// – that uses base register access. Note that the address of a Double is passed not in a floating point register, but in a general-purpose register.
FMUL D7, D7, D5
// – that uses pre-indexing to increment the address in X1 by 8, so accessing the next Double in the array.
STR D7, [X0]
LDR D5, MULT_THREE
LDR D4, [X1]
FMUL D7, D4, D5
STR D7, [X1]
LDR D4, [X1,8]!
FMUL D7, D4, D5
// – the result is returned as a Double value in the D0 register.
STR D7, [X1]
LDR D4, [X1,8]!
FMUL D7, D4, D5
STR D7, [X1]
FMOV D0, D6
LDR LR, [SP], #16
RET
MULT_TWO: .double 2.010203
MULT_THREE: .double 3.020304
Now we’re in a position to access arguments passed to the assembly routine, return results both as values and in those arguments which were passed by reference. The next step is to control program flow, which I’ll start in the next in this series.
Previous articles in this series:
1: Building an app to develop assembly routines, including an explanation of calling assembly language from Swift, with a complete Xcode project
2: Registers explained, with a chart showing the registers of the ARM64 processor, and passing arguments by value
Downloads:
ARM register summary
AsmAttic, a complete Xcode project
References
Procedure Call Standard for the Arm 64-bit Architecture (ARM) from Github
Writing ARM64 Code for Apple Platforms (Apple)
Stephen Smith (2020) Programming with 64-Bit ARM Assembly Language, Apress, ISBN 978 1 4842 5880 4.
Daniel Kusswurm (2020) Modern Arm Assembly Language Programming, Apress, ISBN 978 1 4842 6266 5.
ARM64 Instruction Set Reference (ARM).