Code in ARM Assembly: Controlling flow

In the previous article in this series, I completed explaining how you can pass arguments from Swift to assembly language by detailing addressing modes and passing by reference. So far, all the assembly code that I have shown has been linear: flow starts at the top, accessing the arguments passed, and works its way down to the return of the result and return to the calling code with a RET instruction. The next step is to consider how that flow can be controlled, to allow the code to branch.

Unlike high-level languages, in which great emphasis is placed on structured code, assembly language has one basic method for this, what is in effect the dreaded GOTO, officially called branching. If you’ve never written code using this more basic system, this can come as a shock, but in practice there’s nothing that you can’t do with branching, a good instruction set, and a little thought.

There are two dangers hidden in branching: spaghetti code, in which branching becomes too complex and interwoven, and the effect of branching on execution speed. Processors such as the ARM64 operate fastest when they’re able to fetch the next instructions in advance, which requires them to predict how execution will follow branches. If the branch prediction logic gets its expectation of flow wrong, that can significantly reduce the speed at which the processor executes code. Minimising the number of branches is important if you want your assembly code to run fast, and using branching which is more easily predictable is a skill worth acquiring. In code which is critical to performance, it’s worth testing different options for branch design to tune your code.

The most fundamental form of branching is the unconditional, or infinite, loop:
loop: MOV X0,#1 // etc.

B loop

which executes the same code, looping back to the label loop from the branch instruction, for ever. Obviously this is of no use on its own, but is used in combination with conditional branching to break out of that loop. The label in this instance is an offset to the PC register, which has to be within 128 MB of the instruction, which should cover anything you or I are likely to use in assembly for a little while to come.

Conditions which are used to construct conditional loops and other branching are based on condition flags, sometimes referred to as the ALU flags stored in PSTATE, which are in the global condition flag register, NZCV (shown in my ARM64 Register Architecture diagram). Its flags are:

  • N (negative) – set (1) when a signed result is negative, otherwise cleared (0).
  • Z (zero) – set (1) when a result is 0, otherwise cleared (0).
  • C (carry) – set (1) when an add-based operation produces an overflow, when a subtract-based operation doesn’t require a borrow; when shifting, holds the last bit that’s been shifted out; otherwise cleared (0).
  • V (overflow) – set (1) when an add- or subtract-based operation generates a signed overflow, otherwise cleared (0).

Regular arithmetic instructions don’t affect the NZCV register, only those which are designated as doing so, for which the general rule is that the instruction ends with an S, such as ADDS, instead of the regular ADD which doesn’t change flags. The other groups of instructions which affect the flags in the NZCV status register are CMP and its relatives, and their floating point equivalents FCMP and its relatives. Note that there are no floating point equivalents to general-purpose instructions like ADDS.

Conditional flow control therefore commonly requires two instructions: the first applies the condition which sets one or more NZCV flags, following which there’s a conditional branching instruction of the form
B.{condition} {label}
When condition is true, the flow then branches to label. The full set of condition codes is given in today’s summary chart (below). Although their use may appear odd at first, they have great flexibility. Labels used in conditional branching need to be within +/– 1 MB, which again you should never find limiting.

CMP X3, #42
B.EQ next

compares the value in X3 to the number 42. If the Z flag is set, it means that the two numbers are equal, and the code branches to the label next.

That CMP instruction is equivalent to
SUBS XZR, X3, #42
which subtracts 42 from the value in X3, writes the result to the zero register (an idiom which discards the result as that register isn’t writable), and sets the NZCV flags accordingly. If the two numbers are equal, then the result will be zero, and the Z flag will be set.

Floating point values can be compared in a similar way, using the FCMP instruction, but care needs to be taken because a near-zero value requires a tolerance for the comparison, or it will fail to detect values which are very close to zero. That’s an issue which I’ll return to in a future article about floating point arithmetic.

Once set, NZCV flags are left, and can be used or reused later, so long as no instruction sets or resets them in the interim.

One of the simplest Swift statements to express as an idiom using branching is switch, although as you might expect nuances such as whether each case needs an explicit break, and whether there’s an implicit fallthrough or default, are yours to determine. One possible implementation which is clear, simple and efficient is:
CMP X5, #10
B.NE case_11
… // code for case X5 = 10
B endcase
case_11: CMP X5, #11
B.NE case_12
… // code for case X5 = 11
B endcase
case_12:
… // code for other cases
endcase:
… // end of switch, continue

Here’s the summary of conditions and conditional branching instructions:

ARMconditionals

and a tear-out copy: ARMconditionals

In the next article, I’ll develop idioms which can be used in place of Swift’s other control flow statements.

Previous articles in this series:

1: Building an app to develop assembly routines, including an explanation of calling assembly language from Swift, with a complete Xcode project
2: Registers explained, with a chart showing the registers of the ARM64 processor, and passing arguments by value
3: Working with pointers, with a chart showing different types of operand, and passing arguments by reference

Downloads:

ARM register summary
ARM operand architecture
AsmAttic, a complete Xcode project

References

Procedure Call Standard for the Arm 64-bit Architecture (ARM) from Github
Writing ARM64 Code for Apple Platforms (Apple)
Stephen Smith (2020) Programming with 64-Bit ARM Assembly Language, Apress, ISBN 978 1 4842 5880 4.
Daniel Kusswurm (2020) Modern Arm Assembly Language Programming, Apress, ISBN 978 1 4842 6266 5.
ARM64 Instruction Set Reference (ARM).