Code in ARM Assembly: Conditions without branches

In the previous article I looked at bit operations, which almost completes this survey of instructions involving the general-purpose registers. As promised, I move on here to look in some detail at conditional selection, which lets your code apply conditions without branching.

I’ve already explained how branching can slow processing as a result of its effects on the processor pipeline. While there are many cases where branching is unavoidable, there are others where you can write code which applies conditions but won’t disrupt the pipeline. I’ve not been able to find a good account of this in books on ARM64 assembly language, so try to correct that here.

Trimming within a range

Swift contains several different ways to check whether an integer is within a specified range, but there’s a more pedestrian approach which also lets you trim the variable to that range:
var theInt: Int = 42 let theMin = 1 let theMax = 10 if (theInt < theMin) { theInt = theMin } else if (theInt > theMax) { theInt = theMax }

If you were to implement that using conditional branching, it would pose problems for prediction, and most probably disrupt the pipeline. Iterating through a million such sequences could thus create quite a bottleneck in your code.

Conditional selection instructions

Each of these instructions takes as it last operand a condition, which is the same two-letter code as used in branch conditions. In each case, these are tested against the current flags in the global condition flag register, NZCV. You’ll see a reminder of those in today’s cheat sheet below.

When using conditional selection instructions, you therefore normally use two instructions, the first to set NZCV flags, and the second to apply the action depending on those flags. The first can thus be any of the flag-setting instructions, typically ending in -S, such as ADDS, although it’s most commonly a comparison such as CMP.

The full list of these instructions reads:

CSEL Xd, Xa, Xb, condition if condition is true, writes Xa to Xd; otherwise writes Xb to Xd;
CSET Xd, condition if condition is true, writes 1 to Xd; otherwise writes 0 to Xd;
CSETM Xd, condition if condition is true, sets all bits in Xd to 1, otherwise sets all to 0;
CINC Xd, Xa, condition if condition is true, increments Xa by 1 and writes it to Xd; otherwise writes Xa to Xd
CINV Xd, Xa, condition if condition is true, writes the bitwise inversion of Xa [NOT(Xa)] to Xd; otherwise writes Xa to Xd
CNEG Xd, Xa, condition if condition is true, writes the negated value of Xa [NOT(Xa)+1] to Xd; otherwise writes Xa to Xd
CSINC Xd, Xa, Xb, condition if condition is true, writes Xa to Xd; otherwise writes Xb + 1 to Xd
CSINV Xd, Xa, Xb, condition if condition is true, writes Xa to Xd; otherwise writes the bitwise inversion of Xb [NOT(Xb)] to Xd
CSNEG Xd, Xa, Xb, condition if condition is true, writes Xa to Xd; otherwise writes the negated value of Xb [NOT(Xb)+1] to Xd

Each can use X or W registers, but they can’t be mixed in a single instruction.

There’s an important distinction to be drawn between the plain C- instructions (CINC, CINV, CNEG) and their CS- siblings (CSINC, CSINV, CSNEG) which makes their use prone to error: the plain C- instructions apply changes to Xa, the first operand register, and write that to the destination Xd if the condition is met; the CS- versions apply changes to Xb, the second operand register, and write that to the destination Xd if the condition is not met. They are thus inverses, and care needs to be taken to set the correct operands and condition.

Here’s a summary chart of these instructions:

and a tear-out PDF version: armcondseln1

Examples

The following examples illustrate simple uses of these instructions.

if (X0 == 1) { X1 = X2 } else { X1 = X3 }
can be coded as
CMP X0, #1 CSEL X1, X2, X3, EQ

if (X1 == 2) { X0 = X0 } else { X0 = X0 + 1 }
can be coded as
CMP X1, #2 CSINC X0, X0, X0, EQ

if (X0 != 0) { X1 = X1 + 1 }
can be coded as
CMP X0, #0 CINC X1, X1, NE

if (X0 == 0) { X1 = X1 + 1 } else { X1 = X1 - 1 }
can be coded as
CMP X0, #0 SUB X2, X1, #1 CSINC X1, X2, X1, NE

Finally, trimming within a range might be coded as
// X0 contains the value to be checked
// X1 contains the range minimum
// X2 contains the range maximum
CMP X0, X1 CSEL X0, X1, X0, LT CMP X0, X2 CSEL X0, X2, X0, GT
and there are many other ways to express this without branching, as I’m sure some of you can suggest.

Having covered those, in the next article I’ll start to look at floating point operations.

Previous articles in this series:

1: Building an app to develop assembly routines, including an explanation of calling assembly language from Swift, with a complete Xcode project
2: Registers explained
3: Working with pointers
4: Controlling flow
5: Conditional loops
6: Flow, pipelines and performance
7: Moving data around
8: Integer arithmetic
9: Bit operations

Downloads:

ARM register summary
ARM operand architecture
Conditions and conditional branching instructions
Control Flow
ARM instructions for GP registers
ARM conditional selection
AsmAttic 2, a complete Xcode project (version 2)
AsmAttic, a complete Xcode project (version 1)

References

Procedure Call Standard for the Arm 64-bit Architecture (ARM) from Github
Writing ARM64 Code for Apple Platforms (Apple)
Stephen Smith (2020) Programming with 64-Bit ARM Assembly Language, Apress, ISBN 978 1 4842 5880 4.
Daniel Kusswurm (2020) Modern Arm Assembly Language Programming, Apress, ISBN 978 1 4842 6266 5.
ARM64 Instruction Set Reference (ARM).

Share this:

Related