HomeAboutArticles

ARM64 — Local variables

Start page1 page2 page3 page4 page5 page6 page7 page8 page9 page10

The 3 special registers

From now on we use the names, not numbers, of these registers:

We want to get rid of the message-string, on previous pages. Those 17 bytes must be allocated as a local variable on the stack.

We take lib3.s and make lib4.s like this:

        .text
        .global _phex
        .global _exit

_phex:  stp fp,lr,[sp, #-16]!
        mov fp,sp
        sub sp,sp,#0x20
        add x2,fp,-17

        movz x3,16
again:  ror x0,x0,60
        and x1,x0,0xF
        cmp x1,10
        bge isAtoF
        add x1,x1,48
        b  done
isAtoF: add x1,x1,55
done:   strb w1,[x2]
        add x2,x2,1

        subs x3,x3,1
        bne again

        mov w1,#10      /* line feed */
        strb w1,[x2]


        mov x0,1
        add x1,fp,-17    /* message */
        mov x2,17
        mov x8,0x40
        svc 0
        add sp,sp,#0x20
        ldp fp, lr, [sp], #16
        ret

_exit:  mov x0,0
        mov x8, 93
        svc 0

A drawing wwill help understanding. This is an attempt to show in one picture what happens throughout the subroutine.

First fp and lr are saved on the stack, and the sp is decreased 16 bytes.

Then we save this position in fp — it points to the current frame containing local variables. Then the frame is allocated with 0x20 bytes (32 bytes) on the stack for the local variables. The sp-register only works properly if it is a multiple of 16 bytes. In this case we need 17, so 32 is subtracted from the sp.

Local variables are now some negative distance from the fp. In this example x2 gets the address of the string, which is -17 bytes from the fp.

add x2,fp,-17

That was setting up the subroutine, and the hard work can begin, printing a value in hex.

Finishing the subroutine is a bit simpler, but we must do it right.

We started by subtracting 0x20 from the sp, so now we must add the same value to sp. And whatever we pushed on the stack with stp must be popped off the stack with ldp, but in reverse order.

The linefeed

One thing is different in the new _phex, since we must add the line feed directly, which we do here:

        mov w1,#10      /* line feed */
        strb w1,[x2]

Line feed is 10 in ASCII, and since we don't use x2 anymore we just don't increment it anymore.

Now compile lib4.s, link the lib4.o to hello3.o, and it should work just like before. But we are using the stack.

Recursion?

You said recursion? Yes, I did. Next page will show how to make the very inefficient but recursive version of the Fibonacci function.

Just one more thing — save the registers

Debugging is hard in a program like the Fibonacci function we are going to make. There are tools to help, but the simplest tool is to just print out values along the way.

In the _phex function shown above many registers are used with new values, and that makes it hard to use just for debug printing.

We will save the registers x0 to x3 on the stack, and make sure the Fibonacci function uses no more than those 4 registers.

We do this in two places — in the beginning:

_phex:  stp fp,lr,[sp, #-16]!
        stp x0,x1,[sp, #-16]!
        stp x2,x3,[sp, #-16]!

and in the end:

        ldp x2, x3, [sp], #16
        ldp x0, x1, [sp], #16
        ldp fp, lr, [sp], #16
        ret

Now we can safely call _phex whenever we want.

Start page1 page2 page3 page4 page5 page6 page7 page8 page9 page10

2025-06-16