Download Assembly Programming: Advantages, Branch Instructions, and Subroutines - Prof. Nihar Mahap and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!
Module Learning Objectives
Assembly Programming
C Programming for
Microcontrollers
Module 5: Part 1 (M5.1)
Describe C’s data types and their meaning in hardware
Explain structured programming techniques and flow control constructs
ECE 331
Prof. Nihar Mahapatra
(adapted from Prof. A. Mason’s lecture notes; other sources listed at the end)
Background: Abstracting from Machine Language
Machine code deals directly with hardware operations
o Direct correspondence:
assembly instruction ↔ machine opcode ↔ electrical signals in the chip
o Labs 1-3 explored how this works for arithmetic and logic instructions
o Load and Store instructions manipulate the address and data lines to the
memory
Problems with assembly programming
o Every processor architecture is different
- No portability
- Requires careful study of the CPU’s design
o Time consuming and error-prone
- The programmer must keep track of everything: code vs. data, numbers vs. words, which register holds what data at each point in time
M5: Assembly Programming
2
Background: Abstracting from Machine Language
Higher level programming languages offer:
o Improved abstraction: hardware-agnosticism, easier to split a problem into
subtasks
o Improved readability: easier to express algorithms and use descriptive names
o Improved safety: language features prevent using data incorrectly by mistake
o Improved portability: write code on one computer architecture, deploy on many
Advantages of assembly programming
o Use of processor-specific features – operating systems, specialized applications
o Highest possible performance and smallest possible memory footprint
- However, modern processors are very complicated!
- Handwritten assembly is not a guarantee of higher performance
C strikes a nice balance of expressive power and direct correspondence
with common machine instructions.
3
C Data Types
A type lets us specify how a piece of data should be interpreted.
Every piece of memory that can be accessed in a C program is attached to
a type.
Every type has an associated size, always given in bytes.
o It is possible to refer to the size of a given type by name within your program
using the builtin sizeof operator
o For example, on the KL25Z, sizeof(char) is 1 byte and sizeof(int) is 4.
- These values (especially sizeof(int)) may be different on other processors.
- For precision we use the types int8_t, int16_t and int32_t instead (which have sizes 1, 2, and 4 respectively). These types are defined in the file stdint.h (part of the C standard library) and guaranteed to have the specified size.
Size is an example of how types encode properties that change the
meaning of a piece of data
o The same line of C code can generate different machine code depending on the
types involved
- Example: Overflow behavior in 8-bit versus 16-bit integers
- Signed vs. unsigned arithmetic corresponding to signed (the default) and unsigned types: unsigned types include unsigned char and uint32_t
M5: Assembly Programming
4
C Function Calls
For functions, declarations (which specify the function’s type) may be
separate from definitions (which specify the function’s behavior)
o A definition includes a function body , which is wrapped with curly braces {}
o Inside a function body is the only legal place for executable statements (loops,
assignments other than initializations, etc.)
o The machine code corresponding to one function may be in a completely
different memory location than the next
o Unless the return type is void, the function body must produce an output value
using the return statement, e.g., return 5; would always produce 5 as the
output value.
o A function body may also contain calls to other functions, which cause the
program flow to jump to the specified function:
int main (void)
int x;
x = add_two_numbers(3, 4);
return x;
7
C Function Calls
The arguments to a function are treated as local variables inside the function body:
These arguments may use the same names as variables defined elsewhere in the program, and the “most local” variable is the one that will be used. The initial values are determined at runtime by the values “passed in” during a function call.
The calling function picks up where it left off, using the return value as the result of
evaluating the function.
C is said to be a call-by-value language , meaning that the called function only
receives a copy of the value of each variable passed as an argument, rather than
references to the variables themselves.
M5: Assembly Programming
8
Pointers and Memory
C, like assembly, allows direct access to addresses of variables
There is a distinction between data and the address where data resides
Similar to the distinction between immediate and indirect addressing modes We often wish to treat addresses themselves as ordinary values
A variable which holds an address has a pointer type.
At declaration we indicate the referent type: uint16_t * ptr_to_halfword = &x; referent type identifier address of variable x The referent type indicates what type we should use to interpret the contents of the memory at a given address (the referent ). We say that a pointer “points to” a referent. Which address to use is determined by the value of the pointer variable, stored as an integer type This means all pointer variables have the same size (on most modern architectures) , irrespective of the size of their referent types
Again, different (referent) types produce different behaviors:
Example: adding 1 to a uint32_t * actually increases the stored address value by 4: it can be thought of as “point to the next uint32_t in memory” When accessing memory using a pointer, the referent type of the pointer determines the number of bytes modified Raw memory has no type information attached (types don’t exist in machine code) – the pointer referent type determines how pointed-to data behaves
9
Module Learning Objectives
Assembly Programming
Flow Control & Branching
Module 5: Part 2 (M5.2)
Explain flow control constructs for conditional and looping program
execution
Describe relationship between status flags and conditional operations
Write effective ASM branch instructions for unconditional and conditional
operations
ECE 331
Prof. Nihar Mahapatra
(adapted from Prof. A. Mason’s lecture notes; other sources listed at the end)
Counting Conditional Loops
Typically, a looping flow would like to
be repeated a specific number of times
Tracking the number of times a loop is
executed involves counting loops
For loop : counting loops
o most basic and common counting loop o for x = 1 to n
- repeats process for n times
- then continues to next process
o Is this do-while or repeat-until structure? o do process while condition true
13
How do we transfer the for-next
counting loop into a linear flow?
o suitable for linear storage in memory
Linear, do-while structure, for-loop
counting loop
o do process for n times while test condition true o then next when test condition is false
How does loop count?
o initialize counter: x= o test: is x n- 1
- why n-1? first loop: x=0, second loop x=1, etc, so stop at n- 1 o increment counter: x=x+
How could test be changed to: is x=n?
for loop
for x = 1 to n, next do while structure
Active Learning
Linear
counting
loops
Using flowchart symbols, construct a linear counting loop
(for-next) with a repeat until structure
repeat-until == repeat process until test condition is True
What do we need?
initialize counter
process to repeat
increment counter
test counter
next process
What is test condition?
What test result ends
loop and goes to next (T/F)?
What test result repeats (T/F)?
Where does repeat flow go?
Could the condition test be moved to the top (after counter
initialization)? How?
M5: Assembly Programming
14
x=
for next
process
x=x+
next
T(yes)
x n F(no)
Flow Control Management Constructs
Program flow structures
o sequential
o conditional
o looping
Implementing flow control in linear (sequential) code requires capability to:
o branch
- move forward or backward to non-sequential position in code
- unconditional - always
- conditional – if T/F
o call subroutine
- jump to subroutine defined outside of (e.g., below) the main sequential code
- when subroutine is finished return to point of call & continue executing code
- returning to point of call is the defining characteristic of a Subroutine (compared to Branch)
15
Constructs referred to as functions, procedures, and methods (object oriented programming) are called subroutines in ASM
conditional flow
always branch
if T/F
subroutine
call
return
Branching
Branch = alter program execution order
o re-direct instruction fetch to new location in program by adjusting PC value
Branching enables conditional program
execution
o permits nonlinear instruction execution o realize if-then-else program constructs
Branching enables looping
program execution
M5: Assembly Programming
16
Assembly Branch Instructions
o for ARM Cortex M0+ (ARMv6-M)
Branch Conditions
o unconditional: branch will always occur o conditional: branch only if condition is true
Branch (standard)
o adjust PC to new location and begin fetching o no knowledge of where program was before Branch
Branch – link
o sub-type of branch for subroutines o stores return address that points to next instruction after branch
- allows code to return to point of subroutine call o return address stored in Link Register
- CPU register R
next
process
function call
call
return
A
A+
A+
A+X A+X+
...
Conditional Branches
Conditional Instruction Suffixes
o Examples:
BNE again
;branch not equal to label ‘again’
BXVS R
;branch if overflow to [R1]
BLEQ f_zero
;branch-link if equal/zero to subroutine at label ‘f_zero’
o Redundant suffixes
19
(^1) same as HS (unsigned higher or same)
(^2) same as LO (unsigned lower)
(^3) same as AL (always)
Suffix Function Condition Flags Condition Code
EQ Equal (to zero) Z == 1 0000
NE Not equal (to zero) Z == 0 0001
CS^1 Carry set C == 1 0010
CC^2 Carry clear C == 0 0011
MI Minus, negative N == 1 0100
PL Plus, positive or zero N == 0 0101
VS Overflow V == 1 0110
VC No overflow V == 0 0111
HI Unsigned higher C==1 and Z==0 1000
LS Unsigned lower or same C==0 or Z==1 1001
GE Signed greater than or equal N == V 1010
LT Signed less than N != V 1011
GT Signed greater than Z==0 and N==V 1100
LE Signed less than or equal Z==1 or N!=V 1101
none^3 Always (unconditional) any 1110
ASM Branch Instructions: Unconditional
Unconditional branches
o B label ; branch Always to label
- branch value (11-bit distance) stored with instruction (immediate addressing)
- target address must be within 2 KB of branch instruction (-2048 B to +2046 B)
o BX Rm ; branch indirect to location specified by Rm
- branch value stored in Rm (32-bit distance); inherent addressing
- target can be anywhere in 4GB memory map
- commonly used as BX LR to return from subroutine; see next section (M5.3)
o BL label ; branch with link to subroutine at label
- branch value (32-bit distance) stored to ROM (PC-relative addressing)
- target address within (-16 MB to ~16 MB, i.e., - 16777216 B to 16777214 B) of branch instruction
- return address (instruction after the BL) stored to Link Register
o BLX Rm ; branch with link to subroutine @ [Rm]
- branch value stored in Rm (32-bit distance); inherent addressing
- target can be anywhere in 4GB memory map
- return address stored to Link Register
M5: Assembly Programming
20
Advanced understanding: Since instructions are 16 bits (2 bytes), 11-bit offset will span 4KB instruction lines
ASM Branch Instructions: Conditional
Conditional branches
o written by including {cond} conditional_suffix option within branch instruction o {cond} suffix only available for B, not BX, BL, or BLX, for ARMv6-M instruction set o branch value (8-bit distance) stored with instruction (immediate addressing) o target address must be within 256 B of branch instruction (-256 B to +254 B) direct status flags o BEQ label ; branch if Z == 1 Equal o BNE label ; branch if Z == 0 Not equal o BCS label ; branch if C == 1 Higher or same, unsigned ≥
- BHS label ; branch if C == 1 Higher or same, unsigned ≥ o BCC label ; branch if C == 0 Lower, unsigned <
- BLO label ; branch if C == 0 Lower, unsigned < o BMI label ; branch if N == 1 Negative o BPL label ; branch if N == 0 Positive or zero o BVS label ; branch if V == 1 Overflow o BVC label ; branch if V == 0 No overflow derived status flags o BHI label ; branch if C==1 and Z==0 Higher, unsigned > o BLS label ; branch if C==0 or Z==1 Lower or same, unsigned ≤ o BGE label ; branch if N == V Greater than or equal, signed ≥ o BLT label ; branch if N != V Less than, signed < o BGT label ; branch if Z==0 and N==V Greater than, signed > o BLE label ; branch if Z==1 and N!=V Less than or equal, signed ≤
branch if equal
not equal C set ≥ C clear < minus plus V set V clear
branch if
≤ signed ≥ signed < signed > signed ≤
21
Active Learning
Example
Branch
Loops
59: B Next
branch always to
0x0.D
61: BEQ PT
does what?
if R2=0 (Z=1), then
set PC=0xDC
i.e., jump to line 65
else what?
execute line 63+
64: BX R
does what?
branch always to
PT
i.e., jump to line 67,
skipping line 65
What is the value in
R4 after executing
this code?
0xFFFF.FFFF
M5: Assembly Programming
22
56 57 59 60 61 63 64 65 66 67 69
ROM
constants
‘.’ means ‘self’. will branch to self creating an infinite loop
Stack Pointer
Stack Pointer (SP) is CPU register that
tracks the top-of-stack
o SP always contains an address value that points to the last data placed on stack o SP initially defined as bottom of stack
- next address after (higher) than highest address assigned to stack
- no data ever stored at bottom-of-stack sets boundary for stack underflow
25
SP
SP 1 1
SP 2
PUSH {R0} PUSH {R1} PUSH {R2}
1
SP 2
3
POP {R5} POP {R4} POP {R3}
0x2000.
0x2000.7FFC Figure 3.27. Stack picture showing three numbers first being pushed, then three numbers being popped. You can draw stack so that the lowest address is on the top (like this one) or so that lowest address is on the bottom. The important matter is to be clear, accurate, and consistent.
Example:
PUSH {R0}
PUSH {R1}
PUSH {R2}
followed by
POP {R3}
POP {R4}
POP {R5}
what will
this do?
functionally? physically?
Stack Pointer
M5: Assembly Programming
26
32b Push
- SP SP – 4
- store word to @[SP:SP+3]
32b Pop
- load register from @[SP:SP+3]
- SP SP + 4
store value above (lower addr.) current SP
load value from current SP
Stack ASM Instructions
o PUSH == store register(s) to stack
- syntax: PUSH {reglist} stores a subset, or possibly all, of the general- purpose registers R0-R7 and LR to the stack The registers are stored in sequence, the lowest- numbered register to the lowest memory address, through to the highest-numbered register to the highest memory address. { } are required
- Exs: PUSH {R1} PUSH {R0,R4-R7}
- register value stored to top-of-stack @[SP] o POP == load register(s) from stack
- syntax: POP {reglist} reglist = list of registers (all or a subset of R0-R and PC) to be loaded, { } are required The lowest-numbered register is loaded from the lowest memory address, through to the highest- numbered register from the highest memory address If PC is specified in the register list, the instruction causes a branch to the address (data) loaded into the PC
- Exs: POP {R1} POP {R0,R4-R7}
- register value loaded from top-of-stack @[SP] o LDR SP,
- set initial location of SP (bottom of stack)
- Ex: LDR SP,=bottomstack
27
ARM University Program Copyright © ARM Ltd 2013
Stack Operations
Push some or all of registers (R0-R7, LR) to stack
PUSH {}
Decrements^ SP by 4 bytes for each register saved Pushing LR saves return address
PUSH {r1, r2, LR}
Pop some or all of registers (R0-R7, PC) from stack
POP {}
Increments^ SP by 4 bytes for each register restored
If PC is popped, then execution will branch to new PC value after this POP
instruction (e.g., return address)
POP {r5, r6, r7}
More Stack
Stack Allocation
o Stack is a defined block of RAM
- typically at the top (lowest addr.) or bottom o ARM Cortex-M0 (KL25Z) memory map
- RAM occupies 0x2000.0000 to 0x2000.2FFF
o Stack ending at bottom of RAM (suggested method)
- stack occupies 0x2000. TBD – 2000.2FFF TBD = to be determined by programmer
- initialize SP to 0x2000.
- all other RAM use limited to before 2000.TBD o Stack starting at top of RAM
- stack occupies 0x2000.0000 – 2000. TBD
- initialize SP to 0x2000.TBD
- all other RAM use limited to after 2000.TBD
Stack Faults
o overflow
- stack filled beyond the top o underflow
- stack unloaded (popped) beyond the bottom
M5: Assembly Programming
28
Stack rules
o functions (subroutines) must have an equal number of pushes and pops
- otherwise, stack will become misaligned o stack access instructions should only be used for stack-allocated area o should not read/write (load/store) to stack-allocated area; only push/pop
SP
Allocated stack area
0x2000.
0x2000.0FFC
Overflow
Underflow
Stack starting at the first RAM location Nothing
More RAM
Stack ending at the last RAM loca More RAM
SP
Alloca stack area
0x2000.
0x2000.7FFC
Overflow
Nothing Underflo
SP
Allocated stack area
0x2000.
0x2000.0FFC
Overflow
Underflow
Stack starting at the first RAM location Nothing
More RAM
Stack ending at the last RAM location More RAM
SP
Allocated stack area
0x2000.
0x2000.7FFC
Overflow
Nothing Underflow
0x2000.
0x2000.2FFC last word will fill .2FFC-.2FFF
Examples for 4096 byte Stack that can store 1024 32-bit words
0x2000.0FFC
Stack Example
Program block to load two values then
call a subroutine that will generate the
average of the two values
31
Before (initial)
After (final)
Keil simulator & Stack
Default stack pointer
SP = 0x2000.
Stack Example
Main
Registers after line 85
M5: Assembly Programming
32
BEFORE AFTER
Stack Example
Main pt. 2
Where does BL go?
o where do you look to find this?
How does program know where to return?
o where do you look to find this?
33
Stack Example
Subroutine
PUSH
o before
o after
o what’s in SP?
M5: Assembly Programming
34
Stack Example
Subroutine Return
o after
o what is PC?
Main pt. 3
MOVS
o before
- R0=s/r output overwritten
- R1-R3, unchanged
- R4-R7, preserved by s/r
- LR unchanged
o after
37
PC set to LR (sort of)
Active Learning
Assembly flow control
Assume
R0 = Ram_Data, address in RAM to store values
R1 = value
R2 = value
R3 = value
Write an assembly code segment that will use a 3-count loop to store values
from R1, R2, R3 to sequential 32-bit memory locations starting at
Ram_Data.
Step 1, prepare a flowchart to describe your loop algorithm
Step 2, adapt flowchart to linear flow and add detail to map each step in flowchart
to a single assembly instruction.
Step 3, write assembly code segment
M5: Assembly Programming
38
Active Learning
Start
Step 1: Loop flowchart
Starter
Building blocks
39
copy loop
R0 = [Ram_Data]. R1, R2, R3 = values 1,2,
?
initialize loop counter
[R0] R
increment counter
done
count = 3?
F
T
R1 R R2 R
change RAM pointer
i++ i=
Active Learning
Worksheet
Step 1: Loop flowchart
Starter
Building blocks
M5: Assembly Programming
40
copy loop
R0 = [Ram_Data]. R1, R2, R3 = values 1,2,
?
initialize loop counter
[R0] R
increment counter
done
count = 3?
F
T
R1 R R2 R
change RAM pointer
i++ i=