2 Introduction To Assembly
2 Introduction To Assembly
Chapter 2
Topics
Assembly Programming
AVR’s CPU
Its architecture
Some simple programs
Data Memory access RAM EEPROM Timers
Program Data
RISC architecture Bus
CPU
Bus
Interrupt Other
OSC Ports
Unit Peripherals
I/O
PINS
2
Assembly Programming
Complier
Assembler
3
Assembly Programming
4
Assembly Programming
Assembly language instructions are native
to the particular CPU used in the system.
For example, a program written in the 8086
assembly language cannot be executed on
different system.
Programming in assembly language also
requires a detailed knowledge about the
system components such as the CPU,
memory, and so on. 5
Why program in Assembly language?
There are two main reasons why
programming is still done in assembly
language: (i) efficiency, and (ii)
accessibility to system hardware.
Efficiency : Memory efficiency and
Execution Time efficiency
System hardware : developing device
driver
Typical Applications
Assembly languages were once widely
used for all sorts of programming.
However, by the 1980s (1990s
on microcomputers), their use had largely
been replaced by higher-level languages,
in the search for improved programming
productivity.
Today, assembly language is still used for
direct hardware manipulation, access to
specialized processor instructions, or to
address critical performance issues
Why learn Assembly?
Learning assembly language has both
practical and educational purposes.
When you program in a highlevel language
such as C, you are shielded from low-level
details on purpose and provided only a
"black-box" view of the system.
Understanding assembly language is
understanding the computer system itself!
C Versus Assembly Language
Let’s look at bubble sorting example
C Versus Assembly Language
Executable file size:
C version : 50,256 bytes
Assembly language version: 50,208 bytes
Execution time:
…
PC register SREG: I T H S V N Z C
R15
Instruction
CPU
R16
decoder R17
…
PC
R30
Instruction decoder
R31
Instruction Register
registers
12
General Purpose Registers (GPRs)
The majority of AVR registers are 8-bit.
In AVR, there are 32 GPRs. R0-R31.
GPRs can be
used for
arithmetic and
logical operations.
(like accumulator
Registers in
Microprocessors)
13
Some simple instructions
1. Loading values into the general purpose registers
…
R16 = 53 SREG: I T H S V N Z C
CPU
R15
LDI R19,$27 R16
R17
LDI R23,0x27 PC
…
R30
R23 = 0x27 Instruction decoder
R31
Instruction Register
LDI R23,0b11101100 registers
14
1. Loading values into the general purpose registers
15
1. Loading values into the general purpose
registers
If the value is in hex, we put $ or 0x in front
of the value. For example, 0xAF or $AF
If single hex value is moved to the register,
the rest will be set to zero. For example,
LDI R19, 0x5
R19 will be 0x05 or 0000 0101b
If you move a value larger than 255 (0xFF),
you will get an error.
LDI R19, 0x7F2 ; illegal
16
Some simple instructions
2. Arithmetic calculation
There are some instructions for doing
Arithmetic and logic operations; such as:
ADD, SUB, MUL, AND, etc.
ADD Rd,Rs
Rd = Rd + Rs R0
ALU R1
Example: R2
…
ADD R25, R9 SREG: I T H S V N Z C
CPU
R15
R25 = R25 + R9 R16
R17
ADD R17,R30 PC
…
R17 = R17 + R30 Instruction decoder
R30
R31
Instruction Register
registers
17
A simple program
Write a program that calculates 19 + 95
R0
ALU R1
R2
…
SREG: I T H S V N Z C
CPU
R15
R16
R17
PC
…
R30
Instruction decoder
R31
Instruction Register
registers
18
A simple program
Write a program that calculates 19 + 95 +
5 LDI R16, 19 ;R16 = 19
LDI R20, 95 ;R20 = 95
LDI R21, 5 ;R21 = 5
ADD R16, R20 ;R16 = R16 + R20
ADD R16, R21 ;R16 = R16 + R21
19
Some simple instructions
2. Arithmetic calculation
SUB Rd,Rs
Rd = Rd - Rs
Example:
SUB R25, R9
R25 = R25 - R9 R0
ALU R1
SUB R17,R30 R2
…
R17 = R17 - R30 SREG: I T H S V N Z C
CPU
R15
R16
R17
PC
…
R30
Instruction decoder
R31
Instruction Register
registers
20
Some simple instructions
2. Arithmetic calculation
INC Rd
Rd = Rd + 1
Example:
INC R25
R25 = R25 + 1 R0
ALU R1
R2
…
DEC Rd SREG: I T H S V N Z C
CPU
R15
Rd = Rd - 1 R16
R17
Example: PC
…
R30
DEC R23 Instruction decoder
R31
Instruction Register
R23 = R23 - 1 registers
21
Data Address Space
Program is stored in code memory space.
22
Data Address Space
Address Name Address Name Address Name
Mem. I/O Mem. I/O Mem. I/O
$20 $00 - $36 $16 TIFR1 $4C $2C SPCR0
$21 $01 - $37 $17 TIFR2 $4D $2D SPSR0
$22 $02 - $38 $18 - $4E $2E SPDR0 RAM EEPROM Timers
$23 $03 PINB $39 $19 - $4F -
General
$2F
$24 $04 DDRB $3A $1A - $50 $30 ACSR Purpose
PROGRAM
$25 $05 PORTB $3B $1B PCIFR $51 $31 DWDR Registers
$26 $06 $3C $1C EIFR ROM
PINC $52 $32 -
$27 $07 DDRC $3D $1D EIMSK $53 $33 SMCR
$28 $08 $3E $1E GPIOR0
Program
CPU Data
PORTC $54 $34 MCUSR
$29 $09 PIND $3F $1F EECR $55
Bus
$35 MCUCR Bus address bus
$2A $0A DDRD $40 $20 EEDR $56 $36 -
data bus
control bus
$2B $0B PORTD $41 $21 EEARL $57 $37 SPMCSR Data
$2C $0C - $42 $22 EEARH $58 $38 -
$2D $0D - $43 $23 GTCCR $59 $39 - Bus
Data
$2E $0E - $44 $24 Data
TCCR0A $5A $3A
Address
R0
Address
-
$2F $0F - R1
$45 $25 TCCR0B $5B $3B -
$0000 R2 $0000
$30 $10
General - $46 $26 TCNT0 $5C
General $3C Other
- Interrupt
...
$31 Purpose -
$11 $47 $27 OCR0A Purpose
$5D $3D OSC Ports
...
SPL
...
$32 Registers -
$12
R31
Registers Unit Peripherals
$001F $48 $28 OCR0B
$001F $5E $3E SPH
$33
$0020 $13 - $49 $29 -
$0020 $5F $3F SREG
Standard I/O I/O Address Standard I/O
$34 $14 GPIOR1
Registers - $4A $2A$00 Registers I/O
...
...
$35
$005F
$15
(SFRs) TIFR0 $4A
Example:
$2A$01
Add contentsExample:
$005F
GPIOR2
of location
(SFRs) 0x90
Store to contents
0x53 into PINS of location
the SPH 0x95
register.
...
$0060
Example:
and store What
$0060
$3E doesinthe
theExtended
result following
location
LDS
STS
The (Store
address instruction
0x313.
(Load direct
of from
direct
SPH to
is do?space)
data
data
0x5E space)
Extended Example: Write
Example: Write
SREG $3F a program that stores 55 into location
a program that copies the contents of location0x80 of RAM.
0x80
...
...
SRAM SRAM
Answer: $21FF
External
$2200
LDS R20,
Solution:
LDI R21, 55
External Example:
0x95 LDI R20, 0x53
;R20 =;R21
55 = [0x95]
;R20 = 0x53
$FFFF
SRAM
It copies
ADD
theSRAM
R20,
$FFFF contents
R21
of R2 ;R20
STS into R20;
0x5E,=
as+ 2R21
R20
R20
is the;SPH
address
= R20of R2.
ATmega328
STS 0x80,
LDS ATmega640/V
R20, 0x80 R20 LDS
STS R1,
;[0x80] =0x60
0x60,R15
;R20 R20 ; [0x60] = R15
= 55
= [0x80]
ATmega1280/V
ATmega64
STS ATmega1281/V
0x313, R20 ;[0x313] = R20
ATmega128 STS ATmega2560/V
0x81, R20 ;[0x81] = R20 = [0x80]
ATmega2561/V
23
Data Address Space
$001F Registers
R31
Example:
$001F Write a programOSCthat adds the contents
Registers
Unit
Portsof the PINC I/O
Peripherals
register to Registers
the contents of PIND IN
Using (IN
and (OUT from
Names IO location)
of I/O
storestothe registers
I/Oresult in location 0x90
$0020 $0020 Standard I/O
Standard I/O I/O Address
...
$005F
(SFRs) of the SRAM
$01
$005F
(SFRs) I/O
PINS
...
$0060
$3E
$0060
IN Rd,IOaddress ;Rd = [addr]
Extended SREG
Solution:
$3F Extended
Example:
OUT IOAddr,Rd ;[addr]=Rd
...
...
SRAM SRAM
IN R21,PIND
$21FF
$2200
IN
;R21 =R15,SREG
Example:
PIND ;IN R15,0x3F
External
SRAM
External
ADD R20,R21
SRAM
;R20
IN R1, 0x3F
= R20
;R1 = SREG
+ R21 ;SREG = R12
$FFFF $FFFF OUT 0x3F,R12
ATmega640/V
ATmega328
STS 0x90,R20
ATmega1280/V IN R17,0x3E
;[0x90] = R20 ;R17 = SPH
ATmega64
ATmega128
ATmega1281/V
ATmega2560/V
OUT 0x3E,R15 ;SPH = R15
ATmega2561/V
24
LDS (Load Direct for data Space)
25
LDS Example
Program to add the content of 0x300 to the
content of 0x302 and writes the result in
R1
26
STS (Store direct To data Space)
27
STS Example
Program first loads R16 with 0x55. Then
moves this value to Port B, Port C and Port
D.
28
Example
29
Example
30
IN vs. LDS
The CPU executes IN faster than LDS (1
machine cycle vs. 2 machine cycle)
IN is 2-bytes operation, LDS is 4-bytes
When we use IN, we can use the names of
I/O registers (no need to remember their
addresses)
32
Example
33
MOV Instruction
It is used to copy date among GPRs.
34
COM instruction
It complements (inverts) the content of the
register.
For example, the following program sends
0x55 to PORTB, then R16 is complemented
and sent to PORTB again.
35
Example
36
More Instructions
37
More Instructions
38
Machine Language
ADD R0,R1
000011 00 0000 0001
opcode operand
LDI R16, 2
LDI R17, 3
ADD R16, R17
39
Status Register (SREG)
SREG: I T H S V N Z C
Carry
Interrupt oVerflow Zero
Temporary Negative
Sign Data Address
Half carry N+V Space
Example:Show
Example:
Example:
Example: Showthe
Show
Show thestatus
the
the statusof
status
status ofthe
of
of theC,
the
the C,H,
C,
C, H,$0000
H,
H, andZZ
and
and
and ZZGeneral
flagsafter
flags
flags
flags afterthe
after
after theaddition
the
the addition
$0001
subtraction
subtraction
of
of 0x9C of
of 0x9C
0x23
0x73
0x38 and 0x2F from
from
from 0x9C
0xA5
0x52 in
in the
the following
following
0x64 in the following instructions: instructions:
instructions:
Purpose
...
LDI 0x38
LDI R20,
R20, 0x9C
0xA5
0x52 Registers
LDI LDI
R16, R20,
R0 0x9C ;R16 = 0x38 $001F IO Address
ALU LDI 0x2F
LDI R1
R21,
R21, 0x9C
0x23
0x73 $0020
$00
LDI LDI
R17, R21, 0x64
;R17 = 0x2F $01
R2 Standard IO
SUB R17R20,
SUB R20,
R20, R21
R21 ;subtract
;subtract
R21 toR21R21 from R20
R20from R20
...
ADD R21;add R17;add
...
ADD R16, to Registers
R16
…
SPH $3E
SREG: I T H S V N Z C $005F
Solution:
Solution:
Solution:
SREG $3F
CPU
Solution: R15 11 $0060
$52
$9C
$A5 0101
R16 0010
1001 1100
1010 1100
0101
$38
$9C 0011
1001 1000
- $73 0111
R17 0011
...
+-- +$64
$9C
$23
$2F 1001 0100
0010
0110 1100
0011
1111
PC $DF 1101 1111 R20
R20 == $DF
…
$00
$82
$67
$100 1 00000000
1000
0110
0000 0000
0010
0111 R20
R16
R20 == 00
$00
$82
0x67
C = 1 because R21 is bigger than R20 and there is a borrow from D8 bit.
CC===100because
C becausethere
because R21 is
R21 is not
isnot bigger
bigger
a carry
R30 than R20
than
beyond R20 andbit.
and
the D7 there is
there is no
no borrow
borrow from
from D8
D8 bit.
bit.
Z
C == 00 decoder
because
because the
thereR20
is has
no a value
carry otherthe
beyond than
D7zero after the subtraction.
Instruction
ZZ =
H == 1
01 because
because there
because the R20
the R20 iscarry
is ahaszero after
a value
from the D3
other
the than 0 bit.
subtraction.
to theafter the subtraction.
D4 bit.
H = 1 because there isR31 a borrow
carry from
from D4D3
the toto
D3.
the D4 bit.
ZH===100Register
H
Instruction
becausethe
because
because there
there
R20 is(the
is no
no borrow
borrow from
result) from
has D4
aD4 to D3.
to
value D3.
0 in it after the addition.
Z = 0 because the R16 (the result) has a value other than 0 after the addition.
registers
$FFFF
40
Not all instructions alter the flag bits
41
AVR Data Formats
AVR has only one data type which is 8-bit.
Its registers are also 8-bit.
It is the job of programmer to handle the
data that is larger than 8-bit (Chapter 5).
AVR can do both signed and unsigned
arithmetic (Chapter 5).
The 8-bit data can be in hex, decimal,
binary or ASCII format.
42
AVR Data Formats
43
AVR Data Formats
44
Assembler Directives
Instructions (LDI, ADD etc.) tell CPU what to
do.
Directives give directives to the assembler.
Directives help us to develop our program
easier.
46
47
Assembler Directives
.EQU and .SET
.SET is used to define a constant or a fixed
address.
The only difference is .SET can be reused
for the same value later.
48
Assembler Directives
.ORG
.ORG directive is used to indicate the
beginning of the address.
00 E205
01 0000
Program.asm 02 0000
03 0000
.ORG 0
LDI R16, 0x25 04 0000
assembler
.ORG 0x7 05 0000
LDI R17, 0x34 06 0000
LDI R18, 0x31 07 E314
08 E321
09 0000
0A 0000
49
Assembler Directives
.INCLUDE
.INCLUDE “filename.ext”
M328def.inc
.equ SREG = 0x3f
.equ SPL = 0x3d
.equ SPH = 0x3e
....
Program.asm
LDI R20, 10
OUT SPL, R20
50
Sample AVR Assembly Program
51
Each line has the following form
optional optional
52
Assembler
Assembly
EDITOR
PROGRAM assembler
myfile.asm
ASSEMBLER
PROGRAM Machine
Language
DOWNLOAD TO DOWNLOAD TO
AVR’s EEPROM AVR ’s FLASH
Your program
goes in AVR’s flash
53
Map file
Map file shows the labels used in the
program together with their values.
54
List file
List file shows the binary and source code, and the
amount of memory program uses.
55
Machine Language
LDI is 2-byte instruction.
56
STS is 4-byte instruction
57
Data Address Space
Data address space, 8-bit (1 byte) each
address.
GPR, SFR, Extended I/O, External SRAM
Interrupt Other
OSC Ports
Unit Peripherals
I/O
PINS
58
Program Address Space
Program is in Program Flash ROM.
Each address is 16-bit long (2 byte) and
holds machine code of each instruction.
Interrupt Other
OSC Ports
Unit Peripherals
I/O
PINS
59
Program Address Space
For example, Atmega32 program ROM is
16Kx16 bits. Its program counter is 14 bits
long, can access 2^14 = 16,384 (3FFF)
different addresses.
60
AVR Memory Space
Flash memory, Data memory and EEPROM
have different address spaces.
61
Example
62
Flash memory and PC register
00 E205
LDI R16, 0x25 01 E314
LDI R17, $34
LDI R18, 0x31 02 E321
ADD R16, R17 03 0F01
ADD R16, R18 04 0F02
LDI R17, 11
0516-bit
E01B
ADD R16, R17
STS SUM, R16 06 0F01
HERE:JMP HERE 07 9300
08 0300 RAM EEPROM Timers
09 940C
940C
PROGRAM
0A 0009
Flash0009
ROM ALU
PC: 3
0
9
1
5
2
A
7
4
8
B
6 Data
16bit
Interrupt Other
OSC Ports
Unit Peripherals
I/O
PINS
63
Fetch and execute
Old Architectures 00
01
E205
E314
02 E321
Instruct 4
03 0F01
Instruct 3
04 0F02
Instruct 2
0516-bit
E01B
Instruct 1
06 0F01
07 9300
08 0300 RAM EEPROM Timers
Fetch
09 940C
PROGRAM
0A
Flash0009
ROM ALU
PC: Data
CPU Bus
Execute
Instruction dec.
Program
Bus
Interrupt Other
OSC Ports
Unit Peripherals
I/O
PINS
64
Pipelining
00 E205
Pipelining 01 E314
02 E321
Instruct 4 03 0F01
Instruct 3 04 0F02
Instruct 2 0516-bit
E01B
Instruct 1 06 0F01
07 9300
08 0300 RAM EEPROM Timers
Fetch 09 940C
PROGRAM
0A
Flash0009
ROM ALU
PC: Data
CPU Bus
Execute Program Instruction dec.
Bus
Interrupt Other
OSC Ports
Unit Peripherals
I/O
PINS
65
How to speed up the CPU
Increase the clock frequency
More frequency More power consumption
& more heat
Limitations
Change the architecture
Pipelining
RISC
66
Changing the architecture
RISC vs. CISC
CISC (Complex Instruction Set Computer)
Put as many instruction as you can into the
CPU
RISC (Reduced Instruction Set Computer)
Reduce the number of instructions, and use
your facilities in a more proper way.
67
RISC architecture
Feature 1
RISC processors have a fixed instruction
size. It makes the task of instruction
decoder easier.
In AVR the instructions are 2 or 4 bytes.
In CISC processors instructions have
different lengths
E.g. in 8051
CLR C ; a 1-byte instruction
ADD A, #20H ; a 2-byte instruction
LJMP HERE ; a 3-byte instruction
68
RISC architecture
Feature 2: reduce the number of
instructions
Pros: Reduces the number of used
transistors
Cons:
Can make the assembly programming more
difficult
Can lead to using more memory
69
RISC architecture
Feature 3: limit the addressing mode
Advantage
hardwiring
Disadvantage
Can make the assembly programming more
difficult
70
RISC architecture
Feature 4: Load/Store: no direct
arithmetic/logical operations in RAM. First
bring data to the GPRs.
LDS R20, 0x200
LDS R21, 0x220
ADD R20, R21 RAM EEPROM Timers
STS 0x230, R20
PROGRAM
Flash ROM ALU
PC: Data
CPU Bus
Instruction dec.
Program
Bus
Interrupt Other
OSC Ports
Unit Peripherals
I/O
PINS
71
RISC architecture
Feature 5 (Harvard
LDS R20, 0x100 architecture): separate buses
; R20 = [0x100]
ADD R20, R21
for
ADDopcodes
R20,R21 ;and
R20 =operands
R20 + R21
LDS R20, 0x100
Advantage: opcodes and operands can go in and out of
the CPU together.
Fetch purpose
Disadvantage: leads to more cost in general
computers.
Execute
72
RISC architecture
Feature 6: more than 95% of instructions
are executed in 1 machine cycle
73
RISC architecture
Feature 7
RISC processors have at least 32 registers.
Decreases the need for stack and memory
usages.
In AVR there are 32 general purpose
registers (R0 to R31)
74