2 - Computer Architecture
2 - Computer Architecture
1
Readings and Exercises
• P & H: Chapter 1
2
Objectives
At the end of this module, you will
1. Understand the basic components of a computer
2. Understand the basic architecture of a CPU
3. Differentiate between RISC and CISC methods
4. Know the basic structure of a machine
instruction
3
High-Level Architecture
• A basic computer system consists of:
▪ Central Processing Unit (CPU)
▪ System clock
▪ Primary memory
• Also called Random Access Memory (RAM)
▪ Secondary memory
• Usually a hard disk drive (HDD)
4
High-Level Architecture (cont’d)
▪ Peripheral input and output devices
• Eg: Keyboard, monitor
▪ Bus
5
High-Level Architecture (cont’d)
CPU Clock
Bus
6
CPU
• Is the “brains” of any computer system
▪ Executes instructions (i.e. a program)
▪ Controls the transfer of data across the bus
• Is usually contained on a single microprocessor
chip
▪ Eg: Intel Core i5, APM883208-X1, Apple A7
7
CPU (cont’d)
• Consists of 3 main parts:
▪ Control Unit (CU)
▪ Arithmetic Logic Unit (ALU)
▪ Registers
Registers
Arithmetic
Control
Logic
Unit
Unit
8
Registers
• Registers are binary storage units within the CPU
▪ May contain:
• Data
• Addresses
• Instructions
• Status information
▪ Eg: General-purpose registers are used by a
programmer to temporarily hold data and addresses
9
Registers
▪ Eg: The Program Counter (PC) contains the address
in memory of the currently executing instruction
• Is incremented to execute the next instruction
▪ Eg: The Status Register (SR) contains information
(flags) about the result of a previous instruction
• Eg: overflow, or carry
10
ALU
• The ALU performs arithmetic and logical
operations on data stored in registers
▪ Eg: Add numbers stored in 2 source registers, and
store the result in a destination register
▪ Eg: Do a bitwise AND using data in 2 registers
11
CU
• The CU directs the execution of instructions
▪ Loads an operation code (opcode) from primary
memory into the Instruction Register (IR)
▪ Decodes the opcode to identify the operation
▪ If necessary, transfers data between primary memory
and registers
▪ If necessary, directs the ALU to operate on data in
registers
12
System Clock
• Generates a clock signal to synchronize the CPU
and other clocked devices
▪ Is a square wave at a particular frequency
Address
0
1
2
. .
. .
. .
15
Primary Memory (cont’d)
• Example sizes:
▪ iMac (2016): 8 GB
▪ Raspberry Pi (original): 256 MB
• In a von Neumann architecture, RAM contains
both data and programs (instructions)
• In contrast, a Harvard architecture uses separate
memories for data and programs
16
Bus
• Is a set of parallel data/signal lines
• Is used to transfer information between computer
components
• Often subdivided into address, data, and control
busses
17
Bus (cont’d)
Address Bus
Data Bus
Primary
CPU
Memory
Control Bus
18
Bus (cont’d)
• Address bus:
▪ Specifies a memory location in RAM
• Or sometimes a memory-mapped I/O device
▪ Common sizes: 32 and 64 bits
• Data bus:
▪ Used for bidirectional data transfer
▪ Common sizes: 32 and 64 bits
19
Bus (cont’d)
• Control bus:
▪ Used to control or monitor devices connected to the
bus
• Eg: read/write signal for RAM
• An expansion bus may be connected to the
computer’s local bus
▪ Makes it easy to connect additional I/O devices to the
computer
▪ Example bus standards: USB, SCSI, PCIe
20
Secondary Memory
• Is used to hold a computer’s file system
▪ Stores files containing programs or data
• Is non-volatile read/write memory
▪ Its contents persist through a power cycle
• Usually embodied on a hard disk drive (HDD)
▪ But solid state drives (SSDs) are becoming more
common
21
Peripheral I/O Devices
• Allow communication between the computer and
the external environment
• Example input devices:
▪ Keyboard
▪ Pointing devices: mouse, trackball, joystick
▪ Microphone
▪ Scanner
22
Peripheral I/O Devices (cont’d)
• Example output devices:
▪ Monitor
▪ Printer
▪ Speakers
• Example I/O devices:
▪ Hard disk drive
▪ Modem
▪ Connections to networks
23
Basic CPU Architectures
• Accumulator Machines
CPU
Address Bus
A
U
L
Primary
Memory
ACC
Data Bus
24
Basic CPU Architectures (cont’d)
▪ Operands for an instruction come from the
accumulator register (ACC) and from a single
location in RAM
▪ ALU results are always put into the ACC
▪ The ACC can be loaded from or stored to RAM
25
Basic CPU Architectures (cont’d)
• Load/Store Machines
CPU
Address Bus
A
U
L
Primary
Memory
Register
File Data Bus
26
Basic CPU Architectures (cont’d)
▪ Only load and store instructions can access RAM
▪ Other instructions operate on specified registers in the
register file, not on RAM
• Registers are more quickly accessed than RAM, so this is
fast
▪ Typical program sequence:
• Load registers from memory
• Execute an instruction using two source registers, putting
the result into a destination register
• Store the result back into memory
27
RISC and CISC Architectures
• RISC: Reduced Instruction Set Computer
▪ Uses only simple instructions that can be executed in
one machine cycle
• Enables faster clock rates, thus faster overall execution
• But makes programs larger, more complex
• Eg: Original SPARC had no multiply instruction
▪ Multiplication done using repeated add-shift operations
28
RISC and CISC Architectures
(cont’d)
▪ Machine instructions are always the same size
• Makes decoding simpler and faster
• Eg: ARMv8 instructions are always 32 bits wide
29
RISC and CISC Architectures
(cont’d)
• CISC: Complex Instruction Set Computer
▪ May have instructions that take many cycles to
execute
• Are provided for programmer convenience
• But slows down overall execution speed!
• Eg: Intel Core 2
▪ add: 1 cycle
▪ mul: 5 cycles
▪ div: 40 cycles
30
RISC and CISC Architectures
(cont’d)
▪ Machines instructions vary in length, and may be
followed by “immediate” data
• Makes decoding difficult and slow
• Eg: Intel x86
▪ Can be as short as 1 byte long (eg: INC)
▪ But as long as 15 bytes!
31
Instruction Cycle
• Also called the fetch-execute or fetch-decode-
execute cycle
• The CPU executes each instruction in a series of
small steps:
1) Fetch the next instruction from memory into the
instruction register (IR)
• The Program Counter register (PC) contains its address
2) Increment PC to point to the next instruction
3) Decode the instruction
32
Instruction Cycle (cont’d)
4) If the instruction uses an operand in RAM,
calculate its address repeat if
necessary
33
Assembly Language Programs
• Consist of a series of statements, each
corresponding to a machine instruction
▪ ARMv8 example: add x20, x20, x21
Corresponds to:
1000 1011 0001 0101 0000 0010 1001 0100
Or in hexadecimal:
0x8b150294
34
Assembly Language Programs
(cont’d)
• Each statement consists of an opcode, and a
variable number of operands
▪ Eg: add x20, x20, x21
opcode operands
35
Assembly Language Programs
(cont’d)
• Optionally, a label can prefix any statement
▪ Form: label: statement
▪ Eg: start: add x20, x20, x21
▪ Is a symbol whose value is the address of the machine
instruction
• May be used as a target for a branch instruction
36
Assembly Language Programs
(cont’d)
• Pseudo-ops (assembler directives) do not
generate machine instructions, but give the
assembler extra information
▪ Form: .pseudo-op
▪ Eg: .global start
37
Assembly Language Programs
(cont’d)
• Comments may be appended to the end of a
statement
▪ In ARMv8, after a // delimiter
▪ Eg:
start: add x20, x20, x21 // add term
38
Assembly Language Programs
(cont’d)
• The labels, opcodes, operands, and comments
should be formatted into columns:
labels: opcodes operands // comments
opcodes operands // comments
opcodes operands // comments
39
Assemblers
• Translate assembly source code into machine
code
• In this course we will use the GNU as assembler
▪ Part of the GNU gcc compiler suite
40
Assemblers (cont’d)
• To assemble ARMv8 source code use:
gcc myprog.s -o myprog
▪ gcc invokes the assembler as, then links the code,
producing an executable called “myprog”
• Assumes files ending in .s contain assembly source code
41
Macro Preprocessors
• Many assemblers support macros
▪ Allows you to define a piece of text with a macro
name
• Optionally, parameters can be specified
▪ This text will be substituted inline wherever invoked
• Called macro expansion
▪ Provided as a convenience to help make your code
more readable
42
Macro Preprocessors (cont’d)
• Unfortunately, gcc (actually as) has limited
support for macros
▪ We use m4 instead, before invoking gcc
• Is a standard UNIX (Linux) command
43
Macro Preprocessors (cont’d)
• Eg: define(coef, 23) defines 2 macros
define(z_r, x18)
...
add x19, z_r, coef
...
macros invoked
is expanded to:
...
add x19, x18, 23
...
44
Macro Preprocessors (cont’d)
• General procedure:
▪ Put your source code containing macros into a file
ending in .asm
▪ Invoke m4, redirecting output to a file ending in .s
• Eg: m4 myprog.asm > myprog.s
▪ Run gcc as usual on the output file
45