0% found this document useful (0 votes)
8 views

Lecture 1 - Introduction

CPU ARch Notes Cache

Uploaded by

njanthony60
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Lecture 1 - Introduction

CPU ARch Notes Cache

Uploaded by

njanthony60
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

ECE 463/563

Microprocessor Architecture
Overview

Prof. Eric Rotenberg

ECE 463/563, Microprocessor Architecture, 1


Prof. Eric Rotenberg
Computer Architecture & Systems
Microprocessor Architecture (CPUs)

Hard:
Correct & Fast CPU

Easy:
Correct CPU

ECE 463/563, Microprocessor Architecture,


2
Prof. Eric Rotenberg
Simple Processor Datapath
Register File

1
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 3


Prof. Eric Rotenberg
Invention #1 Pipelining
Register File

1
6
5
4
3
2
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 4


Prof. Eric Rotenberg
Problem: Data-Dependent Instructions

Register File

1
6
5
4
3
2
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 5


Prof. Eric Rotenberg
Invention #2 Register File Bypasses
Register File

1
6
5
4
3
2
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 6


Prof. Eric Rotenberg
Problem: Branch Instructions
Register File

2
? 1
IF ID EX MEM WB
2 (instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 7


Prof. Eric Rotenberg
Invention #3 Branch Prediction
Register File
Branch
342 Predictor

? 1
IF ID EX MEM WB
2 (instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 8


Prof. Eric Rotenberg
Problem: “Memory Wall”
Register File
Branch
Predictor

IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 9


Prof. Eric Rotenberg
Invention #4 Caches
Register File
Branch
Predictor

IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Instr. Data
Cache Cache

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 10


Prof. Eric Rotenberg
Caches (cont.)
• Locality of reference
– Temporal locality: If you access an item, likely to
access it again in near future
– Spatial locality: If you access an item, likely to
access a nearby item in the near future

ECE 463/563, Microprocessor Architecture, 11


Prof. Eric Rotenberg
Problem: Stalled Instructions
Register File
Branch
Predictor

4 3 2 1
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Instr. cache Data


Cache miss Cache

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 12


Prof. Eric Rotenberg
Invention #5 Out-of-Order Execution
Register File
Branch
Predictor

4
7
6
5 3 2 1
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)
Dynamic
Scheduler
Instr. cache Data
Cache miss Cache

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 13


Prof. Eric Rotenberg
Superscalar Execution
Register File
Branch
Predictor
7
1
4
IF852 ID EX MEM WB
3
6
9
(instr. fetch) (instr. decode) (execute) (memory) (writeback)
Dynamic
Scheduler
Instr. Data
Cache Cache

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 14


Prof. Eric Rotenberg
Deep Pipelining
Register File
Branch
Predictor

IF1 IF2 ID1 ID2 EX1 EX2 M1 M2 W1 W2

Dynamic
Scheduler
Instr. Data
Cache Cache

Memory (DRAM & Disk)

ECE 463/563, Microprocessor Architecture, 15


Prof. Eric Rotenberg
BRANCH
PREDICTION
L1

S
OOO Instr.

SE
Cache

AS
EXECUTION

YP
SUPPORT

.B
R.F
S
SSE L1
PA Data
. BY
R.F Cache

ECE 463/563, Microprocessor Architecture,


16
Prof. Eric Rotenberg
Computer System

ECE 209, 309


Application
Operating
ECE 306, CSC 501 or
ECE 465/565 System
ECE 466/566 Compiler Firmware
Instruction Set
Architecture
ECE 109, 463/563 Machine Organization
(ECE 506, 786) Processor Mem. I/O system
ECE 721
ECE 310, 464/564 Datapath & Control
ECE 212 Digital Design
ECE 211, 403 Circuit Design
ECE 546 Layout

Computer Architecture = Instruction Set Architecture + Machine Organization


(Microarchitecture)
ECE 463/563, Microprocessor Architecture, 17
Prof. Eric Rotenberg
What is Computer Architecture?
Computer Architecture =
Instruction Set Architecture +
Machine Organization
• Programmable storage (registers, • Capabilities and performance characteristics of
memory) principal components
• Data types and their encodings – e.g., register files, ALUs, memory system, etc.
(integer, floating-point, SIMD) • Ways in which these components are
• Instruction set interconnected
• Instruction formats and encodings • Choreography of components to realize the ISA
• Modes of addressing and accessing • Performance-enhancing techniques and
data and instructions components
• Exceptions and interrupts – e.g., pipelining, caches, predictors, dynamic scheduling,
• Virtual memory superscalar execution, etc.

• Memory consistency model


ECE 463/563, Microprocessor Architecture, 18
Prof. Eric Rotenberg
CPU time equation (brief version)
• CPU time = time to execute a program on CPU
• Two factors
– # cycles = number of clock cycles to execute a program
– Cycle Time (CT) = clock period = 1 / (clock frequency)

CPU time = (# cycles)x(CT)


Example:
# cycles = 109 cycles (1 billion cycles) clock frequency (f) = 1 GHz = 109 Hz = 109 cycles/s
CT = 1/f = 10-9 s/cycle = 1 ns/cycle

CPU time = (109 cycles) x (10-9 s/cycle ) = 1 s

ECE 463/563, Microprocessor Architecture, 19


Prof. Eric Rotenberg
Static vs. Dynamic Instructions
#include <stdio.h>
#include <inttypes.h>
Loop has 4 static instructions.
#define HUGE 1000000
Dynamically, at run-time, the loop
int main(int argc, char *argv[]) {
uint64_t a[HUGE];
executes 1 million times.
uint64_t sum = 0;

// I know, a[] uninitialized... just a demo. # static instructions = 4


for (uint64_t i = 0; i < HUGE; i++)
# dynamic instructions = 4 million
sum += a[i];

printf("sum = %lu\n", sum); Dynamic instruction count influences


number of cycles.
return(0);
}

a5: starts with address of first element of array a[] (&a[0])


a3: contains address just after last element of array a[] (&a[1000000])
a1: contains on-going sum, initialized to 0

1a538: ld a4,0(a5) // a4 = a[i]


1a53c: addi a5,a5,8 // increment address to point to next element of a[]
1a540: add a1,a1,a4 // sum += a[i]
1a544: bne a5,a3,1a538 <main+0x34> // branch to top of loop if not after last element of a[]

ECE 463/563, Microprocessor Architecture, 20


Prof. Eric Rotenberg
Influence on CPU time
• Programmer influence
CYCLES – Algorithm affects dynamic instruction count: more (fewer) instructions to fetch and execute may cause more
(fewer) cycles
– Algorithm affects temporal and spatial locality: more (fewer) cache misses may cause more (fewer) cycles
• Compiler influence
CYCLES – Compiler optimizations affect dynamic instruction count (up or down): more (fewer) instructions to fetch and
execute may cause more (fewer) cycles
? – Instruction scheduling aims to increase instruction level parallelism and hence reduce cycles
• Influence of instruction-set architecture (ISA)
– Complex interactions between ISA, compiler, and microarchitecture. Unresolved debate on whether ISA (e.g.,
CISC vs. RISC) influences performance as much as microarchitecture.
CYCLES,
• CTMicroarchitecture influence
– Pipeline optimizations aim to reduce cycles, by increasing instruction-level parallelism (ILP) (the number of
concurrently executing instructions and the extent of their overlapped execution)
(e.g., pipelining,
– Pipeline optimizations may increase CT due to increased logic complexity
CT – Deeper pipelining aims to decrease CT
data bypassing,
branch prediction,
caches,
CT • Circuit design influence dynamic scheduling,
– Faster circuits aim to decrease CT superscalar, etc.)

• Technology influence
– Faster transistors and wires aim to decrease CT

ECE 463/563, Microprocessor Architecture, 21


Prof. Eric Rotenberg
Overview of Topics in 463/563
1. Measuring Performance and Cost
2. Caches and Memory Hierarchies
3. Instruction-Set Architecture (ISA)
– Defines software/hardware interface
4. Simple Pipelining
– Data and control (branch) dependencies
– Register file bypasses
– Branch prediction

ECE 463/563, Microprocessor Architecture, 22


Prof. Eric Rotenberg
Overview of Topics in 463/563

5. Complex Pipelining and Instruction-Level


Parallelism (ILP)
– Data hazards
– Issue Queue (IQ): from in-order to out-of-order
scheduling
– Reorder Buffer (ROB): speculation and register
renaming
– Precise interrupts
– Superscalar, VLIW, and vector processors
ECE 463/563, Microprocessor Architecture, 23
Prof. Eric Rotenberg
Projects
• Three projects
– Cache simulator
– Branch predictor simulator
– Superscalar pipeline simulator
• Programming for projects is harder than
anything many of you have encountered
before

ECE 463/563, Microprocessor Architecture, 24


Prof. Eric Rotenberg
Course Grading
• 40% projects
– Project 1, caches: 15%
– Project 2, branch predictors: 10%
– Project 3, superscalar pipeline: 15%
• 20% Moodle quizzes
– Can be thought of as homework assignments
– Each quiz tests a unit of knowledge or a whole topic, or gives practice on manually
doing a computer architecture simulation
– The current plan is 10 quizzes but there could be more
– A quiz may be assigned with a group of lectures (most common) or a pre-recorded
video
• 20% Midterm
– Covers Performance/Cost, Caches
• 20% Final (NOT comprehensive)
– Covers ISA, Simple and Complex Pipelining, ILP

ECE 463/563, Microprocessor Architecture, 25


Prof. Eric Rotenberg
Course Web Page Moodle site

• wolfware.ncsu.edu
– Login
– Select ECE 463/563
• Content
– Link to Panopto for live-stream (webcast) and recordings
– Syllabus
– Schedule
– Contact information and office hours
– Q&A and discussion forums
– Projects: specifications, benchmark traces, validation runs, etc.
– Moodle (on-line) quizzes and exams
• Check frequently (I announce updates)
ECE 463/563, Microprocessor Architecture, 26
Prof. Eric Rotenberg

You might also like