0% found this document useful (0 votes)
6 views

Lecture11

Uploaded by

minulo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Lecture11

Uploaded by

minulo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 132

CSC 252/452: Computer Organization

Fall 2024: Lecture 11

Instructor: Yanan Guo

Department of Computer Science


University of Rochester
Carnegie Mellon

Announcement
• Programming assignment 3 will be out
• Details: https://www.cs.rochester.edu/courses/252/fall2024/
labs/assignment3.html
• Due on Oct. 25th, 11:59 PM
• You (may still) have 3 slip days

2
Carnegie Mellon

Announcement
• Programming assignment 3 is in x86 assembly language. Seek
help from TAs.
• TAs are best positioned to answer your questions about
programming assignments!!!
• Programming assignments do NOT repeat the lecture materials.
They ask you to synthesize what you have learned from the
lectures and work out something new.

3
Carnegie Mellon

So far in 252…
C Program

Assembly
Program

Instruction Set Architecture

Processor
Microarchitecture

Circuits
4
Carnegie Mellon

Computing with Logic Gates


And Or Not
a a
out out a out
b b
out = a && b out = a || b out = !a

5
Carnegie Mellon

Bit Equality

Combinational Circuit: Circuit for computation

6
Carnegie Mellon

Bit Equality

Combinational Circuit: Circuit for computation

6
Carnegie Mellon

Bit Equality

Combinational Circuit: Circuit for computation

6
Carnegie Mellon

Bit Equality

Combinational Circuit: Circuit for computation

6
Carnegie Mellon

Bit Equality

Combinational Circuit: Circuit for computation

6
Carnegie Mellon

Bit Equality

Combinational Circuit: Circuit for computation

6
Carnegie Mellon

Bit Equality

Combinational Circuit: Circuit for computation

6
Carnegie Mellon

Bit Equality

eq

Combinational Circuit: Circuit for computation

6
Carnegie Mellon

Bit Equality

Bit equal
a

eq

Combinational Circuit: Circuit for computation

6
Carnegie Mellon

Bit Equality

Bit equal
a

eq

Combinational Circuit: Circuit for computation


Sequential Circuit: Circuit for storage

6
Carnegie Mellon

Sequential Circuit
D

Q
Some Logic

• 1-bit storage:
• D is the data I want to store (0 or 1)
• C is the control signal
• When C is 1, Q becomes D (i.e., storing the data)
• When C is 0, Q doesn’t change with D (data stored)

7
Carnegie Mellon

Register Operation
State = x

Input = y Output = x
x

8
Carnegie Mellon

Register Operation
State = x
C Rises
Input = y Output = x
x

8
Carnegie Mellon

Register Operation
State = x State = y
C Rises
Input = y Output = x Output = y
x y

8
Carnegie Mellon

Register Operation
State = x State = y
C Rises
Input = y Output = x Output = y
x y

• Stores data bits


• For most of time acts as barrier between input and output
• As C rises, loads input

8
Carnegie Mellon

Register Operation
State = x State = y
C Rises
Input = y Output = x Output = y
x y

C Output continuously produces


y after the rising edge unless
you cut off power.
• Stores data bits
• For most of time acts as barrier between input and output
• As C rises, loads input

8
Carnegie Mellon

Clock Signal
State = x State = y
C Rises
Input = y Output = x Output = y
x y

• A special C: periodically oscillating between 0 and 1


• That’s called the clock signal. Generated by a crystal oscillator
inside your computer.

9
Carnegie Mellon

Clock Signal
State = x State = y
C Rises
Input = y Output = x Output = y
x y

• A special C: periodically oscillating between 0 and 1


• That’s called the clock signal. Generated by a crystal oscillator
inside your computer.

Clock

9
Carnegie Mellon

Clock Signal
State = x State = y
C Rises
Input = y Output = x Output = y
x y

• A special C: periodically oscillating between 0 and 1


• That’s called the clock signal. Generated by a crystal oscillator
inside your computer.

Clock

In x0 x1 x2 x3 x4 x5

9
Carnegie Mellon

Clock Signal
State = x State = y
C Rises
Input = y Output = x Output = y
x y

• A special C: periodically oscillating between 0 and 1


• That’s called the clock signal. Generated by a crystal oscillator
inside your computer.

Clock

In x0 x1 x2 x3 x4 x5

Out x0 x1 x2 x3 x4 x5

9
Carnegie Mellon

Clock Signal
• Cycle time of a clock signal: the time duration between two rising edges.

Clock

In x0 x1 x2 x3 x4 x5

Out x0 x1 x2 x3 x4 x5

10
Carnegie Mellon

Clock Signal
• Cycle time of a clock signal: the time duration between two rising edges.

Cycle time

Clock

In x0 x1 x2 x3 x4 x5

Out x0 x1 x2 x3 x4 x5

10
Carnegie Mellon

Clock Signal
• Cycle time of a clock signal: the time duration between two rising edges.
• Frequency of a clock signal: how many rising (falling) edges in 1 second.

Cycle time

Clock

In x0 x1 x2 x3 x4 x5

Out x0 x1 x2 x3 x4 x5

10
Carnegie Mellon

Clock Signal
• Cycle time of a clock signal: the time duration between two rising edges.
• Frequency of a clock signal: how many rising (falling) edges in 1 second.
• 1 GHz CPU means the clock frequency is 1 GHz

Cycle time

Clock

In x0 x1 x2 x3 x4 x5

Out x0 x1 x2 x3 x4 x5

10
Carnegie Mellon

Clock Signal
• Cycle time of a clock signal: the time duration between two rising edges.
• Frequency of a clock signal: how many rising (falling) edges in 1 second.
• 1 GHz CPU means the clock frequency is 1 GHz
• The cycle time is 1/10^9 = 1 ns

Cycle time

Clock

In x0 x1 x2 x3 x4 x5

Out x0 x1 x2 x3 x4 x5

10
Carnegie Mellon

Register File
• A register file consists of a set of registers that you can individually
read from and write to.

Register File

1 z
2 x

3 w

Clock

11
Carnegie Mellon

Register File
• A register file consists of a set of registers that you can individually
read from and write to.
• To read: give a register file ID, and read the stored value out

Register File

1 z
2 x

3 w

Clock

11
Carnegie Mellon

Register File
• A register file consists of a set of registers that you can individually
read from and write to.
• To read: give a register file ID, and read the stored value out

Register File

1 z
valA
2 x
srcA Read
3 w

Clock

11
Carnegie Mellon

Register File
• A register file consists of a set of registers that you can individually
read from and write to.
• To read: give a register file ID, and read the stored value out

Register File

1 z
valA
2 x
srcA Read
2
3 w

Clock

11
Carnegie Mellon

Register File
• A register file consists of a set of registers that you can individually
read from and write to.
• To read: give a register file ID, and read the stored value out

Register File

1 z
valA
x 2 x
srcA Read
2
3 w

Clock

11
Carnegie Mellon

Register File
• A register file consists of a set of registers that you can individually
read from and write to.
• To read: give a register file ID, and read the stored value out
• To write: give a register file ID, a new value, overwrite the old value

Register File

1 z
valA
x 2 x
srcA Read
2
3 w

Clock

11
Carnegie Mellon

Register File
• A register file consists of a set of registers that you can individually
read from and write to.
• To read: give a register file ID, and read the stored value out
• To write: give a register file ID, a new value, overwrite the old value

Register File

1 z
valA valW
x 2 x
srcA Read Write dstW
2
3 w

Clock

11
Carnegie Mellon

Register File
• A register file consists of a set of registers that you can individually
read from and write to.
• To read: give a register file ID, and read the stored value out
• To write: give a register file ID, a new value, overwrite the old value

Register File

1 z
valA valW
x 2 x y
srcA Read Write dstW
2 2
3 w

Clock

11
Carnegie Mellon

Register File
• A register file consists of a set of registers that you can individually
read from and write to.
• To read: give a register file ID, and read the stored value out
• To write: give a register file ID, a new value, overwrite the old value

Register File

1 z
valA valW
x 2 x y
srcA Read Write dstW
2 2
3 w

Rising
Clock edge

11
Carnegie Mellon

Register File
• A register file consists of a set of registers that you can individually
read from and write to.
• To read: give a register file ID, and read the stored value out
• To write: give a register file ID, a new value, overwrite the old value

Register File

1 z
valA valW
x 2 yx y
srcA Read Write dstW
2 2
3 w

Rising
Clock edge

11
Carnegie Mellon

Register File
Register File

valA
x 1 z Write port
srcA A
2 y valW
Read
2 x y
W dstW
ports
valB
3 w 2
z
srcB B
1
Rising
Clock edge
• Stores multiple registers of data
• Address input specifies which register to read or write
• Multiple Ports: Can read and/or write multiple words in one
cycle. Each port has separate address and data input/output

12
Carnegie Mellon

Processor Microarchitecture
• Sequential, single-cycle microarchitecture implementation
• Basic idea
• Hardware implementation
• Pipelined microarchitecture implementation
• Basic Principles
• Difficulties: Control Dependency
• Difficulties: Data Dependency

13
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06 Instruction Code Function Code
Add

addq rA, rB 6 0 rA rB

14
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)

Clock
PC

A
Register L
File U

Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)

Clock
PC

A
Read Reg. Register
ID 1 L
File U

Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)

Clock
PC

A
Read Reg. Register
ID 1 L
File U
Read Reg.
ID 2
Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)

Clock
PC
Reg 1 Data
A
Read Reg. Register
ID 1 L
File U
Read Reg.
ID 2
Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)

Clock
PC
Reg 1 Data
A
Read Reg. Register
ID 1 L
File Reg 2 Data U
Read Reg.
ID 2
Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)

Clock
PC Select
Reg 1 Data
A
Read Reg. Register
ID 1 L
File Reg 2 Data U
Read Reg.
ID 2
Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Clock
PC Write Select
Reg. ID Reg 1 Data
A
Read Reg. Register
ID 1 L
File Reg 2 Data U
Read Reg.
ID 2
Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Clock
PC Write Select
Reg. ID Reg 1 Data
A
Read Reg. Register
ID 1 L
File Reg 2 Data U
Read Reg.
ID 2
Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Clock
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Clock
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Clock
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Clock
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
What
Clock Logic?
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• How does the processor execute addq %rax,%rsi
• The binary encoding is 60 06
Add

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
What
Clock Logic?
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Flags
Enable Clock Z S O
15
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Logic 2 Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Logic 2 Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;
• Logic 2: if (s0 == 6) Enable = 1; else Enable = 0;

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Logic 2 Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;
• Logic 2: if (s0 == 6) Enable = 1; else Enable = 0;

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Logic 2 Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;
• Logic 2: if (s0 == 6) Enable = 1; else Enable = 0;

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
nPC s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Logic 2 Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;
• Logic 2: if (s0 == 6) Enable = 1; else Enable = 0;

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
oPC nPC s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Logic 2 Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;
• Logic 2: if (s0 == 6) Enable = 1; else Enable = 0;

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
oPC nPC s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Logic 2 Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;
• Logic 2: if (s0 == 6) Enable = 1; else Enable = 0;
• Logic 3: if (s0 == 6) nPC = oPC + 2;

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
oPC nPC s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg.
6 ID 2

Logic 2 Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;
• Logic 2: if (s0 == 6) Enable = 1; else Enable = 0;
• Logic 3: if (s0 == 6) nPC = oPC + 2;
• How about Logic 4?

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
oPC nPC s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
6 ID 2

Logic 2 Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;
How do these logics
• Logic 2: if (s0 == 6) Enable = 1; else Enable = 0;
get implemented?
• Logic 3: if (s0 == 6) nPC = oPC + 2;
• How about Logic 4?

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
oPC nPC s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
6 ID 2

Logic 2 Flags
Enable Clock Z S O
16
Carnegie Mellon

Executing an ADD instruction


• Logic 1: if (s0 == 6) select = s1;
How do these logics
• Logic 2: if (s0 == 6) Enable = 1; else Enable = 0;
get implemented?
• Logic 3: if (s0 == 6) nPC = oPC + 2;
• How about Logic 4?

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
oPC nPC s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
6 ID 2

Logic 2 Flags
Enable Clock Rising
Z S O
edge 16
Carnegie Mellon

Executing an ADD instruction


• When the rising edge of the clock arrives, the RF/PC/Flags will be written.
• So the following has to be ready: newData, nPC, which means Logic1, Logic2,
Logic3, and Logic4 has to finish.

addq rA, rB 6 0 rA rB
Memory
(Later…)
newData
Logic 1
Clock
PC Write Select
oPC nPC s0 6 Reg. ID Reg 1 Data
s1 A
0 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
6 ID 2

Logic 2 Flags
Enable Clock Rising
Z S O
edge 17
Carnegie Mellon

Executing many ADD instructions

Clock

Ins1 Ins2 Ins3 Ins4 Ins5

18
Carnegie Mellon

Executing many ADD instructions

Cycle time

Clock

Ins1 Ins2 Ins3 Ins4 Ins5

18
Carnegie Mellon

Executing a JLE instruction


• Let’s say the binary encoding for jle .L0 is 71 0123000000000000
• What are the logics now?
jle Dest 7 1 Dest

Memory

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
s4 2
Flags
s5 3 Logic 2
Enable Clock Rising
Z S O
… … edge 19
Carnegie Mellon

Executing a JLE instruction

Memory

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
s4 2
Clock Flags
s5 3 Logic 2
Enable Z S O
… … 20
Carnegie Mellon

Executing a JLE instruction


• Logic 1: if (s0 == 6) select = s1;

Memory

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
s4 2
Clock Flags
s5 3 Logic 2
Enable Z S O
… … 20
Carnegie Mellon

Executing a JLE instruction


• Logic 1: if (s0 == 6) select = s1;
• Logic 2: if (s0 == 6) Enable = 1; else Enable = 0;

Memory

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
s4 2
Clock Flags
s5 3 Logic 2
Enable Z S O
… … 20
Carnegie Mellon

jle Dest 7 1 Dest

Executing a JLE instruction


• Logic 3??

Memory

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
s4 2
Clock Flags
s5 3 Logic 2
Enable Z S O
… … 21
Carnegie Mellon

jle Dest 7 1 Dest

Executing a JLE instruction


• Logic 3?? if (s0 == 6) nPC = oPC + 2;
else if (s0 == 7) {
if (s1 == 1) { // jLE
if (Z || (S ^ O)) nPC = Dest; // jump
else nPC = oPC + 10; // don’t jump, but add 10 (why??)
} else if (s1 == …) {…}
Memory }}

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
s4 2
Clock Flags
s5 3 Logic 2
Enable Z S O
… … 21
Carnegie Mellon

jle Dest 7 1 Dest

Executing a JLE instruction


• Logic 3?? if (s0 == 6) nPC = oPC + 2;
else if (s0 == 7) {
if (s1 == 1) { // jLE
if (Z || (S ^ O)) nPC = Dest; // jump
else nPC = oPC + 10; // don’t jump, but add 10 (why??)
} else if (s1 == …) {…}
Memory }}

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
s4 2
Clock Flags
s5 3 Logic 2
Enable Z S O
… … 21
Carnegie Mellon

jle Dest 7 1 Dest

Executing a JLE instruction


• Logic 3?? if (s0 == 6) nPC = oPC + 2;
else if (s0 == 7) {
if (s1 == 1) { // jLE
if (Z || (S ^ O)) nPC = Dest; // jump
else nPC = oPC + 10; // don’t jump, but add 10 (why??)
} else if (s1 == …) {…}
Memory }}

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
Flags [s2…s9] s4 2
Clock Flags
s5 3 Logic 2
Enable Z S O
… … 21
Carnegie Mellon

Executing a JLE instruction


• Logic 4? Does JLE write flags?

Memory

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
Flags [s2…s9] s4 2
Clock Flags
s5 3 Logic 2
Enable Z S O
… … 22
Carnegie Mellon

Executing a JLE instruction


• Logic 4? Does JLE write flags?
• Need another piece of logic.

Memory

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
Flags [s2…s9] s4 2
Clock Flags
s5 3 Logic 2
Enable Logic 5
EnableF Z S O
… … 22
Carnegie Mellon

Executing a JLE instruction


• Logic 4? Does JLE write flags?
• Need another piece of logic.
• Logic 5: if (s0 == 7) EnableF = 0; else if (s0 == 6) EnableF = 1;

Memory

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID Reg 1 Data
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File Reg 2 Data U
s3 Read Reg. Logic 4
1 ID 2
Flags [s2…s9] s4 2
Clock Flags
s5 3 Logic 2
Enable Logic 5
EnableF Z S O
… … 22
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

23
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

23
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

Inst.

23
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

Inst.

Combinational Logic

23
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

Inst.

Combinational Logic

Read current_states;

23
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC Cur. Flag
Rd/Wr Current Values
Inst. Reg.
Reg. IDs
Values

Combinational Logic

Read current_states;

23
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC Cur. Flag
Rd/Wr Current Values
Inst. Reg.
Reg. IDs
Values

Combinational Logic

Read current_states;
next_states = f(current_states);

23
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC Cur. Flag
Rd/Wr Current Values
Inst. Reg.
Reg. IDs
Values

Combinational Logic

A
L
U

Read current_states;
next_states = f(current_states);

23
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC Cur. Flag
Rd/Wr Current Values
Inst. Reg.
Reg. IDs
Values

Combinational Logic
Logic for generating Logic for generating
ALU select signal new flag value A
Logic for generating Logic for deciding all L
new PC value the enable signal values U

Read current_states;
next_states = f(current_states);

23
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC Cur. Flag
Rd/Wr Current Values
Inst. Reg.
Reg. IDs
Values

Combinational Logic
Logic for generating Logic for generating
ALU select signal new flag value A
Logic for generating Logic for deciding all L
new PC value the enable signal values U

Read current_states;
next_states = f(current_states);
When clock rises, current_states = next_states;
23
Carnegie Mellon

Microarchitecture (So far)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC New Enable? Cur. Flag
Rd/Wr Reg. Current Values
Inst. Reg.
New Reg. IDs Valus
Values New Flag
PC
Enable? Values

Combinational Logic
Logic for generating Logic for generating
ALU select signal new flag value A
Logic for generating Logic for deciding all L
new PC value the enable signal values U

Read current_states;
next_states = f(current_states);
When clock rises, current_states = next_states;
23
Carnegie Mellon

Executing a MOV instruction


• How do we modify the hardware to execute a move instruction?

rmmovq rA, D(rB) 4 0 rA rB D

move rA to the memory address rB + D


rmmovq %rsi,0x41c(%rsp) 40 64 1c 04 00 00 00 00 00 00

Memory

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 7 Reg. ID RA
s1 A
1 Read Reg. Register
Logic 3 ID 1 L
s2 0 File RB U
s3 Read Reg. Logic 4
1 ID 2
Flags [s2…s9] s4 2
Clock Flags
s5 3 Logic 2
Enable Logic 5
EnableF Z S O
… … 24
Carnegie Mellon

move rA to the memory address rB + D


rmmovq rA, D(rB) 4 0 rA rB D

Memory

newData
Logic 1
Clock
PC Write Select
oPC nPC s0 4 Reg. ID RA
s1 A
0 Read Reg. Register
Logic 3 ID 1 L
s2 6 File RB U
s3 Read Reg. Logic 4
4 ID 2
Flags [s2…s9] s4 1
Clock Flags
s5 c Logic 2
Enable Logic 5
EnableF Z S O
… … 25
Carnegie Mellon

move rA to the memory address rB + D


rmmovq rA, D(rB) 4 0 rA rB D

Memory

newData
Logic 1
Clock [s4…s19]
PC M Select
Write
oPC nPC s0 4 Reg. ID RA U
X A
s1 0 Read Reg. Register
Logic 3 ID 1 L
s2 6 File RB U
s3 Read Reg. Logic 4
4 ID 2
Flags [s2…s9] s4 1
Clock Flags
s5 c Logic 2
Enable Logic 5
EnableF Z S O
… … 25
Carnegie Mellon

move rA to the memory address rB + D


rmmovq rA, D(rB) 4 0 rA rB D

Memory

newData
Logic 1
Clock [s4…s19]
PC M Select
Write
oPC nPC s0 4 Reg. ID RA U
X A
s1 0 Read Reg. Register
Logic 3 ID 1 L
s2 6 File RB U
s3 Read Reg. Logic 4
4 ID 2
Flags [s2…s9] s4 1
Clock Flags
s5 c Logic 2
Enable Logic 5
EnableF Z S O
… … 25
Carnegie Mellon

move rA to the memory address rB + D


rmmovq rA, D(rB) 4 0 rA rB D

• Need new logic (Logic 6) to select the input to the ALU for Enable.

Logic 6
Memory

newData
Logic 1
Clock [s4…s19]
PC M Select
Write
oPC nPC s0 4 Reg. ID RA U
X A
s1 0 Read Reg. Register
Logic 3 ID 1 L
s2 6 File RB U
s3 Read Reg. Logic 4
4 ID 2
Flags [s2…s9] s4 1
Clock Flags
s5 c Logic 2
Enable Logic 5
EnableF Z S O
… … 25
Carnegie Mellon

move rA to the memory address rB + D


rmmovq rA, D(rB) 4 0 rA rB D

• Need new logic (Logic 6) to select the input to the ALU for Enable.

Logic 6
Memory
Address
newData
Logic 1
Clock [s4…s19]
PC M Select
Write
oPC nPC s0 4 Reg. ID RA U
X A
s1 0 Read Reg. Register
Logic 3 ID 1 L
s2 6 File RB U
s3 Read Reg. Logic 4
4 ID 2
Flags [s2…s9] s4 1
Clock Flags
s5 c Logic 2
Enable Logic 5
EnableF Z S O
… … 25
Carnegie Mellon

move rA to the memory address rB + D


rmmovq rA, D(rB) 4 0 rA rB D

• Need new logic (Logic 6) to select the input to the ALU for Enable.

Data to write
Logic 6
Memory
Address
newData
Logic 1
Clock [s4…s19]
PC M Select
Write
oPC nPC s0 4 Reg. ID RA U
X A
s1 0 Read Reg. Register
Logic 3 ID 1 L
s2 6 File RB U
s3 Read Reg. Logic 4
4 ID 2
Flags [s2…s9] s4 1
Clock Flags
s5 c Logic 2
Enable Logic 5
EnableF Z S O
… … 25
Carnegie Mellon

move rA to the memory address rB + D


rmmovq rA, D(rB) 4 0 rA rB D

• Need new logic (Logic 6) to select the input to the ALU for Enable.

Data to write
Enable Logic 6
Memory
Address
newData
Logic 1
Clock [s4…s19]
PC M Select
Write
oPC nPC s0 4 Reg. ID RA U
X A
s1 0 Read Reg. Register
Logic 3 ID 1 L
s2 6 File RB U
s3 Read Reg. Logic 4
4 ID 2
Flags [s2…s9] s4 1
Clock Flags
s5 c Logic 2
Enable Logic 5
EnableF Z S O
… … 25
Carnegie Mellon

How About Memory to Register MOV?


move data at memory address rB + D to rA
mrmovq D(rB), rA 4 0 rA rB D

Data to write
Enable Logic 6
Memory
Address
newData
Logic 1
Clock [s4…s19]
PC M Select
Write
oPC nPC s0 4 Reg. ID RA U
X A
s1 0 Read Reg. Register
Logic 3 ID 1 L
s2 6 File RB U
s3 Read Reg. Logic 4
4 ID 2
Flags [s2…s9] s4 1
Clock Flags
s5 c Logic 2
Enable Logic 5
EnableF Z S O
… … 26
Carnegie Mellon

How About Memory to Register MOV?


move data at memory address rB + D to rA
mrmovq D(rB), rA 4 0 rA rB D

Data to write
Enable Logic 6
Memory Data read back
Address

Logic 7 MUX
Logic 1
Clock newData [s4…s19]
PC M Select
Write
oPC nPC s0 4 Reg. ID RA U
X A
s1 0 Read Reg. Register
Logic 3 ID 1 L
s2 6 File RB U
s3 Read Reg. Logic 4
4 ID 2
Flags [s2…s9] s4 1
Clock Flags
s5 c Logic 2
Enable Logic 5
EnableF Z S O
… … 27
Carnegie Mellon

Microarchitecture (with MOV)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC Inst. New Enable? Cur. Flag
Rd/Wr Reg. Current Values
Reg. IDs Valus Reg.
New Values
PC New Flag
Enable? Values

Combinational Logic

Read current_states;
next_states = f(current_states);
When clock rises, current_states = next_states;

28
Carnegie Mellon

Microarchitecture (with MOV)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC Inst. New Enable? Cur. Flag
New Rd/Wr Reg. Current Values
Data Reg. IDs Valus Reg.
New Values
PC New Flag
Enable? Values

Combinational Logic

Read current_states;
next_states = f(current_states);
When clock rises, current_states = next_states;

28
Carnegie Mellon

Microarchitecture (with MOV)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC Inst. New Enable? Cur. Flag
New Rd/Wr Reg. Current Values
Data Addr. Reg. IDs Reg.
New Valus
Values New Flag
PC
Enable? Values

Combinational Logic

Read current_states;
next_states = f(current_states);
When clock rises, current_states = next_states;

28
Carnegie Mellon

Microarchitecture (with MOV)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC Inst. New Enable? Cur. Flag
New Rd/Wr Reg. Current Values
Data Addr. Reg. IDs Reg.
New Data Valus
Values New Flag
PC
Enable? Values

Combinational Logic

Read current_states;
next_states = f(current_states);
When clock rises, current_states = next_states;

28
Carnegie Mellon

Microarchitecture (with MOV)


Clock

PC
Register Flags
Memory
File Z S O

Cur.
PC Inst. New Enable? Cur. Flag
New Rd/Wr Reg. Current Values
Data Addr. Reg. IDs Reg.
New Data Valus
Values New Flag
PC
Enable? Values

Combinational Logic

Read current_states;
next_states = f(current_states);
When clock rises, current_states = next_states;

next_states has to be ready before the close rises

28
Carnegie Mellon

Single-Cycle Microarchitecture: Illustration


Think of it as a state machine
Combinational
Every cycle, one instruction gets logic Read Write

executed. At the end of the Data


cycle, architecture states get memory

modified. CC
100
Read Write

States (All updated as clock ports ports

rises) Register
file
■ PC register %rbx = 0x100

■ Cond. Code register


■ Data memory PC
0x014
■ Register file

29
Carnegie Mellon

Cycle 1 Cycle 2 Cycle 3 Cycle 4


Clock

① ② ③ ④
Cycle 1: 0x000: irmovq $0x100,%rbx # %rbx <-- 0x100
Cycle 2: 0x00a: irmovq $0x200,%rdx # %rdx <-- 0x200
Cycle 3: 0x014: addq %rdx,%rbx # %rbx <-- 0x300 CC <-- 000
Cycle 4: 0x016: je dest # Not taken
Cycle 5: 0x01f: rmmovq %rbx,0(%rdx) # M[0x200] <-- 0x300

Combinational
logic
• state set according to second
Read Write

Data irmovq instruction


memory
• combinational logic starting to
CC
100 react to state changes
Read Write
ports ports

Register
file
%rbx = 0x100

PC
0x014

30
Carnegie Mellon

Cycle 1 Cycle 2 Cycle 3 Cycle 4


Clock

① ② ③ ④
Cycle 1: 0x000: irmovq $0x100,%rbx # %rbx <-- 0x100
Cycle 2: 0x00a: irmovq $0x200,%rdx # %rdx <-- 0x200
Cycle 3: 0x014: addq %rdx,%rbx # %rbx <-- 0x300 CC <-- 000
Cycle 4: 0x016: je dest # Not taken
Cycle 5: 0x01f: rmmovq %rbx,0(%rdx) # M[0x200] <-- 0x300

Combinational
logic
• state set according to second
Read Write

Data irmovq instruction


memory
• combinational logic generates
CC
100 results for addq instruction
Read Write
ports ports

000 Register
%rbx
file <--
%rbx = 0x100
0x300

0x016
PC
0x014

31
Carnegie Mellon

Cycle 1 Cycle 2 Cycle 3 Cycle 4


Clock

① ② ③ ④
Cycle 1: 0x000: irmovq $0x100,%rbx # %rbx <-- 0x100
Cycle 2: 0x00a: irmovq $0x200,%rdx # %rdx <-- 0x200
Cycle 3: 0x014: addq %rdx,%rbx # %rbx <-- 0x300 CC <-- 000
Cycle 4: 0x016: je dest # Not taken
Cycle 5: 0x01f: rmmovq %rbx,0(%rdx) # M[0x200] <-- 0x300

Combinational
logic Read Write
• state set according to addq
Data
memory
instruction
• combinational logic starting
CC
000 to react to state changes
Read Write
ports ports

Register
file
%rbx = 0x300

PC
0x016

32
Carnegie Mellon

Cycle 1 Cycle 2 Cycle 3 Cycle 4


Clock

① ② ③ ④
Cycle 1: 0x000: irmovq $0x100,%rbx # %rbx <-- 0x100
Cycle 2: 0x00a: irmovq $0x200,%rdx # %rdx <-- 0x200
Cycle 3: 0x014: addq %rdx,%rbx # %rbx <-- 0x300 CC <-- 000
Cycle 4: 0x016: je dest # Not taken
Cycle 5: 0x01f: rmmovq %rbx,0(%rdx) # M[0x200] <-- 0x300

Combinational
logic Read Write
• state set according to addq
Data
memory
instruction
• combinational logic generates
CC
000 results for je instruction
Read Write
ports ports

Register
file
%rbx = 0x300

0x01f
PC
0x016

33
Carnegie Mellon

Processor Microarchitecture
• Sequential, single-cycle microarchitecture implementation
• Basic idea
• Hardware implementation
• Pipelined microarchitecture implementation
• Basic Principles
• Difficulties: Control Dependency
• Difficulties: Data Dependency

34
Carnegie Mellon

Performance Model

Execution time
of a program = # of Dynamic Instructions
(in seconds)

X # of cycles taken to execute an instruction (on average)

/ number of cycles per second

35
Carnegie Mellon

Performance Model

Execution time
of a program = # of Dynamic Instructions
(in seconds) CPI

X # of cycles taken to execute an instruction (on average)

/ number of cycles per second

35
Carnegie Mellon

Performance Model

Execution time
of a program = # of Dynamic Instructions
(in seconds) CPI

X # of cycles taken to execute an instruction (on average)

/ number of cycles per second Clock Frequency


(1/cycle time)

35
Carnegie Mellon

Improving Performance

Execution time
of a program = # of Dynamic Instructions
(in seconds)

X # of cycles taken to execute an instruction (on average)

/ number of cycles per second

• 1. Reduce the total number of instructions executed (mainly done by


the compiler and/or programmer).

36
Carnegie Mellon

Improving Performance

Execution time
of a program = # of Dynamic Instructions
(in seconds)

X # of cycles taken to execute an instruction (on average)

/ number of cycles per second

• 1. Reduce the total number of instructions executed (mainly done by


the compiler and/or programmer).
• 2. Increase the clock frequency (reduce the cycle time). Has huge
power implications.

36
Carnegie Mellon

Improving Performance

Execution time
of a program = # of Dynamic Instructions
(in seconds)

X # of cycles taken to execute an instruction (on average)

/ number of cycles per second

• 1. Reduce the total number of instructions executed (mainly done by


the compiler and/or programmer).
• 2. Increase the clock frequency (reduce the cycle time). Has huge
power implications.
• 3. Reduce the CPI, i.e., execute more instructions in one cycle.

36
Carnegie Mellon

Improving Performance

Execution time
of a program = # of Dynamic Instructions
(in seconds)

X # of cycles taken to execute an instruction (on average)

/ number of cycles per second

• 1. Reduce the total number of instructions executed (mainly done by


the compiler and/or programmer).
• 2. Increase the clock frequency (reduce the cycle time). Has huge
power implications.
• 3. Reduce the CPI, i.e., execute more instructions in one cycle.
• We will talk about one technique that simultaneously achieves 2 & 3.

36
Carnegie Mellon

Limitations of a Single-Cycle CPU

37
Carnegie Mellon

Limitations of a Single-Cycle CPU


• Cycle time

37
Carnegie Mellon

Limitations of a Single-Cycle CPU


• Cycle time
• Every instruction finishes in one cycle.

37
Carnegie Mellon

Limitations of a Single-Cycle CPU


• Cycle time
• Every instruction finishes in one cycle.
• The absolute time takes to execute each instruction varies.
Consider for instance an ADD instruction and a JMP instruction.

37
Carnegie Mellon

Limitations of a Single-Cycle CPU


• Cycle time
• Every instruction finishes in one cycle.
• The absolute time takes to execute each instruction varies.
Consider for instance an ADD instruction and a JMP instruction.
• But the cycle time is uniform across instructions, so the cycle time
needs to accommodate the worst case, i.e., the slowest
instruction.

37
Carnegie Mellon

Limitations of a Single-Cycle CPU


• Cycle time
• Every instruction finishes in one cycle.
• The absolute time takes to execute each instruction varies.
Consider for instance an ADD instruction and a JMP instruction.
• But the cycle time is uniform across instructions, so the cycle time
needs to accommodate the worst case, i.e., the slowest
instruction.
• How do we shorten the cycle time (increase the frequency)?

37
Carnegie Mellon

Limitations of a Single-Cycle CPU


• Cycle time
• Every instruction finishes in one cycle.
• The absolute time takes to execute each instruction varies.
Consider for instance an ADD instruction and a JMP instruction.
• But the cycle time is uniform across instructions, so the cycle time
needs to accommodate the worst case, i.e., the slowest
instruction.
• How do we shorten the cycle time (increase the frequency)?
• CPI

37
Carnegie Mellon

Limitations of a Single-Cycle CPU


• Cycle time
• Every instruction finishes in one cycle.
• The absolute time takes to execute each instruction varies.
Consider for instance an ADD instruction and a JMP instruction.
• But the cycle time is uniform across instructions, so the cycle time
needs to accommodate the worst case, i.e., the slowest
instruction.
• How do we shorten the cycle time (increase the frequency)?
• CPI
• The entire hardware is occupied to execute one instruction at a
time. Can’t execute multiple instructions at the same time.

37
Carnegie Mellon

Limitations of a Single-Cycle CPU


• Cycle time
• Every instruction finishes in one cycle.
• The absolute time takes to execute each instruction varies.
Consider for instance an ADD instruction and a JMP instruction.
• But the cycle time is uniform across instructions, so the cycle time
needs to accommodate the worst case, i.e., the slowest
instruction.
• How do we shorten the cycle time (increase the frequency)?
• CPI
• The entire hardware is occupied to execute one instruction at a
time. Can’t execute multiple instructions at the same time.
• How do execute multiple instructions in one cycle?

37
Carnegie Mellon

A Motivating Example
300 ps 20 ps

R
Combinational
e
logic
g

Clock

• Computation requires total of 300 picoseconds


• Additional 20 picoseconds to save result in register
• Must have clock cycle time of at least 320 ps

38

You might also like