0% found this document useful (0 votes)

147 views14 pages

Problem Set 4 Sol

Uploaded by

bsudheertec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

147 views14 pages

Problem Set 4 Sol

Uploaded by

bsudheertec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Problem-Set #4

COE 608: Computer Organization and Architecture

Multicycle MIPS-Lite CPU and Pipelining

(a) MIPS-Lite CPU Multi-Cycle Control and Pipelining

Chapter 4:
Exercises: 4.12, 4.13, 4.14, 4.16, 4.17, 4.18, 4.19,
4.20, 4.21 and 4.22.1 & 4.22.2.

Additional Questions
Q.1. How could we modify the following code to make use of a delayed branch slot?

Loop: lw $2, 100($3)

addi $3, $3, 4
beq $3, $4, Loop

Q.2. Identify all the data dependencies in the following code.

Which dependencies are data hazards that can be resolved by forwarding?

add $2, $5, $4

add $4, $2, $5
sw $5, 100($2)
add $3, $2, $4

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 1

Solutions:
4.12
Done in the class

4.13
4.13.1
Instruction sequence Dependences
a. I1: lw $1,40($6) RAW on $1 from I1 to I3
I2: add $6,$2,$2 RAW on $6 from I2 to I3
I3: sw $6,50($1) WAR on $6 from I1 to I2 and I3

b. I1: lw $5,-16($5) RAW on $5 from I1 to I2 and I3

I2: sw $5,-16($5) WAR on $5 from I1 and I2 to I3
I3: add $5,$5,$5 WAW on $5 from I1 to I3
4.13.2
In the basic fi ve-stage pipeline WAR and WAW dependences do not cause any hazards. Without
forwarding, any RAW dependence between an instruction and the next two instructions (if register
read happens in the second half of the clock cycle and the register write happens in the first half). The
code that eliminates these hazards by inserting nop instructions is:
Instruction sequence
a. lw $1,40($6) Delay I3 to avoid RAW hazard on $1 from I1
add $6,$2,$2
nop
sw $6,50($1)
b. lw $5,-16($5) Delay I2 to avoid RAW hazard on $5 from I1
nop
nop
sw $5,-16($5)
add $5,$5,$5 Note: no RAW hazard from on $5 from I1 now

4.13.3
With full forwarding, an ALU instruction can forward a value to EX stage of the next instruction
without a hazard. However, a load cannot forward to the EX stage of the next instruction (by can to the
instruction after that). The code that eliminates these hazards by inserting nop instructions is:

Instruction sequence
a. lw $1,40($6) No RAW hazard on $1 from I1 (forwarded)
add $6,$2,$2
sw $6,50($1)
b. lw $5,-16($5)
nop Delay I2 to avoid RAW hazard on $5 from I1
sw $5,-16($5) Value for $5 is forwarded from I2 now
add $5,$5,$5 Note: no RAW hazard from on $5 from I1 now

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 2

4.13.4
The total execution time is the clock cycle time times the number of cycles. Without any stalls, a three-
instruction sequence executes in 7 cycles (5 to complete the first instruction, then one per instruction).
The execution without forwarding must add a stall for every nop we had in 4.13.2, and execution
forwarding must add a stall cycle for every nop we had in 4.13.3. Overall, we get:
No forwarding With forwarding Speed-up due to forwarding.

No forwarding With forwarding Speed-up due to forwarding

a. (7 + 1) × 300ps = 2400ps 7 × 400ps = 2800ps 0.86 (This is really a slowdown)
b. (7 + 2) × 200ps = 1800ps (7 + 1) × 250ps = 2000ps 0.90 (This is really a slowdown)

4.13.5
With ALU-ALU-only forwarding, an ALU instruction can forward to the next instruction, but not to
the second-next instruction (because that would be forwarding from MEM to EX). A load cannot
forward at all, because it determines the data value in MEM stage, when it is too late for ALU-ALU
forwarding. We have:

Instruction sequence
a. lw $1,40($6) Can’t use ALU-ALU forwarding, ($1 loaded in MEM)
add $6,$2,$2
nop
sw $6,50($1)
b. lw $5,-16($5) Can’t use ALU-ALU forwarding ($5 loaded in MEM)
nop
nop
sw $5,-16($5)
add $5,$5,$5

4.13.6
No forwarding With ALU-ALU Speed-up with ALU-ALU
forwarding only forwarding
a. (7+1)×300ps = 2400ps (7+1)×360ps = 2880ps 0.83 (This is really a slowdown)
b. (7+2)×200ps = 1800ps (7+2)×220ps = 1980ps 0.91 (This is really a slowdown)

4.14
4.14.1
In the pipelined execution shown below, *** represents a stall when an instruction cannot be fetched
because a load or store instruction is using the memory in that cycle. Cycles are represented from left
to right, and for each instruction we show the pipeline stage it is in during that cycle:

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 3

We cannot add nops to the code to eliminate this hazard—nops need to be fetched just like any other
instructions, so this hazard must be addressed with a hardware hazard detection unit in the processor.

4.14.2
This change only saves one cycle in an entire execution without data hazards (such as the one given).
This cycle is saved because the last instruction finishes one cycle earlier (one less stage to go through).
If there were data hazards from loads to other instruction, the change would help eliminate some stall
cycles.
Instructions Cycles with Cycles with
Executed 5 stages 4 stages Speed-up
a. 4 4+4=8 3+4=7 8/7 = 1.14
b. 5 4+5=9 3+5=8 9/8 = 1.13
4.14.3
Stall-on-branch delays the fetch of the next instruction until the branch is executed. When branches
execute in the EXE stage, each branch causes two stall cycles. When branches execute in the ID stage,
each branch only causes one stall cycle. Without branch stalls (e.g., with perfect branch prediction)
there are no stalls, and the execution time is 4 plus the number of executed instructions. We have:

Instructions Branches Cycles with Cycles with

Executed Executed branch in EXE branch in ID Speed-up
a. 4 1 4 + 4 + 1 × 2 = 10 4+4+1×1=9 10/9 = 1.11
b. 5 1 4 + 5 + 1 × 2 = 11 4 + 5 + 1 × 1 = 10 11/10 = 1.10

4.14.4
The number of cycles for the (normal) 5-stage and the (combined EX/MEM) 4-stage pipeline is
already computed in 4.14.2. The clock cycle time is equal to the latency of the longest-latency stage.
Combining EX and MEM stages affects clock time only if the combined EX/MEM stage becomes the
longest-latency stage:

Cycle time Cycle time

with 5 stages with 4 stages Speed-up
a. 130ps (MEM) 150ps (MEM + 20ps) (8 × 130)/(7 × 150) = 0.99
b. 220ps (MEM) 240ps (MEM + 20ps) (9 × 220)/(8 × 240) = 1.03

4.14.5
New ID New EX New cycle Old cycle

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 4

Latency latency time time Speed-up
a. 180ps 80ps 180ps (ID) 130ps (MEM) (10 × 130)/(9 × 180) = 0.80
b. 150ps 160ps 220ps (MEM) 220ps (MEM) (11 × 220)/(10 × 220) = 1.10
4.14.6
The cycle time remains unchanged: a 20ps reduction in EX latency has no effect on clock cycle time
because EX is not the longest-latency stage. The change does affect execution time because it adds one
additional stall cycle to each branch. Because the clock cycle time does not improve but the number of
cycles increases, the speed-up from this change will be below 1 (a slowdown). In 4.14.3 we already
computed the number of cycles when branch is in EX stage. We have:

Cycles with Execution time Cycles with Execution time

branch in EX (branch in EX) branch in MEM (branch in MEM) Speed-up
a. 4+4+1×2 = 10 10×130ps = 1300ps 4+4+1 × 3 = 11 11 × 130ps = 1430ps 0.91
b. 4+5+1 × 2 = 11 11×220ps = 2420ps 4+5+1 × 3 = 12 12 × 220ps = 2640ps 0.92

4.15
4.15.1
a. This instruction behaves like a load with a zero offset until it fetches the value from memory. The
pre-ALU Mux must have another input now (zero) to allow this. After the value is read from memory
in the MEM stage, it must be compared against zero. This must either be done quickly in the WB
stage, or we must add another stage between MEM and WB. The result of this zero comparison must
then be used to control the branch Mux, delaying the selection signal for the branch Mux until the WB
stage.
b. We need to compute the memory address using two register values, so the address computation for
SWI is the same as the value computation for the ADD instruction. However now we need to read a
third register value, so Registers must be extended to support a another read register input and another
read data output and a Mux must be added in EX to select the Data Memory’s write data input between
this value and the value for the normal SW instruction.

4.15.2
a. We need to add one more bit to the control signal for the pre-ALU Mux. We also need a control
signal similar to the existing “Branch” signal to control whether or not the new zero-compare result is
allowed to change the PC.
b. We need a control signal to control the new Mux in the EX stage.

4.15.3
a. This instruction introduces a new control hazard. The new PC for this branch is computed only after
the Mem stage. If a new stage is added after MEM, this either adds new forwarding paths (from the
new stage to EX) or (if there is no forwarding) makes a stall due to a data hazard one cycle longer.
b. This instruction does not affect hazards. It modifies no registers, so it causes no data hazards. It is
not a branch instruction, so it produces no control hazards. With the added third register read port, it
creates no new resource hazards, either.

4.15.4
a. lw Rtmp,0(Rs) e.g., BEZI can be used when trying to find the length of a
beq Rt,$0,Label zero-terminated array.

b. add Rtmp,Rs,Rt e.g., SWI can be used to store to an array element, where the array
sw Rd,0(Rtmp) begins at address Rt and Rs is used as an index into the array.

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 5

4.15.5
The instruction can be translated into simple MIPS-like micro-operations (see 4.15.4 for a possible
translation). These micro-operations can then be executed by the processor with a “normal” pipeline.

4.15.6
We will compute the execution time for every replacement interval. The old execution time is simply
the number of instruction in the replacement interval (CPI of 1). The new execution time is the number
of instructions after we made the replacement, plus the number of added stall cycles. The new number
of instructions is the number of instructions in the original replacement interval, plus the new
instruction, minus the number of instructions it replaces:
New execution time Old execution time Speed-up
a. 20 − (2 − 1) + 1 = 20 20 1.00
b. 60 − (3 − 1) + 0 = 58 60 1.03

4.16
4.16.1
For every instruction, the IF/ID register keeps the PC + 4 and the instruction word itself. The ID/EX
register keeps all control signals for the EX, MEM, and WB stages, PC + 4, the two values read from
Registers, the sign-extended lowermost 16 bits of the instruction word, and Rd and Rt fi elds of the
instruction word (even for instructions whose format does not use these fi elds). The EX/MEM register
keeps control signals for MEM and WB stages, the PC + 4 + Offset (where Offset is the sign-extended
lowermost 16 bits of the instructions, even for instructions that have no offset fi eld), the ALU result
and the value of its Zero output, the value that was read from the second register in the ID stage (even
for instructions that never need this value), and the number of the destination register (even for
instructions that need no register writes; for these instructions the number of the destination register is
simply a “random” choice between Rd or Rt). The MEM/WB register keeps the WB control signals,
the value read from memory (or a “random” value if there was no memory read), the ALU result, and
the number of the destination register.
4.16.2
Need to be read Actually read
a. $6 $6, $1
b. $5 $5 (twice)
4.16.3
EX MEM
a. 40 + $6 Load value from memory
b. $5 + $5 Nothing

4.16.4

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 6

4.16.5
In a particular clock cycle, a pipeline stage is not doing useful work if it is stalled or if the instruction
going through that stage is not doing any useful work there. In the pipeline execution diagram from
4.16.4, a stage is stalled if its name is not shown for a particular cycles, and stages in which the
particular instruction is not doing useful work are marked in red. Note that a BEQ instruction is doing
useful work in the MEM stage, because it is determining the correct value of the next instruction’s PC
in that stage. We have:

Cycles in which all stages % of cycles in which all

Cycles per loop iteration do useful work stages do useful work
a. 5 1 20%
b. 5 2 40%
4.16.6
The address of that fi rst instruction of the third iteration (PC + 4 for the beq from the previous
iteration) and the instruction word of the beq from the previous iteration.
4.17
4.17.1
Of all these instructions, the value produced by this adder is actually used only by a beq instruction
when the branch is taken. We have:
a. 15% (60% of 25%)
b. 9% (60% of 15%)
4.17.2
Of these instructions, only add needs all three register-ports (reads two registers and write one). beq
and sw does not write any register, and lw only uses one register value. We have:
a. 50%
b. 30%
4.17.3
Of these instructions, only lw and sw use the data memory. We have:
a. 25% (15% + 10%)

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 7

b. 55% (35% + 20%)
4.17.4
The clock cycle time of a single-cycle is the sum of all latencies for the logic of all fi ve stages. The
clock cycle time of a pipelined datapath is the maximum latency of the fi ve stage logic latencies, plus
the latency of a pipeline register that keeps the results of each stage for the next stage. We have:

Single-cycle Pipelined Speed-up

a. 500ps 140ps 3.57
b. 730ps 230ps 3.17
4.17.5
The latency of the pipelined datapath is unchanged (the maximum stage latency does not change). The
clock cycle time of the single-cycle datapath is the sum of logic latencies for the four stages (IF, ID,
WB, and the combined EX + MEM stage). We have:
Single-cycle Pipelined
a. 410ps 140ps
b. 560ps 230ps
4.17.6
The clock cycle time of the two pipelines (5-stage and 4-stage) as explained for 4.17.5. The number of
instructions increases for the 4-stage pipeline, so the speed-up is below 1 (there is a slowdown):
Instructions with 5-stage Instructions with 4-stage Speed-up
a. 1.00 × I 1.00 × I + 0.5 × (0.15 + 0.10) × I = 1.125 × I 0.89
b. 1.00 × I 1.00 × I + 0.5 × (0.35 + 0.20) × I = 1.275 × I 0.78

4.18
4.18.1
No signals are asserted in IF and ID stages. For the remaining three stages we have:
EX MEM WB
a. ALUSrc = 0, ALUOp = 10, Branch = 0, MemWrite = 0, MemtoReg = 1,
RegDst = 1 MemRead = 0 RegWrite = 1

b. ALUSrc = 0, ALUOp = 10, Branch = 0, MemWrite = 0, MemtoReg = 1,

RegDst = 1 MemRead = 0 RegWrite = 1

4.18.2
One clock cycle.

4.18.3
The PCSrc signal is 0 for this instruction. The reason against generating the PCSrc signal in the EX
stage is that the and must be done after the ALU computes its Zero output. If the EX stage is the
longest-latency stage and the ALU output is on its critical path, the additional latency of an AND gate
would increase the clock cycle time of the processor. The reason in favor of generating this signal in
the EX stage is that the correct next-PC for a conditional branch can be computed one cycle earlier, so
we can avoid one stall cycle when we have a control hazard.

4.18.4
Control signal 1 Control signal 2
a. Generated in ID, used in EX Generated in ID, used in WB
b. Generated in ID, used in MEM Generated in ID, used in WB

4.18.5
a. R-type instructions
b. Loads.

4.18.6
Signal 2 goes back though the pipeline. It affects execution of instructions that execute after the one for
which the signal is generated, so it is not a time-travel paradox.

4.19
4.19.1
Dependences to the 1st next instruction result in 2 stall cycles, and the stall is also 2 cycles if the
dependence is to both 1st and 2nd next instruction. Dependences to only the 2nd next instruction result
in one stall cycle. We have:

CPI Stall Cycles

a. 1 + 0.45 × 2 + 0.05 × 1 = 1.95 49% (0.95/1.95)
b. 1 + 0.40 × 2 + 0.10 × 1 = 1.9 47% (0.9/1.9)

4.19.2
With full forwarding, the only RAW data dependences that cause stalls are those from the MEM stage
of one instruction to the 1st next instruction. Even this dependences causes only one stall cycle, so we
have:
CPI Stall Cycles
a. 1 + 0.25 = 1.25 20% (0.25/1.25)
b. 1 + 0.20 = 1.20 17% (0.20/1.20)
4.19.3
With forwarding only from the EX/MEM register, EX to 1st dependences can be satisfied without
stalls but EX to 2nd and MEM to 1st dependences incur a one-cycle stall. With forwarding only from
the MEM/WB register, EX to 2nd dependences incur no stalls. MEM to 1st dependences still incur a
one-cycle stall (no time travel), and EX to 1st dependences now incur one stall cycle because we must
wait for the instruction to complete the MEM stage to be able to forward to the next instruction. We
compute stall cycles per instructions for each case as follows:
EX/MEM MEM/WB Fewer stall cycles with
a. 0.10 + 0.05 + 0.25 = 0.40 0.10 + 0.10 + 0.25 = 0.45 EX/MEM
b. 0.05 + 0.10 + 0.20 = 0.35 0.15 + 0.05 + 0.20 = 0.40 EX/MEM
4.19.4
In 4.19.1 and 4.19.2 we have already computed the CPI without forwarding and with full forwarding.
Now we compute time per instruction by taking into account the clock cycle time:
Without forwarding With forwarding Speed-up
a. 1.95 × 100ps = 195ps 1.25 × 110ps = 137.5ps 1.42
b. 1.90 × 300ps = 570ps 1.20 × 350ps = 420ps 1.36

4.19.5
We already computed the time per instruction for full forwarding in 4.19.4. Now we compute time-per
instruction with time-travel forwarding and the speed-up over full forwarding:

With full forwarding Time-travel forwarding Speed-up

a. 1.25 × 110ps = 137.5ps 1 × 210ps = 210ps 0.65
b. 1.20 × 350ps = 420ps 1 × 450ps = 450ps 0.93

4.19.6
EX/MEM MEM/WB Shorter time per instruction with
a. 1.40 × 100ps = 140ps 1.45 × 100ps = 145ps EX/MEM
b. 1.35 × 320ps = 432ps 1.40 × 310ps = 434ps EX/MEM

4.20
4.20.1

4.20.2
Only RAW dependences can become data hazards. With forwarding, only RAW dependences from a
load to the very next instruction become hazards. Without forwarding, any RAW dependence from an
instruction to one of the following three instructions becomes a hazard:

4.20.3
With forwarding, only RAW dependences from a load to the next two instructions become hazards
because the load produces its data at the end of the second MEM stage. Without forwarding, any RAW
dependence from an instruction to one of the following 4 instructions becomes a hazard:

4.20.4

4.20.5
A register modification becomes “visible” to the EX stage of the following instructions only two
cycles after the instruction that produces the register value leaves the EX stage. Our forwarding-
assuming hazard detection unit only adds a one-cycle stall if the instruction that immediately follows a
load is dependent on the load. We have:

4.20.6

4.21
4.21.2
We can move up an instruction by swapping its place with another instruction that has no dependences
with it, so we can try to fill some nop slots with such instructions. We can also use R7 to eliminate
WAW or WAR dependences so we can have more instructions to move up.

4.21.3
With forwarding, the hazard detection unit is still needed because it must insert a one-cycle stall
whenever the load supplies a value to the instruction that immediately follows that load. Without the
hazard detection unit, the instruction that depends on the immediately preceding load gets the stale
value the register had before the load instruction.
a. I2 gets the value of $1 from before I1, not from I1 as it should.
b. I4 gets the value of $1 from I1, not from I3 as it should.

4.21.4
The outputs of the hazard detection unit are PCWrite, IF/IDWrite, and ID/EXZero (which controls the
Mux after the output of the Control unit). Note that IF/IDWrite is always equal to PCWrite, and
ED/ExZero is always the opposite of PCWrite. As a result, we will only show the value of PCWrite for
each cycle. The outputs of the forwarding unit is ALUin1 and ALUin2, which control Muxes which
select the fi rst and second input of the ALU. The three possible values for ALUin1 or ALUin2 are 0
(no forwarding), 1 (forward ALU output from previous instruction), or 2 (forward data value for
second-previous instruction). We have:

4.21.5
The instruction that is currently in the ID stage needs to be stalled if it depends on a value produced by
the instruction in the EX or the instruction in the MEM stage. So we need to check the destination
register of these two instructions. For the instruction in the EX stage, we need to check Rd for R-type
instructions and Rd for loads. For the instruction in the MEM stage, the destination register is already
selected (by the Mux in the EX stage) so we need to check that register number (this is the bottommost
output of the EX/MEM pipeline register). The additional inputs to the hazard detection unit are register
Rd from the ID/EX pipeline register and the output number of the output register from the EX/MEM
pipeline register. The Rt fi eld from the ID/EX register is already an input of the hazard detection unit
in Figure 4.60. No additional outputs are needed. We can stall the pipeline using the three output
signals that we already have.

4.21.6
As explained for 4.21.5, we only need to specify the value of the PCWrite signal, because IF/IDWrite
is equal to PCWrite and the ID/EXzero signal is its opposite. We have:

4.22
4.22.1

4.22.2

IT3030E Exercise Chap5 v2 Ans
No ratings yet
IT3030E Exercise Chap5 v2 Ans
11 pages
Tuesday, October 31, 2023 10:53 PM: Discuss, The Schemes For Dealing With The Pipeline Stalls Caused by Branch Hazards
No ratings yet
Tuesday, October 31, 2023 10:53 PM: Discuss, The Schemes For Dealing With The Pipeline Stalls Caused by Branch Hazards
7 pages
F10 E1 Solution
No ratings yet
F10 E1 Solution
5 pages
CO Assignment 4 Solution
100% (1)
CO Assignment 4 Solution
10 pages
Ca CT2
No ratings yet
Ca CT2
4 pages
Pipelining
No ratings yet
Pipelining
29 pages
M116C 1 EE116C-Midterm2-w15 Solution
100% (1)
M116C 1 EE116C-Midterm2-w15 Solution
8 pages
Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page
No ratings yet
Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page
11 pages
Lect3 Pipeline
No ratings yet
Lect3 Pipeline
4 pages
Computer Architecture - Sheet 6 Solution
No ratings yet
Computer Architecture - Sheet 6 Solution
7 pages
Computer Architecture - Sheet 7 Solution
No ratings yet
Computer Architecture - Sheet 7 Solution
5 pages
hw5 Soln
No ratings yet
hw5 Soln
4 pages
CH04 Solution
No ratings yet
CH04 Solution
24 pages
Chapter4 2
No ratings yet
Chapter4 2
34 pages
Two Forms of Pipelining: - E.g., Floating Point Operations
No ratings yet
Two Forms of Pipelining: - E.g., Floating Point Operations
36 pages
A4 Solution
No ratings yet
A4 Solution
4 pages
Midterm Solutions Mar 30
No ratings yet
Midterm Solutions Mar 30
6 pages
Computer Organization: Ahmed Hashim
No ratings yet
Computer Organization: Ahmed Hashim
48 pages
L13 Stalls and Flushes
No ratings yet
L13 Stalls and Flushes
27 pages
Ca07 2014 PDF
No ratings yet
Ca07 2014 PDF
56 pages
PS4 Solution
No ratings yet
PS4 Solution
6 pages
Enhancing Performance With Pipelining
No ratings yet
Enhancing Performance With Pipelining
71 pages
Hazards: CSE378 W, 2001 CSE378 W, 2001
No ratings yet
Hazards: CSE378 W, 2001 CSE378 W, 2001
6 pages
Pipeline
No ratings yet
Pipeline
36 pages
Chapter4 Pipelining END FA11
No ratings yet
Chapter4 Pipelining END FA11
84 pages
PARALLELISM VIA INSTRUCTIONS: Pipelining Exploits The Potential Parallelism Among Instructions. This Parallelism Is
No ratings yet
PARALLELISM VIA INSTRUCTIONS: Pipelining Exploits The Potential Parallelism Among Instructions. This Parallelism Is
2 pages
Pipeline
No ratings yet
Pipeline
39 pages
Sample Problems Pipe&Memory
No ratings yet
Sample Problems Pipe&Memory
57 pages
COA Unit 3
No ratings yet
COA Unit 3
89 pages
Chapter 04 Processor 3.5
No ratings yet
Chapter 04 Processor 3.5
52 pages
Computer Architecture LAB 2
No ratings yet
Computer Architecture LAB 2
4 pages
Sheet 9
No ratings yet
Sheet 9
12 pages
Lec12 Pipeline 2 Notes
No ratings yet
Lec12 Pipeline 2 Notes
58 pages
Solution 2
No ratings yet
Solution 2
3 pages
HW 6 ECE 6913 Sol
No ratings yet
HW 6 ECE 6913 Sol
18 pages
Homework 4 - Alina Pineda
No ratings yet
Homework 4 - Alina Pineda
5 pages
8 Pipeline DDP Control
No ratings yet
8 Pipeline DDP Control
54 pages
L15 MipsPipeline
No ratings yet
L15 MipsPipeline
26 pages
CA Classes-86-90
No ratings yet
CA Classes-86-90
5 pages
Lec13 Pipe Control
No ratings yet
Lec13 Pipe Control
19 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
50 pages
PIPELINE
No ratings yet
PIPELINE
13 pages
Investigating Instruction Pipelining
No ratings yet
Investigating Instruction Pipelining
20 pages
CS3350B Computer Architecture: Lecture 6.2: Instructional Level Parallelism: Hazards and Resolutions
No ratings yet
CS3350B Computer Architecture: Lecture 6.2: Instructional Level Parallelism: Hazards and Resolutions
31 pages
Pipelining
No ratings yet
Pipelining
44 pages
CompEng 361 - Homework 3 Solutions
No ratings yet
CompEng 361 - Homework 3 Solutions
6 pages
Pipelined Datapath and Control
No ratings yet
Pipelined Datapath and Control
37 pages
Lecture 6
No ratings yet
Lecture 6
17 pages
CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan
No ratings yet
CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan
21 pages
Lecture 13-14: Pipelines Hazards": Suggested Reading:" (HP Chapter 4.5-4.7) "
No ratings yet
Lecture 13-14: Pipelines Hazards": Suggested Reading:" (HP Chapter 4.5-4.7) "
51 pages
Computer Architecture: Introduction To The Concept of Pipelined Processor
No ratings yet
Computer Architecture: Introduction To The Concept of Pipelined Processor
20 pages
This Study Resource Was: Pipelining Analogy
No ratings yet
This Study Resource Was: Pipelining Analogy
58 pages
Lec 06
No ratings yet
Lec 06
18 pages
Homework #3 Solutions: Spring 2013
No ratings yet
Homework #3 Solutions: Spring 2013
2 pages
Unit 6 Part1 Ilp
No ratings yet
Unit 6 Part1 Ilp
39 pages
3 Pipeline
No ratings yet
3 Pipeline
21 pages
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Engine Tuning Guide
From Everand
Engine Tuning Guide
Rodulf nouh Fidal
No ratings yet
Beginning Software Engineering
From Everand
Beginning Software Engineering
Rod Stephens
4.5/5 (2)
CS8261 - C Programming Laboratory - MCQ
No ratings yet
CS8261 - C Programming Laboratory - MCQ
6 pages
Smart Social Distancing Technique
No ratings yet
Smart Social Distancing Technique
86 pages
E-Health Care Management
No ratings yet
E-Health Care Management
92 pages
M.E (FT) 2021 Regulation-Ece Syllabus
No ratings yet
M.E (FT) 2021 Regulation-Ece Syllabus
64 pages
Webinar On AI Using Matlab
No ratings yet
Webinar On AI Using Matlab
1 page
M.E (FT) 2021 Regulation-Cse Syllabus
No ratings yet
M.E (FT) 2021 Regulation-Cse Syllabus
88 pages
MDP PDF
No ratings yet
MDP PDF
37 pages
Question Paper Code:: Reg. No.
No ratings yet
Question Paper Code:: Reg. No.
2 pages
CS554 - Advanced Database Systems Homework 8: Undo-Log Records
No ratings yet
CS554 - Advanced Database Systems Homework 8: Undo-Log Records
15 pages
Academic Allotments PDF
No ratings yet
Academic Allotments PDF
259 pages
Markov Decision Processes: - The Markov Property - The Markov Decision Process - Partially Observable Mdps
No ratings yet
Markov Decision Processes: - The Markov Property - The Markov Decision Process - Partially Observable Mdps
24 pages
HW3 Sol PDF
No ratings yet
HW3 Sol PDF
44 pages
Tamilnadu Engineering Admissions 2020 Directorate of Technical Education, Chennai - 25
No ratings yet
Tamilnadu Engineering Admissions 2020 Directorate of Technical Education, Chennai - 25
35 pages
Waside Graphs PDF
No ratings yet
Waside Graphs PDF
4 pages
ECE 341 Final Exam Solution: Problem No. 1 (10 Points)
No ratings yet
ECE 341 Final Exam Solution: Problem No. 1 (10 Points)
9 pages
Deepiris: Iris Recognition Using A Deep Learning Approach
No ratings yet
Deepiris: Iris Recognition Using A Deep Learning Approach
4 pages
I. What Comes Before: Srimathi Sundaravalli Memorial School
No ratings yet
I. What Comes Before: Srimathi Sundaravalli Memorial School
2 pages
Midterm1 Soln Fall09 PDF
No ratings yet
Midterm1 Soln Fall09 PDF
6 pages
Control of Pumps
No ratings yet
Control of Pumps
9 pages
Smart Ups On Line Surtd5000xlt
No ratings yet
Smart Ups On Line Surtd5000xlt
4 pages
Marinelec DI09
No ratings yet
Marinelec DI09
2 pages
DSAS 6 First-Order LTI Systems
No ratings yet
DSAS 6 First-Order LTI Systems
13 pages
Eric MiniLink Brochure 150608 Lowres
No ratings yet
Eric MiniLink Brochure 150608 Lowres
5 pages
Challenger ITR
No ratings yet
Challenger ITR
64 pages
LEAB CDR 200 - Product Specs
No ratings yet
LEAB CDR 200 - Product Specs
9 pages
OFC Lab Manual BTECH VIII SEM
No ratings yet
OFC Lab Manual BTECH VIII SEM
16 pages
Datasheet Square TMD EN Ed3
No ratings yet
Datasheet Square TMD EN Ed3
2 pages
Battery Management Project
No ratings yet
Battery Management Project
36 pages
Crane Motion Controller
No ratings yet
Crane Motion Controller
14 pages
Computer Hardware Components
No ratings yet
Computer Hardware Components
20 pages
8MR21701E Datasheet en
No ratings yet
8MR21701E Datasheet en
2 pages
CPE479 Presentation
No ratings yet
CPE479 Presentation
25 pages
Computation of Electric Field Distribution Near EHV UHV Transmission
No ratings yet
Computation of Electric Field Distribution Near EHV UHV Transmission
6 pages
Fox Valley ANAB Certification
No ratings yet
Fox Valley ANAB Certification
28 pages
430VX
No ratings yet
430VX
90 pages
Shannon - Hartley Theorem
No ratings yet
Shannon - Hartley Theorem
10 pages
Sensors Used in Navigation: - Dead Reckoning
No ratings yet
Sensors Used in Navigation: - Dead Reckoning
16 pages
Optra Service Manual
No ratings yet
Optra Service Manual
194 pages
GaN Power Device Tutorial Part2 GaN Driving
No ratings yet
GaN Power Device Tutorial Part2 GaN Driving
63 pages
Catálogo de Partes Electrónicas - Detalle de La Opción
No ratings yet
Catálogo de Partes Electrónicas - Detalle de La Opción
3 pages
Chapter 06
No ratings yet
Chapter 06
18 pages
Electronic Devices and Circuit Theory
No ratings yet
Electronic Devices and Circuit Theory
25 pages
Poster TIM20-21
No ratings yet
Poster TIM20-21
1 page
Optical Lifi Modulation
No ratings yet
Optical Lifi Modulation
5 pages
Al 572
No ratings yet
Al 572
37 pages
Shepd Duos Charges 2023 24 v1.4
No ratings yet
Shepd Duos Charges 2023 24 v1.4
90 pages
Idcgroupsliteraturedocumentsin23c In001 en P PDF
No ratings yet
Idcgroupsliteraturedocumentsin23c In001 en P PDF
208 pages
Mã L I DCT-80
100% (1)
Mã L I DCT-80
112 pages

Problem Set 4 Sol

Uploaded by

Problem Set 4 Sol

Uploaded by

Problem-Set #4

COE 608: Computer Organization and Architecture

(a) MIPS-Lite CPU Multi-Cycle Control and Pipelining

Loop: lw $2, 100($3)

Q.2. Identify all the data dependencies in the following code.

add $2, $5, $4

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 1

b. I1: lw $5,-16($5) RAW on $5 from I1 to I2 and I3

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 2

No forwarding With forwarding Speed-up due to forwarding

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 3

Instructions Branches Cycles with Cycles with

Cycle time Cycle time

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 4

Cycles with Execution time Cycles with Execution time

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 5

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 6

Cycles in which all stages % of cycles in which all

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 7

Single-cycle Pipelined Speed-up

b. ALUSrc = 0, ALUOp = 10, Branch = 0, MemWrite = 0, MemtoReg = 1,

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 8

CPI Stall Cycles

With full forwarding Time-travel forwarding Speed-up

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 9

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 10

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 11

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 12

© G. Khan Problem-Set-, COE608: Multi-cycle CPU and Pipelining Page: 13

© G. Khan Problem-Set-4, COE608: Multi-cycle CPU and Pipelining Page: 14

You might also like