Exercise 1 ComputerArchitecture
Exercise 1 ComputerArchitecture
Exercise 1
Hồ Hữu Hiệp - ITITIU20202
Nguyễn Duy Khang - ITITIU18057
Nguyễn Thanh Hiền ITITIU20142
Problem 1
P1 P2 P3
Clock rate 3.0 GHz 2.5 GHz 4.0 GHz
CPI 1.5 1.0 2.2
Processor P1:
9
3 ×10
Instruction per second ( P 1)= =2×10 9 (instructions /s)
1.5
Processor P2:
9
2.5 ×10 9
Instruction per second ( P 2)= =2.5× 10 (instructions /s)
1.0
Processor P3:
9
4.0 ×10 9
Instruction per second ( P 3)= =1.81 ×10 (instructions / s)
2.2
In the same amount of time (1 second), the P2 process the greatest number of
instructions among those three processors. Hence, P2 has the highest
performance.
b) Based on the formula calculating the CPU time above, the formula
calculating number of instructions is
CPU time × Clock rate
Instruction count=
CPI
Processor P1:
10 10
Clock cycles 1=1.5 × 2.0× 10 =3 ×10 (cycles)
Processor P2:
Clock cycles 2=1.0× 2.5× 1010 =2.5 ×1010 (cycles)
Processor P3:
10 10
Clock cycles 3=2.2× 1.8× 10 =4.0× 10 (cycles)
Processor P1:
' 12
Clock rat e = ×3.0=5.1(GHz )
7
Processor P2:
' 12
Clock rat e = ×2.5=4.3 (GHz)
7
Processor P3:
' 12
Clock rat e = × 4.0=6.6(GHz)
7
Problem 2:
Class A: 106 ×10 %=105 (instructions)
Class B: 106 ×20 %=2 ×105 (instructions )
Class C: 106 ×50 %=5 ×105 (instructions )
Class D: 106 ×20 %=2 ×105 (instructions )
n
Instruction count i
b) global CPI=∑ (CPI i × ¿)¿
i=1 Instruction count
5
global CPI (1)=(10¿ ¿5 ×1)+(2× 2×10 )+¿ ¿ ¿ ¿
5
global CPI (2)=(10¿ ¿5 × 2)+(2× 2× 10 ) +¿ ¿ ¿ ¿
Instruction count × global CPI
a) CPU time=
Clock rate
106 × 2.6
CPU time ( 1 )= 9
=1.04 × 10−3 (s)
2.5 ×10
106 ×2.0
CPU time ( 2 )= 9
=0.66 × 10−3 (s)
3.0 ×10
Hence, the second implementation is faster.
c) Clock cycles=CPI × Instruction count
6
Clock cycles ( 1 )=2.6 × 10 (cycles)
6
Clock cycles ( 2 )=2.0× 10 (cycles)
Problem 3:
CPU time
a) CPI=
Clock cycle time × Instruction count
1.1
CPI ( A )= 9 −9
=1.1
10 ×10
1.5
CPI ( B )= =1.25
1.2×10 9 × 10−9
Instruction count × CPI
b) CPU time =
Clock rate
CPU time ( A ) Instruction count ( A ) CPI ( A ) Clock rate ( B )
¿> = × ×
CPU time ( B ) Instruction count ( B ) CPI ( B ) Clock rate ( A )
Clock rate ( A ) Instruction count ( A ) CPI ( A ) CPU time ( B )
= × ×
Clock rate ( B ) Instruction count ( B ) CPI ( B ) CPU time ( A )
Clock rate ( A ) 10
9
1.1
= × ×1=¿ Clock rate ( A ) =0.73 Clock rate ( B )
Clock rate ( B ) 1.2 ×10 1.25
9
1
Hence, the clock of the processor running compiler B’s code is =1.36 faster
0.73
than the clock of the processor running compiler A’s code.
c) CPU time ( new compiler )=CPI × Instruction count × Clock cycle time
¿ 1.1× 6 ×10 8 ×10−9
(Clock cyle time=10−9 ( s ) because of the same processor)
¿ 0.66( s)
CPU time ( A ) 1.1
= =1.67
CPU time ( new compiler ) 0.66
CPU time (B) 1.5
= =2.27
CPU time(new compiler ) 0.66
Therefore, the new compiler applied for that processor is faster than the
compiler A 1.67 times and also faster than B 2.27 times.
Problem 4:
2
Dynamic power=Capacitive load ×Voltage × Frequency
Dynamic power
Capacitive load =
Voltage 2 × Frequency
90 −8
a) Capacitive load( Pentinum 4 Prescott)= 2 9
=1.6 × 10 ( F)
1.25 ×3.6 ×10
40 −8
Capacitive load (Core i5 Ivy Bridge)= 2 9
=1.45× 10 (F )
0.9 ×3.4 × 10
static power
b) %static power=
dynamic power+ static power
10
Pentinum 4 Prescott :%static power= =10 %
90+ 10
30
Core i 5 Ivy Bridge :%static power= =42.86 %
40+ 30
Pentinum 4 Prescott:
2
(90+ 10)−1.6 ×10−8 ×1.252 ×3.6 ×10 9 (90+10)× 0.9−1.6× 10−8 × ( Voltag e ' ) × 3.6× 109
=
1.25 Voltage '
' 2
90−57.6 ( Voltag e )
¿> 8= '
Voltag e
2
¿>57.6 ( Voltag e ' ) + 8 Voltag e' −90=0
Problem 5:
a)
We have the equation:
clock cycles = num of instruction × CPI
Because we have three types of instructions, so:
3
Then,
clock cycles 1.92×10 10
execution time = clock rate = 9 = 9.6 (s)
2 ×10
Call p is the number of processor (p > 1). We have:
9 9
clock cycles p ¿ 2.56× 10 × 1+ 1.28 ×10 ×12+256 ×106 ×5
0.7 × p 0.7 × p
9
2.56× 10 9
¿ + 1.28× 10
p
hence,
9
2.56 ×10 9
+ 12.8× 10
clock cycles p
clock cycles p = =
clock rate 2 ×109
Finally, we’ll sketch the table:
p 1 2 4 8
c)
This mean that the execution time of one processor (with reduced CPI ) and of 2
Then,
clock cycles = ( 2.56 ×10 9 ) × 1+ ( 1.28 ×109 ) ×CPI 2 , new+ ( 256 ×106 ) ×5
new,
9 9 9
¿ 3.84 ×10 + 1.28× 10 × CPI 2 ,new =7.68× 10
Hence,
9 9
7.68 ×10 −3.84 ×10
CPI 2 , new = 9
=3
1.28 ×10
Problem 6:
a)
First, we obtian the die areas:
Wafer area1 π ( 7.5 )2 2
Die area1 ≈ = =2.104 (cm )
Die count 1 84
Wafer area2 π ( 10 )2 2
Die area2 ≈ = =π (cm )
Die count 2 100
Plug in to the yield euqation:
1 1
Yield 1= 2
= 2
=0.96
Die area1 2.104
(1+ Defect rate1 × ) (1+0.020 × )
2 2
1 1
Yield 2= 2
= =0.91
Die area2 π 2
(1+ Defect rate2 × ) (1+0.031 × )
2 2
b)
Cost per die:
Cost per wafer 1 12
Cost per die 1= = =0.149
Dies per wafer 1 ×Yield 1 84 ×0.96
Cost per wafer 2 15
Cost per die 2= = =0.165
Dies per wafer 2 × Yield2 100 × 0.91
c)
number of dies per wafer is increased by 10%
Wafer area1 π ( 7.5 )2
=1.91 ( cm )
2
Die area1 ≈ =
Die count 1 84 ×1.1
2
Wafer area2 π ( 10 ) 2
Die area2 ≈ = =2.86( cm )
Die count 2 100 ×1.1
New :
1 1 defects
Defect rate= −1= −1=0.026 ( )
√Yield √ 0.95 cm2
Problem 7.
Instruction Execution Reference
count time time
2.389E12 750 s 9650 s
a.
- Clock cycle is 0.333ns find CPI.
- CPI = (execution time)/((instruction count) × (Clock cycle))
750
- 12 −9
=0.94
(2.389 ×10 )×(0.333 ×10 )
b.
9650
- Spec ratio = reference time /excecution time= 750 =12.87s
c.
Number of instruction count ×CPI
- CPU time = Clock rate
Because CPU time is proportional to Instruction count . So increase 10%
of number of instruction count without affect clock rate and CPI will
increase the CPU time 10%.
d.
- CPU time after increase Intruction count 10% , CPI 5%:
( 1.1number of instruction ) ×(1.05 CPI )
- CPU time = =1.115CPU time (old )
Clock rate
Problem 8.
Clock Rate (GHz) Instruction CPI
Counts (E9)
P1 4 5 0.9
P2 3 1 0.75
a.
5× 109 ×0.9
- Execution Time P1: =1.125 second
4 ×10 9
1× 109 × 075
- Execution Time P2 9
=0.25 second
3× 10
This fallacy is false although Processor 1 has larger clock rate than
Processor 2 but the execution time is smaller than processor 2.
b.
- The execution time of Processor 1 to process 1.0E9 instruction:
Instruction count ×CPI 1.0 ×109 ×0.9
CPU time = Clock rate
= = 0.225 s
4 ×10 9
c.
=> Although Processor 1 has larger MIPS but we has determined that
Processor 2 has better performance in the section a.
d.
Number of FP operation
MFLOPS = 6
Execution time × 10
40 % × 5× 109
- MFLOPS of Processor 1 : 6
=1.7 ×103
1.125× 10
9
40 % × 1× 10 3
- MFLOPS of Processor 2 : 6
=1.6 ×10
0.25× 10
Problem 9:
a)
New time spend to run FP operation:
(1-0.2) x 70 = 56 (s)
Total time reduced by:
70 - 56 = 14 (s)
or (14 : 250) x 100 = 5.6%
b)
The total time is reduced by 20% ⇒ 250 x (1- 0.2) = 200 (s)
Then, the time for execute INT operations is : 200 -70 -85 - 40 = 5 (s)
When the actually time needed is : 250 - 70 - 85 - 40 = 55 (s)
5
Hence, the time for INT operations reduced by : 55 x100 = 91%
c)
Assume that we avoid using branch operations.
The time of execution is : 55 + 70 +85 = 210
210
So it’s reduction is : 1 - 250 = 0.16 = 16%
Hence, the total time cannot be reduced 20% only by decreasing time of
branch operations.
Problem 10:
a)
The execution of 50 ×106 FP instructions
110 ×10 INT instructions
6
b)
we want the program to run two times faster
0.256
⇒ the executions time = 2
= 0.128 (s)
4
c)
The CPI of INT and FP instructions reduced by 40%
CPI = (1 - 0.4) x 1 = 0.6
INT
Then,
4