0% found this document useful (0 votes)
21 views9 pages

B. Tech, High Performance Computer Architecture (CS-3010), Autumn End Semester Examination 2021

Uploaded by

aksad1991
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views9 pages

B. Tech, High Performance Computer Architecture (CS-3010), Autumn End Semester Examination 2021

Uploaded by

aksad1991
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

KIIT Deemed to be University

Online End Semester Examination(Autumn Semester-2021)

Subject Name & Code: HPC & CS3010 Applicable to


Courses:B.Tech

Full Marks=50 Time:2 Hours

SECTION-A(Answer All Questions. Each question carries 2 Marks)

Time:30 Minutes (7×2=14


Marks)

Question Questi Question CO Answer


No on Mapping Key
Type( (For MCQ
MCQ/ Questions
SAT) only)
Q.No:1 ISA serves as an interface between: CO1 B
A) Hardware and User
B) Hardware and Software
C) Software and User
D) User and Compiler
Multiprocessing can be achived in a SISD CO1 B
computer by:
A) Keeping multiple PEs
B) Using pipelining
C) Making it to MIMD
D) Never possible
Which of the following is not included under CO2 D
the dimensions of ISA:
A) Memory addressing
B) Adressing mode
C) Control flow instructions
D) Structural hazard
Assume a pipeline processor has CO1 A
shared a single memory pipeline for
data and instruction.
Whattypeofhazarditmayleadsto:
A) Structural Hazard
B) Data Hazard
C) Control Hazard
D) Instruction Hazard
Q.No:2 A 4 stage pipeline has stage delays as 150, CO2 C
120, 160,140 ns respectively. There is a
latch delay of 5ns each. Assuming constant
clock rate, the total time taken to process
1000 data items on this pipeline will be

A) 120.5 µs
B) 160.5 µs
C) 165.5 µs
D) 190 µs
A 4 stage pipeline has stage delays as 300, CO2 B
240, 320,280 ns respectively. There is a
latch delay of 10 ns each. Assuming constant
clock rate, the total time taken to process
1000 data items on this pipeline will be

A)241.5 µs
B)331 µs
C)298 µs
D)340 µs

A 4 stage pipeline has stage delays as 150, CO2 A


120, 160,140 ns respectively. There is a
latch delay of 5ns each. What is the speedup
ratio?
A) 3.45
B) 4.25
C) 2.8
D) 4.5
A 4 stage pipeline has stage delays as 300, CO2 A
240, 320,280 ns respectively. There is a
latch delay of 10 ns each.What is the
speedup ratio?
A) 3.45
B) 4.25
C) 2.8
D) 4.5

Q.No:3 Suppose a program takes 1 billion CO1 B


instructions to execute on a processor
running at 2 GHz. Suppose also that 60% of
the instructions execute in 3 clock cycles,
20% execute in 4 clock cycles, and 20%
execute in 5 clock cycles. What is the
execution time for the program or task?
A) 1.25 sec
B) 1.8 sec
C) 2.15 sec
D) 3.25 sec

Suppose a program takes 1 billion CO1 A


instructions to execute on a processor
running at 2 GHz. Suppose also that 50% of
the instructions execute in 4 clock cycles,
30% execute in 6 clock cycles, and 20%
execute in 8 clock cycles. What is the
execution time for the program or task?
A) 2.7 sec
B) 3.6 sec
C) 3.8 sec
D) 4.5 sec

A 7stage pipeline separated by a clock 6ns CO1 D


along with a latch delay of 1ns.If the non-
pipelineclockisalsohavingthesameduration
andthepipelineefficiencyis50%thencalcula
tethespeedupfactor.
A) 2
B) 4
C) 6
D) 3
We have 2 designs D1 and D2 for a pipeline CO1 C
processor. D1 has 6 stage pipeline with
execution time of 6 ns, 5 ns, 4 ns, 7 ns and 3
ns. While the design D2 has 8 pipeline
stages each with 5 ns execution time. How
much time can be saved using design D2
over design D1 for executing 100
instructions?
A) 300 ns
B) 270 ns
C) 200 ns
D) 340 ns
Q.No:4 A memory systemhas a L1 cache, L2 cache CO4 C
and main memory.If thehit rate of L1
cacheis 90% & the hitrate of L2 is 95%. Find
out average memory access time, if it takes
10 cycles to accessL1
cache,20cyclestoaccessL2
cacheand100cyclestoaccessmainmemory?
A) 14.5
B) 13.5
C) 12.5
D) 11.5
A memory systemhas a L1 cache, L2 cache CO4 A
and main & memory.If thehit rate of L1
cacheis 85% & the hitrate of L2 is 90%. Find
out average memory access time, if it takes
5 cycles to accessL1
cache,15cyclestoaccessL2 cacheand80
cyclestoaccessmainmemory?
A) 8.45
B) 10.25
C) 12.5
D) 13.75

The size of cache memory is 2 MB and main CO4 C


memory size is 512 MB X 32. Each block of
cache memory contains 64 words. A 2-way
set associative mapping is used. What is the
address size of TAG, Block and Word Under
Block.
A) 8, 10, 11
B) 6, 12, 11
C) 8, 15, 6
D) 15, 6, 8
The size of cache memory is 4 MB and main CO4 D
memory size is 2 GB X 32. Each block of
cache memory contains 256 words. A 2-way
set associative mapping is used. What is the
address size of TAG, Block and Word Under
Block.
A) 14, 9, 8
B) 12, 10, 9
C) 9, 10, 12
D) 9, 14, 8

Q.No:5 In which architecture compiler finds CO3 C


parallelism?
A) Super scalar
B) Super pipeline
C) VLIW
D) CISC
A simpler VLIW processor does not require CO3 D
of:
A) Complex logic circuit
B) SSD slot
C) Computational register
D) Scheduling hardware
In which stage of theTomasulo approach, CO3 A
the output dependency has been resolved.
A) Issue Stage
B) Read operand Stage
C) Execution Stage
D) Write result stage
The instruction execution sequence ,that CO3 C
holds the instruction result known as
A) Data buffer
B) control buffer
C) reorder buffer
D) ordered buffer

Q.No:6 If the MAR size is 31 bits and the MDR size CO6 A
is 32 bits, what is the size of RAM?

A) 2GB X 32
B) 4GB X 64
C) 8GB X 64
D) 4GB X 32
If the MAR size is 32 bits and the MDR size CO6 B
is 64 bits, what is the size of RAM?
A) 2GB X 32
B) 4GB X 64
C) 8GB X 64
D) 4GB X 32
If the MAR size is 33 bits and the MDR size CO6 C
is 64 bits, what is the size of RAM?

A) 2GB X 32
B) 4GB X 64
C) 8GB X 64
D) 4GB X 32
If the MAR size is 34 bits and the MDR size CO6 A
is 64 bits, what is the size of RAM?
A) 16GB X 64
B) 8 GB X 32
C) 8 GB X 64
D) 16GB X 32
Q.No:7 UMA and NUMA are two different CO5 B
architecture on the basis of:
A) The Control Unit is used
B) The Primary Memory is shared
C) The Connection Network
D) The ISA
Which of the following statement is true CO5 B
about distributed shared memory
architecture
A) DSM multiprocessor are UMA
B) DSM Multiprocessor are NUMA
C) The communication between PEs happen
through memory.
D) The communication cost is low in case
of DSM
New topology that could reduce the no of CO5 B
switches through which packets must travel,
referred to as
A) Crossbar Switch
B) Hope Coint
C) Multi layer Switch
D) Network
A linear array is formed in each dimension CO5 C
by all the nodes, in the
A) Bus Topology
B) Ring Topology
C) Mesh topology
D) Torus Topology
SECTION-B(Answer Any Three Questions. Each Question carries
12 Marks)

Time: 1 Hour and 30 Minutes (3×12=36


Marks)

Question No Question CO
Map
ping
(Eac
h
ques
tion
shou
ld
be
from
the
sam
e
CO(s
))
Q.No:8 I)State and explain Amdahl’s law and discusses its CO1
various aspects?

II)Suppose in a graphics processor, the ‘Floating Point CO1


Square Root’ (FPSQR) is responsible for 40% of the total
execution time. One proposal to enhance the FPSQR
hardware & speed up this operation by a factor of 30. The
other alternative is to make all floating point instructions
in the graphics processor run faster by a factor of 2.4;
floating point operations are responsible for 50% of the
execution time for the application. Compare the speed up
for both the design alternatives.
I) What is Instruction Set Architecture? CO1

II) What is its significance in designing a computer CO1


system?
III) What are the different classes of ISA? CO1
I) Derive the overall speedup gained by Amdahl’s law.
CO1
II) Assume that we make an enhancement to a
computer that improves some mode of execution by a
factor of 30. Enhanced mode is used 70% of the time. CO1
Measured as a percentage of the execution time when
the enhanced mode is in use. Recall that Amdahl’s law
depends on the fraction of the original unenhanced
execution time that could make use of enhanced mode.
What is the speed up we have obtained from the fast
mode? What percentage of original the original
execution time has been converted to fast mode?
Q.No:9 I) Discuss Flynn’s classification of multiprocessor CO5
architecture.

II) Consider the following sequence of instructions:


Add R1, R2, R3
CO2
Mul #3, R4, R5
Sub R3, R2, R5
Add #20, R2, R5
Show the content of different stages at different clock
pulses in a 4 stage pipelined processor, by considering
the Space Time diagram. Show the content of inter-stage
buffers from clock pulse 3 to 8. The contents of registers
R1, R2 and R4 are 50, 75 and 90 respectively.
I) Explain Superscalar and Super pipelined architecture
with proper diagrammatic representation. CO3

II) Find out the total number of clock cycles required to


execute the following instructions without and with
operand forwarding? CO2
LD R1, 10(R2)
L DR1,10(R3)
DSUBR5,R1,#20
SD R5, 0(R2)
DADDIUR2,R3,#4
DSUBR5,R3,R2

I) What are the different types of hazards that occur in a


pipeline? Explain how the data forwarding techniques is CO2
useful in reducing the data hazard.

II) A five stage linear pipeline processor has IF, ID,


EXE, MEM,WB. The IF, ID, WB stages takes 1clock CO2
cycles each for any instruction. The execution of
different instructions takes more than ideal time as
given:- 2 clock cycle for an ADD instruction,2 clock
cycles for SUB instruction, 4clock cycles for MUL
instructions, 6 clock cycles for DIV instructions and 2
clock cycles for LOAD and STOREinstruction
respectively.
Consider the following instructions:-
ADDR4,R1,R2
STORER3, (04)R2
S U BR5, R3, R2
LOADR8, (02)R5
DIVR8,R7,R4
MULR5,R8, R2

Draw the time space diagram for the above sequence of


instructions and find out total number of clock cycles
required to complete their execution.
Q.No:10 I) Discuss MIPS architecture. Explain Data Path in MIPS CO5
architecture in general and represent the data path for
branch instruction with proper diagram in particular?

II) Explain loop unrolling and perform loop unrolling for


the following code fragment represented in high level
language- CO5
For ( i=0;I<=100; i++)
{
X [i] = x [i]-constant;
}

I) Explain loop level parallelism and the 2 different CO3


methods to achieve that and their differences.

II) What is Dynamic Scheduling and what are the 4


stages used in scoreboard to perform Dynamic CO4
Scheduling.
Perform the Dynamic Scheduling of the following codes
with Scoreboard technique-

L.D F2, 5(R3)


MUL.D F0, F2, F3
ADD.D F5, F3, F4
DIV.D F10, F6, F3

With these given information:


Functional unit No. Of FU EX cycles
Integer 1 1
Floating Point Multiply 2 4
Floating Point add 1 2
Floating point Divide 2 10

I) Drawandexplaindifferentoperationalstepsexecuteinst
ructionswithtomasuloapproachtoachievehigherILP.
II) Compare the AMAT of the following two multi
level cache architecture. CO3
Split cache : 16 KB instructions and 16 KB data with
miss rate of 2.5% and 8.6% respectively.
Unified cache: 32 KB (instructions + data) with miss
rate of 3.8%. CO6
The miss penalty is 50 cycles. The hit time is 1 cycle in
case of split cache and 2 in case of unified cache
because of LOAD and Store instructions. Where as 80%
of the total memory accesses for instructions and 20%
of the total memory accesses are for data.
Q.No:11 I) What is the necessity of cache memory in a computer CO4
system? Discuss multilevel cache with proper
architectural diagram. Find out the average memory
access time.
II)What are the different cache optimization techniques,
discuss in details? CO6

I) Differentiatebetweenmessagepassingmodelsanddistrib
utedsharedmemoryarchitecture? Explain cache
CO5
coherence in symmetric shared memory multiprocessor
with an example.
II) Draw and explain both static and dynamic CUBE
CO6
Interconnection Network by taking 8 nodes in to
account?
I) Explain the architecture of VLIW processors with
CO5
proper diagram and discuss the advantages and
disadvantages?
II) What is perfect shuffle and invert perfect shuffle?
CO6
Draw and explain a 8×8 Omega Network in details?

You might also like