Test 3
Test 3
9600 bps. The processor is fetching instructions at the whenever needed, LRU replacement policy can be
rate of 1 million instructions per second (1 MIPS). By used. What is the percentage of cache utilization for
how much the processor is slowed down due to the Direct mapped, Associative and 2-way set-Associative
DMA activity? respectively, if the processor accesses the following
(A) 0.0012% (B) 0.01% elements?
(C) 0.001% (D) 0.12% M0, 0, M0, 1, M0, 2, M0, 3, M0, 4, M0, 5, M0, 6, M0, 7, M1, 0,
13. Match list-A with list-B and select the correct answer M1, 1, M1, 2, M1, 3, M1, 4, M1, 5, M1, 6, M1, 7.
using the code given below the list: (A) 50%, 100%, 50% (B) 50%, 50%, 50%
(C) 100%, 100%, 100% (D) 100%, 100%, 50%
List–A List–B
18. Consider the given program structure, which is in Main
a. Cache 1. Printer
memory:
b. DMA I/O 2. Disk
c. Interrupt I/O 3. High speed RAM START 20
Dth element. For example, for D = 1, the program EX: Execution or address calculation
accesses every element, for D = 2, the program accesses MEM: Data memory access
every second element and so on. Assuming a cache that WB: Write back
is initially empty and a program makes one pass over The following code executed on this pipeline:
this array with a scale of D, what is the miss rate gener-
ated for D = 8? Instruction Operation
21. A computer has 32-bit instructions and 12-bit addresses. Use forwarding to resolve data hazards. Then the num-
If there are 240 two-address instructions, how many ber of stalls that will occur because of data hazards in
one-address operations can be formulated? given code is _____.
(A) 4096 (B) 65536 (A) 0 (B) 1
(C) 8192 (D) 131072 (C) 2 (D) 3
22. Consider a pipelined processor with a 5-stage pipeline. 26. Consider a floating point representation: c.re where c
Assume that all instructions take 5 cycles. The dynamic represents coefficient register of size 10 in which MSB
instruction count by type, as a percentage of the total, is bit is used to represent sign, r represents radix and e
as follows: represents contents of exponent register of size 5, in
10% store instructions which MSB is used to represent sign. Then the contents
20% load instructions of coefficient and exponent registers for the number
30% branch instructions +1001.110 will be:
40% ALU instructions (A) 0100111000, 00100 (B) 0001001110, 00100
What is the ideal speed-up due to pipelining for this (C) 1001110000, 10100 (D) 1001110000, 10101
processor? 27. Consider a 32-bit microprocessor, with a 16-bit exter-
(A) 2.5 (B) 5 nal data bus, driven by an 8 MHz input clock. Assume
(C) 10 (D) 50 that this microprocessor has a bus cycle whose mini-
23. For the data given in Q. No. 22, let stalls due to data mum duration equals four input clock cycles. What is
hazards occur only under two reasons. the minimum data transfer rate across the bus that this
A stall of two cycles occur when a load instruction is microprocessor can sustain, in bytes?
followed by an ALU instruction that uses the result of (A) 2 MB/sec (B) 4 MB/sec
load. This scenario exists for 40% of the load instruc- (C) 6 MB/sec (D) 8 MB/sec
tions. 28. Consider a bus structure in which a single internal bus
A stall of three cycles occur when a branch instruction connects the ALU and all processor registers.
is preceded by an ALU operation whose result is used Which of the following represents the correct sequence
as a branch condition. This scenario exists for 50% of of micro-operations to add a number to the accumula-
the branch instructions. What is the decrease in the tor when the number is an indirect address operand?
ideal speed up of pipelining only due to data hazards? (A) t1: MAR ← (IR(address))
(A) 50% (B) 75.6% t2: MBR ← memory
(C) 18.9% (D) 37.8% t3: Y ← (MBR)
24. Given t4: Z ← (AC) + (Y)
x = (0100 0110 1101 1000 0000 0000 0000 0000)2 and t5: AC ← (Z)
y = (1011 1110 1110 0000 0000 0000 0000 0000)2, (B) t1: MAR ← (IR(address))
representing single precision IEEE 754 floating point t2: MBR ← memory
numbers. Then the respective values of x + y and x * y t3: Z ← (AC) + (MBR)
in decimal (approximately) are: t4: AC ← (Z)
(A) 27647.5625 and –12096 (C) t1: MBR ← (IR(address))
(B) 27647.5625 and –24192 t2: Z ← (AC) + (MBR)
(C) 13823.75 and 24192 t3: AC ← (Z)
(D) 13823.75 and –24192 (D) t1: MAR ← (IR(address))
25. A 5-stage pipeline has the following stages: t2: MBR ← memory
IF: Instruction fetch t3: MAR ← (MBR)
ID: Instruction decode and register file read t4: MBR ← memory
Computer Organization and Architecture Test 3 | 3.29
Answer Keys
1. B 2. B 3. C 4. C 5. A 6. C 7. C 8. D 9. B 10. C
11. A 12. D 13. C 14. C 15. A 16. A 17. C 18. D 19. B 20. B
21. B 22. B 23. D 24. A 25. B 26. A 27. B 28. D 29. D 30. C
31. D 32. C 33. D 34. D 35. A
In the given problem, the loop executes 100 times but 12. The DMA is transmitting at a rate of 9600 bits per
in the 101th time the loop is not taken. 9600
But it will be incorrectly predicted by 1-bit branch his- second i.e., it is transmitting = 1200 characters
8
tory table prediction as the loop is taken for 100 times. per second. The processor is processing at a rate of
In the 1st iteration also the predictor specifies incorrect 1 million instructions per second i.e., It will take
branch prediction as the bit is set to ‘not taken’ in the 1
exit stage of last execution of the program. = 1m second to process a single instruction.
106
∴ 2 wrong predictions, out of 101 predictions.
A single character will be processed by DMA in
∴ Prediction Accuracy percentage
1
=
99
× 100 = 98% Choice (C) ≈ 833 ms
101 1200
∴ Slow down of processor due to DMA
5. In Double precision floating point format,
1
Number of bits for exponent = 11 = × 100 = 0.12% Choice (D)
833
∴ Biased exponent = 2k–1 – 1,
Where k is the number of bits used for exponent. 13. Cache is high speed RAM, DMA I/O is used with disk,
Here k = 11, hence Biased exponent Interrupt I/O is used with printer. Choice (C)
= 210 – 1 = 1023. Choice (A) 14. Without pipelining, execution time = pq
with pipelining, execution time = p + q – 1
6. If we assume R2 as Base register, R3 as Index register,
pq
then the mode will be Base with index mode. ∴ speed up = Choice (C)
Choice (C) p + q −1
7. Number of bits in opcode = 4 15. IEEE 32-bit floating point representation will be in the
⇒ Number of operations possible = 24 = 16 form of:
Number of bits in src/dest. Registers = 8 8 bits 23-bits
∴ Number of registers = 28 = 256 Choice (C) Biased
fraction
exponent
8. Each DRAM has 4M addresses of 16-bit words.
⇒ DRAM capacity = 4M × 16 Sign bit
Memory system capacity = 128M × 256 6 = 110
128 M × 256 For –6 sign bit is 1.
∴ Number of chips required = = 512
4 M × 16 6 = 110 = 1.10 × 2010
Choice (D) Exponent = 010
9. Instruction size = 64-bits Biased exponent = 127 + 2 = 129 = 10000001
Opcode size = 2B = 16-bits Fraction = 10000000000000000000000
Hence IEEE 32-bit floating point representation of
Operand/Address field = 64 – 16 = 48-bits.
–6 is:
The 48-bits can be used to specify a particular address.
110000001
∴ Maximum directly addressable memory = 248
10000000000000000000000 Choice (A)
Choice (B)
16. In given code stalls occur before the ADD instructions.
10. Number of lines in cache = 32 2 stalls are there in given code. To minimize stalls, we
There are 32 lines in cache. Each line will have 16 bytes. reorder the code. In the reordered code place LOAD R4,
∴ Total bytes of memory in cache M[1004] before ADD R3, R1, R2.
= 16 × 32 = 512 bytes Choice (C) The resultant code will be:
11. The DRAM has given a refresh cycle 64 times per ms. LOAD R1, M[1000]
Time required for one refresh operation = 150 ns LOAD R2, M[1002]
In 1 ms, the time required to refresh is 64 × 150 ns LOAD R4, M[1004]
= 9600 ns ADD R3, R1, R2
∴ The fraction of time devoted to memory refresh is ADD R5, R1, R4
9600 × 10 −9 STORE R3, M[1008]
= 0.0096 STORE R5, M[1010]
10 −3
No stalls in the reordered code. Choice (A)
∴ Approximate percentage of the memory’s total
operating time given to refreshes is 1%. 17. The Array M2×8 is stored in column-major order in the
Choice (A) main memory, i.e.,
Computer Organization and Architecture Test 3 | 3.31
Main memory
Cache memory The cache memory is shown below with main memory
w0
block addresses.
L0
w1 Start
w2
. L1 0 1024 20
w3 I0
. . .
. . 127 1151 26
. . .
.. 128 1152
. I1 168
.
. . 255 1279 1203
2000
. M0, 0 . .
. B0 . . 256 1280 I2 242
. M1, 0 w14 L7 383 1407
. M0, 1 w15
. B1 384 1408 I3
. M1, 1
. 511 1535
. .
. . . 512 I4
. . .
. . . 1503
.
639
end
640 I5
2015 M0, 7
B7
M1, 7 767
768 I6
.
. 895
.
. 896 I7
1023
In direct mapping, B0 is placed in L0, B1 in L1, B2 in
L2, ..., B7 in L7 to access the elements M0, 0 – M0, 7
The remaining elements are already in cache. All those
accesses will be hits. Hence the sequence of reads from the main memory
As all the 8-blocks used, cache utilization is 100%. blocks into cache line is:
In Associative mapping, each block of main memory
Line: 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 0, 1, 0, 1, ...0, 1, 0, 1, 2, 3
will be placed at anywhere in the cache lines. Also
8-accesses will be misses and the remaining will be
hits. No need of replacement and all lines of cache will Pass 1 outer loop Pass 2 Pass 10
be used.
i.e., in pass1 of outer loop the lines 0, 1, 2, 3, 4, 5, 6,
∴ Cache utilization = 100%
7, 0, 1 will be accessed. In pass 2 0, 1 are accessed for
In 2-way set-Associative mapping, two blocks will
(0 – 127), (128 – 255) and again 0, 1 are accessed for
be treated as a single set. There will be 4-sets.
(1024 – 1151), (1152 – 1279).
Cache memory In last pass 0, 1, 0, 1, 2, 3 lines will be accessed.
∴ Total time for reading the blocks of main memory
L0L1 Set 0
into the cache
L2L3 Set 1 = (10 + 9 × 4 + 2) × 128 × 10
= 61440 n sec. Choice (D)
L4L5 Set 2 19. Let the number of instruction be 100.
Without optimization, time required to execute 100
L6L7 Set 3
Instructions = 100 + 10 × 2 = 100 + 20 = 120
All the four sets are used. Set 0 consists B0, B4, set 1 With optimization time required
contains B1, B5. Like this all sets are used. Hence cache = (120 – 0.75 ×10 – 0.20 × 10) = 110.5
utilization is 100%. Choice (C) ∴ Improvement using optimization
18. Given main memory size = 64 K = 216 words =
120
= 1.086. Choice (B)
Block size = 128 = 27 words 110.5
Cache size = 1 K = 210 words 20. Number of lines in cache = 128
Word field size = 7 As the cache is a 4-way set associative, each set con-
210 tains 4-blocks.
Number of lines = 7 = 23 128
2 ∴ Number of sets = = 32
∴ Line field size = 3 4
⇒ Tag = 16 – (7 + 3) = 6 So there will be 32 misses.
3.32 | Computer Organization and Architecture Test 3
But the program accesses every 8th word. Exponent = 125 – 127 = –2
32 Mantissa =1.110 0000 0000 0000 0000 0000
∴ Number of misses = =4 ∴ y = –1.110 0000 0000 0000 0000 0000 × 2–2
8
= –0.4375
1
Hence miss rate = Choice (B) x + y = 27648 – 0.4375 = 27647.5625
4 x * y = 27648 * 0.4375 = –12096 Choice (A)
21. Instruction size = 32-bits 25. Given code:
Two address instruction format will be I1: ADD R1, R2, R3
I2:SUB R4, R1, R5
opcode address1 address2
I3:LOAD R6, 200(R1)
8 12 12
I4:ADD R7, R1, R6
There will be 28 possible combinations of operations. I2, I3, I4 are dependent on I1. I4 dependent on I3.
Two address instructions = 240 The execution chart is shown below:
Operations for single address instructions 1 2 3 4 5 6 7 8 9 10
= 256 – 240 = 16 I1 IF ID EX MEM WB
I2 IF ID EX MEM WB
Single address instruction format will be I3 IF ID EX MEM WB
I4 IF Stall ID EX MEM WB
opcode address