0% found this document useful (0 votes)
348 views69 pages

Basics of COA: 1. (MCQ) (GATE-2023: 2M)

The document discusses the basics of springs. It provides 3 key points: 1) Springs are used to store mechanical energy and can be designed for various purposes like suspension, force regulation, or energy storage. 2) The design of springs involves determining factors like the spring material, wire diameter, number of coils, and form of the spring based on the required properties. 3) Springs obey Hooke's law, which states that the force needed to extend or compress a spring from its resting position is proportional to the displacement. This allows springs to be designed to provide predictable and calibrated forces.

Uploaded by

dzz9wt7x9k
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
348 views69 pages

Basics of COA: 1. (MCQ) (GATE-2023: 2M)

The document discusses the basics of springs. It provides 3 key points: 1) Springs are used to store mechanical energy and can be designed for various purposes like suspension, force regulation, or energy storage. 2) The design of springs involves determining factors like the spring material, wire diameter, number of coils, and form of the spring based on the required properties. 3) Springs obey Hooke's law, which states that the force needed to extend or compress a spring from its resting position is proportional to the displacement. This allows springs to be designed to provide predictable and calibrated forces.

Uploaded by

dzz9wt7x9k
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Design of springs

CHAPTER

1
Basics of COA c = 20
1. [MCQ] [GATE-2023 : 2M] d=a+b
e=c+d
Consider the IEEE-754 single precision floating
point numbers f=c+e
P = 0xC1800000 and Q = 0x3F5C2EF4. b=c+e
Which one of the following corresponds to the e=b+f
product of these numbers d= 5 + e
(i.e., P × Q), represented in the IEEE-754 single return d + f
precision format? Assuming that all operations take their operands
(a) 0x404C2EF4 (b) 0x405C2EF4 from registers, what is the minimum number of
(c) 0xC15C2EF4 (d) 0xC14C2EF4 registers needed to execute this program without
spilling?
2. [MCQ] [GATE-2014 : 2M] (a) 2 (b) 3
The value of a float type variable is represented using (c) 4 (d) 6
the single-precision 32-bit floating point format of
IEEE-754 standard that uses 1 bit for sign, 8 bits for 5. [MCQ] [GATE-2008 : 1M]
biased exponent and 23 bits for mantissa. A float A processor that has carry, overflow and sign flag
type variable X is assigned the decimal value of bits as part of its program status word (PSW)
−14.25. The representation of X in hexadecimal performs addition of the following two 2’s
notation is complement numbers 01001101 and 11101001.
(a) C1640000H (b) 416C0000H After the execution of this addition operation, the
(c) 41640000H (d) C16C0000H status of the carry, overflow and sign flags,
respectively will be:
3. [MCQ] [GATE-2012 : 1M] (a) 1, 1, 0 (b) 1, 0, 0
The amount of ROM needed to implement at 4 bit (c) 0, 1, 0 (d) 1, 0, 1
multiplier is
(a) 64 bits (b) 128 bits 6. [MCQ] [GATE-2008 : 2M]
(c) 1Kbits (d) 2 Kbits The use of multiple register windows with overlap
causes a reduction in the number of memory
accesses for:
Registers and its Types
1. function locals and parameters
4. [MCQ] [GATE-2010 : 2M] 2. register saves and restores
The program below uses six temporary variables 3. instruction fetches
a, b, c, d, e, f. (a) 1 only (b) 2 only
a=1 (c) 3 only (d) 1, 2 and 3
b = 10

4.1
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

Types of Buses 9. [NAT] [GATE-2021 : 2M]


7. [MCQ] [GATE-2019 : 1M] Consider the following instruction sequence where
The chip select logic for a certain DRAM chip in a registers R1, R2 and R3 are general purpose and
memory system design is shown below. Assume that MEMORY [X] denotes the content at the memory
the memory system has 16-address lines denoted by location X.
A15 to A0. What is the range of addresses (in
hexadecimal) of the memory system that can get Instruction Semantics Instruction
enabled by the chip select (CS) signal? Size (bytes)
MOV R1, R1  MEMORY 4
(5000) [5000]
MOV R2, (R3) R2  MEMORY [R3] 4
ADD R2, R1 R2  R1 + R2 2
(a) C800 to CFFF (b) CA00 to CAFF
(c) C800 to C8FF (d) DA00 to DFFF MOV (R3), R2 MEMORY [R3]  R2 4
INC R3 R3  R3 + 1 2
Instruction Set Architecture DEC R1 R1  R1 – 1 2
8. [MCQ [GATE-2023 : 2M] BNZ 1004 Branch if not zero to 2
Consider the given C-code and its corresponding the given absolute
assembly code, with a few address
operands U1–U4 being unknown. Some useful
information as well as the semantics of each unique HALT Stop 1
assembly instruction is annotated as inline
comments in the code. The memory is byte- Assume that the content of the memory location
addressable. 5000 is 10, and the content of the register R3 is 3000.
//C-code ;assembly-code (; indicates comments) The content of each of the memory locations from
;r1-r5 are 32-bit integer registers 3000 to 3010 is 50. The instruction sequence starts
;initialize r1=0, r2=10
;initialize r3, r4 with base address of a, b from the memory location 1000. All the numbers are
in decimal format. Assume that the memory is byte
int a[10], L01: jeq r1, r2, end ;if(r1==r2) goto end
b[10], i; L02: lw r5, 0(r4) ;r5 <- Memory[r4+0] addressable.
// int is L03: shl r5, r5, U1 ;r5 <- r5 << U1 After the execution of the program, the content of
32-bit L04: sw r5, 0(r3) ;Memory[r3+0] <- r5 memory location 3010 is _____.
for (i=0; L05: add r3, r3, U2 ;r3 <- r3+U2
i<10;i++) L06: add r4, r4, U3
a[i] = b[i] L07: add r1, r1, 1 10. [NAT] [GATE-2020 : 2M]
* 8; L08: jmp U4 ;goto U4
L09: end A processor has 64 registers and uses 16-bit
instruction format. It has two types of instructions:
Whicho of the following options is a correct I-type and R-type. Each I-type instruction contains
replacement for operands in the position (U1, U2, an opcode, a register name, and a 4-bit immediate
U3, U4) in the above assembly code? value. Each R-type instruction contains an opcode
and two register names. If there are 8 distinct I-type
(a) (8, 4, 1, L02) (b) (3, 4, 4, L01)
opcodes, then the maximum number of distinct R-
(c) (8, 1, 1, L02) (d) (3, 1, 1, L01)
type opcodes is ______.

4.2
GATE Wallah CS & IT Topic wise PYQs
Machine Instruction and Addressing Modes

11. [NAT] [GATE-2018 : 2M] 14. [NAT] [GATE-2016 : 1M]


A processor has 16 integer registers (R0, R1, …., A Processor has 40 distinct instructions and 24
R15) and 64 floating point registers (F0, F1, …., general purpose registers. A 32-bit instruction word
F63). It uses a 2-byte instruction format. There are has an opcode, two register operands and an
four categories of instructions: Type-1, Type-2, immediate operand. The number of bits available for
Type-3 and Type-4. Type-1 category consists of four the immediate operand field is _____.
instructions, each with 3 integer register operands
(3Rs). Type-2 category consists of eight 15. [MCQ] [GATE-2015 : 1M]
instructions, each with 2 floating point register For computers based on three-address instruction
operands (2Fs). Type-3 category consists of fourteen formats, each address field can be used to specify
instructions, each with one integer register operand which of the following:
and one floating point register operand (1R+1F). S1: A memory operand
Type-4 category consists of N instructions, each S2: A processor register
with a floating-point register operand (1F). S3: An implied accumulator register
The maximum value of N is ________. (a) Either S1 or S2 (b) Either S2 or S3
(c) Only S2 and S3 (d) All of S1, S2 and S3
12. [MCQ] [GATE-2018 : 1M]
The following are some events that occur after a
16. [MCQ] [GATE-2015 : 2M]
device controller issues an interrupt while process L
Consider a processor with byte-addressable
is under execution.
memory. Assume that all registers, including
(P) The Processor pushes the process status of L
Program Counter (PC) and Program Status Word
onto the control stack.
(PSW), are of size 2 bytes. A stack in the main
(Q) The processor finishes the execution of the
memory is implemented from memory location
current instruction.
(0100)16 and it grows upward. The stack pointer (SP)
(R) The processor executes the interrupt service points to the top element of the stack. The current
routine. value of SP is (016E)16. The CALL instruction is of
(S) The processor pops the process status of L from two words, the first word is the op-code and the
the control stack. second word is the starting address of the subroutine
(T) The processor loads the new PC value based on (one word = 2 bytes). The CALL instruction is
the interrupt. implemented as follows:
Which one of the following is the correct order in • Store the current value of PC in the stack.
which the events above occur? • Store the value of PSW register in the stack.
(a) QPTRS (b) PTRSQ • Load the starting address of the subroutine in
(c) TRPQS (d) QTPRS PC.
The content of PC just before the fetch of a CALL
13. [NAT] [GATE-2016 : 2M] instruction is (5FA0)16. After execution of the CALL
Consider a Processor with 64 registers and an instruction, the value of the stack pointer is
instruction set of size twelve. Each instruction has (a) (016A)16 (b) (016C)16
five distinct fields, namely, opcode, two source (c) (0170)16 (d) (0172)16
register identifiers, one destination register
identifier, and a twelve-bit immediate value. Each 17. [NAT] [GATE-2014: 2M]
instruction must be stored in memory in a byte-
A machine has a 32-bit architecture, with 1-word
aligned fashion. If a program has 100 instructions,
long instructions. It has 64 registers, each of which
the amount of memory (in bytes) consumed by the is 32 bits long. It needs to support 45 instructions,
program text is _____.

4.3
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

which have an immediate operand in addition to two 1. It must be a trap instruction


register operands. Assuming that the immediate 2. It must be a privileged instruction
operand is an unsigned integer, the maximum value 3. An exception cannot be allowed to occur during
of the immediate operand is ________. execution of an RFE instruction.
(a) 1 only
18. [MCQ] [GATE-2011 : 2M] (b) 2 only
Consider evaluating the following expression tree on (c) 1 and 2 only
a machine with load-store architecture in which (d) 1, 2 and 3 only
memory can be accessed only through load and store
instructions. The variables a, b, c, d and e are Addressing Modes
initially stored in memory. The binary operators 21. [MCQ] [GATE-2017 : 1M]
used in this expression tree can be evaluated by the Consider the C struct defined below:
machine only when the operands are in registers. struct data {
The instructions produce result only in a register. If
int marks [100];
no intermediate results can be stored in memory,
char grade;
what is the minimum number of registers needed to
int cnumber;
evaluate this expression?
};
struct data student;
The base address of student is available in register
R1. The field student, grade can be accessed
efficiently using
(a) Post-increment addressing mode, (R1)+
(b) Pre-decrement addressing mode, - (R1)
(c) Register direct addressing mode, R1
(a) 2 (b) 9 (d) Index addressing mode, X(R1). Where X is an
offset represented in 2’s complement 16-bit
(c) 5 (d) 3
representation.

19. [MCQ] [GATE-2009 : 1M]


22. [MCQ] [GATE-2011 : 1M]
A CPU generally handles an interrupt by executing
an interrupt service routine Consider a hypothetical processor with a instruction
of type LW R1, 20(R2), which during execution
(a) As soon as an interrupt is raised.
reads a 32-bit word from memory and stores it in a
(b) By checking the interrupt register at the end of
fetch cycle. 32-bit register R1. The effective address of the
memory location is obtained by the addition of a
(c) By checking the interrupt register after
finishing the execution of the current constant 20 and the contents of register R2. Which
instruction. of the following best reflects the addressing mode
(d) By checking the interrupt register at fixed time implemented by this instruction for the operand in
intervals. memory?
20. [MCQ] [GATE-2008 : 2M] (a) Immediate Addressing
Which of the following must be true for the RFE (b) Register Addressing
(Return from Exception) instruction on a general- (c) Register Indirect Scaled Addressing
purpose processor. (d) Base Indexed Addressing

4.4
GATE Wallah CS & IT Topic wise PYQs
Machine Instruction and Addressing Modes

23. [MCQ] [GATE-2008 : 2M] calculated; EA =-(X) is the effective address equal to
Which of the following is/are true of the auto- the contents of location X, with X decremented by
increment addressing mode? one word length before the effective address is
1. It is useful in creating self-relocating code. calculated; EA = (X)- is the effective address equal
2. If it is included in an instruction Set to the contents of location X, with X decremented by
Architecture, then an additional ALU is required one word length after the effective address is
for effective address calculation calculated. The format of the instruction is (opcode,
3. The amount of increment depends on the size of source, destination), which means (destination ←
the data item accessed. source op destination). Using X as a stack pointer,
(a) 1 only (b) 2 only which of the following instructions can pop the top
(c) 3 only (d) 2 and 3 only two elements from the stack, perform the addition
operation and push the result back to the stack.
24. [MCQ [GATE-2008 : 1M] (a) ADD (X)–,(X)
Assume that EA = (X)+ is the effective address equal (b) ADD (X), (X)–
to the contents of location X, with X Incremented by (c) ADD –(X), (X)+
one word length after the effective address is (d) ADD –(X), (X)

❑❑❑

4.5
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

1. (c) 2. (a) 3. (d) 4. (b)


5. (b) 6. (a) 7. (a) 8. (b)
9. (50 to 50) 10. (14 to 14) 11. (32 to 32) 12. (a)
13. (500 to 500) 14. (16 to 16) 15. (a) 16. (d)
17. (16383 to 16383) 18. (d) 19. (c) 20. (d)
21. (d) 22. (d) 23. (c) 24. (a)

1. (c) 0 011 1111 0 101 1100 0010 1110 1111 0100


P = Ox C1800000 Q = Ox 3F5C2EF4 Sign E(8bit) Mantissa (23bit)
1bit 8bit 23bit 1bit
S E M Sign = 0 (+ve) E = 126 bias = 127
Bias = 28−1 − 1 bias = 127 E = 01111110  E = 126 e = E – bias
P = Ox C1800000 BE or E = 126 126 – 127
M = 101 1100 0010 1110 1111 0100.
1 100 0001 1 000 0000 0000 0000 0000 0000
Bias = 127
Sign E(8bit) Mantissa (23bit)
E = e + bias
1bit
e = E – bias
S = 1(–ve)
(–1)s 1.M × 2e
E = 10000011 = 131 (–1)0 1.101 1100 0010 1110 1111 0100 ×2126 – 127
BE or E = 131 Q = (1.101 1100 0010 1110 1111 0100) × 2–1
M = 00000000 Sign = –ve.
S e
(–1) 1.M × 2 P × Q = exponent = (+4) + (–1) = +3.
(–1)1 1.00000000 × 2131 – 127 Mantissa
+4
P = –(1.00000000) × 2 = (1.0000) *(1.101 1100 0010 1110 1111 0100)
BE = AE + bias – (1.101 1100 0010 1110 1111 0100) × 2 + 3
Or Sign = 1(–ve)
E = e + bias e = +3
e = E – bias bias = 127
E = 131 E = e + bias
= 3 + 127
Bias = 127
E = 130
Q = 3F5C2EF4
E = 10000010

4.6
GATE Wallah CS & IT Topic wise PYQs
Machine Instruction and Addressing Modes

e = +3 bias = 127
E or BE = e + bias  3 + 127
= 130
E = 130
(C15C2EF4 H)
M = 10111 → 10000010
2nd Approach.
Alternate Approach. P = C1800000 S = 1 –ve

S = 1(–ve) (–1)S 1.M × 2e


(C15C2EF4)
E = 10000011 (–1)1 1.000000 × 2
BE or E = 131 – 1.000000 × 2+4
M = 00000000 – 10000.00
8–1
Bias = 2 –1 P = – 16
Bias = 127
2. (a)
E = e + bias
– 14.25
e = E – bias
 1110.01
Q = 3F5C2EF4
 –1110.01
 – 1.11001×2+3
sign = 1 [–ve]
S = 0 (+ve) Mantissa [m] = 11001
[AE]e = +3
E = 01111100
BE or E = 126 [BE] E = e + bias
M = 10111000010 … Q = 1.101 1100 × 2 – 1 BE = AE + bias
Bias = 129 = 0.1101 1100 E = 3+bias = 3 +127
E = e + bias Q = 0.8593
E = 130
e = E – bias
BE = 130
(–1)S 1.M × 2e
(–1)0 1.101 1100 0010 1110 × 2126–127 IEE E 754
Single Precision
P * Q = – 16 × .8593
= – (13.75)
P * Q = – 13.75
– 1101.11
 – 1.10111 × 2 + 3 bias = 28–1 –1 = 127

4.7
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

bias = 127 4. (b)

S=1 R1 = 1(a)
R2 = 10(b)
E = 130  10000010 R3 = 20(c)
M : 11001
I R1  R1 + R2 [R1[d] = a + b]
II R1  R3 + R1 [R1[e] = c + d]
III R2  R3 + R1 [R2[f] = c + e]
IV R2  R3 + R1 [R2[b] = c + e]
V R1  R2 + R2 [R1[e] = b + f]
[C1640000H] VI R3  R1 + 5 R3[d] = 5 + e
VII return R2 + R3
Minimum 3 register required
OR
a = 1, b = 10, c = 20
R1  R1 + R2 I. d = a + b
3. (d) R1  R3 + R1 II. e = c + d
When we multiply two 4 bit number then each result R2  R3 + R1 III. f = c + e
is 8 bit R3  R3 + R1 IV b = c + e
4 bit 4 bit R1  R2 + R3 V e=b+f
R3  R1 + 5 VI d = 5 + e
↓ ↓ return R2 + R3 VIII return d + f
Combination 24 24
Total ROM size = 24 × 24 × 8 bit
⇒ 28 × 8 bit
⇒ 28 × 23 bit ⇒ 211 bit
5. (b)
⇒ 2 × 210 bit
Cin
= 2k bits Cout Cin
01001101
+11101001
00110110
Overflow
C in + cout = 1
1+1=0
Carry = 1, Sign = 0
Overflow flag = 0

4.8
GATE Wallah CS & IT Topic wise PYQs
Machine Instruction and Addressing Modes

6. (a) U2 = 4
1. Function, Local parameter, memory Access  U3 = 4
L08 Jmp U4  U4 = L01
Because at L01 condition checking.

7. (a)
A15, A14, A11 is enabled (1), A13 and A12 = 0

9. (50 to 50)

Instruction Semantics Instruction


C800 to CFFF Size
(bytes)
1000 MOV R1, R1  4 R1 = 10
–1003 (5000) I1 MEMORY
[5000]
8. (b) 1004 MOV R2, R2  4 R2 = 50
U1 = 3 multiply by 8. –1007 (R3) I2 MEMORY
U2 = 4 32 bit given [R3] R2
U3 = 4 memory is byte (8bit) Addressable m[3000]
U4 = L01 1008 ADD R2, R2  R1 + 2 R2 = 10 +
–1009 R1 I3 R2, R2 10 50 = R2 =
r1 = 0 r2 = 10
+ 50 60
r3 [a]
1010 MOV (R3), MEMORY 4 M [3000]
r4 [b] –1013 R2 I4 [R3]  R2 = 60
r5  m [b0 + 0] m[3000] 
U1 = 3 R2
M [a0 + 0] 1014 INC R3 I5 R3  R3 + 1, 2 R3 = 3000
U2 = 4
–1015 R3 = 3000+1 + 1  R3
= 3001
U3 = 4
1016 DEC R1 I6 R1  R1 – 1, 2 R1 = 10 –
L03 : Multiple by 8 then left shift by 3 bit –1017 R1 = 10–1 1  R1 =
U3 memory is Byte addressable but size is 32 bit 9
32 bit  4 Byte 1018 BNZ 1004 Branch if not 2 2
–1019 I7 zero to the
given
absolute
address
1020 HALT I8 Stop 1 Go to
1004 (I2)

4.9
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

M [5000] = 10 64 register ⇒ Register AF=6 bit


R3 = 3000
3000 50 60
3001 50 59
3002 50 58
3003 50 57
3004 50 56 Total number of operation in R type = 24 = 16
operatin
3005 50 55
3006 50 54 Assume R type instruciton = x
3007 50 53 Number of free opcode after allocating R type =
3008 50 52 (16 – x)
3009 50 51
3010 50

R2  m [3001] = R 2 = 50

R2  9 + 50 = R 2 = 59
M [R3]  R2 → m [3001] = 59
R3  3001 + 1  R3 = 3002
R1  9 – 1  R1 = 8
2nd Part Total number of operations in I type = free opcode
R2 m [3003] R2 = 50 × 2 increment bit in opcode
R2 R1 + R2  7 + 50  R 2 = 57 = (16 – x) × 26–4 ⇒ (16 – x) × 22
M [3003] = 57 I type (Given) = 8
R3 = 3003 + 1  R 3 = 3004 8 = (16 – x) × 4
2 = 16 – x
Simiarly execute
x = 16 – 2
R1 = 7  R1 = 6
x = 14
In the last R1 = 0 so, m [3010] will not change
[update]

11. (32 to 32)


16 Integer Register (R0, R1 R2….. R15)
IR = 4 bit
10. (14 to 14) 64 floating point Register (F0, F1 ….. F63)
FR = 6 bit
TYPE-1: OP CODE IR IR IR
4 Instruction

4.10
GATE Wallah CS & IT Topic wise PYQs
Machine Instruction and Addressing Modes

Given: Type 3 (given) = 14 Instruction


TYPE-2: OP CODE FR FR Total number of free opcodes after allocating = 16
8 Instruction – 14
64 floating point Register (F0, F1, F2 ……F64) Type 3 instruction = 2 Free opcodes

FR= 6 bit
TYPE -4
Instruction size = 16 bit (2 byte)
TYPE: 3 OP CODE IR FR
Total number of operations in type 4 = Free
14 Instructions
opcode × 2Increment bit in opcode.
TYPE: 4 OP CODE FR (N)  2 × 210–6
N Instructions  2 × 24
 32
TYPE:- 1 N= 32

N bit can perform 2n operation


Total number of operation in type 1 = 24 = 16
operation.
Given = 4 Instruction
12. (a)
Total number of free after allocating type
When Interrupt occur, after completion of current
= 16 – 4 = 12 Instruction.
Interrupt will be serviced, It push the program
TYPE -2 current [PC] value into stack & control transfer to
ISR.
Q: Process finish the current instruction
Total number of operation in type 2: Free opcode execution
P: PUSH the PC value into stack
×2Increment bit in opcode
T: Interrupt → PC
Total number of operations = 12 × 24 – 4 =12×2º = R: Service the Interrupt
12 operation S: Pop the PC value
Type 2 (given) = 8 Instruction (operation)
Total number of free opcode after Allocating type
2 = 12 – 8 = 4 free code

TYPE -3
13. (500 to 500)
64 Register, Instruction set size = 12
Total number of operations in type 3 = free opcode
× 2 Increment bit in opcode 5 Fields = Opcode, Source Reg1, S Reg2, D Reg2, 12
bit Immediate field
 4 × 26 – 4 = 4 ×22
64 Register  Reg. AF = 6bit
= 16 operation/ Instruction

4.11
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

Inst. Set = 12  Op code = 4 bit 15. (a)

Memory operand
Or
Instruction size = 4 + 6 + 6 + 6 +12 Register operand
= 34 bits  5 Byte But Accumulator is a special purpose Register
Program having = 100 Inst.
Program size = 100 × 5B = 500 byte.

16. (d)
Given
Current SP value is = (016E)16
PC = 2 Byte, PSW = 2 byte
(1) Store the current value of PC into the stack.
Here PC is 2 byte so stack pointer to increased
by 2 bytes
(016E)2 + 2 = [0170]16

01 0110 1110 016E


+1
14. (16 to 16) 01 0110 1111 016F
40 Distinct Instruction/operation  OP code = 01 0111 0000 016F
log2 40 = 6 bit +1
OP code = 6 bit 017016

24 Register  Reg.AF= 5bit (2) Store the value of PSW in the stack
Here PSW is also 2 byte long. So SP increased
32 bits Instruction
by 2 byte
(0170)16 + 2 = (0172)16
SP = (0172)16
(3) Load the starting address of the subroutine in
Immediate field = 32 – (5 + 5 + 6) the PC.
= 32 – 16
PC  ISR
Immediate field = 16bit
SP = ( 0172 )16

4.12
GATE Wallah CS & IT Topic wise PYQs
Machine Instruction and Addressing Modes

17. (16383 to 16383) 19. (c)


Instruction size = 1 word = 32 bit CPU will check the interrupt after finishing of
45 Operation  OP code = 6 bit current instant execution.
64 Register  Register = 6 bit If interupts is present
Push PC value

Immediate field = 32 – (6 + 6 + 6)
= 32 – 18 = 14 bit
n bit unsigned Range = 0 to 2n –1
Immediate field Range = 0 to 214 –1 = 16,383

18. (d) 20. (d)


LOAD R1 c; The RFE (Return from Exception) instruction must
LOAD R2 d; be a trap instruction, a privilege instruction and an
ADD R1 R1 R2; R1  R1 + R2  R1 = c + d exception cannot be allowed to occur during
LOAD R2 e; execution of an RFE instruction.
SUB R2, R2, R1; R2  R2 – R1 R2 = e – (c + d)
LOAD R1 a;
LOAD R3 b;
SUB R1, R1, R3; R1  R1 – R3 ; R1 = (a – b)
ADD R3, R1, R2; R3 = (a – b) + [e – (c + d)]
OR
ADD R1 R1 R2; R1  R1 + R2 21. (d)
OR
ADD R2 R1 R2; R2  R1 + R2

X(R1)  M [X + R1]
Minimum 3 Registers required

4.13
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

22. (d) 4. –(X): Pre decrement


LW R1, 20(R2) First Decrement in the X, then Updated
R1  M[20 + R2] Location (After Decrement) X content
(operand) is Fetch.
Read 32 bit word from memory and store into 32
Assume start from 1000
bit Register R.
EA = M [20 +R2]
20: Index

(a) ADD (X)–, X


(X) – : Fetch then Decrement operand
X = 999 M[1000] then X–
23. (c) X = 999  M [999] = 100
In auto-increment addressing mode, the amount of 200 + 100 = 300.
increment depends on the size of the data item ADD
accessed. POP (Top Of Stack)
POP (Top Of Stack)
ALU operation (ADD) push into POP.
200 + 100
STACK: LIFO
24. (a)
(b) ADD X, (X)–
OPCODE source destination.
M [1000] = 200
Destination 
(S1) OPERATION
(S2 ) M [1000] =200
Source destination (c) ADD – (X), (X) +
1. (X)+: Post increment X= First Decrement X = 999
First Fetch the operand (Content) from the M [999] = 100
location X then Increment. X + Post Increment M [999] = 100
2. (X)– Post Decrement (d) ADD –(X), X
First Fetch the operand (Content) from the –X: M [999] = 100
location X. then Decrement in X. M [999] = 100
3. +(X) Pre Increment
First Increment in the X, then Updated Location
[After Increment X content (Operand) is Fetch

❑❑❑

4.14
GATE Wallah CS & IT Topic wise PYQs
Design of springs

CHAPTER

2
Micro-Operation and Micro Program Consider an instruction: R0 ← R1 + R2. The
1. [MCQ] [GATE-2013 : 2M] following steps are used to execute it over the given
Consider the following sequence of micro - data path. Assume that PC is incremented
operations. appropriately. The subscripts r and w indicate read
MBR ← PC and write operations, respectively.
MAR ← X 1. R2r TEMP1R, ALUadd, TEMP2W
PC ← Y 2. R1r TEMP1W,
Memory ← MBR 3. PCr, MARW, MEMr
Which one of the following is a possible operation 4. TEMP2R,R0W
performed by this sequence 5. MDRr, IRW
(a) Instruction fetch
Which one of the following is the correct order of
(b) Operand fetch
execution of the above steps?
(c) Conditional branch
(d) Initiation of interrupt service (a) 3, 5, 1, 2, 4
(b) 2, 1, 4, 5, 3
ALU Data Path (c) 3, 5, 2, 1, 4
2. [NAT] [GATE-2020 : 1M] (d) 1, 2, 4, 3, 5
A multiplexer is placed between a group of 32
registers and an accumulator to regulate data Microprogrammed Control
movement such that at any given point in time the 4. [MCQ] [GATE-2008 : 2M]
content of only one register will move to the Consider a CPU where all the instructions require 7
accumulator, the minimum number of select lines clock cycles to complete execution. There are 140
needed for the multiplexer is _____ instructions in the instruction set. It is found that 125
control signals are needed to be generated by the
3. [MCQ] [GATE-2020 : 1M] control unit. While designing the horizontal micro-
Consider the following data path diagram programmed control unit, single address field format
is used for branch control logic. What is the
minimum size of the control word and control
address register?
(a) 125, 7
(b) 125, 10
(c) 135, 9
(d) 135, 10

4.15
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

RISC and CISC Which of the characteristics above are used in the
5. [MCQ] [GATE-2018 : 1M] design of a RISC processor?
Consider the following processor design (a) I and II only
characteristics: (b) II and III only
I. Register-to-register arithmetic operations only.
(c) I and III only
II. Fixed-length instruction format.
III. Hardwired control unit. (d) I, II and III

❑❑❑

4.16
GATE Wallah CS & IT Topic wise PYQs
ALU and Control Unit

1. (d) 2. (5 to 5) 3. (c) 4. (d)


5. (d)

1. (d)
MBR  PC
MAR  X
PC  Y
Memory  MBR
(a) Instruction fetch
PC → MAR → Memory → MBR → IR
(Mem to CPU (IR))
PC value stored in Memory then PC is updated
(b) Operand fetch
IR (AF) → MAR → Memory → MBR → ALU
(c) Conditional branch
Not store the PC value in the memory
(d) Initiation of Interrupt service 3. (c)
R0 ← R1 + R2
Fetch cyle (Mem to CPU (IR))
PC → M [MAR] → MBR → IR
3. PCR MARw MBRr

2. (5 to 5) 5. MBRr, IRw ; MBR → IR


Multiplexer 2. R1R Tempw ⇒ R1 → Temp1
3 select line then 23 Input line (8 input line) 1. R2R temp1R ALU tempw ; temp2 ← R2 + temp1
log 2 23  or log28 4. Temp2R R0w; R0  Temp2
[23] input line then 3 select line required R0 ⇐ R1 + R2 .
M select line ⇒ 2m line
If 2m line then [log 2m] select line
For 32 input line
# select line = log2(32) = 5

4.17
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

4. (d) CW = 135
Total number of instructions = 140 CAR = 10 bit
Each intruction requires = 7 cycle
Total number of micro operations = 140 × 7
= 980  instn | operations|cw

Control memory =980 CW


5. (d)
NIA|A.F|CAR = [log10 980]

CAR = 10 bit

Horizontal  program
125 CS = 125 bits required

125 + 10 = 135

❑❑❑

4.18
GATE Wallah CS & IT Topic wise PYQs
Design of springs

CHAPTER

3
Basics of Pipelining pipelining force you to operate the pipelined
processor at 2 GHz. In a given program, assume that
1. [NAT] [GATE-2014 : 2M]
30% are memory instructions, 60% are ALU
Consider two processors P1 and P2 executing the
instructions and the rest are branch instructions. 5%
same instructions set. Assume that under identical
of the memory instructions cause stalls of 50 clock
conditions, for the same input, a program running on
cycles each due to cache misses and 50% of the
P2 takes 25% less time but incurs 20% more CPI
branch instructions cause stalls of 2 cycles each.
(clock cycles per instruction) as compared to the Assume that there are no stalls associated with the
program running on P1. If the clock frequency of P1 execution of ALU instructions. For this program, the
is 1GHZ, then the clock frequency of P2 (in GHZ) is speedup achieved by the pipelined processor over
________. the non-pipelined processor (round off to 2 decimal
2. [MCQ] [GATE-2008 : 2M] places) is _________.
In an instruction execution pipeline, the earliest that
the data TLB (Translation Look a side Buffer) can 5. [NAT] [GATE-2018 : 2M]
be accessed is
The instruction pipeline of a RISC processor has the
(a) Before effective address calculation has started
following stages. Instruction Fetch (IF), Instruction
(b) During effective address calculation
Decode (ID), Operand Fetch (OF), Perform
(c) After effective address calculation has Operation (PO) and Writeback (WB) The IF, ID, OF
completed and WB stages take 1 clock cycle each for every
(d) After data cache lookup has completed instruction. Consider a sequence of 100 instructions,
In the PO stage, 40 instructions take 3 clock cycles
Performance Evaluation of Pipeline each, 35 instructions take 2 clock cycles each, and
3. [NAT] [GATE-2023 : 1M] the remaining 25 instructions take 1 clock cycle
Consider a 3-stage pipelined processor having a each. Assume that there are no data hazards and no
delay of 10 ns (nanoseconds), 20 ns, and 14 ns, for control hazards.
the first, second, and the third stages, respectively. The number of clock cycles required for completion
Assume that there is no other delay and the processor of execution of the sequence of instructions is
does not suffer from any pipeline hazards. Also ______.
assume that one instruction is fetched every cycle.
The total execution time for executing 100 6. [NAT] [GATE-2017 : 2M]
instructions on this processor is __________ ns. Instruction execution in a processor is divided into 5
4. [NAT] [GATE-2020 : 2M] stages. Instruction Fetch (IF), Instruction Decode
Consider a non-pipelined processor operating at 2.5 (ID), Operand Fetch (OF), Execute (EX) and Write
GHz. It takes 5 clock cycles to complete an Back (WB). These stages take 5, 4, 20, 10 and 3
instruction. You are going to make a 5-stage pipeline nanoseconds (ns) respectively. A pipelined
out of this processor. Overheads associated with implementation of the processor requires buffering

4.19
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

between each pair of consecutive stages with a delay 10. [NAT] [GATE-2015 : 2M]
of 2 ns. Two pipelined implementations of the Consider a non-pipelined processor with a clock rate
processor are contemplated: of 2.5 gigahertz and average cycles per instruction
(I) a naive pipeline implementation (NP) with 5 of four. The same processor is upgraded to a
stages and (II) an efficient pipeline (EP) where the pipelined processor with five stages; but due to the
OF stage is divided into stages OF1 and OF2 with
internal pipeline delay, the clock speed is reduced to
execution times of 12 ns and 8 ns respectively.
2 gigahertz. Assume that there are no stalls in the
The speedup (correct to two decimal places)
pipeline. The speed up achieved in this pipelined
achieved by EP over NP in executing 20 independent
processor is______.
instructions with no hazards is _____.

7. [MCQ] [GATE-2017 : 1M] 11. [NAT] [GATE-2014 : 2M]


Consider the following processors (ns stands for An instruction pipeline has five stages, namely,
nanoseconds). Assume that the pipeline registers instruction fetch (IF), instruction decode and register
have zero latency. fetch (ID/RF), instruction execution (EX), memory
P1: Four – stage pipeline with stage latencies 1ns, access (MEM), and register write back (WB) with
2 ns, 2 ns, 1 ns. stage latencies 1 ns, 2.2 ns, 2ns ,1 ns, and 0.75 ns,
P2: Four-stage pipeline with stage latencies 1ns, respectively (ns stands for nanoseconds). To gain in
1.5 ns, 1.5 ns, 1.5 ns. terms of frequency, the designers have decided to
P3: Five-stage pipeline with stage latencies 0.5 ns, split the ID/RF stage into three stages (ID, RF1,
1 ns, 1 ns, 0.6 ns. 1 ns. RF2) each of latency 2.2/3 ns. Also, the EX stage is
P4: Five-stage pipeline with stage latencies 0.5 ns, split into two stages (EX1, EX2) each of latency 1
0.5 ns, 1 ns, 1 ns, 1.1 ns. ns. The new design has a total of eight pipeline
Which processor has the highest peak clock stages. A program has 20% branch instructions
frequency? which execute in the EX stage and produce the next
(a) P1 (b) P2 instruction pointer at the end of the EX stage in the
(c) P3 (d) P4 old design and at the end of the EX2 stage in the new
design. The IF stage stalls after fetching a branch
instruction until the next instruction pointer is
8. [NAT] [GATE-2016 : 2M] computed. All instructions other than the branch
Consider a 3 GHz (gigahertz) processor with a three- instruction have an average CPI of one in both the
stage pipeline and stage latenciesτ1, τ2, and τ3 such designs. The execution times of this program on the
that τ1 = 3τ2/4 = 2τ3. If the longest pipeline stage is old and the new design are P and Q nanoseconds,
split into two pipeline stages of equal latency, the
respectively. The value of P/Q is ______.
new frequency is _____ GHZ, ignoring delays in the
pipeline registers.
12. [NAT] [GATE-2014 : 2M]
9. [NAT] [GATE-2016 : 2M] Consider a 6-stage instruction pipeline, where all
The stage delays in a 4-stage pipeline are 800, 500, stages are perfectly balanced. Assume that there is
400 and 300 picoseconds. The first stage (with delay no cycle-time overhead of pipelining. When an
800 picoseconds) is replaced with a functionally application is executing on this 6-stage pipeline, the
equivalent design involving two stages with speedup achieved with respect to non-pipelined
respective delays 600 and 350 picoseconds. The execution if 25% of the instructions incur 2 pipeline
throughput increase of the pipeline is ____ percent. stall cycles is _______.

4.20
GATE Wallah CS & IT Topic wise PYQs
Instruction Pipelining

13. [MCQ] [GATE-2013 : 2M] What is the number of cycles needed to execute the
Consider an instruction pipeline with five stages following loop? For (i = 1 to 2) {I1; I2; I3; I4;}
without any branch prediction: Fetch Instruction (a) 16 (b) 23
(FI), Decode Instruction (DI), Fetch Operand (FO), (c) 28 (d) 30
Execute Instruction (EI) and Write Operand (WO).
The stage delays for FI, DI, FO, EI and WO are 5 ns, 16. [MCQ] [GATE-2008 : 2M]
7 ns, 10 ns, 8 ns and 6 ns, respectively. There are A non-pipelined single cycle processor operating at
intermediate storage buffers after each stage and the 100 MHZ is converted into a synchronous pipelined
delay of each buffer is 1 ns. A program consisting of processor with five stages requiring 2.5 nsec, 1.5
12 instruction I1, I2, I3, ….I12 is executed in this nsec, 2 nsec, 1.5 nsec and 2.5 nsec, respectively. The
pipelined processor. Instruction I4 is the only branch delay of the latches is 0.5 nsec. The speedup of the
instruction and its branch target is I9. If the branch is pipeline processor for a large number of instructions
taken during the execution of this program, the time is:
(in ns) needed to complete the program is (a) 4.5 (b) 4.0
(a) 132 (b) 165 (c) 3.33 (d) 3.0
(c) 176 (d) 328
14. [MCQ] [GATE-2011 : 2M] Pipelining Dependencies
Consider an instruction pipeline with four stages (S1, 17. [NAT] [GATE-2022: 2M]
S2, S3 and S4) each with combinational circuit only. A processor X1 operating at 2 GHz has a standard
The pipeline registers are required between each stage
5-stage RISC instruction pipeline having a base CPI
and at the end of the last stage. Delays for the stage
(cycles per instruction) of one without any pipeline
and for the pipeline registers are as given in the
hazards. For a given program P that has 30% branch
instructions, control hazards incur 2 cycles stall for
Pipe line Registe r (Delay 1 ns)

Pipe line Registe r (Delay 1 ns)

Pipe line Registe r (Delay 1 ns)

Pipe line Registe r (Delay 1 ns)

every branch. A new version of the processor X2


Stage Stage Stage Stage
S1 S2 S3 S4 operating at same clock frequency has an additional
Delay Delay Delay Delay branch predictor unit (BPU) that completely
5 ns 6 ns 11 ns 8 ns
eliminates stalls for correctly predicted branches.
There is neither any savings nor any additional stalls
for wrong predictions. There are no structural
What is the approximate speed up of the pipeline in hazards and data hazards for X1 and X2. If the BPU
steady state under ideal conditions when compared has a prediction accuracy of 80%, the speed up
to the corresponding non-pipeline implementation? (rounded off to two decimal places) obtained by X2
(a) 4.0 (b) 2.5 over X1 in executing P is____________.
(c) 1.1 (d) 3.0
15. [MCQ] [GATE-2009 : 2M]
Consider a 4-stage pipeline processor. The number
18. [NAT] [GATE-2021 : 2M]
of cycles needed by the four instructions I1, I2, I3, A five-stage pipeline has stage delays of
I4 in stages S1, S2, S3, S4 is shown below: 150,120,150,160 and 140 nanoseconds. The
registers that are used between the pipeline stages
S1 S2 S3 S4
have a delay of 5 nanoseconds each.
I1 2 1 1 1
I2 1 3 2 2 The total time to execute 100 independent
I3 2 1 1 3 instructions on this pipeline, assuming there are no
I4 1 2 2 2 pipeline stalls, is ______ nanoseconds.

4.21
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

19. [NAT] [GATE-2021 : 1M] 21. [MCQ] [GATE-2015 : 2M]


Consider a pipelines processor with 5 stages. Consider the following code sequence having five
Instruction Fetch (IF), Instruction Decode (ID), instructions I1 to I5. Each of these instructions has
Execute (EX), Memory Access (MEM), and Write the following format.
Back (WB). Each stage of the pipeline, except the OP Ri, Rj, Rk
Where operation OP is performed on contents of
EX stage, takes one cycle. Assume that the ID stage
registers Rj and Rk and the result is stored in register Ri.
merely decodes the instruction and the register read
𝐈𝟏: ADD R1, R2, R3 𝐈𝟐: MUL R7, R1, R3
is performed in the EX stage. The EX stage takes
𝐈𝟑 : SUB R4, R1, R5 𝐈𝟒 : ADD R3, R2, R4
one cycle for ADD instruction and two cycles for 𝐈𝟓 : MUL R7, R8, R9
MUL instructions. Ignore pipeline register latencies. Consider the following three statements:
Consider the following sequence of 8 instructions: S1: There is an anti-dependence between
ADD, MUL, ADD, MUL, ADD, MUL, ADD, instructions I2 and I5.
MUL S2: There is an anti-dependence between
Assume that every MUL instruction is data- instructions I2 and I4.
dependent on the ADD instruction just before it and S3: Within an instruction pipeline an anti-
every ADD instruction (except the first ADD) is dependence always creates one or more stalls.
data-dependent on the MUL instruction just before Which one of above statements is/are correct?
it. The Speedup is defined as follows: (a) Only S1 is true
(b) Only S2 is true
Execution time without operand forwarding (c) Only S1 and S3 are true
Speedup =
Execution time with operand forwarding (d) Only S2 and S3 are true
The Speedup achieved in executing the given 22. [NAT] [GATE-2015 : 2M]
instruction sequence on the pipelined processor Consider the sequence of machine instructions given
below:
(rounded to 2 decimal places) is _____.
MUL R5, R0, R1
DIV R6, R2, R3
20. [NAT] [GATE-2017 : 2M] ADD R7, R5, R6
Consider a RISC machine where each instruction is SUB R8, R7, R4
exactly 4 bytes long. Conditional and unconditional In the above sequence, R0 to R8 are general purpose
branch instructions use PC-relative addressing mode registers. In the instructions shown, the first register
with Offset specified in bytes to the target location stores the result of the operation performed on the
of the branch instruction. Further the Offset is second and the third registers. This sequence of
always with respect to the address of the next instructions is to executed in a pipelined instruction
processor with the following 4 stages: (1) Instruction
instruction in the program sequence. Consider the
Fetch and Decode (1F), (2) Operand Fetch (OF), (3)
following instruction sequence
Perform Operation (PO) and (4) Write back the
Instr. No Instruction Result (WB). The IF, OF and WB stages take 1 clock
i add R2, R3, R4 cycle each for any instruction. The PO stage takes 1
i+1 sub R5, R6, R7 clock cycle for ADD or SUB instruction, 3 clock
cycles for MUL instruction and 5 clock cycles for
i+2 cmp R1, R9, R10
DIV instruction. The pipelined processor uses
i+3 beq R1, Offset
operand forwarding from the PO stage to the OF
If the target of the branch instruction is i, then the stage. The number of clock cycles taken for the
decimal value of the Offset is ______. execution of the above sequence of instructions is
____.

4.22
GATE Wallah CS & IT Topic wise PYQs
Instruction Pipelining

23. [MCQ] [GATE-2012 : 1M] I3: ADD R1 ← R2 + R3


Register renaming is done in pipelined processors I4: STORE Memory [R4] ← R1
(a) As an alternative to register allocation at BRANCH to Label if /R1 = = 0
compile time
Which of the instruction I1, I2, I3 or I4 can
(b) For efficient access to function parameters and legitimately occupy the delay slot without any other
local variables program modification?
(c) To handle certain kinds of hazards (a) I1 (b) I2
(d) As part of address translation (c) I3 (d) I4

24. [MCQ] [GATE-2010 : 2M]


26. [MCQ] [GATE-2008 : 2M]
A 5-stage pipelined processor has Instruction Fetch
Delayed branching can help in the handling of
(IF), Instruction Decode (ID), Operand Fetch (OF),
control hazards.
Perform Operation (PO) and Write Operand (WO)
stage. The IF, ID, OF and WO stage take 1 clock For all delayed conditional branch instructions,
cycle each for any instruction. The PO stage takes 1 irrespective of whether the condition evaluates to
clock cycle for ADD and SUB instructions ,3 clock true or false,
cycles for MUL instruction, and 6 clock cycles for (a) The instruction following the conditional branch
DIV instruction respectively. Operand forwarding is instruction in memory is executed
used in the pipeline. What is the number of clock (b) The first instruction in the fall through path is
cycles needed to execute the following sequence of
executed
instructions?
(c) The first instruction in the taken path is executed
Instruction Meaning of Instruction
(d) The branch takes longer to execute than any
I0: MUL R2, R0, R1 R2 ← R0*R1
other instruction
I1: DIV R5, R3, R4 R5 ← R3/R4
I2: ADD R2, R5, R2 R2 ← R5 + R2 27. [MCQ] [GATE-2008 : 2M]
I3: SUB R5, R 2, R6 R5 ← R2 - R6 Which of the following are NOT true in a pipelined
(a) 13 (b) 15 processor?
(c) 17 (d) 19 1. Bypassing can handle all RAW hazards.
Common Data for next two questions: 2. Register renaming can eliminate all register
Delayed branching can help in the handling of control carried WAR hazards.
hazards 3. Control hazard penalties can be eliminated by
25. [MCQ] [GATE-2008 : 2M] dynamic branch prediction.
The following code is to run on a pipelined (a) 1 and 2 only
processor with one branch delay slot: (b) 1 and 3 only
I1: ADD R2 ← R7 + R8 (c) 2 and 3 only
I2: SUB R4 ← R5 - R6x (d) 1, 2 and 3

❑❑❑

4.23
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

1. (1.6 to 1.6) 2. (c) 3. (2040 to 2040) 4. (2.15 to 2.18)


5. (219 to 219) 6. (1.49 to 1.52) 7. (c) 8. (3.9 to 4.1)
9. (33 to 34) 10. (3.2 to 3.2) 11. (1.54 to 1.54) 12. (4 to 4)
13. (b) 14. (b) 15. (b) 16. (c)
17. (1.42 to 1.42) 18. (17160 to 17160) 19. (1.87 to 1.88) 20. (–16 to –16)
21. (b) 22. (13 to 13) 23. (c) 24. (b)
25. (d) 26. (a) 27. (b)

1. (1.6 to 1.6)
Total execution time = Number of Instruction × CPI
× cycle time
But in the Question number of Instruction are same
for processor P1 & P2 2. (c)
ETP1 = CPI × cycle time1 Effective adress (EA) →: Actual address of the object
= CPI × 1 nsec.

Clock frequency P1 = 1GHZ


1
Cycle time P1 = sec.  10–9 sec. = 1 nsec.
1G 3. (2040 to 2040)
ETP2 = 1.2 CPI × cycle time2 [20% more CPI] Number of stage (k) = 3.
Number of instruction [n] = 100.
Cycle timeP1 = 1nsec.

ETP2 = 0.75 ETP1 (25% less time that of P1)


0.75 × ETP1 = 1.2 CPI × cycle time2.
Tp = max (stage delay) = max (10, 20, 14)
0.75 × CPI × 1nsec. = 1.2CPI × cycle time2
0.75 CPI = 1.2 CPI × cycle time2 Tp = 20 nsec

0.75 ETpipe = [k + (n – 1)] tp


Cycle time2 = = 0.625 nsec. = [3 + (100 –1)]20 = 102 × 20
1.2
1 ETpipe = 2040 nsec
Clock frequency P2 =
cycle time P2
1
= = 1.6GHZ.
0.625  10 –9

4.24
GATE Wallah CS & IT Topic wise PYQs
Instruction Pipelining

4. (2.15 to 2.18) 5. (219 to 219)


Non pipeline processor: IF, ID, OF, PO, WB
Frequency = 2.5 GHZ Number of stages = 5
1 1 (Number of Inst.) n = 100
Cycle time =  sec.
frequency 2.5 G IF ID OF WB
1 ETPIPE = [k + (n–1)] cycle
 10 –9  0.4 n sec
2.5 ETPIPE without stalls = [5 + (100 –1)] cycle
without forwards
Cycle time = 0.4 nsec
= 104 cycle
ETNON PIPE = CPI × Cycle time = 5 × 0.4 n sec.
40 Inst. Takes 3 clocks cycle
ETPIPE = 2 nsec. 35 Inst. Takes 2 clocks cycle
25 Inst. Takes 1 clocks cycle
Number of stalls = 40 × 2 + 35 ×1 + 25 × 0
= 80 + 35 = 115 cycle
Total number of clock cycle take = 104 + 115
= 219 cycle
Alternate Approch.
WB I1
PO I1
OF I1
Number of Stalls / Inst.= .30 × 0.05 × 50 +.60 × 0 + ID I1
.10× .50 × 2 = .75 + .1
IF I1
n
Number of stalls/Inst = 0.85
IF ID OF PO WB
ETPIPE = (1 + Number of stalls/Inst.) × cycle time
pipe. For 1st Inst. 1+1+1+1+ Po stages
1 1 4
Cycle time pipeline = sec.   10 –9
2G 2 Total times = 1 + 1 + 1 + 1 + (40 × 3 + 35 × 2 + 25 ×
0.5 nsec 1) = 4 + 120 + 70 + 25 = 219
ETPIPE = (1 + 0.85) × 0.5 n sec. = 1.85 ×0.5
ETPIPE  0.925 ns

ET NONPIPE 2
S= = = 2.16
ETPIPE 0.925 6. (1.49 to 1.52)
Naïve pipeline [NP]
5 stages: IF, ID, OF, EX, WB.
K = 5 5ns, 4ns, 20ns, 10ns, 30ns
Buffer Delay = 2ns

4.25
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

Number of Instruction = 20 n = 20 1
Clock Frequency 
Max (stage Delay + Buffer Delay) cycle time
NP P3 having lowest time
tp = max (5, 4, 20, 10, 3) + Buffer Delay = 20 + 2  P3 having highest clock frequency

tp NP = 22 ns
ETNP = [k + n –1] tpNP
 [5 + (20 – 1)] × 22 ns  24 × 22 ns
8. (3.9 to 4.1)
ETNP = 528 nsec 3
1 = 2  2T3
Efficient pipe line [EP] 4
6 Stages IF, ID, OF1 OF2 EX WB 3τ 3τ
τ1 = 2 & 2 = 2τ3
k =6 (5, 4, 12, 8, 10, 3) n sec. 4 4
1 3 6 T2 8
Buffer Delay = 2 nsec. = or =
2 4 8 T3 3
EP
tp = max (stage Delay + Buffer Delay)  1:  2 :  3  6 : 8 : 3
 max (5, 4, 12, 8, 10, 3) + 2ns = 12 + 2 Let x is time
1 = 6x, 2 = 8x, 3 = 3x
tpEP = 14 nsec.
tp = max (6x, 8x, 3x)
n = 20 tp = 8x
EPEP =  k + ( n − 1) tp EP 1 1
Frequency = = Frequency 
 6 + ( 20 − 1)  14 = 25  14 tp 8x
1
ETNP = 350 nsec 3GHz =
8x
Perfomance of EP 1 / ETEP 1
Speed up factor = =  = 24 GHZ
Perfomance of NP 1 / ETNP x
ETEP 528 New Design
= =
ETNP 350

Speed up foctor = 1.508


tp New = 6x
1
FrequencyNew =
tpnew
1 1 1
FrequencyNew =  
6x 6 x
7. (c) 1
  24 GHZ
P1 = tp = max (1ns, 2ns, 2ns, 1ns ) = 2 nsec. 6
P2 = tp = max (1ns, 1.5ns, 1.5ns, 1.5ns ) = 1.5 nsec. FrequencyNew = 4 GHZ.
P3 = tp = max (0.5ns, 1ns, 1ns, 0.6ns, 1ns ) = 1 nsec.
P4 = tp = max (0.5ns, 0.5ns, 1ns, 1ns, 1.1ns ) =1.1
nsec.

4.26
GATE Wallah CS & IT Topic wise PYQs
Instruction Pipelining

9. (33 to 34) 1 1
Cycle time = sec.   10−9 sec.
OLD Design: 2 2
4 Stage, (800, 500, 400, 300) 1 Inst. ET in pipeline = 0.5 nsec.
tpOLD = max (800PS, 500PS, 400PS, 300 PS) Speed up factor =
Performance of pipeline 1 / ETpipeline
tpOLD Desgin = 800 PS =
Performance of Non-pipeline 1 / ETnon-pipeline
Instruction takes = 800PS
ET in non-pipeline tn 1.6
In 1 sec. how many # of Inst. = = =
ET in pipeline tp 0.5
1
tp OLD = = Speed up factor = 3.2
800
New Design:-
5 stages Delay = (600, 350, 500, 400, 300) PS
tp New = max. (600, 350, 500, 400, 300)
tp New = 600PS
Instruction takes = 600 PS 11. (1.54 to 1.54)
Ins.1 sec. how many number of instruction Old Design

1
trp new =
600
New – OLD
% of throughput increase in pipeline = tpOLD = max(1, 2.2, 2, 1, 0.75)
OLD
1 1 1 1 8−6 tpOLD = 2.2n sec. & OLD Design ‘Ex’ Stage
− − 28 2
 600 800  6 8  48 = = Branch Penalty = 3–1
1 1 1 48 6
B.P = 2
800 8 8
1 Branch Frequency = 20%
 = 33.33%
3 Number of stalls / Inst. = .20×2 = 0.4
Average Inst. ETOLD = (1+Number of stalls/Ins.)
tpOLD
 (1 + 0.4) × 2.2
1.4 ×2.2
ETOLD[p] = 3.08 n sec.
10. (3.2 to 3.2)
New Design
Non pipelined processer
I Instruction takes 4 cycle
1 1
Cycle time = sec.  × 10–9 sec.
2.5 G 2.5
Cycle time = 0.4 n sec.
 2.2 2.2 2.2 
In non-pipelined tp new = max 1, , , ,1,1,1,0.75  ns
I Instruction execution time = 4 × 0.4 sec. = 1.6 n sec.
 3 3 3 
Pipelined Processer: tpnew = 1 n sec.
5 stage
Branch Penalty = 6 – 1 = 5
1 Instn takes = 1 cycle
4.27
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

Branch frequency = 20%


Branch required = .20 × 5 = 1
Number of Instn[n] = 8
ETnew Design = (1 + number of stalls/ins.)  tpnew (Out of 12 only 8 Instn executing)
 (1 + 1) × 1 nsec. Number of stall(extra cycles) = 4 –1 = 3
Q = 2n sec. Number of (extra cycle) = 3
Total cycle = 12 + 3 cycle = 15 cycles
P 3.08 ETPIPE = 15 cycle
= =1.54
Q 2 Cycle time (tp) = 11 nsec. = 15 × 11 = 165 nsec.
Alternate approach : By timing Diagram
tp (cycle time ) =11 nsec.
5 stages

12. (4 to 4) Total = 15 cycle


Number of stalls/Inst. = Branch frequency × Branch  15 × 11 nsec.
Penalty  165 nsec.
= .25× 2 = 0.5
ETNP  1 
Speed up factor =  Preformance  
ETpipe  ET 
Pipeline Depath ( number stage )
=
(1 + number stalls/Inst.) 14. (b)
Perfactly balanced S1 = 5 nsec, S2 = 6 nsec, S3 = 11 nsec, S4 = 8 nsec
6 6 P.R/IR or Buffer Delay = 1 nsec.
= = =4
1 + 0.5 1.5 Speed up factor = ?
Execution time (ET) in non-pipeline = 5 + 6 + 11 + 8
ETNon-pipeline = 30 nsec.
Execution time in pipeline tp = max (stage Delay +
Buffer Delay)
13. (b) Max (5 + 1, 6 + 1, 11+1, 8+1)
Max (6, 7, 12, 9)
ETpipe = 12 nsec.
ETnon-pipe 30
Buffer Delay = 1 nsec. Speed up factor = 
ETpipe 12
tp = max (stage Delay + Buffer Delay) = 10 + 1
tp [cycle time] =11nsec. Speed up factor = 2.5

Without stalls
ETPIPE = [K + (n + 1)] cycle.
= [5 + 8 –1)] cycle. = 12 cycle
Total Instn = 12

4.28
GATE Wallah CS & IT Topic wise PYQs
Instruction Pipelining

15. (b) tn 10 nsec.


S= =
S4 I1 I2 I2 tp 3 nsec.
S3 I1 I2 I2 I3 I4
S = 3.33
S2 I1 I2 I2 I2 I3 I4 I4 I1
S1 I1 I1 I2 I3 I3 I4 I1 I1 I2 I3
1 2 3 4 5 6 7 8 9 10

S4 I3 I3 I3 I4 I4 I1 I2 I2 I3 I3
S3 I4 I1 I2 I2 I3 I4 I4 17. (1.42 to 1.42)
S2 I2 I2 I2 I3 I4 I4 X1 : 5 stage RISC pipeline, 30% branch instruction
S1 I3 I4 2GHz clock frequency
11 12 13 14 15 16 17 18 19 20
1
Cycle Time = sec Cycle Time = 0.5ns.
2G
S4 I3 I4 I4
Number stalls/instruction = Branch frequency ×
S3 Branch penalty.
S2 = .70 × 0 + .30 × 2
S1 The number stalls/Instruction = 0.6
21 22 23 24 25

23 cycles.

16. (c)
Non pipe line processor X1: Avg Instn ET = (1 + number of stells/instn) ×
Frequency = 100 MHZ Cycle time
1  (1 + 0.6) × 0.5ns  1.6 × 0.5 ns
Cycle time non-pipe = = 10–8 sec.
100  106
ETX1 = 0.8ns
10
=  10 –8 = 10  10 –9 sec.
10 New version X2 with Branch Prediction.
ETnon-pipe = 10 nsec. Cycle time = 0.5 nsec.

Cycle time  10 nsec.


PIPELINE 5 stages (k = 5)
Tp = 2.5, 1.5, 2, 1.5, 2 nsec.
Buffer/ Latch Delay = 0.5 nsec.
tp = max (stage Delay + Buffer Delay)
= 2.5 + 0.5
Tp = 3 nsec.
When very large Number of Instruction are executed

4.29
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

The number of stalls/instructions = .70 × 0 + .30 × .20 19. (1.87 to 1.88)


× 2 + .30 × .80 × 0
The number of stalls/Instruction = 0.12
X2: Avg instruction ET = (1 + number of
stalls/instruction) × cycle time Number of Instn = 8
= (1 + 0.12) × 0.5  1.12 × 0.5 Operand forwarding: No additional stalls (Extra
cycle) are required to fetch the operand from the
ETX2 = 0.56 previous Instn output.

performance of X 2 1 / ETX 2
SPEED UP = 
performance of X1 1 / ETX1 Or
ETX1 No extra cycle (stall) Due to Data Dependency
0.8
  Each instn takes total 2 cycles 4 MUL Instn takes
ETX2 0.56
1extra cycle
S = 1.42 Stall = 4 × 1 = 4 cycles
ETPIPE = [k + (n –1)] tp + stalls  [5 + (8 –1)] × 1+
4 = 16 cycles
With operand forwards ETPIPE = 16 cycle
Without opened forwarding
Extra cycle (stalls) due to data Dependency
18. (17160 to 17160) Or
Stage Delays = [150, 120, 150, 160, 140 ns] Additional cycle (stall) Required to fetch the opened
Buffer delay = 5ns from the previous Instruction.

n = 100
k = 5 stage

ETPIPE = ( K+ ( n –1) ) tp ‘2 stalls per Instruction’


Total additional stall Due to data dependency
tp = max. (Stage Delay + Buffer Delay (5ns)) = 7 × 2 = 14 cycle
max. [155, 125, 155, 165, 145] ns. Without operand forwarding:
ET pipeline = 16 + 14 = 30 cycle
ETPIPE = [5 + (100 –1)] × 165 nsec.
30
 104 × 165 ×109 sec. S= = 1.875
16
ETPIPE = 17160 ×10–9 sec. Alternate approach:
30
ETPIPE = 17160 nsec. Can do by. Time diagram S = = 1.875
16

4.30
GATE Wallah CS & IT Topic wise PYQs
Instruction Pipelining

20. (–16 to –16) 22. (13 to 13)


n
Each inst is 4 bytes long
Assume start at location 4000

Instr. No Instruction
4000 – 4003 i add R2, R3, R4
4004 – 4007 i +1 sub R5, R6, R7
IF
4008 – 4011 i+2 cmp R1, R9, R10
OF
4012 – 4015 i+3 beq R1, Offset
PO
4016 –
WB
PC = 4016
WB I1
Target address = i
PO I1 I1 I1 I2 I2 I2 I2 I2
PC  denotes the starting address of the next Instn
OF I1 I2 I3
In PC Relative Addressing (AM)  Target address =
Current PC value + OFFSET IF I1 I2 I3 I4
4000 = 4016 + OFFSET 1 2 3 4 5 6 7 8 9 10

OFFSET= –16
WB I2 I3 I4
PO I3 I4
OF I4
IF
21. (b)
OP Ri Rj Rk 11 12 13 14 15 16 17 18 19
Ri  Rj any operation Rk 13 clock cycle.
S1: Anti Dependency between I2 & I5 : Alternate approach:
I2 : MUL R7 R1 R3 : R7  R1 × R3 Stall (extra cycles)
I5 : MUL R7 R8 R9 : R7  R8 × R9 I1 MUL 3 →2
False because its output Dependency.
I2 DIV 5 →4
S2: Anti Dependency between I2 & I4 :
I3 ADD 1 → 6 extra cycle stalls
I2 : MUL R7 R2 R3 : R7  R1 × R3
I4 : ADD R3 R2 R4 : R3  R2 × R4 I4 SUB 1

True n = 4, k = 4
Anti Dependency & output Dependency ET = [K + (n –1)] tp + t stalls
Sol. Register Remaining = [4 + (4 – 1)] ×1 + 6 = 7 + 6= 13 cycle

4.31
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

23. (c) ET = [k + (n –1)]+P + (extra cycle +stalls due to


Register Renaming: Handles Hazards hazards)
 [5 + (4 –1)] ×1 + 7 = 8 + 7= 15 cycle

24. (b)
25. (d)
WB I0
PO I0 I0 I0 I1 I1 I1 I1 I1
OF I0 I1
ID I0 I1 I2 I3
Branch to level if R1 = = 0
IF I 0 I1 I2 I3
(a) I1 : R2  R7 + R8
1 2 3 4 5 6 7 8 9 10 11
Used in I3 result of I1 (R2) used as operand.
(b) I2 : ?
WB I1 I2 I3
R4  R5 – R6
PO I1 I2 I3 I4 is used as memory [R4] for memory store
OF I2 I3 purpose

ID (c) I3 : R1  R2 + R3
Used in Branch to total if R1 = 0
IF
Not proper working
12 13 14 15 16 17 18 19 20 21
(d) I4: memory [R4] store
15 clock cycle
PO
I0 3
I1 6
I2 1
I3 1
OR 26. (a)
Alternate apporach
PO stage Extra cycle
I0 MUL 3 → 2
I1 DIV 6 → 5
7 extra cycle (stalls)
I2 ADD 1
I3 SUB 1
n = 4, number of stage = 5

4.32
GATE Wallah CS & IT Topic wise PYQs
Instruction Pipelining

(a) False 27. (b)


(b) False: (Only branch) 1. By passing (operand forwarding): Handle
(When condition is True) All Raw Hazards: (False)
If I1 Load Instn & I2 Next Instn using the results
of I1 as a operand
2. WAR (ANTI Dep.)  Register Remaining 
True
3. Control Hazards: eliminate by Branch
Predication: (False)
I2 : Next sequential Instruction
Condition Branch Inst. JNZ.
I1 – I101 – I102
→→
Taken Path.

❑❑❑

4.33
GATE Wallah CS & IT Topic wise PYQs
Design of springs

CHAPTER

4
Cache Memory and Cache Organisation Mapping Techniques
1. [NAT] [GATE-2017 : 1M] 4. [NAT] [GATE-2023 : 2M]
Consider a two-level cache hierarchy with L1 and L2 An 8-way set associative cache of size 64 KB (1 KB
caches. An application incurs 1.4 memory accesses = 1024 bytes) is used in a system with 32-bit
per instruction on average. For this application, the address. The address is sub-divided into TAG,
miss rate of L1 cache is 0.1; the L2 cache experiences INDEX, and BLOCK OFFSET. The number of bits
on average, 7 misses per 1000 instructions. The miss in the TAG is ______________.
rate of L2 expressed correct to two decimal places is
______. 5. [NAT] [GATE-2021 : 1M]
2. [MCQ [GATE-2017 : 2M] Consider a computer system with a byte-addressable
In a two-level cache system, the access times of L1 primary memory of size 232 bytes. Assume the
and L2 caches are 1 and 8 clock cycles, respectively. computer system has a direct-mapped cache of size
The miss penalty from the L2 cache to main memory 32 KB (1 KB = 210 bytes), and each cache block is
is 18 clock cycles. The miss rate of L1 cache is twice of size 64 bytes.
that of L2. The average memory access time The size of the tag field is _______bits.
(AMAT) of this cache system is 2 cycles. The miss
rates of L1 and L2 respectively are: 6. [NAT] [GATE-2021 : 1M]
(a) 0.111 and 0.056 (b) 0.056 and 0.111 Consider a set-associative cache of size 2 KB (1 KB
(c) 0.0892 and 0.1784 (d) 0.1784 and 0.0892 = 210 bytes) with cache block size of 64 bytes.
Assume that the cache is byte - addressable and a 32-
3. [MCQ] [GATE-2014 : 2M] bit address is used for accessing the cache. If the
width of the tag field is 22 bits, the associativity of
In designing a computer’s cache system, the cache
the ache is _____.
block (or cache line) size is an important parameter.
Which one of the following statements is correct in
this context? 7. [MCQ] [GATE-2020 : 2M]
(a) A Smaller block size implies better spatial A computer system with a word length of 32 bits has
locality a 16 MB byte-addressable main memory and a 64
KB, 4-way set associative cache memory with a
(b) A Smaller block size implies a smaller cache tag
block size of 256 bytes. Consider the following four
and hence lower cache tag overhead
physical addresses represented in hexadecimal
(c) A Smaller block size implies a larger cache tag
notation.
and hence lower cache hit time
A1 = 0 × 42C8A4, A2 = 0 × 546888, A3 = 0 ×
(d) A Smaller block size incurs a lower cache miss 6A289C, A4 = 0 × 5E4880
penalty
Which one of the following is TRUE?

4.34
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

(a) A1 and A3 are mapped to the same cache set. 12. [MCQ] [GATE-2016 : 2M]
(b) A2 and A3 are mapped to the same cache set. The width of the physical address on a machine is 40
(c) A3 and A4 are mapped to the same cache set. bits. The width of the tag field in a 512 KB 8-way
(d) A1 and A4 are mapped to different cache sets. set associative cache is ______bits.
(a) 24 (b) 20
8. [MCQ] [GATE-2019 : 1M] (c) 30 (d) 40
A certain processor uses a fully associative cache of
size 16 kB. The cache block size is 16 bytes. Assume 13. [MCQ] [GATE-2015 : 1M]
that the main memory is byte addressable and uses a Consider a machine with a byte addressable main
32-bit address. How many bits are required for the memory of 220 bytes, block size of 16 bytes and a
Tag and the Index fields respectively in the direct mapped cache having 212 cache lines. Let the
addresses generated by the processor? addresses of two consecutive bytes in main memory
(a) 24-bits and 0-bits (b) 28-bits and 4-bits be (E201F)16 and (E2020)16. What are the tag and
(c) 24-bits and 4-bits (d) 28-bits and 0-bits cache line address (in hex) for main memory address
(E201F)16?
(a) E, 201 (b) F, 201
9. [MCQ] [GATE-2018 : 2M]
(c) E, E20 (d) 2, 01F
The size of the physical address space of a processor
is 2p bytes. The word length is 2wbytes. The capacity
14. [NAT] [GATE-2014 : 1M]
of cache memory is 2n bytes. The size of each cache
A 4-way set-associative cache memory unit with a
block is 2m words. For a K-way set-associative cache
capacity of 16 KB is built using a block size of 8
memory, the length (in number of bits) of the tag
words. The word length is 32 bits. The size of the
field is
physical address space is 4 GB. The number of bits
(a) P – N – log2 K
for the TAG field is _______.
(b) P – N + log2 K
(c) P – N – M – W – log2 K
15. [MCQ] [GATE-2014 : 2M]
(d) P – N – M – W + log2 K
If the associativity of a processor cache is doubled
while keeping the capacity and block size
10. [MCQ] [GATE-2017 : 2M] unchanged, which one of the following is guaranteed
Consider a machine with a byte addressable main to be NOT affected?
memory of 232 bytes divided into blocks of size 32 (a) Width of tag comparator
bytes. Assume that a direct mapped cache having (b) Width of set index decoder
512 cache lines is used with this machine. The size (c) Width of way selection multiplexer
of the tag field in bits is _____. (d) Width of processor to main memory data bus
(a) 12 (b) 16
(c) 18 (d) 24 16. [MCQ] [GATE-2013 : 1M]
In a k-way set associative cache, the cache is divided
11. [NAT] [GATE-2017 : 2M] into v sets, each of which consists of k lines. The
A cache memory unit with capacity of N words and lines of a set are placed in sequence one after
block size of B Words is to be designed. If it is another. The lines in set s are sequenced before the
designed as a direct mapped cache, the length of the lines in set (s + 1). The main memory blocks are
TAG field is 10 bits. If the cache unit is now numbered 0 onwards. The main memory block
designed as a 16-way set-associative cache, the numbered ‘j’ must be mapped to any one of the
length of the TAG field is _____ bits. cache lines from

4.35
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

(a) (j mod v) * k to (j mod v) * k + (k-1) 20. [MCQ] [GATE-2008 : 2M]


(b) (j mod v) * (j mod v) + (k-1) Which of the following array elements has the same
(c) (j mod k) to (j mod k) + (v-1) cache index as APR [0][0]?
(d) (j mod k) * v (j mod k) * v+ (v-1) (a) ARR [0][4] (b) ARR [4][0]
(c) ARR [0][5] (d) ARR [5][0]
Common Data for next two questions:
Consider a computer with a 4-ways set-associative mapped 21. [MCQ] [GATE-2008 : 2M]
cache of the following characteristics: a total of 1 MB of The total size of the tags in the cache directory is
main memory, a word size of 1 byte, a block size of 128 (a) 32 kbits (b) 34 kbits
words and a cache size of 8 KB. (c) 64 kbits (d) 68 kbits

17. [MCQ] [GATE-2008 : 2M] Cache Replacement Techniques


While accessing the memory location 0C795H by 22. [MSQ] [GATE-2022 : 2M]
the CPU, the contents of the TAG field of the Consider a system with 2 KB direct mapped data
corresponding cache line is
cache with a block size of 64 bytes. The system has
(a) 000011000 (b) 110001111
a physical address space of 64 KB and a word length
(c) 00011000 (d) 110010101
of 16 bits. During the execution of a program, four
data words P, Q, R, and S are accessed in that order
18. [MCQ] [GATE-2008 : 2M]
10 times (i.e., PQRSPQRS…). Hence, there are 40
The number of bits in the TAG, SET and WORD
accesses to data cache altogether. Assume that the
fields, respectively are:
data cache is initially empty and no other data words
(a) 7, 6, 7 (b) 8, 5, 7
are accessed by the program. The addresses of the
(c) 8, 6, 6 (d) 9, 4, 7
first bytes of P, Q, R, and S are 0xA248, 0xC28A,
0xCA8A, and 0xA262, respectively. For the
Common Data for next tree questions:
execution of the above program, which of the
Consider a machine with a 2-way set associative data cache
following statements is/are TRUE with respect to
of size 64Kbytes and block size 16 bytes. The cache is
managed using 32 bit virtual addresses and the page size is the data cache?
4 Kbytes. A program to be run on this machine begins as (a) Every access to S is a hit.
follows: (b) Once P is brought to the cache it is never
double ARR [1024] [1024] evicted.
Int i, j; (c) At the end of the execution only R and S reside
/ * Initalize array ARR to 0.0 */ in the cache.
for (i = 0; i < 1024; i ++) (d) Every access to R evicts Q from the cache.
for (j = 0; j < 1024; j++) 23. [NAT] [GATE-2017 : 2M]
ARR [i] [j] = 0.0; Consider a 2-way set associative cache with 256
The size of double 8 bytes. Array ARR is in memory blocks and uses LRU replacement. Initially the
starting at the beginning of virtual page 0×FF000 and cache is empty. Conflict misses are those misses
stored in row major order. The cache is initially empty and which occur due to contention of multiple blocks for
no pre-fetching is done. The only data memory references the same cache set. Compulsory misses occur due to
made by the program are those to array ARR. first time access to the block. The following
19. [MCQ] [GATE-2008 : 2M] sequence of accesses to memory blocks (0, 128, 256,
The cache hit ratio for this initialization loop is 128, 0, 128, 256, 128, 1, 129, 257, 129, 1, 129, 257,
(a) 0% (b) 25% 129) is repeated 10 times. The number of conflict
(c) 50% (d) 5% misses experienced by the cache is ______.

4.36
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

24. [MCQ] [GATE-2014 : 2M] 27. [NAT] [GATE-2022 : 1M]


An access sequence of cache block address of length A cache memory that has a hit rate of 0.8 has an access
N and contains n unique block addresses. The latency 10 ns and miss penalty 100 ns. An optimization
number of unique block addresses between two is done on the cache to reduce the miss rate.
consecutive accesses to the same block address is However, the optimization results in an increase of
bounded above by 𝓀-. What is the miss ratio if the cache access latency to 15 ns, whereas the miss
access sequence is passed through a cache of penalty is not affected. The minimum hit rate
associativity A ≥ 𝓀 exercising least-recently used (rounded off to two decimal places) needed after the
replacement policy? optimization such that it should not increase the
average memory access time is _____________.
n 1
(a) (b)
N N 28. [MCQ] [GATE-2021 : 2M]
1 k Assume a two-level inclusive cache hierarchy. L1
(c) (d) and L2, where L2 is the larger of the two. Consider
A n
the following statements.
S1: Read misses in a write through L1 cache do not
25. [MCQ] [GATE-2009 : 2M]
result in write backs of dirty lines to the L2.
Consider a 4-way set associative cache (initially
S2: Write allocate policy must be used in
empty) with total 16 cache blocks. The main
conjunction with write through caches and no-write
memory consists of 256 blocks and the request for
allocate policy is used with write back caches.
memory blocks is in the following order:
Which of the following statements is correct?
0, 255, 1, 4, 3, 8, 133, 159, 216, 129, 63, 8, 48, 32, (a) S1 is true and S2 is true.
73, 92, 155 (b) S1 is true and S2 is false.
Which one of the following memory block will NOT (c) S1 is false and S2 is true.
be in cache if LRU replacement policy is used? (d) S1 is false and S2 is false.
(a) 3 (b) 8
29. [NAT] [GATE-2020 : 1M]
(c) 129 (d) 216
A direct mapped cache memory of 1 MB has a block
size of 256 bytes. The cache has an access time of 3
Cache Updation Policy
ns and a hit rate of 94%. During a cache miss, it takes
20 ns to bring the first word of a block from the main
26. [MSQ] [GATE-2022: 1M]
memory, while each subsequent word takes 5 ns.
Let WB and WT be two set associative cache
The word size is 64 bits. The average memory access
organizations that use LRU algorithm for cache
time in ns (round off to 1 decimal place) is _____.
block replacement. WB is a write back cache and
30. [NAT] [GATE-2019 : 2M]
WT is a write through cache. Which of the following
A certain processor deploys a single-level cache.
statements is/are FALSE? The cache block size is 8 words and the word size is
(a) Each cache block in WB and WT has a dirty bit. 4 bytes. The memory system uses a 60-MHz clock.
(b) Every write hit in WB leads to a data transfer To service a cache miss, the memory controller first
from cache to main memory. takes 1 cycle to accept the starting address of the
(c) Eviction of a block from WT will not lead to block, it then takes 3 cycles to fetch all the eight
data transfer from cache to main memory. words of the block, and finally transmits the words
of the requested block at the rate of 1 word per cycle.
(d) A read miss in WB will never lead to eviction
The maximum bandwidth for the memory requested
of a dirty block from WB.

4.37
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

block at the rate of 1 word per cycle. The maximum The smallest cache size required to ensure an
bandwidth for the memory system when the average read latency of less than 6 ms is ______MB.
program running on the processor issues a series of
read operations is _______×106 bytes/sec. 33. [NAT] [GATE-2015 : 1M]
31. [NAT] [GATE-2017 : 2M] Assume that for a certain processor, a read request
The read access times and the hit ratios for different
takes 50 nanoseconds on a cache miss and 5
caches in a memory hierarchy are as given below:
nanoseconds on a cache hit. Suppose while running
cache Read access time (in Hit ratio a program, it was observed that 80% of the
nano seconds) processor’s read requests result in a cache hit. The
I-cache 2 0.8 average read access time in nanoseconds is _______.
D-cache 2 0.9
L2- 8 0.9
cache 34. [NAT] [GATE-2014 : 2M]
The memory access time is 1 nanosecond for a read
The read access time of main memory is 90 operation with a hit in cache, 5 nanoseconds for a
nanoseconds. Assume that the caches use the
read operation with a miss in cache, 2 nanoseconds
referred word-first read policy and the write back
for a write operation with a hit in cache and 10
policy. Assume that all the caches are direct mapped
nanoseconds for a write operation with a miss in
caches. Assume that the dirty bit is always 0 for all
cache. Execution of a sequence of instructions
the blocks in the caches. In execution of a program,
involves 100 instruction fetch operations, 60
60% of memory reads are for instruction fetch and
memory operand read operations and 40 memory
40% are for memory operand fetch. The average
read access time in nanoseconds (up to 2 decimal operand write operations. The cache hit-ratio is 0.9.
places) is ________. The average memory access time (in nanoseconds)
32. [NAT] [GATE-2016 : 2M] in executing the sequence of instructions is _____.
A file system uses an in-memory cache to cache disk
blocks. The miss rate of the cache is shown in the Common Data for next two questions:
figure. The latency to read a block from the cache is A Computer has a 256 Kbyte, 4-way set associative, write
1 ms and to read a block from the disk is 10 ms. back data cache with block size of 32 Bytes. The processor
Assume that the cost of checking whether a block sends 32-bit addresses to the cache controller. Each cache
exists in the cache is negligible. Available cache tag directory entry contains, in addition to address tag, 2
sizes are in multiples of 10 MB. valid bits, 1 modified bit and 1 replacement bit.

35. [MCQ] [GATE-2012 : 2M]


The number of bits in the tag field of an address is
(a) 11 (b) 14
(c) 16 (d) 27

36. [MCQ] [GATE-2012 : 2M]


The size of the cache tag directory is
(a) 160 Kbits
(b) 136 Kbits
(c) 40 Kbits
(d) 32 Kbits

4.38
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

37. [MCQ] [GATE-2011 : 2M] 40. [MCQ] [GATE-2008 : 2M]


An 8KB direct-mapped write back cache is For inclusion to hold between two cache level L1
organized as multiple blocks, each of size 32 bytes. and L2 in a multilevel cache hierarchy, which of the
The processor generates 32-bit addresses. The cache following are necessary?
controller maintains the tag information for each 1. L1 must be a write-through cache
cache block comprising of the following. 2. L2 must be write-through cache
1Valid bit 3. The associativity of L2 must be greater that of
1Modified bit L1
4. The L2 cache must be at least as large as the L1
As many bits as the minimum needed to identify the
cache
memory block mapped in the cache.
(a) 4 only (b) 1 and 4 only
What is the total size of memory needed at the cache
(c) 1,2 and 4 only (d) 1,2,3 and 4
controller to store meta-data (tags) for the cache?
(a) 4864 bits (b) 6144bits
Main Memory
(c) 6656 bits (d) 5376 bits
41. [MCQ] [GATE-2023 : 2M]
Common Data for next two questions: A 4 kilobyte (KB) byte-addressable memory is
A Computer system has an L1 and L2 cache, an L2 cache, realized using four 1 KB memory blocks. Two input
and a main memory unit connected as shown below. The address lines (IA4 and IA3) are connected to the chip
block size in L1 cache is 4 words. The block size in L2 select (CS) port of these memory blocks through a
cache is 16 words. The memory access times are 2 decoder as shown in the figure. The remaining ten
nanoseconds, 20 nanoseconds and 200 nanoseconds for L1 input address lines from IA11–IA0 are connected to
the address port of these blocks. The chip select (CS)
cache, L2 cache and main memory unit respectively.
is active high

38. [MCQ] [GATE-2010 : 2M]


When there is a miss in L1 cache and a hit in L2
cache, a block is transferred from L2 cache to L1
cache. What is the time taken for this transfer?
(a) 2 nanoseconds (b) 20 nanoseconds
(c) 22 nanosecond (d) 88 nanoseconds The input memory addresses (IA11–IA0), in
decimal, for the starting locations
39. [MCQ] [GATE-2010 : 2M] (Addr=0) of each block (indicated as X1, X2, X3, X4
When there is a miss in both L1 cache and L2 cache, in the figure) are among the
first a block is transferred from main memory to L2 options given below. Which one of the following
cache, and then a block is transferred from L2 cache options is CORRECT?
to L1 cache. What is the total time taken for these (a) (0, 1, 2, 3)
transfers? (b) (0, 1024, 2048, 3072)
(a) 222 nanoseconds (b) 888 nanoseconds (c) (0, 8, 16, 24)
(c) 902 nanoseconds (d) 968 nanoseconds (d) (0, 0, 0, 0)

4.39
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

42. [NAT] [GATE-2018 : 1M] 45. [MCQ] [GATE-2013 : 2M]


A 32-bit wide main memory unit with a capacity of A RAM chip has capacity of 1024 words of 8 bits
1 GB is built using 256 M ×4-bit DRAM chips. The each (1K × 8). The number of 2 × 4 decoders with
number of rows of memory cells in the DRAM chip enable line needed to construct a 16 K×16 RAM
is 214. The time taken to perform one refresh from 1K × 8 RAM is
operation is 50 nanoseconds. The refresh period is 2 (a) 4
milliseconds. The percentage (Rounded to the (b) 5
closest integer) of the time available for performing (c) 6
the memory read/write operations in the main (d) 7
memory unit is________.
46. [MCQ] [GATE-2010 : 2M]
43. [NAT] [GATE-2016 : 1M] A main memory unit with a capacity of 4 megabytes
A processor can support a maximum memory of 4 is built using 1M × 1-bit DRAM chips. Each DRAM
GB, where the memory is word-addressable (a word chip has 1K rows of cells with 1K cells in each row.
consists of two bytes). The size of the address bus of The time taken for a single refresh operation is 100
the processor is at least ______bits. nanoseconds. The time required to perform one
refresh operation on all the cells in the memory unit
44. [NAT] [GATE-2014 : 2M] is
Consider a main memory system that consists of 8 (a) 100 nanoseconds
memory modules attached to the system bus, which (b) 100*210 nanoseconds
is one word wide. When a write request is made, the (c) 100*220 nanoseconds
bus is occupied for 100 nanoseconds (ns) by the (d) 3200*220 nanoseconds
data, address, and control signals. During the same
100 ns, and for 500 ns thereafter, the addressed 47. [MCQ] [GATE-2009 : 1M]
memory module executes one cycle accepting and How many 32K × 1 RAM chips are needed to
storing the data. The (internal) Operation of different provide a memory capacity of 256 K-bytes?
memory modules may overlap in time, but only one (a) 8
request can be on the bus at any time. The maximum (b) 32
number of stores (of one word each) that can be (c) 64
initiated in 1 millisecond is _____. (d) 128

❑❑❑

1. (0.05 to 0.05) 2. (a) 3. (d) 4. (19 to 19)


5. (17 to 17) 6. (2 to 2) 7. (b) 8. (d)
9. (b) 10. (c) 11. (14 to 14) 12. (a)
13. (a) 14. (20 to 20) 15. (d) 16. (a)
17. (a) 18. (d) 19. (c) 20. (b)

4.40
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

21. (d) 22. (a, b, d) 23. (76 to 76) 24. (a)


25. (d) 26. (a, b, d) 27. (0.85 to 0.85) 28. (d)
29. (13.3 to 13.5) 30. (160 to 160) 31. (4.72 to 4.72) 32. (30 to 30)
33. (14 to 14) 34. (1.68 to 1.68) 35. (c) 36. (a)
37. (d) 38. (c) 39. (c) 40. (a)
41. (c) 42. (59 to 60) 43. (31 to 31) 44. (10000 to 10000)
45. (b) 46. (b) 47. (c)

1. (0.05 to 0.05) L1 cache access time = 1 clock cycle


1 instruction takes = 1.4 memory access (reference) L2 cache access time = 8 clock cycle
1000 instruction takes = 1.4 × 1000 = 1400 memory Mm access = 18 clock cycle
reference (access) Tavg = 2 clock cycle
Total memory reference (access) = 1400 Assume,
L1 miss rate = 0.1 L2 miss rate = x
L1 = 2x
Tavg = Hit time L1 + (miss rate L1)
[(Hit time L2 + miss rate L2) (mm access time)]
2 = 1 + 2x [8 + x (18)]
1 = 1 + 2x[8 + 18 x]
7 1
L2 Miss rate = = = 0.05 1 = 1 + 16 x+ 36x2
140 20
Alternate approach: 36x 2 + 16x –1 = 0
Number of misses occurred in a = 36
7 1 b = 16
L2 = = = 0.005
1400 200 c = –1
Miss rate of L2 = Total misses occured in L 2 − b  b 2 − 4ac
Miss rate of L1
2a
0.005
=
0.1
= 0.05 −16  256 – 4  36  ( –1)

2  36
−16  256 + 144

72
−16  400
⇒ [miss rate cannot –v]
72
−16 + 400
2. (a) ⇒
72
L1 miss rate twice that of L2
−16 + 20 4 1
⇒ = = = 0.0556
72 72 18
x = 0.0556

4.41
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

L2 miss rate (x) = 0.0556  0.056


L1 = miss rate = 2 × 0.056 = 0.1112  0.111

32
Number of lines = =8
4
Case III: Block size = 16 byte
3. (d)
(a) Incorrect because:
Spatial locality: Adjacent word (Data)
If block size 4B

32
Number of lines = =2
16
But in fully associative :
Case I : Block size = 2bytes

Mm to cm complete block transferred


If block size 8 B

Case II: Block size = 4 byte

Case III: Block size = 16 byte

(b) Incorrect because


PA = 10 bits, cm = 32 bytes, Block size = 2 bytes
Case I:
Direct mapping
(c) incorrect
(d) correct because
Smaller block size  then lower miss penalty
If B.S = 2 words or 2 bytes ⇒ only 2 W or 2 bytes
32 bring from mm to cm
Number of lines = = 16 If B.S = 16 words or 16 byte ⇒ 16 W or 16 byte
2
mm to cm ie miss peanalty increase.
Case II: block size = 4Byte

4.42
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

Tag = 32–13 = 19bit.


4. (19 to 19)
8 way set Associative.
Cache size = 64kB,
Physical address = 32 bit
Cache size = 64 KB = 216 Byte.
Direct Mapping:
5. (17 to 17)
Main memory size = 232 Byte
Cache size = 32 KB
Block size = 64 byte
Direct mapped cache:
Mm size 232 byte = physical address (P.A) = 32 bit
But In Question 8 way set Associative.
Cache size = 32 kB = 215 Byte
Log2(8) = 3bit
Block size = 64 Byte ⇒ word offset = log2(64) = 6 bits
8 Way set Associative
CM size 215 B 9
Number of Lines: = = =2
Block size 26 B
Line offset = log2(29)
9 bits
TAG = 32–13 = 19bit Direct mapped: -
Alternate approach
8 Way set Associative
Cache size = 64 KB
Physical address = 32bit
Assume Block size = 1 Byte
TAG = 32 – (9 + 6) = 32 – 15 = 17 bits
Number of lines = CM size
Block size

=
64KB
1B
( )
= 64K 216 Line

Line off set = log2 216 = 16 bit


6. (2 to 2)
Physical address (P.A) = 32 bit
Cache size = 2 KB = 211 Bytes
Block size = 64 byte = 26 byte
8 Way set associative Tag field = 22 bit
16
#Lines 2 Number of sets = 24 = 16 set
Number of set = = = 213
N − ways 23 Set offset (SO) = 4
Set offset = 13 bit Word offset = [log2 block size] ⇒ [log2 64]

4.43
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

Set offset (S.O) = 6 bit


Word offset = 6 bit
Set associative cache.

cm Size
# LINES [#CM blocks] =
Block size
211 B
⇒ 6
= 25 = 32 lines
2 B
#Lines A1 and A4 mapped to the same cache set
# Set = A2 and A3 are mapped to the same cache set
N − way
32
16 =
N − way
32
N-way = =2
16
8. (d)
Physical address = 32 bit
Cache size = 16 kB
Block size = 16 byte
7. (b) Word offset = [log2 block size] ⇒[log216]
Main memory = 16 MB = 224 byte Word offset = 4bit
16
Cache size = 64 MB = 2 Byte Fully associative cache
Block size = 256 B = 28 Byte
4 way set associative
A1 : Ox 42C 8A4
A2 Ox 546 888
A3: Ox 6A2 89C
TAG: 28 bit word offset = 4 bit
A4: Ox 5E4 880
Index: 0 bit
Physical address = 24 bit
Word offset = log2 (block size)
= [log2 256] ⇒ 8 bit

Word Offset = 8 bit


CM size 216 B 8 9. (b)
Number of line = = =2 P.A.S = 2p byte , word length = 2w byte
Block size 28 B
Cache = 2n byte
Number of Set = Number of LINES = 256 = 64 Block size = 2m words
No − way 4 K-way set associative

4.44
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

Block size = 2m word (each word size = 2w byte)


⇒ 2m × 2w Byte
Block size=2m+w Byte

Word offset =m+w Bits TAG = 32 – (9 + 5)


CM size 2n TAG = 32 – 14
Number of LINEs = =
Block size 2m+ w TAG = 18 bit
N–M–W
Number of Lines 2
Number of Set = =
N − way K
 2 N–M–W 
Set offset ⇒ log2 (# SET) ⇒ log2(  
 K
 
⇒ Log2(2(N––M–W) – log2k
11. (14 to 14)
⇒ N – M – W – log 2 k
Direct mapped cache

16 ways set associative (assume log = n bit)


TAG : = P.A – (S.O + W.O)
N
⇒ P – ( N – M – W – log 2 k + M + W ) Number of Line  B 
Number of set = = 
N ways  16 
TAG  P – N + log 2 k
 
 N
 
 
Set offset = log 2  B 
 16 
 
 
10. (c) ⇒ log 2  N  – log 2 16 ⇒ log 2  N  – 4
MM size of 232 byte B B
N N
Physical address (P.A) = 32 bit 10 + log 2 + log 2 B = n + log 2   – 4 + log 2 B
B B
Block size = 32 bytes
10 = n – 4 ⇒ n = 14 bit
Word offset = [log232]
TAG = 14 bit
Word offset = 5 bit
Alternate Approach
Number of lines = 512; direct mapped cache Direct mapped cache:
Line offset (L.O) = [log2 512] TAG Line offset Word offset
Line Offset = 9 bit 10 bits
16 way set associate cache:
TAG (10 bit) Set offset Word offset

4.45
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

16-way set associate 1 set we can store 16 way (16


mm block), so in index now we require 4 less bits
compare to direct mapped
Now, TAG = 10 + 4
TAG = 14 bits 12. (a)
Proof. Physical address (P.A) = 40 bits
MM = 1 MB cache size – 16 kB, block size = 128 byte Cache size = 512 KB = 219
PA = 20 bit 8-way set associative
Direct method cache P.A = TAG + set offset + word offset
Word offset = [log2128] = 7 bit TAG + set offset + word offset = 40bits …..(1)
14
CM size 16 kB 2 Cache = Number of sets × block per set × block size
Number of LINES = = =
block size 128 B 27 512 KB = Number of Sets × 8 × block size
7
= 2 lines [216] KB = Number of sets × block size
L.O = 7 bit
set offset + word offset = 16 bit ……(2)

16-way set associate cache


TAG + set offset + word offset = 40 bit
Number of set = Number of line
N − ways TAG + 16 bits = 40 bits
27 TAG = 40 – 16
= = 23 set = 8 set
2 (16ways )
4
TAG = 24 bits
set offset = 3bit

13. (a)
Main memory = 220 byte
cache size = N word, block size = B word Block size = 16 bytes
Tag = 10 bits Number of lines = 212
Word offset = log2B
Mm size = 220 Byte
CM size N
Number of line = = P.A = 20 bit
Block size B
Word offset = [log216]
L.O = log2  N 
B Word offset = 4bit
# lines = 212

Line offset = 12bit

4.46
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

Direct mapped cache: CM Size


Number of lines =
Block size
16kB 214 B 9
= = = 2 lines
32B 25 B
Number of set = Number of LINE
N − way

29
= 2
= 27 = 128 set
2
Set offset = 7bit

Note: A : 10
B : 11
C : 12 TAG = 32 – (7 + 5)
D : 13 TAG = 20 bit
E : 14
F : 15
Main memory address = (E2021F)16

E 201 F 15. (d)


4 bits 12 bits 4 bits PA = 20 bit (mm = 1MB), cach size = 16 MB, Block
Tag = E size = 64 byte
Cache line number = 201 Direct mapping:

14. (20 to 20) CM size 14


32 Number of line = = 16kB = 2 = 28
Main (physical) memory = 4GB = 2 Byte Block size 64B 26
Cache size = 16 kB 2-way set associative;
Word size = 32 bits
Block size = 8 words
4-way set associative
Physical address = 32 bits
1 word size 32 bit= 4 byte
Block size = 8 words Lines 28
Number of set = = = 27
= 8 × 4 bytes N − way 2 1
= 32 bytes
Word offset = [log232] = 5 bit

4.47
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

Each set containt 8 lines


K=8
First line of any set
Set 0 = 0 × 8 = 0
4 way set associative
Set 1 = 1 × 8 = 8
8
Lines 2 Set 2 = 2 × 8 = 16
Number of set = = 2 = 26
N − way 2 Set number × k

0 to 7
0 × 8 to 0 × 8 + (8 – 1)
1 × 8 to 1 × 8 + (8 – 1)
8 to 15
16. (a)
Number of sets = each set consit ‘k’ lines
Set associative mapping
K mod S or mm request (mod) # set
(J mod V)
Find first Line of any set
Each set contiant 4 lines (k = 4)
Set 0 ⇒ 0 × 4 = 0
Set 1 = 1 × 4 = 4
Set 2 = 2 × 4 = 8
Set 3 = 3 × 4 = 12 (j mod v) × k to
(j mod v) × k + (k – 1)
For first Line, set number × k
0 × 4 to 0 × 4 + (4 – 1) = 0 to 3
1 × 4 to 1 × 4 + (4 – 1) = 4 to 7x × k to x × k + (k – 1)

17. (a)

4.48
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

18. (d)
Main memory size = 1 MB (220 B)
Cache size = 8 KB
1 word size = 1 byte
Block size = 128 words
Block size = 128 ×1 B = 128 bytes
Word offset = log2 128 = 7 bits
cm size 8 KB
Number of lines = = = 26
Block size 128 B
26 20. (b)
Number of sets = 2 = 24 = 16 sets
2 Page size = 4 KB
Set offset = log2 16 = 4 Page offset(d) = [log2 page size] = [log2 4KB]
d = 12 bit
Start address = (FF000)16
V.A = 32 bit

TAG SET OFFSET WORD OFFSET


Tag = 20 – (4 + 7) = 9 bits
17 bit 11 bit 4 bit

1413121110 9 8 7 6 5 4 3 21 0
11 bit

Paging V.A = 32 bit


19. (c)
Block size = 16 byte Page Number Page Offset (d)
Double = 8 byte 20 bit 12 bit
i.e each block cache [cache block] contain 2 element
APR[0][0] FF000
of the Array

Given
I element of the Array  Start from (FF000)16[Page
st
Row major = Accessing in serial order APR [0, 0]
Number] and Page offset (000)F
[0, 1] [0, 2] [0, 3]
(FF000 000)
For i = 0  j = 0 to 1023 (1024 times)
APR[0][0] =
For i = 1  j = 0 to 1023 (1024 times)
Frist element of array

Row Major

4.49
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

APR[0][4] = 5th element of the Ist row, i.e [0] [0], [0] Match APR [4] [0]
[1], [0, 2] [0, 3]
Ist row 5th element (i.e already 4 element passed)
4 × 8 = 32

21. (d)
2 way set associative, block size = 16 B, VA = 32 bit
Cache size = 64 KB
Block size = 16 Byte
Row Major Block offset = [log2 block size] = log216 = 4 bit
APR[4][0] = Ist element of the 5th row [Row No 0, 1,
cm size 64 KB
2, 3] Number of lines = =
block size 16B
And each row 1024 element
4 × 1024 × 8 216
= 4
= 212 lines
2
22 × 210 × 23= 215
OR Number of lines 212 11
Number of set = = = 2 set
4 × 1024 × 8 N- way 2
22 × 210 × 8 Set offset = 11 bit
2
21 × 8 = 4k × 8 VIVT
…. 217 216 215 214 213 212 211 210 29 28 27 26 25 24 23 22 Virtual Index and virtually Tag
21 20 32 bit
TAG Set offset Block 1 word offset
17 bit 11 bit 4 bit
2 way set associative
TAG = 32 – (11 + 4)
Row Major Tag memory size = number of lines × Tag bits
= 212 × 17 bit = 22 × 17 × 210 bit
APR[0][5] = First row, 6th element
= 68 k bits
[[0][0], 0.1, 0.2, 0.3, 0.4] OR
5 element Tag memory size = Number of set × Block per set ×
Tag bit
5 × 8 = 40 ( )16
= 211 × 2 × 17 bit = 2 × 2 × 17 × 210 bit
Tag memory size = 68 k bits

(in Hex virtual Set Index 22. (a, b, d)


address) Cache size = 2KB  211 B
APR [0] [0] FF 000 000 000 0000 0000
APR [0] [4] FF 000 020 000 0000 0010 Block size = 64 Byte = 26 B
APR [4] [0] FF 008 000 000 0000 0000 Word offset = log2(block size)  log2(26)
APR [0] [5] FF 000 028 000 0000 0010

4.50
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

Word offset (W.O)  6 bits


Physical memory (MM) = 64KB = 216Byte.
Physical address (P.A) = 16 bits

Cache size 211 B 5


Number of lines = = = 2 Lines
Block size 26 B
Line offset (L.O) = 5 bits.
Direct Mapping: Here P and S same main memory block and mapped
to the same cache line number.
⎯⎯⎯⎯
⎯ 16bit ⎯⎯⎯⎯⎯

TAG LINE(L.O) WORDOFFSET
OFFSET (W.O)
5bit 5bit 6bit
P:OxA248 Here Q and R are the different main memory block but

TAG(5bit) Line Word offset mapped to the same cache line number.
offset (6bit)

10100 01001 001000


Line No : 9
Q:OxC28A
TAG(5bit) Line Word offset
Offset (5bit) (6bit)

11000 01010 001010


Line No : 10
R:OxCA8A
TAG(5bit) Line Word offset
Every access to S is a hit.: True.
Offset (5bit) (6bit)
Once P is brought to the cache it is never evicted.
11001 01010 001010
: True.
Line No : 10
S:OxA262 At the end of the execution only R and S reside in

TAG(5bit) Line Word offset the cache.: False


Offset (5bit) (6bit) Every access to R evicts Q from the cache. : True.
10100 01001 100010
Line No : 9
Line Number

23. (76 to 76)

4.51
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

Number of CM blocks = 256


(Number of Lines)
2-way set associative
256
Number of set = = 128
2

SET S = 128

K MODS = i

K MOD128 = i
K : mm block number
S : Number of cm set
i = cache set number
1st time

24. (a)
A≥k

1 2 3 1 2 5 1 4 6 ⇒ N=9

Unique block address (1, 2, 3, 4, 5, 6) ⇒ n =6


6
Min miss [cold/compulsory miss.]
9
OR

124156189178253⇒ N = 15

n = 9 Unique block address N=9


9
min miss = (compulsory)
2nd time 15
minimum number of misses = n. [compulsory miss]
for maximum misses
k concept

Worst case: 71 3 6 7
51 2 4 5
At most k [k =3]
Associativity A  k

4.52
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

Case I: 25. (d)


Maximum allocation in same cache set (Any) Mm = 256 block
Cache = 16 block
number of line = 16
4 way set Associative
number of set = Number of LINES = 16
N − ways 4

Number of SETS=4
K MOD S = i
Or
Cache set address = mm block mod number of cm set
k MOD 4 = i
Case II: If that unique address repeated

k=3
At most k  unique = 3 address
1, 2, 4
No other extra miss

Total number of misses = n


n
Miss ratio =
N
Alternate approach:
Total access = N so divide by ‘N’ so option (c) and (d)
wrong
Unique block address = n
1
So minimum not option (b) also wrong
N
n
N

4.53
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

26. (a, b, d) Cache Hit Rate = hnew.


Tavg = hnew × tcnew + (1 – hnew)(M.P + tcnew)
30 = 15 × hnew + (1 – hnew) (100 + 15)
30 = 15 hnew + 115 – 115 hnew
85 = 100 hnew
Write through (Both cm & mm update at a same time) hnew = 85/100
(simultaneously) so no need of dirty bit
hnew = 0.85
Write Back: (first cache is update & mm update later).
Method 2
Cache Hit Rate [h] = 0.8
Cache Access Time [tc] = 10 ns.
Miss Penalty (M.P) = 100 ns.
False So, its Hierarchical Access.
(a) Incorrect, WB has dirty bit. Tavg = cache time + (1 – h) miss penalty
(b) Incorrect/False: = 10 + (1 – 0.8) 100
(c) Corect because in WT both cache memory & main Tavg = 30 ns
memory update at the same time.
Memory with optimization (new)
(d) False/Incorrect, depends on mapping technique &
Wants to reduce the miss rate ie increase the Hit Rate.
replacement algorithm.
New miss penalty = 100 ns (not affected in new)
Tavgnew = 30 ns. (not affected in new)
Cache Access Time (tcnew) = 15 ns.
Cache Hit Rate = hnew
Tavgnew = Cache time + (1 – hnew) miss penalty
27. (0.85 to 0.85)
30 = 15 + (1 – hnew) 100
Range (0.85 to 0.85)
30 = 15 + 100 – 100 hnew
Cache Hit Rate [h] = 0.8
100 hnew = 85
Cache Access Time [tc] = 10 ns.
hnew = 85/100
Miss penalty (M.P) = 100 ns.
hnew = 0.85
Tavg = h × tc + (1 – h) (M.P + tc)
= 0.8 × 10 + (1 – 0.8) (100 + 10)
= 8 + 0.2(110)
Tavg = 30 ns.
Memory with optimization (new)
Wants to reduce the miss rate ie increase the Hit Rate. 28. (d)
Miss penalty = 100 ns (not affected in new) S1: Write through → simultaneous update in L1 and
Tavg = 30 ns. NO role of dirty bit because here, its simultaneous
update.
Cache Access Time(tcnew) = 15 ns
If miss occur directly replace the block

4.54
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

S2: Write through: No write allocate. 8


1 3
Write back: Write allocation + +  Transmit 1 word  = 12 cycle
( Accept ) ( Fetch )  
Incorrect  per cycle 

Band width = Total data = 32Byte


12 cycle  1 
12   6
 60  10 

⇒ 32Byte  560  106 sec


12
29. (13.3 to 13.5)
Block size =256 bytes = 160 × 106 bytes/sec
Word size = 64 bit = 8 bytes
Cache bit ratio = 94%
Cache access time [tc] = 3 nesec.
First word takes 20 and each subsequent word take
5 nsec. 31. (4.72 to 4.72)
256 B
Number of word = = 32
8B
Tavg = 0.94 × 3 + (1 – 0.94) (3 + 20 + 31×5)
⇒ 2.82 + 0.06 [178]
⇒ 2.82 + 10.68
⇒ 13.5 sec Tavg read instn fetch = H1T1 + (1 – H1)H2(T2 + T1)
+ (1 –H1) (1 – H2) H3 (Tm + T2 + T1)
Instn fetch = 0.8 × 2 + (1 – 0.8) 0.9 (8 + 2) + (1 – 0.8)
(1 – 0.9) [90 + 8 + 2]
⇒ 1.6 + (0.2)(0.9)10 + (0.2)(0.1)(100)

30. (160 to 160) Tavgread instn fetch = 5.4 nsec


Clock frequency = 60 MHz Tavgread operand fetch = H1T + (1 – H1)H2(T2 + T1)
1 1 +(1 – H1) (1 – H2) H3 (Tm + T2 + T1)
Cycle time = = sec
clock frequency 60  l06 = 0.9 × 2 + ( 1 – 0.9) 0.9 (8 + 2) (0.1)(0.1)
Cache block size = 8 Words (90 + 8 + 2)

1 word size = 4 Bytes = 1.8 + 0.9 + 1 = 3.7 nsec

Block size = 4 Bytes Tavgread acceesstime = frequency of instn fetch × Tavgread instn
fetch + frequency of operand fetch × Tavg operand fetch
Block size = 8 × 4 byte = 32 Byte
= 0.60 × 5.4 + – 0.40 × 3.7
Total time taken to
Tavgread = 4.72 n sec
Transfer cache block = access time

4.55
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

32. (30 to 30) 33. (14 to 14)


Cache access time [tc] = 1msec Hit ratio = 80%
Disk access time [td] = 10 msec Cache read hit takes = 5 nsec
Case I: When cache size is 10 MB that time miss Read miss = 50 nsec
rate 10.8. Tavg = 0.80 × 5 + (1 – 0.80) × 50
Tavg = h × tc + *(1 – h) (tD + tC) ⇒ 4 + (0.20) 50
⇒ 0.2 ×1 + (0.8) (10 +1) ⇒ 0.2 + 0.8 (11) ⇒ = 4 + 10

Tavg = 9 msec at 10 MB cache size Tavg = 14 n sec

Case II: When size is 20 MB that time miss rate


= 0.6
Tavg = 0.4 × 1 + 0.6 [10 + 1]
⇒ 0.4 + 0.6 [11] ⇒ 0.4 + 6.6 34. (1.68 to 1.68)
Tavg = 7msec at 20 MB 100 instruciton fetch operation + 60 memmory
operand read + 40 memory operand write
Case III: When cache size is 30 MB, that time miss
Total # Instn/operatin = 200
rate = 0.4
100 instruction fetch operation (memory read) and 60
Hit rate (h) = 1– 0.4 = 0.6
memory read
Tavg = h × tc + (1 – h) (tD + tc)
Total memory read operation = 100 + 60 = 160
⇒ 0.6 × 1 + (0.4) (10 + 1) ⇒ 0.6 + 0.4(11)
Read: tC = 1 ns
⇒ 0.6 + 4.4
Hit ratio = 0.9
Tavg = 5msec at 30 MB cache size
Treadmiss = 5 nsec
Case IV: When size 40 MB, that time miss rate 0.35 Tavgread = 0.9 × 1 + (0.1)(5)
Tavg = h × tc + (1 – h) (tD + tC) ⇒ 0.9 + 0.5 = 1.4 nsec
⇒ 0.65 (1) + (0.35) (10 + 1) ⇒ 0.65 + 0.35 (11) Total time required to perform read operation = 160 ×
⇒ 0.65 + 3.85 1.4 = 224 nsec
Tavg = 4.5msec at cache size 40 MB 40 memory write operation

Case V: when cache size is 50 MB that time miss Write:


rate = 0.3 tC = 2nsec
Tavg = h × tC + (1 – h) (tD + tC) Hit ratio = 0.9
⇒ 0.7 × 1 + (0.3) (10 + 1) Twirtemiss = 10 ns
Tavg = 4msec Tavgwrite = 0.9 × 2 + 0.1 (10)
Hence, the smallest cache size required to ensure an ⇒ 1.8 + 1
average read latency of less than 6 ms will be 30 MB. Tavgwrite = 2.8

4.56
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

Total time taken to perform write operation = 40 × 2.8 = 23 × 20 × 210 bits = 160 k bits
= 112 nsec
Total time taken for 200 instructions = 224 + 112 =
336 nsec
336
Average memory access time = = 1.68 n sec
200
37. (d)
Physical address = 32 bit 1 valid bit
Cache size = 8 kB 1 modified bit
Block size = 32 byte
35. (c) Word offset = [log2 Block size] ⇒ [log232]
Tag = 16 bit Word offset = 5 bit
Physical address = 32 bit CM size 8kB 213
# lines = = = = 28 lines
Cache size = 256 k byte Block size 32byte 2 5

Block size = 32 byte


#LINES = 256
4 way set associative [22]
Line offset [L.0] =[log2256]
Word offset = [ log2 blog size] ⇒ [log2 32]
Word offset = 5 bit L.O = 8 bit
18 (direct mapped cache)
CM size 256kB 2 B
Number lines = = = T
Block size 32B 2 B
= 213 Lines

Number Lines 213 11


Number set = = 2 = 2 set
N.way 2 TAG ⇒ 32 – (8 + 5)
S. O = 11 bit
TAG = 19 bit
Write back cache = Tag entry = 19 + 1 + 1 = 21 bits
Tag memory size = number of lines × Tag entry size
⇒ 256 × 21 bit

TAG = 16bits TAG memory size = 5376 bit

36. (a) 38. (c)


Tag directory size = Number of of lines × tag entry in
bits
= 213 × [16 + 2 + 1+1] bits
4.57
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

Time taken to transfer L2 to L1 = time to read word


from L2 + update or write/store in L1 41. (c)
= 20 + 2 = 22 nsec IA4 IA3 Connect to decodes.
IA4 IA3
0 0 : x1 enabled
0 1 : x2 enabled
1 0 : x3 enabled
1 1 : x4 enabled
39. (c)
16 8 4 2 1
Time taken to transfer data from mm to L2 = Time to
4 3 2 1
read word from mm + time to store or update write in 2 2 2 2 20
L2. IA6 IA5 IA4 IA3 IA2 IA1 IA0
4[200 + 20] ⇒ 4 × 220 = 880 nsec 0 0→ x1 enabled (0)
Time taken to transfer data from L2 to L1 = time to 0 1→ x2 enabled (8)
read word form L2 + time to store or update / write 1 0→ x3 enabled (16)
word in L1 1 1→ x4 enabled (24)
= 20 + 2 = 22 nsec.
Total time to transfer data
= 880 (mm to L2) + 22 (L2 to L1)
From mm to L2 and L2 to L1 = 902 nsec.

42. (59 to 60)


In chip total number of rows = 214
Time to per form one refresh operation = 50 nsecs
Total time taken to refresh operation in all rows
40. (a) = 214 × 50 nsecs
⇒ 24 × 210 × 50 = 819200nsec = 0.8192 msec.
Refresh period = 2 msec
= 0.8192 msec
1. L1 must be write through cache: not correct always
Percentage of time spend in refersh operation =
(we can use write back cache)
0.8192
2. L2: not correct = 40.96%
2
3. Not correct
Percentage of time available for read write operation
4. Correct : its necessary
=100 – 40.96 = 59.04%
L 2  Ll Alternate method
Total time taken for refresh operation = 0.8192 msec

4.58
GATE Wallah CS & IT Topic wise PYQs
Memory Hierarchy

Refresh period = 2 msec In every 100 ns, initiate 1 word data to the bus.
Time available for read/write operationi = 2 – 0.8192  bus→ initate one request (1 word request)
⇒ 1.1808 msec For 1 milli second number of words we can initate
%time available for read/write operation = 1 millisec 10 –3
1.1808 ⇒ = –9
= 104 = 10,000
= 59.04% 100 n sec 100  10
2

45. (b)
43. (31 to 31)
Main memory size = 4 GB [232 byte] RAM CHIP size = 1 k × 8 (1024 word, 8 bit)
1 word size = 2 Byte Wants to contruct RAM of size = 16 k × 16
Memory is word addressable
16 k  16
1 word = 2Byte Number of RAM chip requried =
1k ×8
4 G Byte → 4 G Byte Word → 2 G word ⇒ 16 × 2 RAM chip ⇒ 16 lines
2 Byte
Word addressable Decoder (4 × 16) required
Mm size = 2G word
⇒ 21230 word = 231 words

Address bus size = 31 Bit

44. (10000 to 10000)


Starting 0 to 100 ns ⇒ data is available in bus for Asking how many 2 × 4 decodes required
module 0 Total 2 × 4 decodes is = 4 + 1 = 5
0 to 100 ⇒ module 0 ⇒ 100 ns + 500 = 600 nsec
100 to 200 ⇒ module 1 ⇒ 200 ns + 500 = 700 nsec
200 to 300 ⇒ module 2 ⇒ 300 ns + 500 = 800 nsec
300 to 400 ⇒ module 3 ⇒ 400 ns + 500 = 900 nsec
400 to 500 ⇒ module 4 ⇒ 500 ns + 500 = 1000 nsec
500 to 600 ⇒ module 5 ⇒ 600 ns + 500 = 1100 nsec
600 to 700 ⇒ module 6 ⇒ 700 ns + 500 = 1200 nsec
700 to 800 ⇒ module 7 ⇒ 800 ns + 500 = 1300 nsec

4.59
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

1 refresh operation takes = 100 nsec.


1 Chip refresh time = 210 × 100 nsec
Note: all ram chip are refresh in parallel

46. (b) Total refersh time = 210 100nsec


Mm capacity = 4m byte
1DRAM size = 1 m × 1 bit

47. (c)
Memory capcity = 256 k Byte
Each ram chip size = 32 k × 1 bit
Numer of ram chip required = MM size
4 m  Byte 4 m  8bit RAM chip size
Number of RAM chip = =
1m  1bit 1m  1bit 8
256 k Byte 256 k × 88 Bi t
= = = 64
= 32 RAM chip 32k  1bit 32k  1 bit
Note: Dram chip one refresh ⇒ one row is refresh
(one row cells)
Total number of rows = 210
In 1 chip, total number of refresh required = 210 refresh
operations

❑❑❑

4.60
GATE Wallah CS & IT Topic wise PYQs
Design of springs

CHAPTER

5
Modes of Data Transfer 4. [NAT] [GATE-2016 : 2M]
The size of the data count register of a DMA
1. [MCQ] [GATE-2022: 1M] controller is 16 bits. The processor needs to transfer a
Which one of the following facilitates transfer of bulk file of 29, 154 kilobytes from disk to main memory.
data from hard disk to main memory with the highest The memory is byte addressable. The minimum
number of times the DMA Controller needs to get the
throughput?
control of the system bus from the processor to
(a) DMA based I/O transfer transfer the file from the disk to main memory
(b) Interrupt driven I/O transfer is_____.
(c) Polling based I/O transfer 5. [MCQ] [GATE-2011 : 2M]
(d) Programmed I/O transfer On a non-pipelined sequential processor, a program
2. [NAT] [GATE-2021 : 1M] segment, which is a part of the interrupt service
routine, is given to transfer 500 bytes from an I/O
Consider a computer system with DMA support. The
device to memory.
DMA module is transferring one 8-bit character in one
Initialize the address register
CPU cycle from a device to memory through cycle
Initialize the count to 500
stealing at regular intervals. Consider a 2 MHZ
processor. If 0.5% processor cycles are used for LOOP: Load a byte from device
DMA, the data transfer rate of the device is _____ bit Store in memory at address given by address register
per second. Increment the address register
Decrement the court
3. [MCQ] [GATE-2020 : 1M]
If count! = 0 go to LOOP
Consider the following statements:
Assume that each statement in this program is
I. Daisy chaining is used to assign priorities in equivalent to a machine instruction which takes one
attending interrupts. clock cycle to execute if it is a non-load/store
II. When a device raises a vectored interrupt, the instruction. The load-store instructions take two clock
CPU does polling to identify the source of cycles to execute.
interrupt. The designer of the system also has an alternate
III. In polling, the CPU periodically checks the status approach of using the DMA controller to implement
bits to know if any device needs its attention. the same transfer. The DM controller requires 20
clock cycles for initialization and other overhead.
IV. During DMA, both the CPU and DMA controller
Each DMA transfer cycle takes two clock cycles to
can be bus masters at the same time.
transfer one byte of data from interrupt driven
Which of the above statements is/are TRUE? program-based input-output?
(a) I, II only (b) I and IV only (a) 3.4 (b) 4.4
(c) I and III only (d) III only (c) 5.1 (d) 6.7

4.61
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

Secondary Memory 9. [MCQ] [GATE-2013 : 2M]


Consider a hard disk with 16 recording surfaces (0-
6. [NAT] [GATE-2023 : 1M]
15) having 16384 cylinders (0-16383) and each
A keyboard connected to a computer is used at a rate cylinder contains 64 sectors (0-63). Data storage
of 1 keystroke per second. The computer system polls capacity in each sector is 512 bytes. Data are
the keyboard every 10 ms (milli seconds) to check for organized cylinder-wise and addressing format is
a keystroke and consumes 100 μs (micro seconds) for <cylinder no., surface no., sector no>. A file of size
each poll. If it is determined after polling that a key 42797 KB is stored in the disk and the starting disk
has been pressed, the system consumes an additional location of the file is < 1200, 9, 40>. What is the
200 μs to process the keystroke. Let T1 denote the cylinder number of the last sector of the file, if it is
fraction of a second spent in polling and processing a stored in a contiguous manner?
keystroke. (a) 1281
In an alternative implementation, the system uses (b) 1282
(c) 1283
interrupts instead of polling. An interrupt is raised for
(d) 1284
every keystroke. It takes a total of 1 ms for servicing
an interrupt and processing a keystroke. Let T2 denote 10. [MCQ] [GATE-2011 : 2M]
the fraction of a second spent in servicing the interrupt An application loads 100 libraries at start-up. Loading
and processing a keystroke. each library requires exactly one disk access. The seek
T1 time of the disk to a random location is given as 10
The ratio is ___________. (Rounded off to one ms. Rotational speed of disk is 6000 rpm. If all 100
T2
decimal place) libraries are loaded from random locations on the
disk, how long does it take to load all libraries? (The
7. [NAT] [GATE-2015 : 2M]
time to transfer data from the disk block once the head
Consider a typical disk that rotates at 15000 rotations has been positioned at the start of the block may be
per minute (RPM) and has a transfer rate of 50 × 106 neglected.)
bytes/sec. If the average seek time of the disk is twice (a) 0.50s
the average rotational delay and the controller’s (b) 1.50s
transfer time is 10 times the disk transfer time, the (c) 1.25s
average time (in milliseconds) to read or write a 512- (d) 1.00s
byte sector of the disk is _____. Common Data for next two questions:
8. [NAT] [GATE-2015 : 2M] A hard disk has 63 sectors per track, 10 platters each with
Consider a disk pack with a seek time of 4 2 recording surfaces and 1000 cylinders. The address of a
milliseconds and rotational speed of 10000 rotations sector is given as a triple 〈c,h,s〉, where c is the cylinder
per minute (RPM). It has 600 sectors per track and number, h is the surface number and s is the sector number.
Thus, the 0th sector is addressed as〈0,0,0〉, the 1st sector as
each sector can store 512 bytes of data. Consider a file
〈0,0,1〉, and so on.
stored in the disk. The file contains 2000 sectors.
Assume that every sector access necessitates a seek, 11. [MCQ] [GATE-2009 : 2M]
And the average rotational latency for accessing each The address 〈400,16,29〉 corresponds to sector
sector is half of the time for one complete rotation. number:
The total time (in milliseconds) needed to read the (a) 505035 (b) 505036
entire file is ____. (c) 505037 (d) 505038

4.62
GATE Wallah CS & IT Topic wise PYQs
I/O Interface

12. [MCQ] [GATE-2009 : 2M] 13. [MCQ] [GATE-2008 : 2M]


For a magnetic disk with concentric circular tracks,
The address of 1039th sector is
the latency is not linearly proportional to the seek
(a) 〈0,15,31〉 distance due to
(a) Non-uniform distribution of requests
(b) 〈0,16,30〉
(b) Arm starting and stopping inertia
(c) 〈0,16,31〉 (c) Higher capacity of tracks on the periphery of the
(d) 〈0,17,31〉 platter
(d) Use of unfair arm scheduling policies.

❑❑❑

4.63
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

1. (a) 2. (80000 to 80000) 3. (c) 4. (456 to 456)


5. (a) 6. (10.2 to 10.2) 7. (6.1 to 6.2) 8. (14020 to 14020)
9. (d) 10. (b) 11. (c) 12. (c)
13. (b)

1. (a) 0.5
 2  106 = 10000 cycle
In DMA bulk amount of data transferred from Hard 100
disk (secondary Memory) to main memory with the Total number of cycle taken by the DMA for data
highest through put. transfer = 10, 000 cycle
In 1 cycle – 8 bit data transfer
In programmed I/O transfer CPU Time is depend on
speed of I/O device, so utilization is very less & not In 10,000 cycle = 10000 × 8 = 80000
used for bulk amount of data transfer. Data transfer rate = 80000

Interrupt driven I/O transfer is used for small amount OR


of data transfer with the involvement of the CPU. CPU cycles can be completed in ½ * 106 seconds.
But in DMA (Direct Memory Access) based I/O Therefore, there will be 2 * 106 cycles in a second
transfer bulk amount of data transfer without the plus an additional 10,000 DMA cycles, or 0.5%, at
involvement of the CPU. & DMA has the highest which time 8 bits are transmitted.
priority so through put is very high. Thus, 80,000 bits will be sent in total in one second.

2. (80000 to 80000) 3. (c)


Clock frequency = 2 MHz I Daisy Channing: true
1 II false
Cycle time = sec = 0.5  sec
2MHz III true
1 −6
1 cycle time =  10 sec IV false
2
At a time either CPU or DMA can be master on bus.
In one second total number of CPU cycle = 2 × 106
cycle
0.5 % of the CPU cycle are used to data transfer by
the DMA = 0.5% × 2 × 106 =

4.64
GATE Wallah CS & IT Topic wise PYQs
I/O Interface

4. (456 to 456) In DMA transfer Tim = 20 + 2 × 500 = 1020 Cycle


Data count register = 16 bit 3502
Speed up = = 3.4
Count value : How many number of byte/word 1020
transferred by the DMA from input output (I/O) to
memory in one cycle
Total count value = 216 – 1
In one time. Total # byte transferred
6. (10.2 to 10.2)
16
= 2 – 1 = 65,535 Byte 1 key stroke per second,
Total # byte to be transferred = 29, 154 Byte Each polling takes = 100μsec.
Number of time DMA needs the control on system After every 10 m sec polling is done.
bus
If key is press then additional = 200 μsec
 29,154  1024 Byte  In every 10 × 10–3 sec →1 poll.
=  = 456
 65,535 Byte  1
In 1 sec → −3
 10+2 = 100 poll in one second.
10  10
Each poll takes = 100μsec
Total 100 poll takes = 100 × 100 = 10000μsec = 10
msec.
5. (a)
If key stroke is pressed then addition = 200μsec
Initialize the address register → 1 cycle Total time for polling and processing pressing key
Initialize the count to 500 → 1 cycle stroke = 10.000 + 200 = 10,200μsec = 10.2msec
T1 in 1 sec — 10.2 msec.
LOOP7: Load a byte from device → 2 cycle
10.2
Store in memory at address given by address register T1 =
1000
→ 2 cycle
T2 Alternative approach: Interrupt
Increment the address register → 1 cycle
Total time taken to servicing interrupt = 1msec.
Decrement the court → 1 cycle and processing key stroke.
If count! = 0 go to LOOP → 1 cycle 1
T2 =
1000
In each one interation of loop takes
(In 1 sec = 1000 msec)
= 2+ 2 + 1 + 1 +1 = 7 Cycle
= 1000 × 10–3 = 1 sec
Loop execute = 500 times (Iteration) T1 10.2 1000
=  = 10.2
Total time taken in 500 interatin = 500 × 7 = 3500 T2 1000 1
Cycle
Total time in ISR = 3500 + 2 = 3502 Cycle
500 Byte transfer using DMA

4.65
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

7. (6.1 to 6.2) 600 sector /track each sector capcity = 512 Byte
Average disk access time or average read/write time 10000 rpm
= Average S.T + Average R.T + D.T.T + overhead (if 10000 rotation in 60 second
any given)
60
15000 RPM 1 rotation …..
10000
15000 rotation in 60 sec
2 ⇒ 6 × 10–3 sec
60 1 1000
In 1 rotation = = sec 
15000 500 250 1000 1 rotation takes = 6 msec

1 rotation time = 4 m sec 1


Average roational latency = 6
2
1
Average rotational Latency =  RT = 2msec Average Rotational latency = 3 msec
2
Average S.T = 2 × average R.T 1 Track capacity = 600 × 512 Byte
⇒ 2 × 2 = 4msec In 1 Rotational 1 complete track

Average S.T = 4msec 600 × 512 Byte takes _______6msec

Transfer rate = 50 × 106 Byte/second 6


1 Byte → msec
600  512
50 × 106 Byte …..1second
1 512  6 1
1 byte ……… sec 512 Byte (1 sector) → = msec
50  10 6 600  12 100
512 D.T.T = 0.01 msec
512 Byte _____  10 –6
50 Total time required for 1 sector
–6
⇒ 10.24 × 10 = 4 + 3 + 0.01 = 7.01 msec
0.01024 msec Total time requird for 2000 sector = 2000 × 7.01
D.T.T = 0.0102 msec = 14020 msec
Control time = 10 × 0.01024 msec.
= 0.1024 m sec.
D.A.T. = 4 + 2 + 0.01024 + 0.1024
D.A.T = 6.11 msec
6.1 m sec 9. (d)
<c, h, s > <1200, 9, 40 >
16 Recording surface, 64 sector
Starting sector = <c, h, s >
8. (14020 to 14020) Number of the file = <1200, 9, 40>
Time required to read one sector = seek time +  1200 × 16 × 64 + 9 × 64 + 40 = 1229416
average R.T + Data transfer time Starting (first) sector number of the file = 1229416
Seek time (S.T) = 4 msec 16384 cylinder

4.66
GATE Wallah CS & IT Topic wise PYQs
I/O Interface

85594
Number of cylinders required to cross =
16  64
= 83 cylinder.
In 83 cylinder the number of sector = 83×16 ×64
= 84992
Number of sector remaining = 85594 – 84992
= 602 sector
602
Surface number = = 9 Surface.
64
(1 more cylinder required)
Number of cylinder = 83 + 1 = 84
Starting = 1200 + 84 = 1284
File size = 42,797 kB & each sector capacity = 512
byte.
Total number of sector needed (Required) to store the
42797 KB
file = = 85594 Sector
512 Byte 10. (b)
Range [o to 85593] Total 100 libraries
Starting (first) sector Address of the file = 1229416 Seek time = 10 msec
Last sector number of the file = 1229416 + 85594 – 1 D.T.T (neglected)
= 1315009
Rotational speed = 6000 rpm
Last Sector number = 1315009
<c, h, s > Average disk access time = S.T + Average R.T +
D.T.T
1315009
Cylinder number = = 1284.188 1
64  6 Average R.T = × rotational time
2
1284 cyclinder number
6000 rotation in 60 second
Number of sectors covered = 1284 × 16 × 64
60
= 1314816 sectors.  1 Rotation = = 10 msec
6000
Remaining sector = 1315009 – 1314816 = 193 sector
1
193 Average R.T = × 10 msec = 5 msec
Surface Number = =3 2
64
Total time 1 library access = S.T + R.L + D.T.T
64 sector per surface 16 Recording surface
⇒ 10 + 5 = 15 msec
 c, h , s 
Total tme taken for 100 libraries = 100 × 15 msec
 
= 100 × 5 × 10–3 m sec = 1.5 second
16 64
OR
Alternative Approach
Number of sector required = 85594
Need to cross = 85594 sector

4.67
GATE Wallah CS & IT Topic wise PYQs
Computer Organization and Architecture

11. (c) = 504000 + 1008 + 29 = 505037


Alternate approach:
By formula: <c, h, s> <400, 16, 29>
St: number of sector per track = 63
tc: number of track per cylinder = 2×10 = 20
63 Sector per track
10 platter: 2 recording surface Sector Numebr = s + st × h + st × tc × c
each  S + St (h + tc × c)
1000 cylinder  29 + 63 [16 + 20 × 400]
= 29 + 63 × 16 + 63 × 20 × 400 = 505037

12. (c)
63 sector per track
1 cylinder = 10 × 2 × 63 = 1260

(a) < 0, 15, 31>  15 × 63 + 31 = 945 + 31 = 976


(b) < 0, 16, 30>  16 × 63 + 30 = 1008 + 30 = 1038
(c) < 0, 16, 31>  16 × 63 + 31 = 1008 + 31 = 1039
(d) < 0, 17, 31>  17 × 63 + 31 = 1071 + 31 = 1102

13. (b)
Every time the head changes tracks, its speed and
direction change, which is just a change in motion or
the result of inertia.
< c, h, s> <400, 16, 29> Hence option (b) is correct.
400 cylinder = 400 [10 ×2 × 63]
16 surface = 16 × 63
29 sector = 29
Sector Number = 400 [10 ×2 × 63] + 16 × 63 + 29
= 400 × [1260] + 16 × 63 + 29
❑❑❑

4.68
GATE Wallah CS & IT Topic wise PYQs
1. Data Types and Operators ......................................................................................................... 5.1 – 5.3

2. Control Flow Statements ........................................................................................................... 5.4 – 5.16

3. Functions and Storage Classes .................................................................................................. 5.17 – 5.32

4. Pointers and Strings .................................................................................................................. 5.33 – 5.55

5. Arrays and Linked List ................................................................................................................ 5.56 – 5.67

6. Stacks and Queues .................................................................................................................... 5.68 – 5.78

7. Trees .......................................................................................................................................... 5.79 – 5.103

8. Hashing ......................................................................................................................................5.104– 5.110

You might also like