0% found this document useful (0 votes)
43 views23 pages

Coa Solved

Direct Memory Access (DMA) allows devices to transfer data directly to and from memory without involving the CPU. This improves performance by offloading data transfers from the CPU and allowing concurrent I/O and computation. The three main registers in DMA are the Address Register, which holds the memory address for transfers, the Count Register, which holds the number of bytes to transfer, and the Control Register, which controls aspects of the transfer like direction and mode.

Uploaded by

Srijeeta Sen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views23 pages

Coa Solved

Direct Memory Access (DMA) allows devices to transfer data directly to and from memory without involving the CPU. This improves performance by offloading data transfers from the CPU and allowing concurrent I/O and computation. The three main registers in DMA are the Address Register, which holds the memory address for transfers, the Count Register, which holds the number of bytes to transfer, and the Control Register, which controls aspects of the transfer like direction and mode.

Uploaded by

Srijeeta Sen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

26. what is the function of Direct Memory Access?

discuss about the 3 main registers present in it

Direct Memory Access (DMA) is a feature in computer systems that allows devices to transfer data
directly to and from the memory without involving the CPU. DMA helps in offloading data transfer
tasks from the CPU, freeing it up to perform other tasks and improving overall system performance.

The function of Direct Memory Access can be summarized as follows:

 Data Transfer: The primary function of DMA is to facilitate data transfer between peripheral
devices (such as disk drives, network interfaces, and sound cards) and the main memory
(RAM) without CPU intervention. This allows for faster and more efficient data movement,
especially for large data transfers or continuous streaming tasks.
 CPU Offloading: By allowing devices to transfer data directly to and from memory, DMA
reduces the burden on the CPU. Instead of the CPU managing each data transfer, DMA
controllers handle the data movement independently, allowing the CPU to focus on
executing instructions and performing other computational tasks.
 Improved System Performance: DMA helps in improving overall system performance by
reducing CPU overhead and allowing for concurrent data transfers. This is particularly
beneficial in systems with high-speed peripherals or when performing multiple I/O
operations simultaneously.

Now, let's discuss the three main registers present in a typical DMA controller:

 Address Register (AR): The Address Register holds the memory address where data will be
transferred to or from. It specifies the starting address of the data transfer operation. When
initiating a DMA transfer, the CPU sets the appropriate memory address in the Address
Register.
 Count Register (CR): The Count Register holds the number of data bytes to be transferred
during the DMA operation. It specifies the size of the data transfer. Once the DMA operation
is initiated, the DMA controller decrements the Count Register each time a data byte is
transferred until it reaches zero, indicating the completion of the transfer.
 Control Register (CTR): The Control Register contains various control bits that configure the
behavior of the DMA controller and specify the transfer mode (e.g., read from device, write
to device, auto-increment address, etc.). It controls aspects such as the direction of data
transfer, transfer mode, transfer width, and interrupt generation upon completion of the
transfer.

These registers work together to manage and control the DMA transfer operations efficiently,
allowing for seamless data movement between devices and memory without CPU intervention.

25. A program runs in 10 seconds on computer A which has a 2GHz clock. A computer designer is
trying to build a computer B, which will run this program in 6 seconds. The designer has determined
that a substantial increase in the clock rate is possible but this increase will affect the rest of CPU
design, causing computer B to require 1.2 times as many clock cycles as computer A for this program.
What clock rate the designer has to target?

To solve this problem, let's first calculate the number of clock cycles required for the program to run
on computer A.

Computer A:

Clock frequency (fA) = 2 GHz = 2 * 10^9 Hz

Execution time (tA) = 10 seconds

Number of clock cycles (NA) = fA * tA

Now, let's calculate the number of clock cycles required for the program to run on computer B,
considering the given information:

Computer B:

Execution time (tB) = 6 seconds

Clock cycles per program (NB) = 1.2 * NA (since computer B requires 1.2 times as many clock cycles as
computer A)

From the formula for clock cycles, we have:

NB = fB * tB

We need to find the clock frequency (fB) for computer B.

Substituting the expressions for NB and tB:

1.2 * NA = fB * 6

Now, let's substitute the expression for NA:

1.2 * (fA * tA) = fB * 6

Now, plug in the given values:

1.2 * (2 * 10^9 Hz * 10 s) = fB* 6

Solving for fB:

1.2 * (2 * 10^9 Hz * 10 s) / 6 = fB

fB = (1.2 * 2 * 10^9 Hz * 10 s) / 6

fB = 2.4 * 10^9 Hz

So, the designer has to target a clock rate of 2.4 GHz for computer B.
24. Evaluate X=(A+B)*(C+D) using 3 address, 2 address, 1 address and 0 address instruction formats

Sure, let's evaluate the expression X=(A+B)∗(C+D) using different instruction formats:

Given variables:

A, B, C, D are operands

+ and * are operators

3 Address Instruction Format:

In this format, each instruction specifies two source operands and one destination operand.

Instructions:

LOAD A, R1

LOAD B, R2

ADD R1, R2, R3

LOAD C, R4

LOAD D, R5

ADD R4, R5, R6

MUL R3, R6, X

Explanation:

Instructions 1 and 2 load values of A and B into registers R1 and R2 respectively.

Instructions 3 and 6 perform addition of A+B and C+D and store the results in registers R3 and R6
respectively.

Instruction 7 multiplies the results of the additions stored in R3 and R6 and stores the result in
memory location X.

2 Address Instruction Format:

In this format, each instruction specifies one source operand and one destination operand. The result
of the operation is stored in the destination operand.

Instructions:

LOAD A

ADD B

STORE R1

LOAD C

ADD D
MUL R1

Explanation:

Instructions 1 and 2 load values of A and B into the accumulator register.

Instruction 3 stores the result of A+B in memory location R1.

Instructions 4 and 5 load values of C and D into the accumulator register.

Instruction 6 multiplies the result in memory location R1 with the accumulator.

1 Address Instruction Format:

In this format, each instruction specifies one operand. The operation is performed between the
operand and the accumulator register, and the result is stored back in the accumulator.

Instructions:

LOAD A

ADD B

LOAD C

ADD D

MUL X

Explanation:

Instructions 1 and 2 load values of A and B into the accumulator register and add them.

Instructions 3 and 4 load values of C and D into the accumulator register and add them.

Instruction 5 multiplies the result in the accumulator with the value of X.

0 Address Instruction Format:

In this format, all operands and results are implied by the instruction sequence.

Instructions:

LOAD A

ADD B

STORE T1

LOAD C

ADD D

STORE T2

LOAD T1

MUL T2

STORE X
Explanation:

Instructions 1 and 2 load values of A and B into the accumulator register and add them.

Instruction 3 stores the result of A+B in temporary location T1.

Instructions 4 and 5 load values of C and D into the accumulator register and add them.

Instruction 6 stores the result of C+D in temporary location T2.

Instructions 7 and 8 load values of T1 and T2 into the accumulator register and multiply them.

Instruction 9 stores the result of the multiplication in memory location X.

23. State the differences between I/O mapped I/O and Memory mapped I/O. What are the
advantages and disadvantages?

FEATURES MEMORY MAPPED I/O I/O MAPPED I/O


Addressing IO devices are accessed like any They cannot be accessed like
other memory location. any other memory location.
Address Size They are assigned with 16-bit They are assigned with 8-bit
address values. address values.
Instructions Used The instruction used are LDA and The instruction used are IN
STA, etc. and OUT.
Cycles Cycles involved during operation are Cycles involved during
Memory Read, Memory Write. operation are IO read and IO
writes in the case of IO
Mapped IO.
Registers Communicating Any register can communicate with Only Accumulator can
the IO device in case of Memory communicate with IO devices
Mapped IO. in case of IO Mapped IO.
Design complexity Simple to implement and design More complex to implement
and design
Instruction set Uses the same instructions for Special instructions are used
accessing both memory and I/O for accessing I/O devices
devices

Memory-Mapped I/O:

Advantages:

 Simplicity: Memory-mapped I/O simplifies the design of the system by treating I/O devices as
memory locations. This allows programmers to access I/O devices using the same
instructions as they use for accessing memory, making programming simpler and more
uniform.
 Efficiency: Memory-mapped I/O can be more efficient than I/O-mapped I/O because it
eliminates the need for separate instructions or addressing modes for I/O operations. This
can lead to faster data transfer rates and reduced overhead.
 Direct Access: Memory-mapped I/O provides direct access to I/O devices without the need
for intermediate controllers or special instructions. This can result in faster response times
for I/O operations.
Disadvantages:

 Address Space Limitation: Memory-mapped I/O can limit the available address space for
memory if a large number of I/O devices are mapped to memory addresses. This can be a
problem in systems with a limited address space.
 Resource Conflicts: Since memory-mapped I/O shares the same address space as main
memory, conflicts may arise between memory access and I/O access. Careful management
and coordination are required to avoid conflicts and ensure proper operation.
 Security Risks: Memory-mapped I/O exposes I/O devices as memory-mapped regions,
potentially increasing security risks such as unauthorized access or buffer overflow attacks if
proper security measures are not implemented.

I/O-Mapped I/O:

Advantages:

 Isolation: I/O-mapped I/O isolates I/O operations from memory operations by using separate
address spaces for memory and I/O devices. This helps prevent conflicts between memory
access and I/O access, enhancing system stability and reliability.
 Flexibility: I/O-mapped I/O provides more flexibility in managing I/O resources by allowing
dedicated address spaces for I/O devices. This can simplify system configuration and
resource allocation, especially in complex systems with multiple I/O devices.
 Security: By isolating I/O operations from memory operations, I/O-mapped I/O can enhance
system security by reducing the risk of unauthorized access or security vulnerabilities
associated with memory-mapped I/O.

Disadvantages:

 Complexity: I/O-mapped I/O adds complexity to system design and programming by


requiring separate instructions or addressing modes for I/O operations. This can increase
development time and effort, as well as introduce potential for errors or compatibility issues.
 Overhead: I/O-mapped I/O may introduce additional overhead in terms of address decoding
and bus arbitration, especially in systems with a large number of I/O devices. This can affect
system performance and efficiency.
 Limited Uniformity: I/O-mapped I/O may lack uniformity in accessing I/O devices compared
to memory-mapped I/O, as it requires separate instructions or addressing modes specific to
I/O operations. This can make programming more complex and less intuitive for developers
accustomed to memory-mapped I/O.

22. Define virtual memory. What are its advantages? What is demand paging?

Virtual memory is a memory management technique used by modern computer operating systems
to provide the illusion of a larger and contiguous main memory (RAM) to programs than is physically
available. It abstracts the physical memory into a larger address space, allowing programs to use
more memory than what is actually installed on the system. Virtual memory makes use of both RAM
and secondary storage (usually a hard disk drive) to create this illusion.

Advantages of virtual memory include:


 Increased Program Capacity: Virtual memory allows programs to access more memory than
is physically available, thereby increasing the overall capacity of programs that can run
concurrently on a system.
 Simplified Memory Management: Virtual memory simplifies memory management for both
the operating system and programmers. Programs can be written assuming that they have
access to a large and contiguous memory space, without needing to worry about physical
memory constraints.
 Isolation: Each process has its own virtual address space, providing isolation between
processes. This ensures that processes cannot access each other's memory, improving
system stability and security.
 Memory Protection: Virtual memory provides memory protection mechanisms that prevent
processes from accessing memory locations that they do not have permission to access. This
helps prevent accidental or malicious access to sensitive data.
 Transparent Disk I/O: Virtual memory uses secondary storage (usually a hard disk) to store
portions of the memory that are not currently in use. This allows for transparent disk I/O
operations, where data is automatically swapped between RAM and disk as needed without
requiring intervention from the program or user.

Demand paging is a technique used in virtual memory systems to optimize memory usage by loading
only the portions of a program into memory that are needed at any given time. Instead of loading
the entire program into memory at once, demand paging loads pages (fixed-size blocks of memory)
into memory on an as-needed basis. When a program accesses a portion of memory that is not
currently in RAM, a page fault occurs, and the operating system loads the required page from disk
into memory.

21. Consider the following instructions-

I1: R1:100

I2:R1=R2+4

I3: R2=R4+25

I4: R4=R1+R3

I5: R1=R1+30

calculate sum of WAR, RAW and WAW dependencies for the above instructions

To calculate the sum of the WAR (Write After Read), RAW (Read After Write), and WAW (Write After
Write) dependencies for the given instructions, let's analyze each instruction and identify the
dependencies:

I1: R1:100

This instruction writes to register R1.


Dependencies:

WAR: None

RAW: None

WAW: None

I2: R1 = R2 + 4

This instruction writes to register R1 and reads from register R2.

Dependencies:

WAR: None

RAW: None

WAW: None

I3: R2 = R4 + 25

This instruction writes to register R2 and reads from register R4.

Dependencies:

WAR: None

RAW: None

WAW: None

I4: R4 = R1 + R3

This instruction writes to register R4 and reads from registers R1 and R3.

Dependencies:

WAR: R1 (from I2)

RAW: None

WAW: None

I5: R1 = R1 + 30

This instruction writes to register R1 and reads from register R1.

Dependencies:

WAR: None

RAW: R1 (from I4)

WAW: R1 (from I2)

Now, let's calculate the sum of dependencies:

WAR dependencies: 1 (from I4)

RAW dependencies: 2 (from I4 and I5)


WAW dependencies: 1 (from I2)

Sum of WAR, RAW, and WAW dependencies = 1 + 2 + 1 = 4

20. What are the 4 parameters of pipeline?

The four main parameters of a pipeline are:

1. Throughput: Throughput refers to the rate at which the pipeline can process instructions or
data. It is measured in terms of completed instructions or data units per unit of time (e.g.,
instructions per second, bytes per cycle). Higher throughput indicates better performance
and efficiency of the pipeline. Measured in kilobits per second(kpbs), Mbps, bps , etc.
2. Latency: Latency, also known as pipeline depth or pipeline delay, refers to the time taken for
an instruction or data to traverse through all stages of the pipeline from input to output. It
represents the delay experienced by each instruction or data unit as it progresses through
the pipeline. Lower latency indicates faster processing and reduced delay.
3. Pipeline Length: Pipeline length refers to the number of stages or segments in the pipeline.
Each stage corresponds to a specific operation or task performed on the input data or
instruction. Longer pipelines typically provide higher throughput but may also result in
higher latency and resource utilization.
4. Pipeline Hazards: Pipeline hazards are conditions or situations that can potentially disrupt
the smooth flow of instructions or data through the pipeline, leading to performance
degradation or incorrect results. The three main types of pipeline hazards are:
 Structural Hazards: Arise when multiple pipeline stages attempt to access the same hardware
resource simultaneously.
 Data Hazards: Arise when a dependent instruction requires data that is not yet available due
to preceding instructions still being processed in the pipeline.
 Control Hazards: Arise when the pipeline incorrectly predicts the outcome of branch
instructions, leading to wasted processing cycles.

19. What is the advantage of non-restoring division over restoring division?

Advantages of non-restoring division over restoring division:

 Simplicity: Non-restoring division algorithm is simpler to implement compared to restoring


division algorithm. It requires fewer steps and computational overhead, making it more
straightforward to understand and execute.
 Potentially Faster Execution: Non-restoring division algorithm can potentially be faster than
restoring division algorithm, especially for certain types of operands or input values. This is
because non-restoring division performs fewer iterations and has less overhead in each
iteration compared to restoring division.
 Reduced Hardware Complexity: Non-restoring division algorithm can be implemented with
simpler hardware components compared to restoring division algorithm. This can result in
lower hardware costs and complexity for implementing division functionality in computer
systems.
 Better Pipelining Efficiency: Non-restoring division algorithm may exhibit better pipelining
efficiency compared to restoring division algorithm. Its simpler structure and reduced
dependency on previous iterations make it easier to parallelize and pipeline, leading to
potential performance improvements in pipelined architectures.

18. which addressing mode makes use of Program counter instead of General Purpose Register?

The addressing mode that makes use of the Program Counter (PC) instead of a General Purpose
Register is called the "Relative Addressing Mode."

In Relative Addressing Mode, the operand's address is specified as a displacement relative to the
current value of the Program Counter (PC). The CPU calculates the effective address by adding the
displacement to the value stored in the PC.

Relative addressing mode is commonly used in branch instructions, where the program needs to
jump to a location that is a certain number of instructions away from the current instruction. The
displacement value specifies the offset or distance from the current instruction to the target
instruction.

17. A system has to store 8KB memory. What are the maximum and minimum addressing bits
needed to store this address? What are the word length corresponding to the maximum and
minimum addressing bit configurations?

To determine the maximum and minimum addressing bits needed to store 8KB of memory, we can
use the formula:

Addressable Memory=2Addressing Bits

Given that the system has to store 8KB of memory (where 1KB = 1024 bytes), we can calculate the
addressing bits needed as follows:

Addressable Memory=8×1024 bytes=2Addressing Bits

213=2Addressing Bits

Addressing Bits=13

So, the minimum number of addressing bits needed to store 8KB of memory is 13 bits.

For the maximum addressing bits, we need to find the next power of 2 greater than or equal to 8KB.

Addressable Memory=214=16×1024 bytes=16KB

214=16×1024

214=16384 bytes

Thus, the maximum addressing bits needed to store 8KB of memory is 14 bits.

Now, to find the corresponding word length for both maximum and minimum addressing bit
configurations, we use the formula:

Word Length=Addressable Memory / Number of Addresses

For the minimum addressing bit configuration (13 bits):


Word Length=(8×1024)/213=8192/8192=1 byte

For the maximum addressing bit configuration (14 bits):

Word Length=(16×1024)/214=16384/16384=1 byte

So, in both cases, the word length corresponding to the maximum and minimum addressing bit
configurations is 1 byte.

16. What is cache coherency? Why is it important?

Cache coherency refers to the consistency of data stored in multiple caches that reference the same
main memory locations in a computer system. In a multiprocessor or multi-core system where each
processor or core has its own cache, cache coherency ensures that all caches have consistent and up-
to-date copies of shared data.

Cache coherency is important for several reasons:

 Data Consistency: Cache coherency ensures that all processors or cores see the most recent
and correct data when accessing shared memory locations. Without cache coherency,
different processors or cores may have different versions of the same data, leading to
inconsistencies and errors in program execution.
 Correctness of Parallel Execution: In a multiprocessor or multi-core system, multiple
processors or cores may execute instructions concurrently. Cache coherency ensures that the
results of concurrent execution are correct by maintaining data consistency across caches.
 Performance: Cache coherency helps improve system performance by minimizing the need
to access main memory for shared data. When one processor or core updates a shared
memory location, cache coherency protocols ensure that all other caches containing copies
of that data are invalidated or updated accordingly, reducing the frequency of cache misses
and memory accesses.
 Synchronization and Communication: Cache coherency facilitates synchronization and
communication between processors or cores in a multi-processing environment. By ensuring
that all caches have consistent views of shared data, cache coherency simplifies coordination
and communication between concurrent threads or processes.

15. An instruction is stored at location 300 with its address location 301. The address field has a value
of 250. The address R1 contains the number 200. Evaluate the effective address if addressing mode is
(a) direct (b) relative.

Instruction stored at location 300: Instruction Address: 300,

Operand Address: 301

Address field value: 250

Register R1 value: 200

(a) Direct Addressing Mode: In direct addressing mode, the effective address is simply the address
specified in the address field of the instruction.

Effective Address = Address field value = 250


(b) Relative Addressing Mode: In relative addressing mode, the effective address is calculated by
adding the address field value to the contents of a specified register (usually the Program Counter,
PC).

Effective Address = Address field value + Contents of R1

Effective Address = 250 + 200 = 450

So, the effective address for: (a) Direct Addressing Mode is 250 (b) Relative Addressing Mode is 450

13. Analyze how we can make 10% of a program 90 times faster using Amdahl’s Law

Amdahl's Law is a principle used to analyze the potential speedup of a system when only a portion of
the system is improved. It states that the overall speedup of a system due to an improvement in one
part is limited by the fraction of time that the improved part is used. Amdahl's Law is given by the
formula:

Speedup=1/((1-P)+P/S)

Where:

 P is the proportion of the program that benefits from the improvement (expressed as a
decimal).

 S is the speedup factor of the improved part.

Given that we want to make 10% of a program 90 times faster, we can plug in the values into
Amdahl's Law:

 P=0.10 (10% of the program)

 S=90 (90 times faster)

Speedup=1/ ((1−0.10)+ 0.10/90)

Speedup=1/(0.90+0.10/90)

Speedup=1/(0.90+0.001111)

Speedup=1/0.901111

Speedup≈1.110

So, according to Amdahl's Law, the maximum speedup that can be achieved by making 10% of the
program 90 times faster is approximately 1.110 times.

12. If the greedy cycles are 3, 7 and (1,8), what will be the minimum average latency?

Given the greedy cycles: 3, 7, and (1, 8)

1. For the greedy cycle 3, the cycle time is 3.

2. For the greedy cycle 7, the cycle time is 7.

3. For the greedy cycle (1, 8), the cycle time is the LCM of 1 and 8, which is 8.

Now, we calculate the average latency:


Average latency = (Cycle time 1 + Cycle time 2 + Cycle time 3) / Number of greedy cycles

Average latency = (3 + 7 + 8) / 3

Average latency = 18 / 3

Average latency = 6

So, the minimum average latency is 6 time units.

11. What is forbidden latency? Give examples.

Forbidden latency, also known as enforced latency or minimum latency, refers to the minimum
amount of time that must elapse between certain events or operations in a system. It represents a
constraint imposed by the system design or requirements, specifying that certain operations cannot
be executed or completed within a specified time frame.

Examples : Memory Operations: In computer systems, certain memory operations may have a
minimum latency requirement. For example, after writing data to a memory location, there may be a
minimum latency enforced before the data becomes available for read operations. This ensures data
integrity and consistency, preventing race conditions or data hazards.

10. A non-pipelined instruction takes 75 ns to process a task. The same task can be processed in a 6
segment pipeline with a clock cycle of 15 ns. Determine the speedup ratio of the pipeline for 100
tasks

To determine the speedup ratio of the pipeline compared to the non-pipelined approach, we need to
calculate the total time taken to process 100 tasks using both methods and then compare them.

For the non-pipelined approach: Total time = Time per task * Number of tasks Total time = 75 ns/task
* 100 tasks Total time = 7500 ns

For the pipelined approach: Since the pipeline has 6 segments and each segment takes 15 ns to
process, the total time per task in the pipeline is the time taken for one segment to process the task,
which is 15 ns. However, due to the pipelining, the tasks overlap in the pipeline. Therefore, the total
time taken to process 100 tasks in the pipeline is determined by the time taken for one segment to
process the task. Total time = Time per task in pipeline = 15 ns/task

Now, we can calculate the speedup ratio: Speedup ratio = Total time for non-pipelined approach /
Total time for pipelined approach

Speedup ratio = 7500 ns / 15 ns Speedup ratio = 500

So, the speedup ratio of the pipeline for 100 tasks compared to the non-pipelined approach is 500.
This means that the pipelined approach is 500 times faster than the non-pipelined approach for
processing 100 tasks.

9 . Define any 3 parallel processing techniques.

Here are three parallel processing techniques:

1. Task Parallelism: Task parallelism involves dividing a larger task into smaller subtasks that can
be executed concurrently. Each subtask is assigned to a separate processing unit or thread,
allowing multiple tasks to be executed simultaneously. Task parallelism is commonly used in
applications where independent tasks can be identified and executed in parallel, such as data
parallelism in image processing or parallelizing loops in numerical simulations.

2. Data Parallelism: Data parallelism involves distributing data across multiple processing units
or threads and performing the same operation on different portions of the data concurrently.
This technique is well-suited for applications that involve processing large datasets or arrays,
such as matrix multiplication, vector operations, or parallel sorting algorithms. Data
parallelism can be implemented using SIMD (Single Instruction, Multiple Data) instructions,
multi-core processors, or distributed computing frameworks like MapReduce.

3. Pipeline Parallelism: Pipeline parallelism involves breaking down a task into multiple stages
or phases and executing each stage concurrently in a pipeline fashion. Each stage of the
pipeline processes a different portion of the input data, and the output of one stage serves
as the input to the next stage. Pipeline parallelism is commonly used in applications with
sequential dependencies between stages, such as instruction execution in CPUs, video and
audio processing, or compiler optimizations. Efficient scheduling and balancing of workload
across pipeline stages are critical for maximizing performance in pipeline parallelism.

8. What scheme is used to prevent data loss between fast and slow memory interfaces

To prevent data loss between fast and slow memory interfaces, an addressing scheme called "First-
In-First-Out (FIFO)" is commonly used.

FIFO is a queue-based addressing scheme where data is stored and retrieved in the order it was
received, similar to standing in line at a grocery store. The first data element to enter the FIFO buffer
is the first one to be processed and removed from the buffer.

7.What is the formula for data rate ?

The formula for data rate (also known as data transfer rate or throughput) is:

Data Rate=Amount of Data/Time

Where:

 Data Rate is the rate at which data is transferred, typically measured in bits per second (bps),
kilobits per second (kbps), megabits per second (Mbps), gigabits per second (Gbps), or bytes
per second (Bps).

 Amount of Data is the quantity of data transferred, typically measured in bits (b), kilobits
(kb), megabits (Mb), gigabits (Gb), or bytes (B).

 Time is the duration over which the data transfer occurs, typically measured in seconds (s),
milliseconds (ms), microseconds (μs), or minutes (min).

6.Consider a digital computer which supports only 2-address instructions each with 16 bits. If
address length is 5 bits, then maximum and minimum how many instructions the system
supports?

Given:

 2-address instructions, each with 16 bits.

 Address length is 5 bits.


The 16-bit instruction format can be divided into two parts: the opcode field and the address
field. Since it's a 2-address instruction, each address field will occupy 5 bits (since the address
length is 5 bits).

The remaining bits in the instruction format will be used for the opcode. Therefore, the opcode
field will be of size 16 - (5 + 5) = 6 bits.

Now, let's calculate the maximum and minimum number of instructions:

1. Maximum Instructions: In the opcode field, there are 6 bits, so there can be 26=64 possible
opcodes. For each opcode, there are 25 possible combinations of addresses (since each
address field is 5 bits). So, the maximum number of instructions is
64×25=64×32=204864×25=64×32=2048 instructions.

2. Minimum Instructions: For the minimum number of instructions, we assume that all
opcodes and address combinations are used. So, the minimum number of instructions is the
total number of possible combinations of opcodes and addresses. The opcode field has 6
bits, and each address field has 5 bits, giving us a total of 16 bits. Therefore, the minimum
number of instructions is 216=65536216=65536 instructions.

5.A processor has 40 distinct instructions and 24 GPRs. A 32-bit instructions word has an opcode,2
register operands and an immediate operand. Find the number of bits available for the immediate
operand field

 The instruction word is 32 bits long.

 Each instruction word contains:

 An opcode.

 2 register operands.

 An immediate operand.

Let's calculate the number of bits used by the opcode and register operands first:

 The opcode field: Since there are 40 distinct instructions, the opcode field requires ⌈ log240⌉
bits.

 Each register operand: Since there are 24 general-purpose registers (GPRs), each register
operand requires ⌈log224⌉ bits.

Now, let's calculate the total number of bits used by the opcode and register operands:

 Opcode field: ⌈ log240⌉ bits.

 Register operands: 2 operands * ⌈log224⌉ bits/operand.

Then, we subtract the total number of bits used by the opcode and register operands from the total
number of bits in the instruction word to find the number of bits available for the immediate
operand field:

Total bits available for immediate operand field=32−(bits for opcode+bits per register operand)

Let's calculate it:

Bits for opcode=⌈log240⌉=6 bits


Bits per register operand=2×⌈log224⌉=2×5=10 bits

Total bits available for immediate operand field=32−(6+10)=32−16=16 bitsTotal bits available for imm
ediate operand field=32−(6+10)=32−16=16 bits

So, there are 16 bits available for the immediate operand field.

4.Difference between logical and physical address

Logical addresses are virtual addresses generated by the CPU, representing memory locations from a
program's perspective. Physical addresses represent actual memory locations in RAM. Logical
addresses are translated to physical addresses by the memory management unit (MMU) before
accessing memory, ensuring abstraction from hardware details.

3.Difference between page and segment table

Page table maps logical to physical memory addresses at the page level, dividing memory into fixed-
size pages. Segment table maps logical to physical memory addresses at the segment level, dividing
memory into variable-sized segments. Page tables are used in paging, while segment tables are used
in segmentation memory management schemes.

2.The main memory of a computer has 2 cm blocks while the cache has 2c blocks. The cache uses the
set associative mapping scheme with 2 blocks per set. How does block K of the main memory map to
set of cache memory?

In a set-associative mapping scheme, each block of main memory maps to a specific set in the cache
memory. Since the cache has 2c blocks and each set has 2 blocks, there are 2c -1 sets in the cache.

To determine which set block K of main memory maps to, we can use the modulo operation:

Set number=Block number mod Number of setsSet number=Block number mod Number of sets

In this case, since the cache has 2c blocks and each set has 2 blocks:

Number of sets=2c /2=2c -1

Now, we can find the set number for block K of main memory:

Set number=K mod 2c -1

This calculation will determine which set in the cache memory block K of the main memory maps to.
1.State the advantages and applications of DMA.

Direct Memory Access (DMA) allows data transfer between peripheral devices and memory without
CPU intervention, enhancing system performance by offloading data transfer tasks.

Advantages include faster data transfer rates, reduced CPU overhead, and efficient handling of large
data streams. DMA is commonly used in disk I/O, networking, and multimedia applications.

27. Difference between associative and set-associative cache mapping.

Direct-mapping Associative Mapping Set-Associative Mapping

Needs comparison with


all tag bits, i.e., the
Needs only one comparison cache control logic must Needs comparisons equal
because of using direct examine every block’s to number of blocks per
formula to get the effective tag for a match at the set as the set can contain
cache address. same time in order to more than 1 blocks.
determine that a block is
in the cache/not.

Main Memory Address is


divided into 3 fields :
TAG, BLOCK & WORD. The
BLOCK & WORD together
make an index. The least Main Memory Address is Main Memory Address is
significant WORD bits identify divided into 1 fields : divided into 3 fields :
a unique word within a block TAG & WORD. TAG, SET & WORD.
of main memory, the BLOCK
bits specify one of the blocks
and the Tag bits are the most
significant bits.

There is one possible location The mapping of the main


The mapping of the main
in the cache organization for memory block can be
memory block can be
each block from main memory done with a particular
done with any of the
because we have a fixed cache block of any direct-
cache block.
formula. mapped cache.

If the processor need to


If the processor need to In case of frequently
access same memory
access same memory location accessing two different
location from 2 different
from 2 different main memory pages of the main
main memory pages
pages frequently, cache hit memory if reduced, the
frequently, cache hit
ratio decreases. cache hit ratio reduces.
ratio has no effect.
Search time is less here
Search time is more as
because there is one possible Search time increases
the cache control logic
location in the cache with number of blocks
examines every block’s
organization for each block per set.
tag for a match.
from main memory.

The index is given by the The index is zero for The index is given by the
number of blocks in cache. associative mapping. number of sets in cache.

It has less tags bits than


It has the greatest associative mapping and
It has least number of tag bits.
number of tag bits. more tag bits than direct
mapping.

Advantages-
 Simplest type of
mapping Advantages-
 Fast as only tag  It gives better
field matching is Advantages- performance
required while  It is fast. than the direct
searching for a  Easy to and
word. implement associative
 It is comparatively mapping
less expensive than techniques.
associative
mapping.

Disadvantages- Disadvantages-
Disadvantages-
 Expensive  It is most
 It gives low
because it expensive as
performance
needs to with the
because of the
store address increase in set
replacement for
along with size cost also
data-tag value.
the data. increases.

28. A 2-byte long assembly language instruction BR 09( branch instruction) stored at location
1000( all nos. are in HEX). What is the effective address that the PC holds?
The effective address that the Program Counter (PC) holds after executing the branch
instruction "BR 09" can be determined as follows:
1. The branch instruction "BR 09" is a relative branch instruction, which means it
specifies a relative offset from the current instruction address to determine the
target address.
2. In the instruction "BR 09", "09" represents the relative offset or displacement in
hexadecimal format.
3. The PC holds the address of the next instruction to be executed after the current
instruction.
4. To calculate the effective address, we add the relative offset (09) to the address of
the next instruction, which is the current value of the PC.
Given that the branch instruction is stored at location 1000, we need to add the relative
offset (09) to this address:
Effective Address=Current PC+Relative OffsetEffective Address=Current PC+Relative Offset
Effective Address=100016+0916Effective Address=100016+0916
Effective Address=100916Effective Address=100916
So, the effective address that the PC holds after executing the branch instruction "BR 09" is
1009 (in hexadecimal).

29. If each register is specified by 3 bits and instructions ADD R1, R2, R3 is 2-byte long. Then
what is the length of the op-code field?
Given:
 Each register is specified by 3 bits.
 The instruction "ADD R1, R2, R3" is 2 bytes long.
In the instruction "ADD R1, R2, R3", we have:
 Opcode (operation code) specifying the operation (e.g., ADD)
 Register operands (R1, R2, R3)
Since each register is specified by 3 bits, the total number of bits required to specify all three
register operands is 3×3=9 bits.
The length of the opcode field can be calculated by subtracting the total number of bits
required for register operands from the total length of the instruction.
Total length of the instruction = 2 bytes = 2 bytes ×8 bits/byte=16
Length of the opcode field = Total length of the instruction - Total number of bits required
for register operands =16 bits−9 bits=7 bits.
So, the length of the opcode field is 7 bits.

30. Suppose processor takes 7 ns to read an instruction from the memory, 3ns to decode the
instruction, 5 ns to read operands from register files, 2ns to perform the computation of the
instruction and 4 ns to write the result into the register. What is the maximum clock rate of
the processor?
Given:

Time to read instruction from memory (IF): 7 ns


Time to decode instruction (ID): 3 ns
Time to read operands from register files (OF): 5 ns
Time to perform computation (EX): 2 ns
Time to write result into register (WB): 4 ns
The longest pipeline stage determines the minimum time needed for each clock cycle. In this
case, it is the time taken to read operands from the register files (OF), which is 5 ns.

31. What is the max number of 0-address, 1-address and 2-address instructions if the
instruction size is 32 bits and 10 bit is used for an address field?
To determine the maximum number of instructions for each type (0-address, 1-address, and
2-address) given the instruction size and the size of the address field, we need to calculate
the number of bits available for the opcode in each case.
Given:
 Instruction size: 32 bits
 Address field size: 10 bits
For each type of instruction, the remaining bits after allocating space for the address field
will be used for the opcode.
1. 0-address instructions:
 Since there are no address operands, the entire instruction size is available for
the opcode.
 Opcode size = Instruction size - Address field size
 Opcode size = 32 bits - 0 bits (no address field)
 Opcode size = 32 bits
2. 1-address instructions:
 With one address operand, the remaining bits for the opcode will be the
instruction size minus the size of the address field.
 Opcode size = Instruction size - Address field size
 Opcode size = 32 bits - 10 bits (address field)
 Opcode size = 22 bits
3. 2-address instructions:
 With two address operands, half of the instruction size will be used for each
address field, leaving the remaining bits for the opcode.
 Available bits for opcode = Instruction size - (2 * Address field size)
 Available bits for opcode = 32 bits - (2 * 10 bits) = 12 bits
Now, let's calculate the maximum number of instructions for each type based on the
number of bits available for the opcode using the formula 2n, where n is the number of bits:
 For 0-address instructions: 232
 For 1-address instructions: 222
 For 2-address instructions: 212
Therefore, the maximum number of 0-address instructions is 232, the maximum number of 1-
address instructions is 222, and the maximum number of 2-address instructions is 212.

32. There are 54 processor registers, 5 addressing modes and 8 K x 32 main memory. State
the instruction format and the size of each field if each instruction supports one register
operand and one address operand.
Let's break down the components of the instruction format:
1. Opcode Field: This field specifies the operation to be performed by the instruction.
Since there are multiple addressing modes and operations, the size of this field will
depend on the number of operations supported. Let's assume it requires
⌈log2(number of operations)⌉ bits.
2. Register Operand Field: This field represents the register operand. Since there are 54
processor registers, this field requires ⌈log254⌉bits.
3. Address Operand Field: This field represents the address operand. Since the main
memory has a size of 8 K x 32, the address field needs to accommodate 13 bits to
represent 8 K addresses.
4. Addressing Mode Field: This field specifies the addressing mode to be used. Since
there are 5 addressing modes, this field requires ⌈log25⌉bits.
Now, let's calculate the size of each field:
1. Opcode Field: ⌈log2 (number of operations)⌉ bits
2. Register Operand Field: ⌈log254⌉ bits
3. Address Operand Field: 13 bits
4. Addressing Mode Field: ⌈log25⌉ bits
The instruction format consists of these fields concatenated together. Therefore, the total
size of each instruction will be the sum of the sizes of these fields.

33. Instruction execution in a processor is divided into 5 stages IF, ID, OF, EX, WB. These
stages take 5, 4, 20, 10 and 3 ns respectively. A pipelined implementation of the processor
requires buffering between each pair of consecutive stages with a delay of 2 ns.
Two pipelined implementations of the processor are contemplated.
A) Maive pipeline implementation(NP) with 5 stages
B) An efficient pipeline (EP) where the OF stage is divided into stages OF1 and
OF2 with execution times of 12ns and 8 ns respectively.
What is the speed up achieved by EP over NP in executing 20 independent
instructions with no hazards?

You might also like