Coa 1
Coa 1
Coa 1
Text Books :
M. Morris Mano & Michael D. Ciletti, Digital Design With an Introduction to Verilog
Design, 5e, Pearson Education,
Carl Hamacher, Zvonko Vranesic, Safwat Zaky, Computer Organization, 5th Edition,
Tata McGraw Hill.
Digital systems &Computer
Organization (IS34)
Introduction to Digital System: Introduction, The Map Method, Four-Variable Map, Don’t-Care
Conditions, NAND and NOR Implementation, Exclusive-OR Function.
Basic Structure of Computers: Functional Units, Basic Operational Concepts, Bus structure,
Performance – Processor Clock, Basic Performance Equation, Clock Rate, Performance
Measurement.
Machine Instructions and Programs: Memory Location and Addresses, Memory Operations,
Instructions and instruction Sequencing, Addressing Modes.
Input/Output Organization: Accessing I/O Devices, Interrupts – Interrupt Hardware, Enabling
and Disabling Interrupts, Handling Multiple Devices, Direct Memory Access: Bus Arbitration,
The Memory System: ROM, Speed, size and Cost, Cache Memories – Mapping Functions.
Basic Processing Unit: Some Fundamental Concepts: Register Transfers, Performing ALU
operations, fetching a word from Memory, Storing a word in memory. Execution of a Complete
Instruction. Pipelining: Basic concepts: Role of Cache memory, Pipeline Performance.
Unit II
Basic Structure of Computers:
• Functional Units
• Basic Operational Concepts
• Bus structure
• Performance – Processor Clock
• Basic Performance Equation
• Clock Rate
• Performance Measurement.
Computer Organization describes the function & design of the various units of
digital computers that store & process information. It also deals with units of
computer that receive information from external sources & send results to external
destination.
Functional Units:
A computer consists of 5 functionally independent main parts:
1) Input Unit
2) Memory Unit
3) ALU
4) Output Unit
5) Control unit.
In order to execute an instruction, operands need to be brought into the ALU from
the memory.
• Operands are stored in general purpose registers available in the ALU.
• Access times of general purpose registers are faster than the cache.
Results of the operations are stored back in the memory or retained in the
processor for immediate use.
Output Unit:
Computers represent information in a specific binary form. Output units:
• Interface with output devices.
• Accept processed results provided by the computer in specific binary form.
• Convert the information in binary form to a form understood by an output device.
Control Unit: Operations of Input unit, Memory, ALU and Output unit are coordinated by
Control unit.
Instructions control “what” operations take place (e.g. data transfer, processing).
Control unit generates timing signals which determines “when” a particular operation
takes place.
Operation of a computer can be summarized as:
• Accepts information from the input units (Input unit).
• Stores the information (Memory).
• Processes the information (ALU).
• Provides processed results through the output units (Output unit).
Basic Operational Concepts:
Activity in computer is governed by instructions.
Add LOCA, R0
• Add the operand at memory location LOCA to the operand in a register R0 in the
processor.
• Place the sum into register R0.
• The original contents of LOCA are preserved.
• The original contents of R0 is overwritten.
• Instruction is fetched from the memory into the processor
• the operand at LOCA is fetched and added to the contents of R0 – the resulting sum
is stored in register R0.
The effect of above instruction can be realized by 2 instruction sequence:
Load LOCA,R1
Add R1,R0
The following are the steps to execute the instruction:
• Fetch the instruction from main-memory into the processor.
• Fetch the operand at location LOCA from main-memory into the register R1.
• Add the content of Register R1 and the contents of register R0.
• Store the result (sum) in R0
Bus Structure:
A bus is a group of lines that serves as a connecting path for several devices.
• A bus may be lines or wires.
• The lines carry data or address or control signal.
There are 2 types of Bus structures:
1) Single Bus Structure
2) Multiple Bus Structure.
Single Bus Structure
• Because the bus can be used for only one transfer at a time, only 2 units can
actively use the bus at any given time.
• Bus control lines are used to arbitrate multiple requests for use of the bus.
Advantages:
1) Low cost &
2) Flexibility for attaching peripheral devices.
Multiple Bus Structure
• Systems that contain multiple buses achieve more concurrency in operations.
• Two or more transfers can be carried out at the same time.
• Advantage: Better performance.
Disadvantage: Increased cost.
Buffer Registers:
The devices connected to a bus vary widely in their speed of operation, like input
output devices are slow in execution compared to magnetic or optical devices. To
synchronize multiple devices operational-speed, buffer-registers can be used.
Buffer registers are included with the devices to hold the information during
transfers.
Ex: printing an encoded character.
PERFORMANCE
The most important measure of performance of a computer is how quickly it can
execute programs.
The speed of a computer is affected by the design of
1) Instruction-set.
2) Hardware & the technology in which the hardware is implemented.
3) Software including the operating system
• Because programs are usually written in a HLL, performance is also affected by
the compiler that translates programs into machine language. (HLL High Level
Language).
• For best performance, it is necessary to design the compiler, machine instruction
set and hardware in a co-ordinated way.
Let us examine the flow of program instructions and data between the memory
& the processor.
• At the start of execution, all program instructions are stored in the main-memory.
• As execution proceeds, instructions are fetched into the processor, and a copy is
placed in the cache.
• Later, if the same instruction is needed a second time, it is read directly from the
cache.
• A program will be executed faster if movement of instruction/data between the
main-memory and the processor is minimized which is achieved by using the
cache.
PROCESSOR CLOCK
• Processor circuits are controlled by a timing signal called a Clock.
• The clock defines regular time intervals called Clock Cycles.
• To execute a machine instruction, the processor divides the action to be
performed into a sequence of basic steps such that each step can be completed in
one clock cycle.
• Let P = Length of one clock cycle
R = Clock rate.
• Relation between P and R is given by
R=1/P
• R is measured in cycles per second.
• Cycles per second is also called Hertz (Hz) in electrical engineering terminology,
millions is called as Mega(M) and billions is called as Giga(G). Ex: 500 ‘millions per
second’ is called as 500Megahertz.
BASIC PERFORMANCE EQUATION
Let T = Processor time required to execute a program.
N = Actual number of instruction required for executions.
S = Average number of basic steps needed to execute one machine instruction.
R = Clock rate in cycles per second.
The program execution time is given by
T= (N*S) / R
Above equation is referred to as the basic performance equation.
• To achieve high performance, the computer designer must reduce the value of T,
which means reducing N and S, and increasing R.
• The value of N is reduced if source program is compiled into fewer machine instructions.
• The value of S is reduced if instructions have a smaller number of basic steps to perform.
• The value of R can be increased by using a higher frequency clock.
Note: Care has to be taken while modifying values since changes in one parameter may affect
the other
CLOCK RATE
• There are 2 possibilities for increasing the clock rate R:
1) Improving the IC technology makes logic-circuits faster.
This reduces the time needed to compute a basic step. (IC integrated circuits).
This allows the clock period P to be reduced and the clock rate R to be increased.
2) Reducing the amount of processing done in one basic step also reduces the clock
period P.
In presence of a cache, the percentage of accesses to the main-memory is
small. Hence, much of performance-gain expected from the use of faster
technology can be realized.
The value of T will be reduced by same factor as R is increased. S & N are not
affected.
PERFORMANCE MEASUREMENT
To assess the performance of a computer:
• Benchmark refers to standard task used to measure how well a processor
operates.
• The Performance Measure is the time taken by a computer to execute a given
benchmark.
• SPEC selects & publishes the standard programs along with their test results for
different application domains. (SPEC -> System Performance Evaluation
Corporation).
• SPEC Rating is given
SPEC rating= (running time on the reference computer)
(running time on the computer under test)
• SPEC rating = 50 The computer under test is 50 times as fast as reference-
computer.
• The test is repeated for all the programs in the SPEC suite. Then, the geometric
mean of the results is computed.
• Let SPECi = Rating for program i in the suite.
overall SPEC rating for the computer is given by
Memory LOC, PLACE, NUM R1 <- [LOC] Contents of memory-location LOC are transferred into
register R1.
Processor R0, R1 ,R2 [R3] <- [R1]+[R2] Add the contents of register R1 &R2
and places their sum into R3.
I/O Registers DATAIN, DATAOUT R1 <- DATAIN Contents of I/O register DATAIN are
transferred into register R1.
ASSEMBLY LANGUAGE NOTATION:
To represent machine instructions and programs, an assembly language format is
used.
Assembly Language Format Description
Move LOC, R1 Transfer data from memory-location LOC to register R1. The contents of LOC are unchanged
by the execution of this instruction, but the old contents of register R1 are overwritten.
Add R1, R2, R3 Add the contents of registers R1 and R2, and places their sum into register R3
Two Address Opcode Source, Add A,B Add the contents of memory-locations A & B. Then, place the Move B, C
Destination result into location B, replacing the original contents of this Add A, C
location. Operand B is both a source and a destination.
One Address Opcode Load A Copy contents of memory location A into accumulator. Load A
Source/Destinati Add B
on Add B Add contents of memory- Store C location B to contents of Store C
accumulator register & place sum back into accumulator.
Zero Address Opcode [no Push Locations of all operands are defined implicitly. The operands Uses stack
Source/Destinati are stored in a pushdown stack. Push
on] pop
INSTRUCTION EXECUTION & STRAIGHT LINE SEQUENCING:
The program is executed as follows:
1) Initially, the address of the first instruction is loaded into PC (Figure 2.8).
2) Then, the processor control circuits use the information in the PC to fetch and
execute instructions, one at a time, in the order of increasing addresses. This is
called Straight-Line sequencing.
3) During the execution of each instruction, PC is incremented by 4 to point to next
instruction.
There are 2 phases for Instruction Execution:
1) Fetch Phase: The instruction is fetched from the memory-location and placed in
the IR.
2) Execute Phase: The contents of IR is examined to determine which operation is
to be performed. The specified-operation is then performed by the processor.
Program Explanation
• Consider the program for adding a list of n numbers (Figure 2.9).
• The Address of the memory-locations containing the n numbers are symbolically
given as NUM1, NUM2…..NUMn.
• Separate Add instruction is used to add each number to the contents of register
R0.
• After all the numbers have been added, the result is placed in memory-location
SUM.
BRANCHING:
• Consider the task of adding a list of „n‟ numbers (Figure 2.10).
• Number of entries in the list „n‟ is stored in memory-location N.
• Register R1 is used as a counter to determine the number of times the loop is
executed.
• Content-location N is loaded into register R1 at the beginning of the program.
• The Loop is a straight line sequence of instructions executed as many times as
needed.
The loop starts at location LOOP and ends at the instruction Branch>0.
• During each pass,
→ address of the next list entry is determined and
→ that entry is fetched and added to R0.
• The instruction Decrement R1 reduces the contents of R1 by 1 each time through the
loop.
• Then Branch Instruction loads a new value into the program counter. As a result, the
processor fetches and executes the instruction at this new address called the Branch
Target.
• A Conditional Branch Instruction causes a branch only if a specified condition is
satisfied. If the condition is not satisfied, the PC is incremented in the normal way,
and the next instruction in sequential address order is fetched and executed.
CONDITION CODES
• The processor keeps track of information about the results of various operations.
This is accomplished by recording the required information in individual bits,
called Condition Code Flags.
• These flags are grouped together in a special processor-register called the
condition code register (or status register).
Four commonly used flags are:
1) N (negative) set to 1 if the result is negative, otherwise cleared to 0.
2) Z (zero) set to 1 if the result is 0; otherwise, cleared to 0.
3) V (overflow) set to 1 if arithmetic overflow occurs; otherwise, cleared to 0.
4) C (carry) set to 1 if a carry-out results from the operation; otherwise cleared to
0.
ADDRESSING MODES: The term addressing modes refers to the way in which the
operand of an instruction is specified. The addressing mode specifies a rule for
interpreting or modifying the address field of the instruction before the operand is
actually executed. Table 2.1 lists the most important addressing modes found in
modern processors:
IMPLEMENTATION OF VARIABLE AND CONSTANTS:
• Variable is represented by allocating a memory-location to hold its value.
• Thus, the value can be changed as needed using appropriate instructions.
• There are 2 accessing modes to access the variables:
1) Register Mode
2) Absolute Mode
Register Mode:
• The operand is the contents of a register. The name (or address) of the register
is given in the instruction.
• Registers are used as temporary storage locations where the data in a register
are accessed.
For example, the instruction
Move R1, R2 ;Copy content of register R1 into register R2.
Absolute (Direct) Mode:
• The operand is in a memory-location.
• The address of memory-location is given explicitly in the instruction.
• The absolute mode can represent global variables in the program.
For example, the instruction
Move LOC, R2 ;Copy content of memory-location LOC into register R2.
Immediate Mode:
• The operand is given explicitly in the instruction.
• For example, the instruction
Move #200, R0 ;Place the value 200 in register R0.
• Clearly, the immediate mode is only used to specify the value of a source-
operand.
INDIRECTION AND POINTERS:
• Instruction does not give the operand or its address explicitly.
• Instead, the instruction provides information from which the new address of the
operand can be determined.
• This address is called Effective Address (EA) of the operand.
Indirect Mode:
• The EA of the operand is the contents of a register(or memory-location).
• The register (or memory-location) that contains the address of an operand is called a
Pointer.
• We denote the indirection by
→ name of the register or
→ new address given in the instruction.
E.g: Add (R1),R0 ;The operand is in memory. Register R1 gives the effective-address (B)
of the operand. The data is read from location B and added to contents of register R0.
Indirect Addressing:
• To execute the Add instruction in fig 2.11 (a), the processor uses the value which
is in register R1, as the EA of the operand.
• It requests a read operation from the memory to read the contents of location B.
The value read is the desired operand, which the processor adds to the contents
of register R0.
• Indirect addressing through a memory-location is also possible as shown in fig
2.11(b). In this case, the processor first reads the contents of memory-location
A, then requests a second read operation using the value B as an address to
obtain the operand.
Explanation:
• Register R2 is used as a pointer to the numbers in the list, and the operands are
accessed indirectly through R2.
• The initialization-section of the program loads the counter-value n from memory-
location N into R1 and uses the immediate addressing-mode to place the address
value NUM1, which is the address of the first number in the list, into R2. Then it
clears R0 to 0.
• The first two instructions in the loop implement the unspecified instruction block
starting at LOOP.
• The first time through the loop, the instruction Add (R2), R0 fetches the operand
at location NUM1 and adds it to R0.
• The second Add instruction adds 4 to the contents of the pointer R2, so that it
will contain the address value NUM2 when the above instruction is executed in
the second pass through the loop.
INDEXING AND ARRAYS
A different kind of flexibility for accessing operands is useful in dealing with lists
and arrays.
Index mode: The operation is indicated as X(Ri)
where X=the constant value which defines an offset(also called a
displacement).
Ri=the name of the index register which contains address of a new location.
• The effective-address of the operand is given by EA=X+[Ri]
• The contents of the index-register are not changed in the process of generating
the effective address.
• The constant X may be given either
→ as an explicit number or
→ as a symbolic-name representing a numerical value.
Ex:
• Fig(a) illustrates two ways of using the Index mode. In fig(a), the index register,
R1, contains the address of a memory-location, and the value X defines an
offset(also called a displacement) from this address to the location where the
operand is found.
• To find EA of operand:
Eg: Add 20(R1), R2
EA=>1000+20=1020
• An alternative use is illustrated in fig(b). Here, the constant X corresponds to a
memory address, and the contents of the index register define the offset to the
operand. In either case, the effective-address is the sum of two values; one is
given explicitly in the instruction, and the other is stored in a register.
Ex:
Base with Index Mode:
• Another version of the Index mode uses 2 registers which can be denoted as (Ri, Rj)
• Here, a second register may be used to contain the offset X.
• The second register is usually called the base register.
• The effective-address of the operand is given by EA=[Ri]+[Rj]
• This form of indexed addressing provides more flexibility in accessing operands
because both components of the effective-address can be changed.
Base with Index & Offset Mode
• Another version of the Index mode uses 2 registers plus a constant, which can be
denoted as X(Ri, Rj)
• The effective-address of the operand is given by EA=X+[Ri]+[Rj]
• This added flexibility is useful in accessing multiple components inside each item in
a record, where the beginning of an item is specified by the (Ri, Rj) part of the
addressing - mode. In other words, this mode implements a 3-dimensional array.
RELATIVE MODE:
• This is similar to index-mode with one difference:
• The effective-address is determined using the PC in place of the general
purpose register Ri.
• The operation is indicated as X(PC).
• X(PC) denotes an effective-address of the operand which is X locations above or
below the current contents of PC.
• Since the addressed-location is identified "relative" to the PC, the name
Relative mode is associated with this type of addressing.
• This mode is used commonly in conditional branch instructions.
• An instruction such as Branch > 0 LOOP ; Causes program execution to go to
the branch target location identified by name
LOOP if branch condition is satisfied.
ADDITIONAL ADDRESSING MODES:
1) Auto Increment Mode
• Effective-address of operand is contents of a register specified in the instruction (Fig: 2.16).
• After accessing the operand, the contents of this register are automatically incremented to
point to the next item in a list.
• Implicitly, the increment amount is 1.
• This mode is denoted as (Ri)+ ;
• Increment is 1 for byte sized operands, 2 for 16 bit operands and 4 for 32 bit operands.
2) Auto Decrement Mode
• The contents of a register specified in the instruction are first automatically decremented
and are then used as the effective-address of the operand.
• This mode is denoted as
-(Ri)
These 2 modes can be used together to implement an important data structure called a
stack.
Ex:
Similarly auto
UNIT IV
Input/Output Organization:
Accessing I/O Devices, Interrupts – Interrupt Hardware, Enabling and Disabling
Interrupts, Handling Multiple Devices
Direct Memory Access: Bus Arbitration, The Memory System: ROM, Speed, size
and Cost, Cache Memories – Mapping Functions.
Sequential Logic: Introduction, Flip-Flops. Verilog codes for Sequential logic
Circuits.
ACCESSING I/O-DEVICES
• A single bus-structure can be used for connecting I/O-devices to a computer
(Figure 7.1).
• Each I/O device is assigned a unique set of address.
• Bus consists of 3 sets of lines to carry address, data & control signals.
• When processor places an address on address-lines, the intended-device
responds to the command.
• The processor requests either a read or write-operation.
• The requested-data are transferred over the data-lines.
There are 2 ways to deal with I/O-devices: 1) Memory-mapped I/O & 2) I/O-mapped I/O.
1) Memory-Mapped I/O
• Memory and I/O-devices share a common address-space.
• Any data-transfer instruction (like Move, Load) can be used to exchange information.
For example,
Move DATAIN, R0; This instruction sends the contents of location DATAIN to register R0. Here,
DATAIN -> address of the input-buffer of the keyboard.
2) I/O-Mapped I/O
• Memory and I/0 address-spaces are different.
• A special instructions named IN and OUT are used for data-transfer.
• Advantage of separate I/O space: I/O-devices deal with fewer address-lines.
I/O Interface for an Input Device
1) Address Decoder: enables the device to recognize its address when this address appears on the
address-lines (Figure 7.2).
2) Status Register: contains information relevant to operation of I/O-device.
3) Data Register: holds data being transferred to or from processor. There are 2 types:
i) DATAIN -> Input-buffer associated with keyboard.
ii) DATAOUT -> Output data buffer of a display/printer.
Simple example of I/O operations involving a Keyboard and a display in a computer system
For an input device, SIN status flag is used.
SIN = 1 -> when a character is entered at the keyboard.
SIN = 0 -> when the character is read by processor.
Four registers used are
DATAIN
DATAOUT
STATUS
STATUS
CONTROL
A program that reads one line from the keyboard, stores it in memory buffer, and
echoes it back to the display
Move #LINE,R0 Initialize memory pointer
WAITK TestBit #0,STATUS Test SIN
Branch=0 WAITK Wait for character to be entered
Move DATAIN,R1 Read character
WAITD TestBit #1,STATUS Test SOUT
Branch=0 WAITD Wait for display to become ready
Move R1,DATAOUT Send character to display
Move R1,(R0)+ Store character and advance pointer
Compare #$0D,R1 Check if Carriage Return
Branch=0 WAITK If not, get another character
Move #$0A,DATAOUT Otherwise, send Line Feed
Call PROCESS Call a subroutine to process the input line
MECHANISMS USED FOR INTERFACING or implementing I/O operations
1) Program Controlled I/O
• Processor repeatedly checks status-flag to achieve required synchronization b/w processor
& I/O device. (We say that the processor polls the device).
Main drawback:
• The processor wastes time in checking status of device before actual data-transfer takes
place.
2) Interrupt I/O
• I/O-device initiates the action instead of the processor.
• I/O-device sends an INTR signal over bus whenever it is ready for a data-transfer operation.
• Like this, required synchronization is done between processor & I/O device.
3) Direct Memory Access (DMA)
• Device-interface transfer data directly to/from the memory without continuous
involvement by the processor.
• DMA is a technique used for high speed I/O-device.
INTERRUPTS:
• There are many situations where other tasks can be performed while waiting
for an I/O device to become ready.
• A hardware signal called an Interrupt will alert the processor when an I/O
device becomes ready.
• Interrupt-signal is sent on the interrupt-request line.
• The processor can be performing its own task without the need to continuously
check the I/O-device.
• The routine executed in response to an interrupt-request is called ISR.
• The processor must inform the device that its request has been recognized by
sending INTA signal. (INTR -> Interrupt Request, INTA -> Interrupt Acknowledge,
ISR -> Interrupt Service Routine)
• For example, consider COMPUTE and PRINT routines (Figure 3.6).
• The processor first completes the execution of instruction i.
• Then, processor loads the PC with the address of the first instruction of the ISR.
• After the execution of ISR, the processor has to come back to instruction i+1.
• Therefore, when an interrupt occurs, the current content of PC is put in temporary storage
location.
• A return at the end of ISR reloads the PC from that temporary storage location.
• This causes the execution to resume at instruction i+1.
• When processor is handling interrupts, it must inform device that its request has
been recognized.
• This may be accomplished by INTA signal.
• The task of saving and restoring the information can be done automatically by the
processor.
• The processor saves only the contents of PC & Status register.
• Saving registers also increases the Interrupt Latency.
• Interrupt Latency is a delay between
→ time an interrupt-request is received and
→ start of the execution of the ISR.
• Generally, the long interrupt latency in unacceptable.
INTERRUPT HARDWARE:
• Most computers are likely to have several I/O devices that can request an interrupt.
• A single interrupt request line may be used to serve n devices as depicted in figure.
• All devices are connected to line via switches to ground. To request an interrupt, a device closes its
associated switch.
• Thus if all Interrupt request signals INTR1 to INTR n are inactive, that is if all switches are open, the
voltage on the interrupt request line will be equal to Vdd .
• When a device requests an interrupt by closing its switch, the voltage on the line drops to 0, causing
interrupt request signal INTR, received by the processor to go to 1.
• Closing of one or more switches will cause the line voltage to drop to 0, the value of INTR is the logical
OR of requests from individual devices, that is
INTR=INTR1+NTR2………+INTRn
• A special gates known as open-collector or open-drain are used to drive the INTR line.
• The Output of the open collector control is equal to a switch to the ground that is
→ open when gates input is in ”0‟ state and
→ closed when the gates input is in “1‟ state.
• Resistor R is called a Pull-up Resistor because it pulls the line voltage up to the high-voltage state when the
switches are open
Difference between Subroutine & ISR
Subroutine ISR
A subroutine performs a function required by the ISR may not have anything in common with program
program from which it is called being executed at time INTR is received.
Subroutine is just a linkage of 2 or more function Interrupt is a mechanism for coordinating I/O
related to each other. transfers.