Bcs302 Ddco Module 3
Bcs302 Ddco Module 3
MODULE-3
Basic Structure of Computers: Functional Units, Basic Operational Concepts, Bus structure,
Performance – Processor Clock, Basic Performance Equation, Clock Rate, Performance
Measurement.
Machine Instructions and Programs: Memory Location and Addresses, Memory Operations,
Instruction and Instruction sequencing, Addressing Modes.
Text book Carl Hamacher, ZvonkoVranesic, SafwatZaky, Computer Organization, 5th Edition,
Tata McGraw Hill. : 1.2, 1.3, 1.4, 1.6, 2.2, 2.3, 2.4, 2.5
The primary memory is also called random access memory(RAM). The time required to
access one word is called memory access time. This memory access time ranges from few
nanoseconds(ns) to 100ns.
Hierarchy of RAM
• Cache: small and fast RAM units. These are highly coupled to processors and are part of
the same processor chip.
• Main Memory: This is largest and slowest unit.
Secondary Storage
Primary storage is expensive hence large amount of data and programs are stored in nonvolatile
memory called secondary storage. For example: magnetic disks,tapes, optical disks(CD-ROMs)
3.1.3 Arithmetic and Logic Unit
Most computer operations are executed in arithmetic and logic unit (ALU). Operations like
addition, subtraction, multiplication, division, comparison of numbers etc. are performed in ALU.
To perform these operations, the operands are placed in high speed storage elements called registers
and output could be stored in memory/register.
When above instruction is being executed, the value needs to be fetched from memory. So
address of “LOC” is stored in MAR so that value from this location can be read and put into register
R1.
• Memory Data Register (MDR): It holds the data to be written to/read from memory.
For example:
Move R1, LOC
The contents of R1 need to be moved to LOC memory location. So, R1 register contents are first
moved to MDR then the address of LOC is moved to MAR and the move instruction is executed
so that value stored in MDR is written to “LOC” memory location.
Let’s look in details the operation steps:
Step 1: The first line of code needs to be executed, so the address where the line of code resides is
placed in MAR.
Step 2: Read control signal is issued a line of code (ex: Move LOC, R1) is copied into MDR.
Step 3: The contents of MDR are moved to IR. So, by Step3 the instruction that needs to be
executed has been fetched and stored in IR, now this instruction needs to be executed.
Let “Add LOC, R1” be the instruction. Let’s see how this is executed.
Step 4: If the instruction involves memory access, the address of operand (ex: address of LOC) is
stored in MAR.
Step 5: Then read signal is issued and the contents of memory (i.e. value stored in LOC) are copied
to MDR.
Step 6: The value in MDR is passed to ALU to perform the operation (Add in the above example)
Step 7: If the result needs to be stored back in memory then the output is put back in MDR and
the location of memory is placed in MAR and “write” signal is executed to write back the
result to memory.
Interrupt:
There are scenarios when normal execution needs to be interrupted to handle situation that needs
immediate attention. In such a scenario an interrupt signal is generated and an appropriate routine
called interrupt service routine is executed.
Each device operates at its own speed. Electromechanical devices like keyboard, printer
operate at low speeds compared to magnetic and optical disks. Optical disks and magnetic disks are
slower compared to memory and processor speeds.
However, all these operate over common bus. Hence a control/synchronization mechanism
is required to coordinate these devices that operate at varying speeds.
Generally, buffers are used to manage the devices operating at different speeds. Ex: when
the processor needs to print a character on printer, first the character is placed in printer buffer. This
frees up the bus and processor to perform its operation. The printer reads the character from the
buffer and prints it.
3.4 Performance
The performance of a computer is how quickly it can execute programs. The speed with which a
computer executes programs is affected by the design of its hardware and machine language
instructions.
The total time required to execute the program is called elapsed time. This measure the
performance of the entire computer system. ( i.e speed of processor, disk and printer).
Performance of processor only the periods during which the processor is active. The sum of these
periods as the processor time needed to execute the program. This processor time depends on the
hardware that is involved in execution of machine instructions. The pertinent/relevant parts is
included the cache memory as a part of processor unit which impact the processor time are shown
in fig 3.4.
Hence, “cache memory” speeds up the program execution. Since it is within the processor,
access time to cache is much less compared to memory (main memory) access.
The length/duration of clock cycle plays a very important role and impacts the processor
performance.
1
Inverse of clock cycle is clock rate(R): 𝑅 = . Clock rate is measured in cycles per second.
𝑃
In new processors the clock rate ranges from few hundred million cycles per second i.e. Hertz(Hz).
For example:
500 million cycles per second =500 MHz
1250 million cycles per second = 1.2GHz
Clock period for above two are 2 and 0.8 nanoseconds (ns)
3.4.2 Basic Performance Equation
Let
T = Processor time required to execute the complete program written in high level language.
N= Number of machine instructions executed to complete the execution of the program
S= Average number of basic steps needed to execute one machine instruction.
R=Clock Rate R cycles per second. The progam execution time
𝑵×𝑺
Hence, 𝑻 = 𝑹
The programs range from Game Playing, Compilers, Database Applications, Numerically
Intensive Programs(example: astrophysics, quantum chemistry)
The program is compiled on target devices and tested. The results are compared with
reference device(ex:SPEC95 uses SUN SPARC station 10/40; For SPEC2000 uses ULTRA
SPARC 10 workstation with 300 MHz ULTRA SPARC- III Processor)
So a value of 50 on spec rating indicates test device is 50 times faster than the reference device.
The tests are done for all programs (ex: gaming, database, numerically intensive programs
etc.) and final rating is arrived as
𝑛
1
𝑆𝑃𝐸𝐶 𝑟𝑎𝑡𝑖𝑛𝑔 = (∏ 𝑆𝑃𝐸𝐶𝑖 )𝑛
𝑖=1
A SPEC rating is a measure of the combined effect of all factors affecting performance
including compiler, operating system, the processor and the memory of the computer being tested.
Dealing with each cell is a tough job and unnecessary. For this purpose, the memory is
organized so that group of n bits can be stored and retrieved in single basic operation.
Each group of n bits is referred to as word of information and n is called world length. The
memory of computer can be schematically represented as a collection of words as shown in figure
3.5.
(a)
(b)
Figure 3.6 Examples of encoded information in a 32-bit word
Department of AI & ML, BIT 2023-2024 Page 8 of 21
DD & CO Module 3- Basic structure of computers & Machine Instructions
Accessing the memory to store or retrieve a single item of information either a word or a
byte, requires distinct name or addresses for each item location. The 2k address constitute the
address space of the compute and the memory can have upto 2k addressable location.
So, we can address a memory at bit level, or byte level or word level. The most practical
way of assigning is to refer to memory as successive bytes’ location. This way of addressing
memory in terms of bytes is called Byte-Addressable Memory.
Byte locations have address 0,1,2,3….. Since a word is a set of multiple bytes, in a 32-bit
machine a word address will be 0,4, 8...
ii. A processor contains registers with each register size equal to the length of the word.
Hence, when a load or store operation is done, content of these registers are operated
upon.
3.7 Instructions and Instruction Sequencing:
A computer generally performs four types of operations:
i. Transfer data between the Memory &Processor registers
ii. Arithmetic and Logic operations on data.
iii. Program Sequence & Control
iv. I/O transfers
To understand these, we first define some notations:
3.7.1 Register Transfer Notation (RTN)
Data could be transferred between memory location, processor register & special registers in
memory system.
• A memory location could be identified by a symbolic name like LOC and a register could
be identified by names like R0, R5.
• I/O register may be identified by names like DATAIN, OUTSTATUS.
• Ex: R1 [LOC], this expression means the content of location “LOC” is copied to R1.
• Ex: R3 [R1] + [R2], this expression means add the contents of R1 register and contents
of R2 register and store it in R3.
• [R1] means value stored in register R1.
3.7.2 Assembly language Notation
• To represent the machine instructions, we use another notation called “assembly language”.
• Generally, a “C” program is first converted to assembly language
RTN Assembly Language
R1 [LOC] Move LOC, R1
R3 [R1] + [R2] Add R1, R2,R3
3.7.3 Basic Instruction Type:
There are four types of instructions
i. Three address instruction
ii. Two address instruction
iii. One address instruction
iv. Zero address instruction
We can do the 3-instruction operation. Add A, B, C using multiple one Address instruction
as below:
RTN Description
Load A Raccum[A] Copy A’s data to accumulator register
Add B Raccum[Raccum] + [B] Add B’s data to
accumulator register& store it in register
Store C C[Raccum] accumulator register content is stored
In case of one address instruction, based on the operation being done, a memory location could be
a source location or destination location. For example:
Load A : A is a source location
Store C : C is a destination location
Some CPUs/Processors have the luxury of many registers. These registers (R1, R2, …. Rn)
are closest to the processor, so if an operation involves registers it will be very fast.
Department of AI & ML, BIT 2023-2024 Page 12 of 21
DD & CO Module 3- Basic structure of computers & Machine Instructions
Ex: Add Ri,Rj Rj [Ri] + [Rj] Will execute very fast because all operands are registers.
Instruction to transfer content is of two types “Move” & “Load”. If one of the operands is a
register, then it’s better to use “Move” rather than “Load”
In some processors operations can be performed only on registers. In such processor the
content from memory is first moved to registers. Before performing the operation.
Figure 3.9 A Straight-line program for Figure 3.10 Using a loop to add n number
adding n numbers
• The variable N stores the number of times SUM operation needs to be performed (i.e. n)
• Register “R0” stores the intermediate sum as it keeps getting added. Final result is stored in
SUM as shown in figure 3.9.
• The looping block is shown in figure 3.10 by the brace “{“. In the looping block the next
“NUM” is fetched from memory & added to Register “R0”. The variable R1 is then
decremented by 1.
• A special instruction is executed called “Branch >0 Loop”, what this instruction means is
that, if the result of the instruction above it (i.e. Decrement R1) resulted in a value in the
register greater than zero then go back to “Loop”, else go to next instruction, i.e. “Move
R0, SUM”.
• Execution of the loop is repeated as long as the result of the decrement operation is greater
than zero.
• So, this special instruction is called a “branch instruction” and the place where we branch
i.e. “Loop” is called “branch target”. Since branching is based on a condition that R1 > 0,
the branching is also called “conditional branching” ex: Branch > 0 Loop is a condition
branch instruction .
3.7.6 Conditional codes:
Processor has a special register called as “condition codes register” or “status register”. This
register is a set of flags. Each flag has a special meaning.
Four out of these flags are:
Flag Description
N(negative) set to 1 if result is negative, otherwise cleared to 0
Z(zero) set to 1 if result is 0, otherwise cleared to 0
V(overflow) set to 1 if arithmetic overflow occurs, otherwise cleared to 0.
C(Carry) set to 1 if a carryout results for the operation, otherwise cleared to 0.
variety of ways. Programmer declare data as constant, local variable, global variable, array and
pointers.
The table 3.1 gives different types of addressing modes
Opcode addresss
( a) (b)
Figure 3.11 Indirect Addressing
Note: Parenthesis is used to refer operands accessed that are accessed in indirect mode.
The register or memory location that contains the address of an operand is called a pointer.
Indirection and the use of pointers are important and powerful concepts in programming.
Example 1:
Let us see an example of adding n numbers using a pointer:
In the above program, Add (R2),R0 will add the contents of memory pointed to by register R2
withthe register R0, and the instruction “Add #4,R2” is executed, then R2 points to next location
which has next number to be added to R0 (provided the index in R1 has not reached 0).
Example 2: In a “C” program, the instruction
A = *B
copies the value pointed to by pointer B into variable A.
The above can be done in assembly language in two ways:
Method 1: Method 2:
Move B, R1 Move (B), A
Move (R1), A
Hence, when Add 1000(R1),R2 instruction is executed, then the following steps happen:
Step 1: 1000 gets added to the register value 20 to get 1020.
Step 2: Value stored in location 1020 is picked
Step 3: This value is added to the value stored in register R2
Example 3: We have a student database with following fields: Student ID, Test 1 marks,
Test 2 marks and Test 3 marks as shown below:
Calculate the total of Test 1, calculate Total of Test 2 and calculate Total of Test 3.
Solution: The aim is to separately add all the Test 1 score, Test 2 score & Test 3 score.
If there are “n” students in the class, let us list the student details for all students one after the other
in the memory as shown in figure 3.14.
Figure 3.15 Index addressing used in accessing test scores in the list in fig 3.14
To understand the above code, we can divide the code into 3 parts:
Part 1: Initialization
• Move #List,R0 makes R0 contain base address of the student array
• Clear R1, Clear R2, Clear R3 makes R1, R2, R3 to 0.
• R1, R2, R3 will be used for temporary calculations.
• Move N, R4 makes R4 contain N (no. of students)
Part 2: Calculation
• Add 4(R0),R1: This means, to base address add 4 to reach Test1 marks of student1, then add
it to R1.Now, R1 contains 1st students test1 marks
• Add 8(R0),R2: This means, to base address add 8 to reach Test2 marks of student 1. Add the
marks to R2. Now, R2 contains 1st students test2 marks.
• Add 12(R0),R3: This means to base address add 12 to reach Test3. Now, R3 contains 1st
students Test3 marks.
• Add #16,R0: R0 was pointing to 1st student’s base address. When we add 16 to it, R0 now
jumps to 2nd student’s base address.
• Decrement R4 to indicate we are done with 1st student.
• Now repeat part 2 again, since R0 now points to 2nd student, we will end up adding
2ndstudent’s marks to 1st student.
• Again when we do Add #16,R0 we will jump to 3rd student and add 3rd student’s marks to
that of 2nd& 1st.
• This goes on till “Decrement R4” results in 0 which means we are done with all students.
Part 3: At this point, R1 has the sum of Test 1 marks of all students, R2 has the sum of Test 2
marks of all students, R3 has the sum of Test 3 marks of all students.
Move R1,SUM1 // Copies the contents of R1 to SUM1
Move R2,SUM2 // Copies the contents of R2 to SUM2
Move R3,SUM3 // Copies the contents of R3 to SUM3.
Note: In all the examples, we were using a register to store base address & the index value was
being specified directly. Example: Add 4(R0),R1
The constant can also be stored in a register. In which case the operand looks as (Ri,Rj)
Example: Add (R5,R0), R1
Where R5 can store the value 4, generally this is used for 2-D array manipulation.
Note: Another improvisation is that to the two register notation, we could add a constant value.
This is represented as X(Ri,Rj).
3.8.4 Relative Addressing
In this mode the program counter is used instead of a general purpose register i.e X(PC)
vii. Relative Mode: The effective address is determined by the index mode using the program
counter in place of the general purpose register Ri.
Example : Branch >0 LOOP
An instruction causes program execution to go the branch target location identified by the name
LOOP if the branch instruction satisfied. This location can be computed by specifying it as an
offset from the current value of the program counter.
Note : The relative mode is similar to index mode discussed previously, but here the base address
is specified by program counter (PC) instead of some register Ri.
Relative mode as X(PC) where as in index mode was represented as X(Ri)
3.8.5 Additional modes
viii. Auto-increment mode: The effective address of the operand is the contents of a register
specified in the instruction. After accessing the operand, the contents of this register are
automatically incremented to point to the next item in a list.
The instruction is shown as (Ri)+,
if Ri register points to an operand which is 32-bit, then (Ri)+ will move the pointer by 4 bytes to
the next word.
ix. Auto-decrement mode: The contents of the register specified in the instruction are first
automatically decremented and then used as the effective address of the operand.
It is shown as – (Ri).
It is similar to Auto-increment except that
• The address is decremented first before using the value of the address, and
• It is a decrement operation.
The below program is to add “n” numbers using auto-increment operation.