0% found this document useful (0 votes)
104 views

Cs6303comparchnotes PDF

This document contains a lesson plan for a class on computer architecture. It outlines the objectives, skills, and outcomes of the lesson, which is to teach students the eight foundational ideas of computer system design: (1) designing for Moore's law, (2) using abstraction, (3) making the common case fast, (4) performance via parallelism, (5) performance via pipelining, (6) performance via prediction, (7) hierarchy of memories, and (8) dependability via redundancy. The lesson plan allocates 40 minutes to cover these eight ideas through lecture notes and examples.

Uploaded by

Gowri Shankar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views

Cs6303comparchnotes PDF

This document contains a lesson plan for a class on computer architecture. It outlines the objectives, skills, and outcomes of the lesson, which is to teach students the eight foundational ideas of computer system design: (1) designing for Moore's law, (2) using abstraction, (3) making the common case fast, (4) performance via parallelism, (5) performance via pipelining, (6) performance via prediction, (7) hierarchy of memories, and (8) dependability via redundancy. The lesson plan allocates 40 minutes to cover these eight ideas through lecture notes and examples.

Uploaded by

Gowri Shankar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 250

SRI VIDYA COLLEGE OF ENGINEERING AND

TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Introduction to Unit I – Overview & Instructions
Time: 45 Minutes
Lesson. No Unit 1 – Lesson No. 1 / 12
1.CONTENT LIST:
Introduction to Unit I – Overview & Instructions

2. SKILLS ADDRESSED:
Listening

3. OBJECTIVE OF THIS LESSON PLAN:


To facilitate students understand the basic concepts of Computer and its components.

4.OUTCOMES:
i. Explain the concept of components of computer system
ii. Know the instructions and addressing modes
iii.
5.LINK SHEET:
i. What are the components of Computer system?
ii. What are the topics covered in overview and instructions
iii.
6.EVOCATION: (5 Minutes)
7.Lecture Notes: (40 Minutes)

Embedded Computer: Performs single function on a microprocessor


 Embedded within a product (e.g. microwave, car, cell phone)
 Objective: Low cost
 Increasingly written in a hardware description language, like Verilog or VHDL
 Processor core allows application-specific hardware to be fabricated on a single chip.

Desktop Computer: Designed for individual use


 Also called personal computer, workstation

Server: Runs large, specialized program(s)


 Shared by many users: more memory, higher speed, better reliability
 Accessed via a network using a request-response (client-server) interface
 Example: File server, Database server, Web server

Supercomputer: Massive computing resources and memory


 Hundreds to thousands of processors within single computer
 Terabytes of memory
 Program uses multiple processors simultaneously
 Rare due to extreme expense
 Applications: Weather forecasting, military simulations, etc.

What types of applications are concerned about:


 Memory?
 Processing speed?
 Usability?
 Maintainability?

How can the following impact performance?


 A selected algorithm?
 A programming language?
 A compiler?
 An operating system?
 A processor?
 I/O system/devices?

Computer Architect must balance speed and cost across the system
 System is measured against specification
 Benchmark programs measure performance of systems/subsystems
 Subsystems are designed to be in balance between each other

Usage:
 Normal: Data communications, time, clock frequencies
 Power of 2: Memory (often)
Memory units:
 Bit (b): 1 binary digit
 Nibble: 4 binary digits
 Byte (B): 8 binary digits
 Word: Commonly 32 binary digits (but may be 64).
 Half Word: Half the binary digits of a word
 Double Word: Double the binary digits of a word

Common Use:
 10 Mbps = 10 Mb/s = 10 Megabits per second
 10 MB = 10 Megabytes
 10 MIPS = 10 Million Instructions Per Second

Moore’s Law:
 Component density increase per year: 1.6
 Processor performance increase: 1.5 more recently 1.2 and < 1.2
 Memory capacity improvement: 4/3: 1.33

Tradeoffs in Power versus Clock Rate


 Faster Clock Rate = Faster processing = More power
 More transistors = More complexity = More power
Example Problems:
A disk operates at 7200 Revolutions per minute (RPM). How long does it take to revolve once?
7200 Revs = 1 Rev
60 seconds x secs
7200/60 x = 1
120x = 1
x = 1/120 = 0.00833 second = 8.33milliseconds or 8.33 ms

A disk holds 600 GB. How many bytes does it hold?


600 GB = 600 x 230 = 600 x 1,073,741,824 = 644,245,094,400

A LAN operates at 10 Mbps. How long will it take to transfer a packet of 1000 bytes?
(Optimistically assuming 100% efficiency)
10 Mb = 8 bits 10 Mb = 8000
1 sec x sec 1 sec x sec
10,000,000x = 8 10,000,000x = 8000
x = 8/10,000,000 = 0.000,000,8 = 800ns x = 8000/10,000,000=8/10,000
1000 x 800 ns = 800us x = 0.0008 = 800us
8. Textbook :
 Carl Hamacher, Zvonko Vranesic and Safwat Zaky, ―Computer Organization‖, Fifth Edition,
Tata McGraw Hill, 2002, PP. 3-9

9. Application
Processor, Embedded system
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Eight ideas
Time: 45 Minutes
Lesson. No Unit 1 – Lesson No. 2/ 12
1.CONTENT LIST:
Eight ideas
2. SKILLS ADDRESSED:
Learning
Understanding
3.OBJECTIVE OF THIS LESSON PLAN:
To make the students know the eight ideas of computer system
4.OUTCOMES:
i. Learn the basics underlined in eight ideas
ii. Remember the basic concepts of eight ideas
5.LINK SHEET:
i. List all the eight ideas followed to deign computer system?
ii. Explain the concept of eight ideas
6.EVOCATION: (5 Minutes)
7. Lecture Notes (40 minutes)

 Design for Moore’s Law


one constant for computer designers is rapid change, which is driven largely by Moore's
Law. It states that integrated circuit resources double every 18–24 months. Moore's Law resulted
from a 1965 prediction of such growth in IC capacity made by Gordon Moore, one of the
founders of Intel. As computer designs can take years, the resources available per chip can easily
double or quadruple between the start and finish of the project. Like a skeet shooter, computer
architects must anticipate where the technology will be when the design finishes rather than
design for where it starts. We use an "up and to the right" Moore's Law graph to represent
designing for rapid change.

 Use abstraction to simplify design


. Both computer architects and programmers had to invent techniques to make themselves
more productive, for otherwise design time would lengthen as dramatically as resources
grew by Moore's Law. A major productivity technique for hardware and soft ware is to
use abstractions to represent the design at different levels of representation; lower-level
details are hidden to off er a simpler model at higher levels. We'll use the abstract
painting icon to represent this second great idea

 Make the common case fast


. Making the common case fast will tend to enhance performance better than optimizing
the rare case. Ironically, the common case is oft en simpler than the rare case and hence is
oft en easier to enhance. This common sense advice implies that you know what the
common case is, which is only possible with careful experimentation and measurement.
We use a sports car as the icon for making the common case fast, as the most common
trip has one or two passengers, and it's surely easier to make a fast sports car than a fast
minivan

 Performance via parallelism


Since the dawn of computing, computer architects have offered designs that get more
performance by performing operations in parallel. We'll see many examples of
parallelism in this book. We use multiple jet engines of a plane as our icon for parallel
performance.

 Performance via pipelining


Following the saying that it can be better to ask for forgiveness than to ask for
permission, the next great idea is prediction. In some cases it can be faster on average to
guess and start working rather than wait until you know for sure, assuming that the
mechanism to recover from a misprediction is not too expensive and your prediction is
relatively accurate. We use the fortune-teller's crystal ball as our prediction icon.


Performance via prediction
A particular pattern of parallelism is so prevalent in computer architecture that it merits
its own name: pipelining. For example, before fire engines, a "bucket brigade" would
respond to a fire, which many cowboy movies show in response to a dastardly act by the
villain. Th e townsfolk form a human chain to carry a water source to fi re, as they could
much more quickly move buckets up the chain instead of individuals running back and
forth. Our pipeline icon is a sequence of pipes, with each section representing one stage
of the pipeline.

 Hierarchy of memories
Programmers want memory to be fast, large, and cheap, as memory speed often shapes
performance, capacity limits the size of problems that can be solved, and the cost of
memory today is often the majority of computer cost. Architects have found that they can
address these conflicting demands with a hierarchy of memories, with the fastest,
smallest, and most expensive memory per bit at the top of the hierarchy and the slowest,
largest, and cheapest per bit at the bottom. Caches give the programmer the illusion that
main memory is nearly as fast as the top of the hierarchy and nearly as big and cheap as
the bottom of the hierarchy. We use a layered triangle icon to represent the memory
hierarchy. The shape indicates speed, cost, and size: the closer to the top, the faster and
more expensive per bit the memory; the wider the base of the layer, the bigger the
memory.

 Dependability via redundancy


Computers not only need to be fast; they need to be dependable. Since any physical
device can fail, we make systems dependable by including redundant components that
can take over when a failure occurs and to help detect failures. We use the tractor-trailer
as our icon, since the dual tires on each side of its rear axels allow the truck to continue
driving even when one tire fails. (Presumably, the truck driver heads immediately to a
repair facility so the fl at tire can be fixed, thereby restoring redundancy!)

8. Textbook :
 Carl Hamacher, Zvonko Vranesic and Safwat Zaky, ―Computer Organization‖, Fifth Edition,
Tata McGraw Hill, 2002, PP. 3-9
9. Application
Processor, Embedded system
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Components of a computer system
Time: 45 Minutes
Lesson. No Unit 1 – Lesson No. 3/ 12
1.CONTENT LIST:
Components of a computer system
2. SKILLS ADDRESSED:
Learning
Remembering
3.OBJECTIVE OF THIS LESSON PLAN:
To make the students know the main components of computer system
4.OUTCOMES:
i. Learn the basics of computer components
ii. Remember operation of computer components
5.LINK SHEET:
i. List the components of computer system?
ii. Sketch the components of computer system with neat diagram.
iii. Explain the operation of all the components of computer system.
6.EVOCATION: (5 Minutes)
7.Lecture Notes: (40 Minutes)

Computer components include:


 Input: keyboard, mouse, network, disk
 Output: printer, video screen, network, disk
 Memory: DRAM, magnetic disk
 CPU: Intelligence: Includes Datapath and Control

Input/Output

Mouse:
 Electromechanical: Rolling ball indicates change in position as (x,y) coordinates.
 Optical: Camera samples 1500 times per second. Optical processor compares images and
determines distance moved.

Displays:
Raster Refresh Buffer: Holds the bitmap or matrix of pixel values.
 Matrix of Pixels: low resolution: 512 x 340 pixels to high resolution: 2560 x 1600 pixels
Black & White: 1 bit per pixel
Grayscale: 8 bits per pixel
Color: (one method): 8 bits each for red, blue, green = 24 bits
 Required: Refresh the screen periodically to avoid flickering

Two types of Displays:


Cathode Ray Tube (CRT): Pixel is source of light
 Scans one line at a time with a refresh rate of 30-75 times per second
Liquid Crystal Display (LCD): LCD pixel control or bends the light for the display.
 Color active matrix LCD: Three transistor switches per pixel

Networking: Communications between computers


Local Area Network (LAN): A network which spans a small area: within a building
Wide Area Network (WAN): A network which extends hundreds of miles; typically managed
by a communications service provider
Memory Hierarchy:

CPU Cache Main Magnetic Tape


registers Memory Disk

Fast, expensive, volatile Slow, cheap, non-volatile


Secondary Memory: Nonvolatile memory used to store programs and data when not running
 Nonvolatile: Does not lose data when powered off
 Includes:
Magnetic Disk: Access time: 5-15 ms
Tape: Sometimes used for backup
Optical Disk: CD or DVD
FLASH: Removable memory cards attach via USB
Floppy and Zip: Removable form of magnetic disk

Magnetic Disk: Movable arm moves to concentric circle then writes


 Disk diameter: 1 to 3.5 inches
 Latency: Moving head to ‗cylinder‘ or concentric track
 Rotation Time: Rotating cylinder to correct location on ‗track‘
 Transfer Time: Reading or writing to disk on ‗track‘
 Access time: 5-20 ms

Optical Disk: Laser uses spiral pattern to write bits as pits or flats.
 Compact Disc (CD): Stores music
 Digital Versatile Disc (DVD): Multi-gigabyte capacity required for films
 Read-write procedure similar to Magnetic Disk (but optical write, not magnetic)

Flash Memory: Semiconductor memory is nonvolatile


 More expensive than disk, but also more rugged and faster latency.
 Good for 100,000-1M writes.
 Common in cameras, portable music players, memory sticks

Primary or Main Memory: Programs are retained while they are running. Uses:
 Dynamic Random Access Memory (DRAM)
Built as an integrated circuit, equal speed to any location in memory
Access time: 50-70 ns.
 SIMM (Single In-line Memory Module): DRAM memory chips lined up in a row, often on
a daughter card
 DIMM (Dual In-line Memory Module): Two rows of memory chips
 ROM (Read Only Memory) or EPROM (Erasable Programmable ROM)

Cache: Buffer to the slower, larger main memory. Uses:


 Static Random Access Memory (SRAM)
 Faster, less dense and more expensive than DRAM
Uses multiple transistors per bit instead of the single transistor for DRAM
Registers: Fastest memory within the CPU.

Central Processing Unit (CPU) or Processor: Intelligence


 Data Path: Performs arithmetic operations using registers
 Control: Management of flow of information through the data path

Bus: Connects the CPU, Memory, I/O Devices


 Bits are transmitted between the CPU, Memory, I/O Devices in a timeshared way
 Serial buses transmit one bit at a time.
 Parallel buses transmit many bits simultaneously: one bit per line

One bus system: Memory, CPU, I/O Subsystem on same bus


Two bus system:
 One bus: CPUMemory
 One bus: CPUI/O Subsystem

Example: Universal Serial Bus (USB 2.0)


 Hot-pluggable: can be plugged and unplugged without damage to the system
 Operates at 0.2, 1.5 or 60 MB/sec
 Can interface to printer or other slow devices

8. Textbook :
 Carl Hamacher, Zvonko Vranesic and Safwat Zaky, ―Computer Organization‖, Fifth Edition,
Tata McGraw Hill, 2002, PP. 3-9
9. Application
Processor, Embedded system, Notebook Computers, Handheld Computers
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Technology
Time: 45 Minutes
Lesson. No Unit 1 – Lesson No.4/ 12
1.CONTENT LIST:
Technology
2. SKILLS ADDRESSED:
Listening
Understanding
3.OBJECTIVE OF THIS LESSON PLAN:
To make the students know the technology involved in computer architecture
4.OUTCOMES:
i. Learn the technology involved in computer system
ii. Remember the different types of technology
5.LINK SHEET:
i. Explain the technology involved in computer architecture
ii. Discuss the types of technology in detail.
6.EVOCATION: (5 Minutes)

7.Lecture Notes: (40 Minutes)


Embedded Computer: Performs single function on a microprocessor
 Embedded within a product (e.g. microwave, car, cell phone)
 Objective: Low cost
 Increasingly written in a hardware description language, like Verilog or VHDL
 Processor core allows application-specific hardware to be fabricated on a single chip.

Desktop Computer: Designed for individual use


 Also called personal computer, workstation

Server: Runs large, specialized program(s)


 Shared by many users: more memory, higher speed, better reliability
 Accessed via a network using a request-response (client-server) interface
 Example: File server, Database server, Web server

Supercomputer: Massive computing resources and memory


 Hundreds to thousands of processors within single computer
 Terabytes of memory
 Program uses multiple processors simultaneously
 Rare due to extreme expense
 Applications: Weather forecasting, military simulations, etc.

CPUs
Device density: 2x every 1.5 years (~60% per year)
Latency: 2x every 5 years (~15% per year)
Memory (DRAM)
Capacity: 4x every 3 years (~60% per year)
(2x every two years lately)
Latency: 1.5x every 10 years
Cost per bit: decreases about 25% per year
Hard drives:
Capacity: 4x every 3 years (~60% per year)
Bandwidth: 2.5x every 4 years
Latency: 2x every 5 years
Boards:
Wire density: 2x every 15 years
Cables:
No change
Physical Hardware

Semiconductor: Conducts electricity poorly


Silicon Ingot: Made of silicon: substance found in sand
Wafer: Ingot is sliced into 0.1-inch blank wafers
Processing: Add materials to wafers: conductor, insulator, or transistors: (on/off) switch
Diced: Wafers are cut into smaller components called dies or chips
Yield: Wafers/dies are tested providing a % success rate. Failures are discarded
Bonding: The chip is connected to the input/output pins of a package

Integrated Circuit = chip: A device containing up to millions of transistors


Very Large Scale Integrated Circuit (VLSI): A device containing hundreds of thousands to
millions of transistors

Computer chassis vocabulary:


 Motherboard: Holds the processor, system bus, various interfaces and connectors
 Daughter card: Small printed circuit board, often contains multiple memory chips.
 Cage or Chassis: Holds multiple boards
 Backplane: Contains bus interface for boards to communicate
 3D Packaging: Transistors interconnect above, beside, below (3D)

8. Textbook :
 Carl Hamacher, Zvonko Vranesic and Safwat Zaky, ―Computer Organization‖, Fifth Edition,
Tata McGraw Hill, 2002, PP. 3-9
9. Application
Memory, Processing speed, Usability, Maintainability
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Performance and Powerwall
Time: 45 Minutes
Lesson. No Unit 1 – Lesson No.5/ 12
1.CONTENT LIST:
Performance and Powerwall
2. SKILLS ADDRESSED:
Learning
Understanding
3.OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the performance and Powerwall in computer
architecture
4.OUTCOMES:
i. Learn the performance of computer architecture
ii. Understand the power consumption and Powerwall in computer architecture
5.LINK SHEET:
i. List the performance factors to design computer system
ii. Explain the performance involved in computer architecture
iii. Discuss the types of power wall in detail.
6.EVOCATION: (5 Minutes)
7.Lecture Notes: (40 Minutes)

Purchasing perspective
 given a collection of machines, which has the
 best performance ?
 least cost ?
 best cost/performance?
Design perspective
faced with design options, which has the
 best performance improvement ?
 least cost ?
 best cost/performance?
Both require
 basis for comparison
 metric for evaluation
Our goal is to understand what factors in the architecture contribute to overall system
performance and the relative importance (and cost) of these factors
Which airplane has the best performance?

Boeing 777

Boeing 747

BAC/Sud
Concorde
Douglas
DC-8-50

0 100 200 300 400 500

Passenger Capacity

Response Time and Throughput


 Response time
o How long it takes to do a task
 Throughput
o Total work done per unit time
 e.g., tasks/transactions/… per hour
 How are response time and throughput affected by
o Replacing the processor with a faster version?
o Adding more processors?
Relative Performance

Define Performance = 1/Execution Time


―X is n time faster than Y‖

Performance X Performance Y
 Execution time Y Execution time X  n

 Example: time taken to run a program


 10s on A, 15s on B
 Execution TimeB / Execution TimeA
= 15s / 10s = 1.5
So A is 1.5 times faster than B

Measuring Execution Time

 Elapsed time
 Total response time, including all aspects
 Processing, I/O, OS overhead, idle time
 Determines system performance
 CPU time
 Time spent processing a given job
 Discounts I/O time, other jobs‘ shares
 Comprises user CPU time and system CPU time
 Different programs are affected differently by CPU and system performance
CPU Clocking

 Operation of digital hardware governed by a constant-rate clock


 Clock period: duration of a clock cycle
 e.g., 250ps = 0.25ns = 250×10–12s
 Clock frequency (rate): cycles per second
 e.g., 4.0GHz = 4000MHz = 4.0×109Hz
CPU Time

CPU Time  CPU Clock Cycles Clock Cycle Time


CPU Clock Cycles

Clock Rate

 Performance improved by
 Reducing number of clock cycles
 Increasing clock rate
 Hardware designer must often trade off clock rate against cycle count
Clock Cycles  Instruction Count  Cycles per Instruction
CPU Time  Instruction Count  CPI  Clock Cycle Time
Instruction Count  CPI

Clock Rate

 Instruction Count for a program


 Determined by program, ISA and compiler
 Average cycles per instruction
 Determined by CPU hardware
 If different instructions have different CPI
 Average CPI affected by instruction mix
8. Textbook :
 Carl Hamacher, Zvonko Vranesic and Safwat Zaky, ―Computer Organization‖, Fifth Edition,
Tata McGraw Hill, 2002, PP. 3-9
9. Application
Processor, Embedded system
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Uniprocessors to multiprocessors
Time: 45 Minutes
Lesson. No Unit 1 – Lesson No.6/ 12
1.CONTENT LIST:
Uniprocessors to multiprocessors
2. SKILLS ADDRESSED:
Learning
Remembering
3.OBJECTIVE OF THIS LESSON PLAN:
To make the students know the history of computer system
4.OUTCOMES:
i. Learn the history of computer architecture
ii. Understand the advantages of transformation from uniprocessors to
multiprocessors
5.LINK SHEET:
i. Explain the evolution of computer architecture.
ii. List the nature of uniprocessors.
iii. Discuss the transformation of uniprocessors to multiprocessors.
6.EVOCATION: (5 Minutes)
7.Lecture Notes: (40 Minutes)
If we open the box containing the computer, we see a fascinating board of thin plastic, covered
with dozens of small gray or black rectangles.
The motherboard is shown in the upper part of the photo. Two disk drives are in front—the hard
drive on the left and a DVD drive on the right. The hole in the middle is for the laptop battery.
The small rectangles on the motherboard contain the devices that drive our advancing
technology, called integrated circuits and nicknamed chips.
The board is composed of three pieces: the piece connecting to the I/O devices mentioned earlier,
the memory, and the processor.
The memory is where the programs are kept when they are running; it also contains the data
needed by the running programs. Figure 1.8 shows that memory is found on the two small
boards, and each small memory board contains eight integrated circuits
\

The processor is the active part of the board, following the instructions of a program
to the letter. It adds numbers, tests numbers, signals I/O devices to activate, and so on..
Occasionally, people call the processor the CPU, for the more bureaucratic-sounding central
processor unit.
Descending even lower into the hardware, The processor logically comprises two main
components: datapath and control, the respective brawn and brain of the processor.
The datapath performs the arithmetic operations, and control tells the datapath, memory, and I/O
devices what to do according to the wishes of the instructions of the program. This explains the
datapath and control for a higher-performance design Descending into the depths of any
component of the hardware reveals insights into the computer. Inside the processor is another
type of memory—cache memory.
Cache memory consists of a small, fast memory that acts as a buffer for the DRAM memory.
(The nontechnical definition of cache is a safe place for hiding things.)
Cache is built using a different memory technology, static random access memory (SRAM).
SRAM is faster but less dense, and hence more expensive, than DRAM You may have noticed a
common theme in both the software and the hardware descriptions: delving into the depths of
hardware or software reveals more information or, conversely, lower-level details are hidden to
offer a simpler model at higher levels. The use of such layers, or abstractions, is a principal
technique for designing very sophisticated computer systems.

8. Textbook :
 Carl Hamacher, Zvonko Vranesic and Safwat Zaky, ―Computer Organization‖, Fifth Edition,
Tata McGraw Hill, 2002, PP. 3-9
9. Application
Processor, Embedded system, DVD
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Instructions – operations and operands
Time: 45 Minutes
Lesson. No Unit 1 – Lesson No.7/ 12
1.CONTENT LIST:
Instructions – operations and operands
2. SKILLS ADDRESSED:
Learning
Remembering
3.OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the instruction set and its operations
4.OUTCOMES:
i. Learn the set of instructions involved in computer architecture
ii. know the operand and operations of instructions
5.LINK SHEET:
i. List the instructions available in computer system
ii. Explain the operands and operations using instruction set
6.EVOCATION: (5 Minutes)
7.Lecture Notes: (40 Minutes)

INSTRUCTION AND INSTRUCTION SEQUENCING


A computer must have instruction capable of performing the following operations. They are,
Data transfer between memory and processor register.
Arithmetic and logical operations on data.
Program sequencing and control.
I/O transfer.

Register Transfer Notation:


The possible locations in which transfer of information occurs are,
Memory Location
Processor register
Registers in I/O sub-system.
Instruction Execution and Straight–line Sequencing:
Instruction Execution:
There are 2 phases for Instruction Execution. They are,
Instruction Fetch
Instruction Execution

Instruction Fetch:
The instruction is fetched from the memory location whose address is in PC.This is placed in IR.
Instruction Execution:
Instruction in IR is examined to determine whose operation is to be performed.
Program execution Steps:
To begin executing a program, the address of first instruction must be placed in PC.
The processor control circuits use the information in the PC to fetch & execute instructions one
at a time in the order of increasing order.
This is called Straight line sequencing.During the execution of each instruction,the PC is
incremented by 4 to point the address of next instruction.
Branching:
The Address of the memory locations containing the n numbers are symbolically given as
NUM1,NUM2…..NUMn.
Separate Add instruction is used to add each number to the contents of register R0.
After all the numbers have been added,the result is placed in memory location SUM.

\
Fig:Straight Line Sequencing Program for adding ‘n’ numbers
Using loop to add ‘n’ numbers:
Number of enteries in the list „n‟ is stored in memory location M.Register R1 is used as a
counter to determine the number of times the loop is executed.
Content location M are loaded into register R1 at the beginning of the program.
It starts at location Loop and ends at the instruction.Branch>0.During each pass,the address of
the next list entry is determined and the entry is fetched and added to R0

It reduces the contents of R1 by 1 each time through the loop.

A conditional branch instruction causes a branch only if a specified condition is satisfied.

Fig:Using loop to add ‘n’ numbers:


Branch >0 Loop
Conditional Codes:
Result of various operation for user by subsequent conditional branch instruction is
accomplished by recording the required information in individual bits often called Condition
code Flags.
Commonly used flags:

N (Negative)set to 1 if the result is – ve ,otherwise cleared to 0.


Z(Zero)set to 1 if the result is 0 ,otherwise cleared to 0.
V(Overflow)set to 1 if arithmetic overflow occurs ,otherwise cleared to 0.
C(Carry)set to 1 if carry and results from the operation ,otherwise cleared to 0.

8. Textbook :
 Carl Hamacher, Zvonko Vranesic and Safwat Zaky, ―Computer Organization‖, Fifth Edition,
Tata McGraw Hill, 2002, PP. 3-9

9. Application
Processor, Embedded system, Microchip, Intel cores
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Instructions – Representing instructions
Time: 50 Minutes
Lesson. No Unit 1 – Lesson No.8/ 12
1.CONTENT LIST:

Representing instructions
2. SKILLS ADDRESSED:
Learning
understanding
3.OBJECTIVE OF THIS LESSON PLAN:
To make the students know the representation of instructions
4.OUTCOMES:
i. Learn the set of representation involve in computer architecture
ii. know the way of representing the instructions
5.LINK SHEET:
i. Give the different types of representation involved in instruction set
ii. Discuss in detail the way to represent instruction set.
6.EVOCATION: (5 Minutes)
7.Lecture Notes: (40 Minutes)
INSTRUCTION AND INSTRUCTION SEQUENCING
A computer must have instruction capable of performing the following operations. They are,
Data transfer between memory and processor register.
Arithmetic and logical operations on data.
Program sequencing and control.
I/O transfer.

Register Transfer Notation:


The possible locations in which transfer of information occurs are,
Memory Location
Processor register
Registers in I/O sub-system.
Instruction Execution and Straight–line Sequencing:
Instruction Execution:
There are 2 phases for Instruction Execution. They are,
Instruction Fetch
Instruction Execution

Instruction Fetch:
The instruction is fetched from the memory location whose address is in PC.This is placed in IR.
Instruction Execution:
Instruction in IR is examined to determine whose operation is to be performed.
Program execution Steps:
To begin executing a program, the address of first instruction must be placed in PC.
The processor control circuits use the information in the PC to fetch & execute instructions one
at a time in the order of increasing order.
This is called Straight line sequencing.During the execution of each instruction,the PC is
incremented by 4 to point the address of next instruction.

Branching:
The Address of the memory locations containing the n numbers are symbolically given as
NUM1,NUM2…..NUMn.
Separate Add instruction is used to add each number to the contents of register R0.
After all the numbers have been added,the result is placed in memory location SUM.

\
Fig:Straight Line Sequencing Program for adding ‘n’ numbers

Using loop to add ‘n’ numbers:


Number of enteries in the list „n‟ is stored in memory location M.Register R1 is used as a
counter to determine the number of times the loop is executed.
Content location M are loaded into register R1 at the beginning of the program.
It starts at location Loop and ends at the instruction.Branch>0.During each pass,the address of
the next list entry is determined and the entry is fetched and added to R0

It reduces the contents of R1 by 1 each time through the loop.

A conditional branch instruction causes a branch only if a specified condition is satisfied.


Fig:Using loop to add ‘n’ numbers:

Branch >0 Loop


Conditional Codes:
Result of various operation for user by subsequent conditional branch instruction is
accomplished by recording the required information in individual bits often called Condition
code Flags.
Commonly used flags:

N (Negative)set to 1 if the result is – ve ,otherwise cleared to 0.


Z(Zero)set to 1 if the result is 0 ,otherwise cleared to 0.
V(Overflow)set to 1 if arithmetic overflow occurs ,otherwise cleared to 0.
C(Carry)set to 1 if carry and results from the operation ,otherwise cleared to 0.

8. Textbook :
 Carl Hamacher, Zvonko Vranesic and Safwat Zaky, ―Computer Organization‖, Fifth Edition,
Tata McGraw Hill, 2002, PP. 3-9

9.Application
Processor, Embedded system, intel core and microchip
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Logical operations – control operations
Time: 45 Minutes
Lesson. No Unit 1 – Lesson No.9/ 12
1.CONTENT LIST:
Logical operations – control operations
2. SKILLS ADDRESSED:
Learning
Remembering
3.OBJECTIVE OF THIS LESSON PLAN:
To make the students learn the logical and control operations
4.OUTCOMES:
i. Learn the logical operations of computer system
ii. Understand the control operations of computer system
5.LINK SHEET:
i. Explain the logical operations of computer.
ii. List the functions of control unit.
iii. Discuss in detail the operations of control unit.
6.EVOCATION: (5 Minutes)
Although the first computers operated on full words, it soon became clear that it
7.Lecture Notes: (40 Minutes)

Logical operations were useful to operate on fields of bits within a word or even on individual
bits. Examining characters within a word, each of which is stored as 8 bits, is one example of
such an operation (see Section 2.9).
It follows that operations were added to programming languages and instruction set architectures
to simplify, among other things, the packing and unpacking of bits into words. These instructions
are called logical operations. Figure 2.8 shows logical operations in C, Java, and MIPS.

The first class of such operations is called shifts. They move all the bits in a word to the left or
right, filling the emptied bits with Os.
For example, if register $s0 contained 0000 0000 0000 0000 0000 0000 0000 1001two= 9t en
and the instruction to shift left by 4 was executed, the new value would be:0000 0000 0000 0000
0000 1001 0000two= 144t en The dual of a shift left is a shift right. The actual name of the two
MIPS shift instructions are called shift left logical ( s l l ) and shift right logical ( s r l )

To place a value into one of these seas of Os, there is the dual to AND, called OR. It is a bit-by-
bit operation that places a 1 in the result if either operand bit is a 1.
To elaborate, if the registers $ 11 and $ 12 are unchanged from the preceding example,the result
of the MIPS instruction g $ t 0 = r e g $ t l | reg $ t 2 is this value in register $ t 0 :
NOT A logical bit-bybit operation with one operand that inverts the bits; that is, it replaces
every 1 with a 0, and every 0 with a 1.NOR
A logical bit-bybit operation with two operands that calculates the NOT of the OR of the two
operands.
That is, it calculates a 1 only if there is a 0 in both operands.0000 0000 0000 0000 0011 1101
1100 0000 two The final logical operation is a contrarian. NOT takes one operand and places a 1
in the result if one operand bit is a 0, and vice versa.
In keeping with the three-operand format, the designers of MIPS decided to include the
instruction NOR (NOT OR) instead of NOT. If one operand is zero, then it is equivalent to NOT:
A NOR 0 = NOT (A OR 0) = NOT (A).
Control Unit
Makes all the other parts work together Uses a FSM (like our Traffic FSM but much bigger -
many inputs/outputs - and more complicated)
Program Counter (PC)
•Tells control unit which instruction to execute next – Recall program is a sequence of
instructions
•Holds address of next instruction (program is in memory)
•Normally, the next PC is the current PC plus one instruction
Instruction Register (IR)
•Holds the instruction currently being executed
•Decoded to feed signals to other units and inputs to FSM

8. Textbook :
 Carl Hamacher, Zvonko Vranesic and Safwat Zaky, ―Computer Organization‖, Fifth Edition,
Tata McGraw Hill, 2002, PP. 3-9

9. Application
Processor, Embedded system, Digital logic gates.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Addressing and addressing modes
Time: 50 Minutes
Lesson. No Unit 1 – Lesson No.10, 11 / 12
1.CONTENT LIST:
Addressing and addressing modes
2. SKILLS ADDRESSED:
Learning
Understanding
3.OBJECTIVE OF THIS LESSON PLAN:
To make the students learn the addressing and addressing modes
4.OUTCOMES:
i. Learn the definition of addressing
ii. Understand the different modes of addressing
5.LINK SHEET:
i. Define Addressing.
ii. What are the modes of addressing?
iii. Discuss in detail any type of addressing mode.
6.EVOCATION: (5 Minutes)
7. Lecture Notes: (40 Minutes)

ADDRESSING MODES
The name(address) of the register is given in the instruction.
Absolute Mode(Direct Mode): The different ways in which the location of an operand is
specified in an instruction is called as Addressing mode.
Generic Addressing Modes:
 Immediate mode
 Register mode
 Absolute mode
 Indirect mode
 Index mode
 Base with index
 Base with index and offset
 Relative mode
 Auto-increment mode
 Auto-decrement mode

Implementation of Variables and Constants:


Variables:
The value can be changed as needed using the appropriate instructions.
There are 2 accessing modes to access the variables. They are
 Register Mode
 Absolute Mode

Register Mode:
The operand
The operand is in new location.
The address of this location is given explicitly in the instruction.

Eg: MOVE LOC,R2


The above instruction uses the register and absolute mode.
The processor register is the temporary storage where the data in the register are accessed using
register mode.
The absolute mode can represent global variables in the program.
Mode Assembler Syntax Addressing Function
Register mode Ri EA=Ri
Absolute mode LOC EA=LOC
Where EA-Effective Address
Constants:
Address and data constants can be represented in assembly language using Immediate Mode.
Immediate Mode.
The operand is given explicitly in the instruction.
Eg: Move 200 immediate ,R0
It places the value 200 in the register R0.The immediate mode used to specify the value of source
operand.
In assembly language, the immediate subscript is not appropriate so # symbol is used.
It can be re-written as
Move #200,R0
Assembly Syntax: Addressing Function
Immediate #value Operand =value
Indirection and Pointers:
Instruction does not give the operand or its address explicitly.Instead it provides information
from which the new address of the operand can be determined.This address is called effective
Address (EA) of the operand.
Indirect Mode:
The effective address of the operand is the contents of a register .
We denote the indirection by the name of the register or new address given in the instruction.
Address of an operand(B) is stored into R1 register.If we want this operand,we can get it through
register R1(indirection).
The register or new location that contains the address of an operand is called the pointer.
Mode Assembler Syntax Addressing Function
Indirect Ri , LOC EA=[Ri] or EA=[LOC]

Indexing and Arrays:


Index Mode:
The effective address of an operand is generated by adding a constant value to the contents of a
register.
The constant value uses either special purpose or general purpose register.
We indicate the index mode symbolically as, X(Ri)
Where X – denotes the constant value contained in the instruction
Ri – It is the name of the register involved.
The Effective Address of the operand is,
EA=X + [Ri]

The index register R1 contains the address of a new location and the value of X defines an offset(also
called a displacement).
To find operand,
First go to Reg R1 (using address)-read the content from R1-1000

Add the content 1000 with offset 20 get the result.


1000+20=1020

Here the constant X refers to the new address and the contents of index register define the offset to
the operand.
The sum of two values is given explicitly in the instruction and the other is stored in register.

Eg: Add 20(R1) , R2 (or) EA=>1000+20=1020

Relative Addressing:
It is same as index mode. The difference is, instead of general purpose register, here we can use
program counter(PC).
Relative Mode:
The Effective Address is determined by the Index mode using the PC in place of the general purpose
register (gpr).
This mode can be used to access the data operand. But its most common use is to specify the target
address in branch instruction.Eg. Branch>0 Loop
It causes the program execution to goto the branch target location. It is identified by the name loop if
the branch condition is satisfied.

Mode Assembler Syntax Addressing Function


Relative X(PC) EA=[PC]+X
Additional Modes:
There are two additional modes. They are

 Auto-increment mode
 Auto-decrement mode

Auto-increment mode:
The Effective Address of the operand is the contents of a register in the instruction.
After accessing the operand, the contents of this register is automatically incremented to point to
the next item in the list.

Mode Assembler syntax Addressing Function


Auto-increment (Ri)+ EA=[Ri];
Increment Ri
Auto-decrement mode:
The Effective Address of the operand is the contents of a register in the instruction.
After accessing the operand, the contents of this register is automatically decremented to point to
the next item in the list.

Mode Assembler Syntax Addressing Function


Auto-decrement -(Ri) EA=[Ri];
Decrement Ri

Textbook :
 Carl Hamacher, Zvonko Vranesic and Safwat Zaky, ―Computer Organization‖, Fifth Edition,
Tata McGraw Hill, 2002, PP. 3-9

Application
Processor, Embedded system.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Introduction to Unit II – Arithmetic Operations
Time: 45 Minutes
Lesson. No Unit 2 – Lesson No. 1 / 9
1.CONTENT LIST:
Introduction to Unit II - Arithmetic Operations
2. SKILLS ADDRESSED:
Listening
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate students understand the basic arithmetic operations.
4.OUTCOMES:
i. Explain the concept of arithmetic operations
ii. Listen the major topics covered in unit2
5.LINK SHEET:
i. What are the arithmetic operations performed by ALU?
ii. What are the topics covered in Arithmetic operations
6.EVOCATION: (5 Minutes)
7. Lecture Notes: (40 Minutes)

ALU Design
In computing an arithmetic logic unit (ALU) is a digital circuit that performs arithmetic and
logical operations. The ALU is a fundamental building block of the central processing unit
(CPU) of a computer, and even the simplest microprocessors contain one for purposes such as
maintaining timers. The processors found inside modern CPUs and graphics processing units
(GPUs) accommodate very powerful and very complex ALUs; a single component may contain
a number of ALUs.
Mathematician John von Neumann proposed the ALU concept in 1945, when he wrote a
report on the foundations for a new computer called the EDVAC. Research into ALUs
remains an important part of computer science, falling under Arithmetic and logic
structures in the ACM Computing Classification System

FIXED POINT NUMBER AND OPERATION


In computing, a fixed-point number representation is a real data type for a number that has a
fixed number of digits after (and sometimes also before) the radix point (e.g., after the decimal
point '.' in English decimal notation). Fixed-point number representation can be compared to the
more complicated (and more computationally demanding) floating point number representation.
Fixed-point numbers are useful for representing fractional values, usually in base 2 or
base 10, when the executing processor has no floating point unit (FPU) or if fixed-point
provides improved performance or accuracy for the application at hand. Most low-cost
embedded microprocessors and microcontrollers do not have an FPU.

FLOATING POINT NUMBERS & OPERATIONS


Floating point Representation:
To represent the fractional binary numbers, it is necessary to consider binary point.If
binary point is assumed to the right of the sign bit ,we can represent the fractional binary
numbers as given below,

Direct implementation of dedicated units :


always : 1 – 5
in most cases : 6
sometimes : 7, 8
Sequential implementation using simpler units and
several clock cycles (_decomposition) :
sometimes : 6
in most cases : 7, 8, 9
Table lookup
techniques using ROMs :
universal : simple application to all operations
efficient only for singleoperand operations of high
complexity (8 – 12) and small word length (note: ROM size
Approximation techniques using simpler units : 7–12
�taylor series expansion
�polynomial and rational approximations
�convergence of recursive equation systems
Binary adder
This is also called Ripple Carry Adder, because of the construction with full adders are
connected in cascade.
Carry Look Ahead Adder
The most widely used technique employs the principle of carry look-ahead to improve
the speed of the algorithm.
Binary subtractor
Usually there are more bits in the partial products and it is necessary to use full adders to
produce the sum of the partial products.
For J multiplier bits and K multiplicand bits we need (J X K) AND gates and (J − 1) K-
bit adders to produce a product of J+K bits.
K=4 and J=3, we need 12 AND gates and two 4-bit adders.
8. Textbook :
Carl Hamacher, Zvonko Vranesic and Safwat Zaky, “Computer Organization”, Fifth
Edition, Tata McGraw Hill, 2002, PP. 3-9
9. APPLICATIONS
 They present special design challenges, because there are simply too many inputs
to list all possible combinations in a truth table.
 In applying this method, bus-wide operations are broken into simpler bit-by-bit
operations that are more easily defined by truth-tables, and more tractable to
familiar design techniques
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for ALU
Time: 45 Minutes
Lesson. No Unit 2– Lesson No. 1 / 9
1.CONTENT LIST:
ALU
2. SKILLS ADDRESSED:
Learning
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate students understand the basic arithmetic operations.
4.OUTCOMES:
i. Explain the concept of arithmetic operations
ii. Listen the major topics covered in unit2
5.LINK SHEET:
i. What are the arithmetic operations performed by ALU?
ii. What are the topics covered in Arithmetic operations
6.EVOCATION: (5 Minutes)
7. Lecture Notes: (40 Minutes)
ALU stands for: Arithmetic Logic Unit
ALU is a digital circuit that performs Arithmetic (Add, Sub, . . .) and Logical (AND, OR,
NOT) operations.
John Von Neumann proposed the ALU in 1945 when he was working on EDVAC
Typical Schematic Symbol of an ALU

1-Bit ALU
This is an one-bit ALU which can do Logical AND and Logical OR operation.
Result = a AND b when operation = 0
Result = a OR b when operation = 1
The operation line is the input of a MUX

32-Bit ALU
8. Textbook :
Carl Hamacher, Zvonko Vranesic and Safwat Zaky, “Computer Organization”, Fifth
Edition, Tata McGraw Hill, 2002, PP. 3-9
9. APPLICATIONS
 They present special design challenges, because there are simply too many inputs
to list all possible combinations in a truth table.
 In applying this method, bus-wide operations are broken into simpler bit-by-bit
operations that are more easily defined by truth-tables, and more tractable to
familiar design techniques
 Operations performed by computer
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Addition
Time: 45 Minutes
Lesson. No Unit 2 – Lesson No. 3 / 9
1.CONTENT LIST:
Addition
2. SKILLS ADDRESSED:
Learning
understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the basic operation of Addition.
4.OUTCOMES:
i. Learn the concept of Addition
ii. Understand the different types of adder circuit
5.LINK SHEET:
i. What is Addition?
ii. Design Fast adders
iii. Discuss in detail the operation of all the adders
6.EVOCATION: (5 Minutes)
7. Lecture Notes: (40 Minutes)
Half Adder:
A combinational circuit that performs the addition of two bits is called a half adder.
Full Adder:
One that performs the addition of three bits(two significant bits and a previous carry) is a
full adder.
Binary Adder:
This is also called Ripple Carry Adder ,because of the construction with full adders are
connected in cascade.
Binary multiplier:
Usually there are more bits in the partial products and it is necessary to use
full adders to produce the sum of the partial products.
Need for using arithmetic circuits in designing combinational circuits:
 reduce cost
 reduce number of gates (for SSI circuits)
 reduce IC packages (for complex circuits)
 (ii) increase speed
 (iii) design simplicity (reuse blocks where possible)
Half Adder
The truth table for the half adder is listed below.

Boolean Expression
C = xy
S = x’y + xy’
Implementation of Half Adder Circuit

Full Adder
One that performs the addition of three bits (two significant bits and a previous carry) is
a full adder.
Truth Table
Boolean expression using K map

S = x’y’z + x’yz’ + xy’z’ + xyz


C = xy + xz + yz
Implementation of Full adder circuit

Binary adder
This is also called Ripple Carry Adder, because of the construction with full adders are
connected in cascade.
Truth Table

Implementation of Binary adder circuit


Carry Look Ahead Adder
The most widely used technique employs the principle of carry look-ahead to improve
the speed of the algorithm.
Boolean expression
Pi = Ai ⊕ Bi steady state value
Gi = AiBi steady state value
Output sum and carry
Si = Pi ⊕ Ci
Ci+1 = Gi + PiCi
Gi : carry generate Pi : carry propagate
C0 = input carry
C1 = G0 + P0C0
C2 = G1 + P1C1 = G1 + P1G0 + P1P0C0
C3 = G2 + P2C2 = G2 + P2G1 + P2P1G0 + P2P1P0C0
Implementation of Carry Look ahead adder circuit
8. Textbook :
Carl Hamacher, Zvonko Vranesic and Safwat Zaky, “Computer Organization”, Fifth
Edition, Tata McGraw Hill, 2002, PP. 3-9

9. APPLICATIONS
 They present special design challenges, because there are simply too many inputs
to list all possible combinations in a truth table.
 In applying this method, bus-wide operations are broken into simpler bit-by-bit
operations that are more easily defined by truth-tables, and more tractable to
familiar design techniques
 Operations performed by computer
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Subtraction
Time: 45 Minutes
Lesson. No Unit 2 – Lesson No. 4 / 9
1.CONTENT LIST:
Subtraction
2. SKILLS ADDRESSED:
Learning
Analyzing
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate the students learn the basic operation of Subtraction.
4.OUTCOMES:
i. Learn the concept of Subtraction
ii. Understand the different types of subtractor circuit
5.LINK SHEET:
i. What is Subtraction?
ii. Design any one of the subtractor circuit
iii. Discuss in detail the operation of all the subtractors
6.EVOCATION: (5 Minutes)
7. Lecture Notes: (40 Minutes)
.
Binary subtractor:
M = 1subtractor ; M = 0adder
Overflow is a problem in digital computers because the number of bits that hold the
number is finite and a result that contains n+1 bits cannot be accommodated.
Binary subtractor:
M = 1subtractor ; M = 0adder
Overflow is a problem in digital computers because the number of bits that hold the
number is finite and a result that contains n+1 bits cannot be accommodated.
Implementation of Binary Subtractor circuit
8. Textbook :
Carl Hamacher, Zvonko Vranesic and Safwat Zaky, “Computer Organization”, Fifth
Edition, Tata McGraw Hill, 2002, PP. 3-9

9. APPLICATIONS
 They present special design challenges, because there are simply too many inputs
to list all possible combinations in a truth table.
 In applying this method, bus-wide operations are broken into simpler bit-by-bit
operations that are more easily defined by truth-tables, and more tractable to
familiar design techniques
 Operations performed by computer
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Multiplication
Time: 45 Minutes
Lesson. No Unit 2 – Lesson No. 5 / 9
1.CONTENT LIST:
Multiplication
2. SKILLS ADDRESSED:
Rembering
Applying
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate the students learn the basic operation of Multiplication.
4.OUTCOMES:
i. Learn the concept of Multiplication
ii. Understand the different types of Multiplier circuit
5.LINK SHEET:
i. What is Multiplication?
ii. Design any one of the Multiplier circuit
iii. Discuss in detail the Booth’s algorithm
6.EVOCATION: (5 Minutes)
7. Lecture Notes: (40 Minutes)
Binary multiplier:
Usually there are more bits in the partial products and it is necessary to use full adders to
produce the sum of the partial products.
For J multiplier bits and K multiplicand bits we need (J X K) AND gates and (J − 1) K-
bit adders to produce a product of J+K bits.
K=4 and J=3, we need 12 AND gates and two 4-bit adders.
Implementation of 4-bit by 3-bit binary multiplier circuit

Multiplication Basics
Multiplies two bit operands_and_[1, 2]
Product is _2 _bit unsigned number or _2 1_bit signed number
Example : unsigned multiplication

Algorithm
1) Generation of partial products
2) Adding up partial products :
a) sequentially (sequential shiftandadd),
b) serially (combinational shiftandadd),
or
c) in parallel
Speedup techniques
�Reduce number of partial products
�Accelerate addition of partial products
Booth Recoding
Textbook :
Carl Hamacher, Zvonko Vranesic and Safwat Zaky, “Computer Organization”, Fifth
Edition, Tata McGraw Hill, 2002, PP. 3-9

9. APPLICATIONS
Applicable to sequential, array, and parallel multip
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Division
Time: 45Minutes
Lesson. No Unit 2 – Lesson No. 6 / 9
1.CONTENT LIST:
Division
2. SKILLS ADDRESSED:
Understanding
Analyzing
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students learn the basic operation of Division.
4.OUTCOMES:
i. Learn the concept of Division
ii. Understand the different types of Divider circuit
5.LINK SHEET:
i. What is Division?
ii. Design any one of the Divider circuit
iii. Discuss in detail the operation of division
6.EVOCATION: (5 Minutes)
7. Lecture Notes: (40 Minutes)

Division Algorithms and Hardware Implementations


Two types of division operations
• Integer division: with integer operands and result
• Fractional division: operands and results are fractions
Any division algorithm can be carried out independent of
• Position of the decimal point
• Sign of operands
Restoring Division Algorithm
Put x in register A, d in register B, 0 in register P, and
perform n divide steps (n is the quotient wordlength)
Each step consists of
(i) Shift the register pair (P,A) one bit left
(ii) Subtract the contents of B from P, put the result back in P
(iii) If the result is -ve, set the low-order bit of A to 0 otherwise to 0
(iv) If the result is -ve, restore the old value of P by adding the contents of B back in P

Non-Restoring Division Algorithm


A variant that skips the restoring step and instead works with negative residuals
If P is negative
• (i-a) Shift the register pair (P,A) one bit left
• (ii-a) Add the contents of register B to P
If P is positive
• (i-b) Shift the register pair (P,A) one bit left
• (ii-b) Subtract the contents of register B from P
• (iii) If P is negative, set the low-order bit of A to 0, otherwise set it to 1
• After n cycles
• The quotient is in A
• If P is positive, it is the remainder, otherwise it has to be restored (add B to it) to get the
remainder
8. Textbook :
Carl Hamacher, Zvonko Vranesic and Safwat Zaky, “Computer Organization”, Fifth
Edition, Tata McGraw Hill, 2002, PP. 3-9

9. APPLICATIONS
 They present special design challenges, because there are simply too many inputs
to list all possible combinations in a truth table.
 In applying this method, bus-wide operations are broken into simpler bit-by-bit
operations that are more easily defined by truth-tables, and more tractable to
familiar design techniques
 Operations performed by computer
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Floating Point operations
Time: 45 Minutes
Lesson. No Unit 2 – Lesson No. 7/ 9
1.CONTENT LIST:
Floating Point operations
2. SKILLS ADDRESSED:
Understanding
Remembering
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the concept of floating point operations.
4.OUTCOMES:
i. Learn the concept of floating numbers
ii. Understand the different floating point operationst
5.LINK SHEET:
i. What is floating point?
ii. Discuss in detail the operation of floating numbers
6.EVOCATION: (5 Minutes)
7. Lecture Notes: (40 Minutes)
Arithmetic operations on floating point numbers consist of addition, subtraction,
multiplication and division the operations are done with algorithms similar to those used
on sign magnitude integers (because of the similarity of representation) -- example,
only add numbers of the same sign. If the numbers are of opposite sign, must do
subtraction.

ADDITION

example on decimal value given in scientific notation:

3.25 x 10 ** 3
+ 2.63 x 10 ** -1
-----------------

first step: align decimal points


second step: add

3.25 x 10 ** 3
+ 0.000263 x 10 ** 3
--------------------
3.250263 x 10 ** 3
(presumes use of infinite precision, without regard for accuracy)
third step: normalize the result (already normalized!)

SUBTRACTION
like addition as far as alignment of radix points then the algorithm for subtraction of
sign mag. numbers takes over.
before subtracting,
compare magnitudes (don't forget the hidden bit!)
change sign bit if order of operands is changed.
don't forget to normalize number afterward.
MULTIPLICATION
example on decimal values given in scientific notation:

3.0 x 10 ** 1
+ 0.5 x 10 ** 2
algorithm: multiply mantissas
add exponents

3.0 x 10 ** 1
+ 0.5 x 10 ** 2
-----------------
1.50 x 10 ** 3

example in binary: use a mantissa that is only 4 bits so that


I don't spend all day just doing the multiplication
part.

DIVISION

similar to multiplication.

true division:
do unsigned division on the mantissas (don't forget the hidden bit)
subtract TRUE exponents

The IEEE standard is very specific about how all this is done.
Unfortunately, the hardware to do all this is pretty slow.

Some comparisons of approximate times:


2's complement integer add 1 time unit
fl. pt add 4 time units
fl. pt multiply 6 time units
fl. pt. divide 13 time units

There is a faster way to do division. Its called


division by reciprocal approximation. It takes about the same
time as a fl. pt. multiply. Unfortunately, the results are
not always the same as with true division.

Division by reciprocal approximation:

instead of doing a/b

they do a x 1/b.

figure out a reciprocal for b, and then use the fl. pt.
multiplication hardware.
example of a result that isn't the same as with true division.

true division: 3/3 = 1 (exactly)

reciprocal approx: 1/3 = .33333333

3 x .33333333 = .99999999, not 1

It is not always possible to get a perfectly accurate reciprocal.

8. Textbook :
Carl Hamacher, Zvonko Vranesic and Safwat Zaky, “Computer Organization”, Fifth
Edition, Tata McGraw Hill, 2002, PP. 3-9
9. APPLICATIONS
 They present special design challenges, because there are simply too many inputs
to list all possible combinations in a truth table.
 In applying this method, bus-wide operations are broken into simpler bit-by-bit
operations that are more easily defined by truth-tables, and more tractable to
familiar design techniques
 Operations performed by computer
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Information Technology

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By P.Kaviya
Lesson Plan for Subword parallelism
Time: 45 Minutes
Lesson. No Unit 2 – Lesson No. 8/ 9
1.CONTENT LIST:
Subword parallelism
2. SKILLS ADDRESSED:
Understanding
Analyzing
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the concept of subword parallelism
4.OUTCOMES:
i. Learn the concept of parallelism
ii. Understand the manipulation of subword parallelism
5.LINK SHEET:
i. What is parallelism?
ii. Discuss in detail the operation of subword parallelism
6.EVOCATION: (5 Minutes)
7. Lecture Notes: (40 Minutes)
Subword Parallelism
The term SIMD was originally defined in 1960s as category of multiprocessor with one
control unit and multiple processing elements - each instruction is executed by all
processing elements on different data streams, e.g., Illiac IV. Today the term is used to
describe partitionable ALUs in which multiple operands can fit in a fixed-width register
and are acted upon in parallel.
(other terms include subword parallelism, microSIMD, short vector extensions, split-
ALU, SLP / superword-level parallelism, and SIGD / single-instruction-group[ed]-data)

The structure of the arithmetic element can be altered under program control. Each
instruction specifies a particular form of machine in which to operate, ranging from a full
36-bit computer to four 9-bit computers with many variations.
Not only is such a scheme able to make more efficient use of the memory in storing data of
various word lengths, but it also can be expected to result in greater over-all machine
speed because of the increased parallelism of operation.

Peak operating rates must then be referred to particular configurations. For addition and
multiplication, these peak rates are given in the following table:

PEAK OPERATING SPEEDS OF TX-2


Word Lengths Additions Multiplications
(in bits) per second per second
36 150,000 80,000
18 300,000 240,000
9 600,000 600,000
Univac 1107 (ca. 1962) - a 36-bit machine that included add/subtract halves instructions
(two 18-bit operations in parallel) and add/subtract thirds instructions (three 12-bit operations
in parallel)
Intel i860 (ca. 1989) added three packed graphics data types (e.g., eight 1-byte pixel values
per 64-bit word) and a special graphics function unit (e.g., z-buffer interpolation)
Motorola 88110 (ca. 1991) included six graphics data types and performed saturating
arithmetic
8. Textbook :
Carl Hamacher, Zvonko Vranesic and Safwat Zaky, “Computer Organization”, Fifth
Edition, Tata McGraw Hill, 2002, PP. 3-9
9. APPLICATIONS
MicroSIMD, superword-level parallelism, and SIGD / single-instruction-group[ed]-data)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Introduction to Unit III – Processor and Control Unit
Time: 45Minutes
Lesson. No Unit 3 – Lesson No. 1 / 13
1. CONTENT LIST:
Introduction to Unit III - Processor and Control Unit
2. SKILLS ADDRESSED:
Listening
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate students understand the basics of processor and control unit.
4. OUTCOMES:
i. Explain the concept of processor and control unit
ii. Listen the major topics covered in unit3
5. LINK SHEET:
i. Give the detailed concept of processor and control unit
ii. What are the topics covered in processor and control unit?
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Basic MIPS implementation
Time: 45 Minutes
Lesson. No Unit 3 – Lesson No. 2 ,3 / 13
1. CONTENT LIST:
Basic MIPS implementation
2. SKILLS ADDRESSED:
Learning
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate students understand the implementation of MIPS.
4. OUTCOMES:
i. Learn the concept of MIPS implementation
ii. Know the different types of formats used in MIPS
5. LINK SHEET:
i. List the formats in MIPS Implementation
ii. Explain the concept of MIPS Implementation
iii. Give the Features of MIPS Implementation
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Building datapath
Time: 45 Minutes
Lesson. No Unit 3 – Lesson No. 4 / 13
1. CONTENT LIST:
Building datapath
2. SKILLS ADDRESSED:
Learning
Analyzing
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students learn the building of datapath
4. OUTCOMES:
i. Understand the concept of datapath building
ii. Know the register format and other types of format in detail
5. LINK SHEET:
i. List the formats in datapath building
ii. Design datapath with various formats
iii. Explain any one of the formats in detail
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Control Implementation scheme
Time: 45 Minutes
Lesson. No Unit 3 – Lesson No. 5, 6 / 13
1. CONTENT LIST:
Control Implementation scheme
2. SKILLS ADDRESSED:
Learning
Applying
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the scheme of control implementation
4. OUTCOMES:
i. Learn the scheme to design datapath
ii. Understand the different cycles in implementing control scheme
5. LINK SHEET:
i. Give the formats in control implementation scheme
ii. Design the scheme to characterize the datapath
iii. Explain control implementation scheme in detail.
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Pipelining
Time: 45 Minutes
Lesson. No Unit 3 – Lesson No. 5, 6 / 13
1. CONTENT LIST:
Pipelining
2. SKILLS ADDRESSED:
Remembering
Applying
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students remember the pipelining procedures
4. OUTCOMES:
i. Learn the sequence steps for pipelining
ii. Understand the features of pipelining
5. LINK SHEET:
i. Give the need and role of cache memory in pipelining
ii. Explain in detail the basic concepts of pipelining
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Pipelined datapath and control
Time: 45 Minutes
Lesson. No Unit 3 – Lesson No. 8, 9 / 13
1. CONTENT LIST:
Pipelined datapath and control
2. SKILLS ADDRESSED:
Understanding
Analyzing
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate the students understand the datapath and control considerations for pipelining
4. OUTCOMES:
i. Remember the important terms used in pipelining datapath and control concept
ii. Understand the operations performed in control and datapath pipelining
5. LINK SHEET:
i. Construct pipelined datapath and control architecture
ii. Discuss in detail the operation of data and control pipelining
iii. Draw the control and data pipelining architecture.
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Handling Data hazards & Control hazards
Time: 45 Minutes
Lesson. No Unit 3 – Lesson No. 10, 11 / 13
1. CONTENT LIST:
Handling Data hazards & Control hazards
2. SKILLS ADDRESSED:
Understanding
Applying
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate the students understand the Von Neumann architecture and data hazards exclusively
4. OUTCOMES:
i. Learn the operation performed in data and control hazards.
ii. Understand the various structure to handle data and control hazards
5. LINK SHEET:
i. What is hazard
ii. Discuss in detail the operation of data hazard.
iii. Explain in detail the features of control hazard.
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Exceptions.
Time: 45 Minutes
Lesson. No Unit 3 – Lesson No. 12 / 13
1. CONTENT LIST:
Exceptions.
2. SKILLS ADDRESSED:
Learning
Analyzing
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate the students understand the way to handle the exceptions
4. OUTCOMES:
i. Know the manipulation performed in exceptions.
ii. Understand the procedure to handle exceptions
5. LINK SHEET:
i. Construct exceptions with neat clock cycle
ii. Discuss in detail the manipulation of exception handling
iii. Write the features of exception handling?
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Introduction to Unit IV – PARALLELISM
Time: 45 Minutes
Lesson. No Unit 4 – Lesson No. 1 / 10
1. CONTENT LIST:
Introduction to Unit IV – Parallelism
2. SKILLS ADDRESSED:
Listening
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate students understand the basics of parallelism
4. OUTCOMES:
i. Explain the concept of parallelism in computer architecture
ii. Demonstrate the major topics covered in unit4
5. LINK SHEET:
i. Define parallelism
ii. What are the topics covered in parallelism?
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Instruction-level-parallelism
Time: 45 Minutes
Lesson. No Unit 4 – Lesson No. 2 / 10
1. CONTENT LIST:
Instruction-level-parallelism
2. SKILLS ADDRESSED:
Learning
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the instruction level parallelism and its features
4. OUTCOMES:
iii. Enumerate the features of instruction level parallelism
iv. Explain the detailed concept of instruction level parallelism
5. LINK SHEET:
iii. Define instruction level parallelism
iv. What is speculation?
v. List the major factors which influence instruction set to undergo parallelism
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Parallel processing challenges
Time: 45 Minutes
Lesson. No Unit 4 – Lesson No. 3,4 / 10
1. CONTENT LIST:
Parallel processing challenges
2. SKILLS ADDRESSED:
Learning
Remembering
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate the students understand the challenges over parallel processing
4. OUTCOMES:
v. Enumerate the characteristics of parallel processing
vi. Explain the detailed concept of challenges in parallel processing
5. LINK SHEET:
vi. Define data dependencies over parallel processing
vii. What is static multiple issue processor?
viii. List the factors to implement dynamic multiple issue processor
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Flynn's classification
Time: 45 Minutes
Lesson. No Unit 4 – Lesson No. 5/ 10
1. CONTENT LIST:
Flynn's classification
2. SKILLS ADDRESSED:
Remembering
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the classifications given by flynn
4. OUTCOMES:
vii. Enumerate the basic classification of flynn
viii. Explain the detailed concept of SIMD,MIMD and other classification of flynn
5. LINK SHEET:
ix. Define SIMD
x. What is Flynn classification?
xi. Explain each classification in detail
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Flynn's classification
Time: 45 Minutes
Lesson. No Unit 4 – Lesson No. 6/ 10
1. CONTENT LIST:
Flynn's classification
2. SKILLS ADDRESSED:
Remembering
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the classifications given by flynn
4. OUTCOMES:
ix. Enumerate the basic classification of flynn
x. Explain the detailed concept of SIMD,MIMD and other classification of flynn
5. LINK SHEET:
xii. Define SIMD
xiii. What is Flynn classification?
xiv. Explain each classification in detail
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Hardware multithreading
Time: 45 Minutes
Lesson. No Unit 4 – Lesson No. 7/ 10
1. CONTENT LIST:
Hardware multithreading
2. SKILLS ADDRESSED:
Remembering
Learning
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students know the detail concept of hardware multithreading
4. OUTCOMES:
xi. Demonstrate hardware multithreading in detail
xii. Explain the detailed concept of hardware multithreading and its features
5. LINK SHEET:
xv. Define multithreading
xvi. List the types of multithreading?
xvii. Discuss the factors influencing hardware multithreading in detail.
6. EVOCATION: (5 Minutes)
\
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Multicore processors
Time: 45 Minutes
Lesson. No Unit 4 – Lesson No. 8/ 10
1. CONTENT LIST:
Multicore processors
2. SKILLS ADDRESSED:
Remembering
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the detail concept of multicore processors
4. OUTCOMES:
xiii. Demonstrate multicore processors in detail
xiv. Sort the different types of multicore processors
5. LINK SHEET:
xviii. Define processor
xix. List the types of multicore processors?
xx. Discuss the various architecture underlying with multicore processors.
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303

Subject Computer Architecture


Prepared By Kaviya.P
Lesson Plan for Multicore processors
Time: 45 Minutes
Lesson. No Unit 4 – Lesson No. 9/ 10
1. CONTENT LIST:
Multicore processors
2. SKILLS ADDRESSED:
Remembering
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the detail concept of multicore processors
4. OUTCOMES:
xv. Demonstrate multicore processors in detail
xvi. Sort the different types of multicore processors
5. LINK SHEET:
xxi. Define processor
xxii. List the types of multicore processors?
xxiii. Discuss the various architecture underlying with multicore processors.
6. EVOCATION: (5 Minutes)
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for Introduction to Unit V – MEMORY AND I/O SYSTEMS
Time: 45 Minutes
Lesson. No Unit 5– Lesson No. 1 / 13
1.CONTENT LIST:
Introduction to Unit V - Memory and I/O Systems
2. SKILLS ADDRESSED:
Listening
3. OBJECTIVE OF THIS LESSON PLAN:
To facilitate students understand the basics of memory, input output systems
4.OUTCOMES:
i. Explain the concept of memory units and its features
ii. Demonstrate the major topics covered in unit5
5.LINK SHEET:
i. Define memory
ii. What are the topics covered in memory and input / output devices?
6.EVOCATION: (5 Minutes)
7. Lecture Notes

From the CPU's perspective, an I/O device appears as a set of special-purpose registers, of three
general types:

 Status registers provide status information to the CPU about the I/O device. These
registers are often read-only, i.e. the CPU can only read their bits, and cannot change
them.
 Configuration/control registers are used by the CPU to configure and control the device.
Bits in these configuration registers may be write-only, so the CPU can alter them, but
not read them back. Most bits in control registers can be both read and written.
 Data registers are used to read data from or send data to the I/O device.

In some instances, a given register may fit more than one of the above categories, e.g. some bits
are used for configuration while other bits in the same register provide status information.

The logic circuit that contains these registers is called the device controller, and the software that
communicates with the controller is called a device driver.

+-------------------+ +-----------+
| Device controller | | |
+-------+ | |<--------->| Device |
| |---------->| Control register | | |
| CPU |<----------| Status register | | |
| |<--------->| Data register | | |
+-------+ | | | |
+-------------------+ +-----------+

Simple devices such as keyboards and mice may be represented by only a few registers, while
more complex ones such as disk drives and graphics adapters may have dozens.

Each of the I/O registers, like memory, must have an address so that the CPU can read or write
specific registers.

Some CPUs have a separate address space for I/O devices. This requires separate instructions to
perform I/O operations.

Other architectures, like the MIPS, use memory-mapped I/O. When using memory-mapped I/O,
the same address space is shared by memory and I/O devices. Some addresses represent memory
cells, while others represent registers in I/O devices. No separate I/O instructions are needed in a
CPU that uses memory-mapped I/O. Instead, we can perform I/O operations using any
instruction that can reference memory.

+---------------+
| Address space |
| +-------+ |
| | ROM | |
| +-------+ |
+-------+address| | | |
| |------>| | RAM | |
| CPU | | | | |
| |<----->| +-------+ |
+-------+ data | | | |
| | I/O | |
| +-------+ |
+---------------+

On the MIPS, we would access ROM, RAM, and I/O devices using load and store instructions.
Which type of device we access depends only on the address used!

lw $t0, 0x00000004 # Read ROM


sw $t0, 0x00000004 # Write ROM (bus error!)

lbu $t0, 0x0000ffc1 # Read RAM


sb $t0, 0x0000ffc1 # Write RAM

lbu $t0, 0xffff0000 # Read an I/O device


sb $t0, 0xffff0004 # Write to an I/O device

The 32-bit MIPS architecture has a 32-bit address, and hence an address space of 4 gigabytes.
Addresses 0x00000000 through 0xfffeffff are used for memory, and addresses 0xffff0000 -
0xffffffff (the last 64 kilobytes) are reserved for I/O device registers. This is a very small fraction
of the total address space, and yet far more space than is needed for I/O devices on any one
computer.

Each register within an I/O controller must be assigned a unique address within the address
space. This address may be fixed for certain devices, and auto-assigned for others. (PC plug-and-
play devices have auto-assigned I/O addresses, which are determined during boot-up.)

8. TEXT BOOKS:

1. V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―, VI


edition, McGraw-Hill Inc,. 2012
2. David A. Patterson and John L. Hennessey, ―Computer organization and design‟, Morgan
auffman lsevier, Fifth edition, 2014

9. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for Memory hierarchy
Time: 45 Minutes
Lesson. No Unit 5– Lesson No.2 / 13
1.CONTENT LIST:
Memory hierarchy
2. SKILLS ADDRESSED:
Learning
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students understand the hierarchy of memories
4.OUTCOMES:
i. Sort the hierarchies present in memory
ii. Demonstrate the major hierarchies with example
5.LINK SHEET:
i. Define hierarchies of memory
ii. Illustrate the hierarchies of memory with example
6.EVOCATION: (5 Minutes)
7. TEXT BOOKS:

V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―, VI


edition, McGraw-Hill Inc,. 2012
David A. Patterson and John L. Hennessey, ―Computer organization and design‟, Morgan
auffman lsevier, Fifth edition, 2014
8. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for Memory technologies
Time: 45 Minutes
Lesson. No Unit 5– Lesson No.3,4 / 13
1.CONTENT LIST:
Memory technologies
2. SKILLS ADDRESSED:
Learning
Remembering
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students remember the technologies present in memory device
4.OUTCOMES:
i. Sort the technologies present in memory
ii. Demonstrate the major technologies with example
5.LINK SHEET:
i. Define technologies of memory
ii. Illustrate the technologies of memory with example
6.EVOCATION: (5 Minutes)
7. Lecture Notes

MEMORY TECHNOLOGIES

Much of the success of computer technology stems from the tremendous progress in
storage technology.
Early computers had a few kilobytes of random-access memory. The earliest IBM PCs didn’t
even have a hard disk.
That changed with the introduction of the IBM PC-XT in 1982, with its 10-megabyte
disk. By the year 2010, typical machines had 150,000 times as much disk storage, and the
amount of storage was increasing by a factor of 2 every couple of years.

Random-Access Memory
Random-access memory (RAM) comes in two varieties—static and dynamic. Static RAM
(SRAM) is faster and significantly more expensive than Dynamic RAM (DRAM). SRAM is used
for cache memories, both on and off the CPU chip. DRAM is used for the main memory plus the
frame buffer of a graphics system. Typically, a desktop system will have no more than a few
megabytes of SRAM, but hundreds or thousands of megabytes of DRAM.

Static RAM
SRAMstores each bit in a bistable memory cell. Each cell is implemented with a six-transistor
circuit. This circuit has the property that it can stay indefinitely in either of two different voltage
configurations, or states. Any other state will be unstable—starting from there, the circuit will
quickly move toward one of the stable

Dynamic RAM

DRAM stores each bit as charge on a capacitor. This capacitor is very small—typically
around 30 femtofarads,that is, 30 × 10−15 farads. Recall, however, that a farad is a very large
unit of measure. DRAM storage can be made very dense—each cell consists of a capacitor and a
single access-transistor. Unlike SRAM, however, a DRAM memory cell is very sensitive to any
disturbance. When the capacitor voltage is disturbed, it will never recover. Exposure to light rays
will cause the capacitor voltages to change. In fact, the sensors in digital cameras and
camcorders are essentially arrays of DRAM cells.
Conventional DRAMs
The cells (bits) in a DRAM chip are partitioned into d supercells, each consisting of w
DRAM cells. A d × w DRAM stores a total of dw bits of information. The supercells are
organized as a rectangular array with r rows and c columns, where rc = d. Each supercell has an
address of the form (i, j), where i denotes the row, and j denotes the column.
For example, Figure 6.3 shows the organization of a 16 × 8 DRAM chip with d = 16 supercells,
w = 8 534 bits per supercell, r = 4 rows, and c = 4 columns. The shaded box denotes the
supercell at address (2, 1).
Information flows in and out of the chip via external connectors called pins. Each pin carries a 1-
bit signal.
Figure shows two of these sets of pins: eight data pins that can transfer 1 byte in or out of the
chip, and two addr pins that carry two-bit row and column supercell addresses. Other pins that
carry control information are not shown.

Fig: Conventionall DRAM

One reason circuit designers organize DRAMs as two-dimensional arrays instead of


linear arrays is to reduce the number of address pins on the chip. For example, if our example
128-bit DRAM were organized as a linear array of 16 supercells with addresses 0 to 15, then the
chip would need four address pins instead of two. The disadvantage of the two-dimensional array
organization is that addresses must be sent in two distinct steps, which increases the access time.
Enhanced DRAMs
There are many kinds of DRAM memories, and new kinds appear on the market with regularity
as manufacturers attempt to keep up with rapidly increasing processor speeds. Each is based on
the conventional DRAM cell, with optimizations that improve the speed with which the basic
DRAM cells can be accessed.

Accessing Main Memory

Data flows back and forth between the processor and the DRAM main memory over
shared electrical conduits called buses. Each transfer of data between the CPU and memory is
accomplished with a series of steps called a bus transaction. A read transaction transfers data
from the main memory to the CPU. A write transaction transfers data from the CPU to the main
memory.
A bus is a collection of parallel wires that carry address, data, and control signals.
Depending on the particular bus design, data and address signals can share the same set of wires,
or they can use different sets. Also, more than two devices can share the same bus. The control
wires carry signals that synchronize the transaction and identify what kind of transaction is
currently being performed.

Figure : Example bus structure that connects the CPU and main memory.
Disk Storage
Disks are workhorse storage devices that hold enormous amounts of data, on the order of
hundreds to thousands of gigabytes, as opposed to the hundreds or thousands of megabytes in a
RAM-based memory. However, it takes on the order of milliseconds to read information from a
disk, a hundred thousand times longer than from DRAM and a million times longer than from
SRAM.

Figure: Memory read transaction for a load operation

8. TEXT BOOKS:

1. V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―, VI


edition, McGraw-Hill Inc,. 2012
2. David A. Patterson and John L. Hennessey, ―Computer organization and design‟, Morgan
auffman lsevier, Fifth edition, 2014

9. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for Cache basics – Measuring and improving cache performance
Time: 45 Minutes
Lesson. No Unit 5– Lesson No.5 / 13
1.CONTENT LIST:
Cache basics – Measuring and improving cache performance
2. SKILLS ADDRESSED:
Learning
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students know the basics of cache and its performance measurement
4.OUTCOMES:
i. Describe cache memory
ii. Analyze the performance of cache and measure it
5.LINK SHEET:
i. Define cache memory
ii. Explain the criteria to improve the performance of cache memory
6.EVOCATION: (5 Minutes)
7. Lecture Notes

Cache basics – Measuring and improving cache performance

One focuses on reducing the miss rate by reducing the probability that two different memory
blocks will contend for the same cache location. The second technique reduces the miss penalty
by adding an additional level to the hierarchy. This technique, called multilevel caching, first
appeared in high-end computers selling for more than $100,000 in 1990; since then it has
become common on desktop computers selling for less than $500! CPU time can be divided into
the clock cycles that the CPU spends executing the program and the clock cycles that the CPU
spends waiting for the memory system. Normally, we assume that the costs of cache accesses
that are hits are part of the normal CPU execution cycles. Thus,
CPU time = (CPU execution clock cycles T Memory-stall clock cycles)
The memory-stall clock cycles come primarily from cache misses, and we make that assumption
here. We also restrict the discussion to a simplified model of the memory system. In real
processors, the stalls generated by reads and writes can be quite complex, and accurate
performance prediction usually requires very detailed simulations of the processor and memory
system.
Reads Read-stall cycles =Pr ogram x Read miss rate x Read miss penalty Writes are more
complicated. For a write-through scheme, we have two sources of stalls: write misses, which
usually require that we fetch the block before continuing the write (see the Elaboration on page
467 for more details on dealing with writes), and write buffer stalls, which occur when the write
buffer is full when a write occurs.

Calculating Cache Performance:

Assume the miss rate of an instruction cache is 2% and the miss rate of the data cache is 4%. If a
processor has a CPI of 2 without any memory stalls and the miss penalty is 100 cycles for all
misses, determine how much faster a processor would run with a perfect cache that never missed.
Assume the frequency of all loads and stores is 36%.

Reducing Cache Misses by Move Flexibfle Placement of Blocks

So far, when we place a block in the cache, we have used a simple placement scheme: A block
can go in exactly one place in the cache. As mentioned earlier, it is called direct mapped because
there is a direct mapping from any block address in memory to a single location in the upper
level of the hierarchy. However, there is actually a whole range of schemes for placing blocks.
Direct mapped, where a block can be placed in exactly one location, is at one extreme. At the
other extreme is a scheme where a block can be placed in any location in the cache. Such a
scheme is called fully associative, because a block in memory may be associated with any entry
in the cache. To find a given block in a fully associative cache, all the entries in the cache must
be searched because a block can be placed in any one. To make the search practical, it is done in
parallel with a comparator associated with each cache entry. These comparators significantly
increase the hardware cost, effectively making fully associative placement practical only for
caches with small numbers of blocks.
Choosing Which Block to Replace :

When a miss occurs in a direct-mapped cache, the requested block can go in exactly one
position, and the block occupying that position must be replaced. In an associative cache, we
have a choice of where to place the requested block, and hence a choice of which block to
replace. In a fully associative cache, all blocks are candidates for replacement. In a set-
associative cache, we must choose among the blocks in the selected set. The most commonly
used scheme is least recently used (LRU), which we used in the previous example. In an LRU
scheme, the block replaced is the one that has been unused for the longest time. The set
associative example on page 482 uses LRU, which is why we replaced Memory(O) instead of
Memory(6). LRU replacement is implemented by keeping track of when each element in a set
was used relative to the other elements in the set. For a two-way set-associative cache, tracking
when the two elements were used can be implemented by keeping a single bit in each set and
setting the bit to indicate an element whenever that element is referenced. As associativity
increases, implementing LRU gets harder; in Section 5.5, we will see an alternative scheme for
replacement.
8. TEXT BOOKS:

1. V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―, VI


edition, McGraw-Hill Inc,. 2012
2. David A. Patterson and John L. Hennessey, ―Computer organization and design‟, Morgan
auffman lsevier, Fifth edition, 2014

9. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for Cache basics – Measuring and improving cache performance
Time: 45 Minutes
Lesson. No Unit 5– Lesson No.6 / 13
1.CONTENT LIST:
Cache basics – Measuring and improving cache performance
2. SKILLS ADDRESSED:
Learning
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students know the basics of cache and its performance measurement
4.OUTCOMES:
i. Describe cache memory
ii. Analyze the performance of cache and measure it
5.LINK SHEET:
i. Define cache memory
ii. Explain the criteria to improve the performance of cache memory
6.EVOCATION: (5 Minutes)
7. TEXT BOOKS:

V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―,


VI edition, McGraw-Hill Inc,. 2012
David A. Patterson and John L. Hennessey, ―Computer organization and design‟,
Morgan auffman lsevier, Fifth edition, 2014

8. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for Virtual memory
Time: 45 Minutes
Lesson. No Unit 5– Lesson No.7/ 13
1.CONTENT LIST:
Virtual memory
2. SKILLS ADDRESSED:
Learning
Remembering
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students know the basics of virtual memory
4. OUTCOMES:
i. Describe virtual memory
ii. Analyze the features of virtual memory
5.LINK SHEET:
i. Define virtual memory
ii. Explain the criteria to improve the performance of virtual memory
6.EVOCATION: (5 Minutes)
7. Lecture Notes
Virtual Memory

Similarly, the main memory can act as a "cache" for the secondary storage, usually implemented
with magnetic disks. This technique is called virtual memory. Historically, there were two major
motivations for virtual memory: to allow efficient and safe sharing of memory among multiple
programs, and to remove the programming burdens of a small, limited amount of main memory.
Four decades after its invention, it's the former reason that reigns today.
Consider a collection of programs running all at once on a computer. Of course,
to allow multiple programs to share the same memory, we must be able to protect the programs
from each other, ensuring that a program can only read and write the portions of main memory
that have been assigned to it. Main memory need contain only the active portions of the many
programs, just as a cache contains only the active portion of one program. Thus, the principle of
locality enables virtual memory as well as caches, and virtual memory allows us to efficiently
share the processor as well as the main memory.
The second motivation for virtual memory is to allow a single user program to exceed the size of
primary memory. Formerly, if a program became too large for memory, it was up to the
programmer to make it fit. Programmers divided programs into pieces and then identified the
pieces that were mutually exclusive. These overlays were loaded or unloaded under user program
control during execution, with the programmer ensuring that the program never tried to access an
overlay that was not loaded and that the overlays loaded never exceeded the total size of the
memory. Overlays were traditionally organized as modules, each containing both code and data.

In virtual memory, the address is broken into a virtual page number and a page offset. Figure
5.20 shows the translation of the virtual page number to a physical page number. The physical
page number constitutes the upper portion of the physical address, while the page offset, which is
not changed, constitutes the lower portion. The number of bits in the page offset field determines
the page size. The number of pages addressable with the virtual address need not match the
number of pages addressable with the physical address. Having a larger number of virtual pages
than physical pages is the basis for the illusion of an essentially unbounded amount of virtual
memory.

8. TEXT BOOKS:
V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―,
VI edition, McGraw-Hill Inc,. 2012
David A. Patterson and John L. Hennessey, ―Computer organization and design‟,
Morgan auffman lsevier, Fifth edition, 2014

9. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for TLBs - Input/output system
Time: 45 Minutes
Lesson. No Unit 5– Lesson No.8/ 13
1.CONTENT LIST:
TLBs - Input/output system
2. SKILLS ADDRESSED:
Learning
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students know the basics of TLB
4.OUTCOMES:
i. Describe TLB
ii. Analyze the features of TLB
5.LINK SHEET:
i. Define TLB
ii. Describe TLB system in detail
6.EVOCATION: (5 Minutes)
7. TEXT BOOKS:

V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―,


VI edition, McGraw-Hill Inc,. 2012
David A. Patterson and John L. Hennessey, ―Computer organization and design‟,
Morgan auffman lsevier, Fifth edition, 2014

8. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for TLBs - Input/output system
Time: 45 Minutes
Lesson. No Unit 5– Lesson No.9/ 13
1.CONTENT LIST:
TLBs - Input/output system
2. SKILLS ADDRESSED:
Learning
Understanding
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students know the basics of input/ output system
4.OUTCOMES:
i. Describe input and output system
ii. Analyze the features of input/ output system
5.LINK SHEET:
i. Define input and output system
(ii) Design the hardware and software using advanced processors
6.EVOCATION: (5 Minutes)
7. Lecture Notes

Input/Output
The computer system’s I/O architecture is its interface to the outside world. This
architecture is designed to provide a systematic means of controlling interaction with the outside
world and to provide the operating system with the information it needs to manage I/O activity
effectively.

There are three principal I/O techniques: programmed I/O, in which I/O occurs under he
direct and continuous control of the program requesting the I/O operation; interrupt-driven I/O,
in which a program issues an I/O command and then continues to execute, until it is interrupted
by the I/O hardware to signal the end of the I/O operations; and direct memory access (DMA), in
which a specialized I/O processor takes over control of an I/O operation to move a large block of
data.

Two important examples of external I/O interfaces are FireWire and Infiniband.

Peripherals and the System Bus


 There are a wide variety of peripherals each with varying methods of operation
 Impractical to for the processor to accommodate all
 Data transfer rates are often slower than the processor and/or memory
 Impractical to use the high-speed system bus to communicate directly
 Data transfer rates may be faster than that of the processor and/or memoryThis mismatch
may lead to inefficiencies if improperly managed
 Peripheral often use different data formats and word lengths
 Purpose of I/O Modules
 Interface to the processor and memory via the system bus or control switch
 Interface to one or more peripheral devices

Purpose of I/O Modules


• Interface to the processor and memory via the system bus or control switch
• Interface to one or more peripheral devices
External Devices:
External device categories
• Human readable: communicate with the computer user – CRT
• Machine readable: communicate with equipment – disk drive or tape drive
• Communication: communicate with remote devices – may be human readable or machine
readable

The External Device – I/O Module


• Control signals: determine the function that will be performed
• Data: set of bits to be sent of received
• Status signals: indicate the state of the device
• Control logic: controls the device’s operations
• Transducer: converts data from electrical to other forms of energy
• Buffer: temporarily holds data being transferred

Keyboard/Monitor
• Most common means of computer/user interaction
• Keyboard provides input that is transmitted to the computer
• Monitor displays data provided by the computer
• The character is the basic unit of exchange
• Each character is associated with a 7 or 8 bit code

Disk Drive
• Contains electronics for exchanging data, control, and status signals with an I/O module
• Contains electronics for controlling the disk read/write mechanism
• Fixed-head disk – transducer converts between magnetic patterns on the disk surface and bits in
the buffer
• Moving-head disk – must move the disk arm rapidly across the surface

I/O Modules
Module Function
• Control and timing
• Processor communication
• Device communication
• Data buffering
• Error detection

I/O control steps


• Processor checks I/O module for external device status
• I/O module returns status
• If device ready, processor gives I/O module command to request data transfer
• I/O module gets a unit of data from device
• Data transferred from the I/O module to the processor

Processor communication
Command decoding: I/O module accepts commands from the processor sent as signals on the
control bus
Data: data exchanged between the processor and I/O module over the data bus Status reporting:
common status signals BUSY and READY are used because peripherals are slow
Address recognition: I/O module must recognize a unique address for each peripheral that it
controls I/O module communication
Device communication: commands, status information, and data
Data buffering: data comes from main memory in rapid burst and must be buffered by the I/O
module and then sent to the device at the device’s rate
Error detection: responsible for reporting errors to the processor

Typical I/O Device Data Rates


I/O Module Structure: Block Diagram of an I/O Module

Module connects to the computer through a set of signal lines – system bus
• Data transferred to and from the module are buffered with data registers
• Status provided through status registers – may also act as control registers
• Module logic interacts with processor via a set of control signal lines
• Processor uses control signal lines to issue commands to the I/O module
• Module must recognize and generate addresses for devices it controls
• Module contains logic for device interfaces to the devices it controls
• I/O module functions allow the processor to view devices is a simple-minded way
• I/O module may hide device details from the processor so the processor only functions in terms
of simple read and write operations – timing, formats, etc…
• I/O module may leave much of the work of controlling a device visible to the processor –
rewind a tape, etc…

I/O channel or I/O processor


• I/O module that takes on most of the detailed processing burden
• Used on mainframe computers

I/O controller of device controller


• Primitive I/O module that requires detailed control
• Used on microcomputers
8. TEXT BOOKS:
V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―,
VI edition, McGraw-Hill Inc,. 2012
David A. Patterson and John L. Hennessey, ―Computer organization and design‟,
Morgan auffman lsevier, Fifth edition, 2014
9. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for Programmed I/O
Time: 45 Minutes
Lesson. No Unit 5– Lesson No.10/ 13
1.CONTENT LIST:
Programmed I/O
2. SKILLS ADDRESSED:
Learning
Remembering
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students know the basics of programmed output and input controller
4.OUTCOMES:
i. Sort the applications of programmed input and output controller
ii. Analyze the features of programmed input and output controller
5.LINK SHEET:
i. Define programmed input and output controller
(ii) Design the programmed input and output controller
6.EVOCATION: (5 Minutes)
7. Lecture Notes
Programmed I/O
Overview of Programmed I/O
• Processor executes an I/O instruction by issuing command to appropriate I/O
module
• I/O module performs the requested action and then sets the appropriate bits in
the I/O status register – I/O module takes not further action to alert the
processor – it does not interrupt the processor
• The processor periodically checks the status of the I/O module until it
determines that the operation is complete
I/O Commands
The processor issues an address, specifying I/O module and device, and an I/O
command. The commands are:
• Control: activate a peripheral and tell it what to do
• Test: test various status conditions associated with an I/O module and its
peripherals
• Read: causes the I/O module to obtain an item of data from the peripheral and
place it into an internal register
• Write: causes the I/O module to take a unit of data from the data bus and

Three Techniques for Input of a Block of Data


I/O Instructions
Processor views I/O operations in a similar manner as memory operations Each device is given a
unique identifier or address
Processor issues commands containing device address – I/O module must check address lines to
see if the command is for itself.

I/O mapping
Memory-mapped I/O
 Single address space for both memory and I/O devices
o Disadvantage – uses up valuable memory address space
 I/O module registers treated as memory addresses
 Same machine instructions used to access both memory and I/O devices
o Advantage – allows for more efficient programming
 Single read line and single write lines needed
 Commonly used

• Isolated I/O
 Separate address space for both memory and I/O devices
 Separate memory and I/O select lines needed
 Small number of I/O instructions
 Commonly used

8. TEXT BOOKS:

V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―,


VI edition, McGraw-Hill Inc,. 2012
David A. Patterson and John L. Hennessey, ―Computer organization and design‟,
Morgan auffman lsevier, Fifth edition, 2014

9. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for DMA and interrupts
Time: 45 Minutes
Lesson. No Unit 5– Lesson No.11/ 13
1.CONTENT LIST:
DMA and interrupts
2. SKILLS ADDRESSED:
Learning
Remembering
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students know the basics of DMA and interrupts
4.OUTCOMES:
i. Sort the applications of DMA and interrupts
ii. Analyze the features of DMA and interrupts
5.LINK SHEET:
i. Define DMA and interrupts
(ii) Design the DMA and interrupts
6.EVOCATION: (5 Minutes)
7. Lecture Notes
Interrupt-Driven I/O
• Overcomes the processor having to wait long periods of time for I/O modules
• The processor does not have to repeatedly check the I/O module status

I/O module view point


• I/O module receives a READ command form the processor
• I/O module reads data from desired peripheral into data register
• I/O module interrupts the processor
• I/O module waits until data is requested by the processor
• I/O module places data on the data bus when requested

Processor view point


• The processor issues a READ command
• The processor performs some other useful work
• The processor checks for interrupts at the end of the instruction cycle
• The processor saves the current context when interrupted by the I/O module
• The processor read the data from the I/O module and stores it in memory
• The processor the restores the saved context and resumes execution

Design Issues
 How does the processor determine which device issued the interrupt
 How are multiple interrupts dealt with
 Device identification
 Multiple interrupt lines – each line may have multiple I/O modules
 Software poll – poll each I/O module
 Separate command line – TESTI/O
 Processor read status register of I/O module
 Time consuming
 Daisy chain
 Hardware poll
 Common interrupt request line
 Processor sends interrupt acknowledge
 Requesting I/O module places a word of data on the data lines – ―vector‖ that uniquely
identifies the I/O module – vectored interrupt
• Bus arbitration
 I/O module first gains control of the bus
 I/O module sends interrupt request
 The processor acknowledges the interrupt request
 I/O module places its vector of the data lines

Multiple interrupts
• The techniques above not only identify the requesting I/O module but provide methods of
assigning priorities
• Multiple lines – processor picks line with highest priority
• Software polling – polling order determines priority
• Daisy chain – daisy chain order of the modules determines priority
• Bus arbitration – arbitration scheme determines priority

Intel 82C59A Interrupt Controller


Intel 80386 provides
• Single Interrupt Request line – INTR
• Single Interrupt Acknowledge line – INTA
• Connects to an external interrupt arbiter, 82C59A, to handle multiple devices and priority
structures
• 8 external devices can be connected to the 82C59A – can be cascaded to 64 82C59A operation
– only manages interrupts
• Accepts interrupt requests
• Determines interrupt priority
• Signals the processor using INTR
• Processor acknowledges using INTA
• Places vector information of data bus
• Processor process interrupts and communicates directly with I/O module

82C59A interrupt modes


Fully nested – priority form 0 (IR0) to 7 (IR7)
Rotating – several devices same priority - most recently device lowest priority
Special mask – processor can inhibit interrupts from selected devices.
Intel 82C55A Programmable Peripheral Interface

 Single chip, general purpose I/O module


 Designed for use with the Intel 80386
 Can control a variety of simple peripheral devices

A, B, C function as 8 bit I/O ports (C can be divided into two 4 bit I/O ports) Left side of
diagram show the interface to the 80386 bus.
Direct Memory Access
Drawback of Programmed and Interrupt-Driven I/O
• I/O transfer rate limited to speed that processor can test and service devices
• Processor tied up managing I/O transfers

DMA Function
• DMA module on system bus used to mimic the processor.
• DMA module only uses system bus when processor does not need it.
• DMA module may temporarily force processor to suspend operations – cycle stealing.

DMA Operation
 The processor issues a command to DMA module
 Read or write
 I/O device address using data lines
 Starting memory address using data lines – stored in address register
 Number of words to be transferred using data lines – stored in data register
 The processor then continues with other work
 DMA module transfers the entire block of data – one word at a time – directly to or from
memory without going through the processor DMA module sends an interrupt to the
processor when complete
DMA and Interrupt Breakpoints during Instruction Cycle
• The processor is suspended just before it needs to use the bus.
• The DMA module transfers one word and returns control to the processor.
• Since this is not an interrupt the processor does not have to save context.
• The processor executes more slowly, but this is still far more efficient that either programmed or
interrupt-driven I/O.

DMA Configurations

 Single bus – detached DMA module


 Each transfer uses bus twice – I/O to DMA, DMA to memory
 Processor suspended twice.

 Single bus – integrated DMA module


 Module may support more than one device
 Each transfer uses bus once – DMA to memory
 Processor suspended once.

 Separate I/O bus


 Bus supports all DMA enabled devices
 Each transfer uses bus once – DMA to memory
 Processor suspended once.
8. TEXT BOOKS:

V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―,


VI edition, McGraw-Hill Inc,. 2012
David A. Patterson and John L. Hennessey, ―Computer organization and design‟,
Morgan auffman lsevier, Fifth edition, 2014

9. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.
SRI VIDYA COLLEGE OF ENGINEERING AND
TECHNOLOGY, VIRUDHUNAGAR
Department of Computer Science & Engineering

Class II Year (03Semester)


Subject Code CS6303
Subject Computer Architecture
Prepared By Kaviya.P
Lesson Plan for I/O processors
Time: 45 Minutes
Lesson. No Unit 5– Lesson No.12/ 13
1.CONTENT LIST:
I/O processors
2. SKILLS ADDRESSED:
Learning
Remembering
3. OBJECTIVE OF THIS LESSON PLAN:
To make the students know the basics of I/O processors
4.OUTCOMES:
i. Sort the applications of I/O processors
ii. Analyze the features of I/O processors
5.LINK SHEET:
i. Define I/O processors
(ii) Design the I/O processors
6.EVOCATION: (5 Minutes)
7. Lecture Notes
Input-Output Processor (IOP)
 Communicate directly with all I/O devices
 Fetch and execute its own instruction
 IOP instructions are specifically designed to facilitate I/O transfer

Command
Instruction that are read form memory by an IOP
 Distinguish from instructions that are read by the CPU
 Commands are prepared by experienced programmers and are stored in memory
 Command word = IOP program Memory

I/O Channels and Processors

The Evolution of the I/O Function


1. Processor directly controls peripheral device
2. Addition of a controller or I/O module – programmed I/O
3. Same as 2 – interrupts added
4. I/O module direct access to memory using DMA
5. I/O module enhanced to become processor like – I/O channel
6. I/O module has local memory of its own – computer like – I/O processor
• More and more the I/O function is performed without processor involvement.
• The processor is increasingly relieved of I/O related tasks – improved performance.

Characteristics of I/O Channels

• Extension of the DMA concept


• Ability to execute I/O instructions – special-purpose processor on I/O channel –
complete control over I/O operations
• Processor does not execute I/O instructions itself – processor initiates I/O
transfer by instructing the I/O channel to execute a program in memory
• Program specifies
 Device or devices
 Area or areas of memory
 Priority
 Error condition actions

Two type of I/O channels


• Selector channel
 Controls multiple high-speed devices
 Dedicated to the transfer of data with one of the devices
 Each device handled by a controller, or I/O module
 I/O channel controls these I/O controllers
• Multiplexor channel
 Can handle multiple devices at the same time
 Byte multiplexor – used for low-speed devices
 Block multiplexor – interleaves blocks of data from several devices.

The External Interface: FireWire and Infiniband


Type of Interfaces
o Parallel interface – multiple bits transferred simultaneously
o Serial interface – bits transferred one at a time
I/O module dialog for a write operation
1. I/O module sends control signal – requesting permission to send data
2. Peripheral acknowledges the request
3. I/O module transfer data
4. Peripheral acknowledges receipt of data

FireWire Serial Bus – IEEE 1394


• Very high speed serial bus
• Low cost
• Easy to implement
• Used with digital cameras, VCRs, and televisions
FireWire Configurations
• Daisy chain
• 63 devices on a single port – 64 if you count the interface itself
• 1022 FireWire busses can be interconnected using bridges
• Hot plugging
• Automatic configuration
• No terminations
• Can be tree structured rather than strictly daisy chained

FireWire three layer stack:Physical layer


Defines the transmission media that are permissible and the electrical and signaling
characteristics of each 25 to 400 Mbps
Converts binary data to electrical signals
Provides arbitration services
 Based on tree structure
 Root acts as arbiter
 First come first served
 Natural priority controls simultaneous requests – nearest root
 Fair arbitration
 Urgent arbitration

Link layer
• Describes the transmission of data in the packets
• Asynchronous
o Variable amount of data and several bytes of transaction data transferred as a packet
o Uses an explicit address
o Acknowledgement returned
• Isochronous
o Variable amount of data in sequence of fixed sized packets at regular intervals
o Uses simplified addressing
o No acknowledgement

Transaction layer
• Defines a request-response protocol that hides the lower-layer detail of FireWire from
applications.
FireWire Protocol Stack

FireWire Subactions
InfiniBand
• Recent I/O specification aimed at high-end server market
• First version released early 2001
• Standard for data flow between processors and intelligent I/O devices
• Intended to replace PCI bus in servers
• Greater capacity, increased expandability, enhanced flexibility
• Connect servers, remote storage, network devices to central fabric of switches and links
• Greater server density
• Independent nodes added as required
• I/O distance from server up to
o 17 meters using copper
o 300 meters using multimode optical fiber
o 10 kilometers using single-mode optical fiber
• Transmission rates up to 30 Gbps
InfiniBand Operations
• 16 logical channels (virtual lanes) per physical link
• One lane for fabric management – all other lanes for data transport
• Data sent as a stream of packets
• Virtual lane temporarily dedicated to the transfer from one end node to another
• Switch maps traffic from incoming lane to outgoing lane

8. TEXT BOOKS:

V.Carl Hamacher, Zvonko G. Varanesic and Safat G. Zaky, ―Computer Organisation―,


VI edition, McGraw-Hill Inc,. 2012
David A. Patterson and John L. Hennessey, ―Computer organization and design‟,
Morgan auffman lsevier, Fifth edition, 2014

9. APPLICATIONS
Real Life Application Of memory concepts Memory is the ability to encode, store, and
retrieve a stimulus.

You might also like