0% found this document useful (0 votes)
51 views67 pages

SEN 207 Complete Note

Uploaded by

amazonkdp00111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views67 pages

SEN 207 Complete Note

Uploaded by

amazonkdp00111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

SEN 207

COMPUTER ORGANIZATION AND ARCHITECTURE


Computer Organization and Architecture are used to design computer systems.
Computer Architecture is considered to be those attributes of a system that are visible to the
user like addressing techniques, instruction sets, and bits used for data, and have a direct impact
on the logic execution of a program, It defines the system in an abstract manner, It deals with
What does the system do.
Computer Organization is the way in which a system has to be structured, its operational
units, and the interconnections between them that achieve the architectural specifications, It is
the realization of the abstract model, and It deals with How to implement the system.

DIFFERENCE BETWEEN COMPUTER ARCHITECTURE AND COMPUTER


ORGANIZATION

S.No Computer Architecture Computer Organization


1. Architecture describes what the The Organization describes how it does it.
computer does.
2. Computer Architecture deals with Computer Organization deals with a
the functional behavior of computer structural relationship.
systems.
3. In the above figure, it’s clear that it In the above figure, it’s also clear that it
deals with high-level design issues. deals with low-level design issues.
4. Architecture indicates its hardware. Whereas Organization indicates its
performance.
5. As a programmer, you can view The implementation of the architecture is
architecture as a series of called organization.
instructions, addressing modes, and
registers.
6. For designing a computer, its For designing a computer, an organization is
architecture is fixed first. decided after its architecture.
7. Computer Architecture is also called Computer Organization is frequently called
Instruction Set Architecture (ISA). microarchitecture.
8. Computer Architecture comprises Computer Organization consists of physical
logical functions such as instruction units like circuit designs, peripherals, and
sets, registers, data types, and adders.
addressing modes.
9. The different architectural CPU organization is classified into three
categories found in our computer categories based on the number of address
systems are as follows: fields:
1. Von-Neumann Architecture 1. Organization of a single Accumulator.
2. Harvard Architecture 2. Organization of general registers
3. Instruction Set Architecture 3. Stack organization
4. Micro-architecture
5. System Design
10. It makes the computer’s hardware It offers details on how well the computer
visible. performs.
11. Architecture coordinates the Computer Organization handles the
hardware and software of the segments of the network in a system.
system.
12. The software developer is aware of It escapes the software programmer’s
it. detection.
13. Examples- Intel and AMD created Organizational qualities include hardware
the x86 processor. Sun elements that are invisible to the
Microsystems and others created the programmer, such as the interfacing of
SPARC processor. Apple, IBM, and computers and peripherals, memory
Motorola created the PowerPC. technologies, and control signals.

BASIC STRUCTURE OF COMPUTERS


A computer is an electronic device that accepts data, performs operations, displays results, and
stores the data or results as needed. It is a combination of hardware and software resources that
integrate together and provide various functionalities to the user. Hardware is the physical
components of a computer like a processor, memory devices, monitor, keyboard, etc., while
software is a set of programs or instructions that are required by the hardware resources to
function properly.

Components of a Computer
There are basically three important components of a computer:
Input Unit
Central Processing Unit (CPU)
Output Unit
1. Input Unit:
The input unit consists of input devices that are attached to the computer. These devices take
input and convert it into binary language that the computer understands. Some of the common
input devices are keyboard, mouse, joystick, scanner etc.
The Input Unit is formed by attaching one or more input devices to a computer.
A user input data and instructions through input devices such as a keyboard, mouse, etc.
The input unit is used to provide data to the processor for further processing.
2. Central Processing Unit:
Once the information is entered into the computer by the input device, the processor processes
it. The CPU is called the brain of the computer because it is the control center of the computer.
It first fetches instructions from memory and then interprets them so as to know what is to be
done. If required, data is fetched from memory or input device. Thereafter CPU executes or
performs the required computation, and then either stores the output or displays it on the output
device. The CPU has three main components, which are responsible for different functions:
Arithmetic Logic Unit (ALU), Control Unit (CU) and Memory registers

A. Arithmetic and Logic Unit (ALU): The ALU, as its name suggests performs mathematical
calculations and makes logical decisions. Arithmetic calculations include addition, subtraction,
multiplication, and division. Logical decisions involve the comparison of two data items to see
which one is larger or smaller or equal.
Arithmetic Logical Unit is the main component of the CPU
It is the fundamental building block of the CPU.
Arithmetic and Logical Unit is a digital circuit that is used to perform arithmetic and logical
operations.
B. Control Unit: The Control unit coordinates and controls the data flow in and out of the
CPU, and also controls all the operations of ALU, memory registers and also input/output units.
It is also responsible for carrying out all the instructions stored in the program. It decodes the
fetched instruction, interprets it and sends control signals to input/output devices until the
required operation is done properly by ALU and memory.
The Control Unit is a component of the central processing unit of a computer that directs the
operation of the processor.
It instructs the computer’s memory, arithmetic, and logic unit, and input and output devices on
how to respond to the processor’s instructions.
In order to execute the instructions, the components of a computer receive signals from the
control unit.
It is also called the central nervous system or brain of the computer.
C. Memory Registers: A register is a temporary unit of memory in the CPU. These are used
to store the data, which is directly used by the processor. Registers can be of different sizes(16-
bit, 32-bit, 64-bit, and so on) and each register inside the CPU has a specific function, like
storing data, storing an instruction, storing the address of a location in memory, etc. The user
registers can be used by an assembly language programmer for storing operands, intermediate
results, etc. The accumulator (ACC) is the main register in the ALU and contains one of the
operands of an operation to be performed in the ALU.
Memory attached to the CPU is used for the storage of data and instructions and is called
internal memory.
The internal memory is divided into many storage locations, each of which can store data or
instructions. Each memory location is of the same size and has an address. With the help of the
address, the computer can read any memory location easily without having to search the entire
memory. When a program is executed, its data is copied to the internal memory and stored in
the memory till the end of the execution. The internal memory is also called the Primary
memory or Main memory. This memory is also called RAM, i.e., Random Access Memory.
The time of access of data is independent of its location in memory, therefore, this memory is
also called Random Access memory (RAM).

Memory Unit is the primary storage of the computer. It stores both data and instructions.
Data and instructions are stored permanently in this unit so that they are available whenever
required.
3. Output Unit:
The output unit consists of output devices that are attached to the computer. It converts the
binary data coming from the CPU to human understandable form. The common output devices
are monitors, printers, plotters, etc.
The output unit displays or prints the processed data in a user-friendly format.
The output unit is formed by attaching the output devices of a computer.
The output unit accepts the information from the CPU and displays it in a user-readable form.
PERFORMANCE OF COMPUTER IN COMPUTER ORGANIZATION
In computer organization, performance refers to the speed and efficiency at which a computer
system can execute tasks and process data. A high-performing computer system is one that can
perform tasks quickly and efficiently while minimizing the amount of time and resources
required to complete these tasks.
There are several factors that can impact the performance of a computer system, including:
Processor speed: The speed of the processor, measured in GHz (gigahertz), determines how
quickly the computer can execute instructions and process data.
Memory: The amount and speed of the memory, including RAM (random access memory)
and cache memory, can impact how quickly data can be accessed and processed by the
computer.
Storage: The speed and capacity of the storage devices, including hard drives and solid-state
drives (SSDs), can impact the speed at which data can be stored and retrieved.
I/O devices: The speed and efficiency of input/output devices, such as keyboards, mice, and
displays, can impact the overall performance of the system.
Software optimization: The efficiency of the software running on the system, including
operating systems and applications, can impact how quickly tasks can be completed.
Improving the performance of a computer system typically involves optimizing one or more of
these factors to reduce the time and resources required to complete tasks. This can involve
upgrading hardware components, optimizing software, and using specialized performance-
tuning tools to identify and address bottlenecks in the system.
Computer performance is the amount of work accomplished by a computer system. The word
performance in computer performance means “How well is the computer doing the work it is
supposed to do?”. It basically depends on the response time, throughput, and execution time of
a computer system. Response time is the time from the start to completion of a task. This also
includes:

• Operating system overhead.


• Waiting for I/O and other processes
• Accessing disk and memory
• Time spent executing on the CPU or execution time.
Throughput is the total amount of work done in a given time.
CPU execution time is the total time a CPU spends computing on a given task. It also excludes
time for I/O or running other programs. This is also referred to as simply CPU time.
Performance is determined by execution time as performance is inversely proportional to
execution time.
Performance = (1 / Execution time)
And,
(Performance of A / Performance of B)
= (Execution Time of B / Execution Time of A)
If given that Processor A is faster than Processor B that means the execution time of A is less
than that of the execution time of B. Therefore, the performance of A is greater than that of the
performance of B.
Example – Machine A runs a program in 100 seconds, and Machine B runs the same program
in 125 seconds
(Performance of A / Performance of B)
= (Execution Time of B / Execution Time of A)
= 125 / 100 = 1.25
That means Machine A is 1.25 times faster than Machine B. And, the time to execute a given
program can be computed as:

Execution time (T) = CPU clock cycles x clock cycle time (𝑡𝑐𝑦𝑐𝑙𝑒 )
Since clock cycle time 𝑡𝑐𝑦𝑐𝑙𝑒 and clock rate are reciprocals, so,
Execution time = CPU clock cycles/clock rate
The number of CPU clock cycles can be determined by,
CPU clock cycles = (No. of instructions / Program) x (Clock cycles / Instruction)
=𝑁𝐼𝑛𝑠𝑡𝑟 x CPI
Which gives,
Execution time = Instruction Count x CPI x clock cycle time
= Instruction Count x CPI/clock rate

T = 𝑁𝐼𝑛𝑠𝑡𝑟 ∗ 𝐶𝑃𝐼 ∗ 𝑡𝑐𝑦𝑐𝑙𝑒

T Execution Time

𝑁𝐼𝑛𝑠𝑡𝑟 Number Of Instructions Executed Per Program

𝐶𝑃𝐼 Average Number Of Cycles Per Instruction

𝑡𝑐𝑦𝑐𝑙𝑒 Time Taken By The Clock Cycle

Example: Calculate the execution time (T) for a program that executes 5million instructions
with the information in the table below while the CPU has a frequency of 2GHz.

Instructions 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚𝒊𝒏𝒔𝒕𝒓 𝑪𝑷𝑰𝒊𝒏𝒔𝒕𝒓


ALU 50% 3
Load 20% 5
Store 10% 4
Branch 20% 3

Solution:

Calculate The Average Number Of Cycle Per Instruction

CPI = ∑ 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦𝑖𝑛𝑠𝑡𝑟 *𝐶𝑃𝐼𝑖𝑛𝑠𝑡𝑟


CPI = (0.5*3) + (0.2*5) + (0.1*4) + (0.2*3)

CPI = 1.5 + 1 + 0.4 + 0.6 = 3.5

Frequency = 2GHZ = 2 ∗ 109

T = (5 ∗ 106 * 3.5)/2 ∗ 109 = 8.75 ∗ 10−3 = 8.75ms

How to Improve Performance?


To improve performance you can either:
▪ Decrease the CPI (clock cycles per instruction) by using new Hardware.
▪ Decrease the clock time or Increase the clock rate by reducing propagation delays or by
using pipelining.
▪ Decrease the number of required cycles or improve ISA or Compiler.
USES AND BENEFITS OF PERFORMANCE OF COMPUTER
Some of the key uses and benefits of a high-performing computer system include:
Increased productivity: A high-performing computer can help increase productivity by
reducing the time required to complete tasks, allowing users to complete more work in less
time.
Improved user experience: A fast and efficient computer system can provide a better user
experience, with smoother operation and fewer delays or interruptions.
Faster data processing: A high-performing computer can process data more quickly, enabling
faster access to critical information and insights.
Enhanced gaming and multimedia performance: High-performance computers are better
suited for gaming and multimedia applications, providing smoother and more immersive
experiences.
Better efficiency and cost savings: By optimizing the performance of a computer system, it
is possible to reduce the time and resources required to complete tasks, leading to better
efficiency and cost savings.

Amdahl’s Law
Amdahl's law is an expression used to find the maximum expected improvement to an overall
system when only part of the system is improved. It is often used in parallel computing to
predict the theoretical maximum speed up using multiple processors.
Speedup is defined as the ratio of performance for the entire task using the enhancement and
performance for the entire task without using the enhancement or
Speedup can be defined as the ratio of execution time for the entire task without using the
enhancement and execution time for the entire task using the enhancement.

If 𝑃𝑒 is the performance for the entire task using the enhancement when possible,

𝑃𝑤 is the performance for the entire task without using the enhancement,

𝐸𝑤 is the execution time for the entire task without using the enhancement and

𝐸𝑒 is the execution time for the entire task using the enhancement when possible then,

Speedup = 𝑃𝑒 /𝑃𝑤 or

Speedup = 𝐸𝑤 /𝐸𝑒

Amdahl’s law uses two factors to find speedup from some enhancement:
Fraction enhanced – The fraction of the computation time in the original computer that can be
converted to take advantage of the enhancement. For example- if 10 seconds of the execution
time of a program that takes 40 seconds in total can use an enhancement, the fraction is 10/40.
This obtained value is Fraction Enhanced. Fraction enhanced is always less than 1.
Speedup enhanced – The improvement gained by the enhanced execution mode; that is, how
much faster the task would run if the enhanced mode were used for the entire program. For
example – If the enhanced mode takes, say 3 seconds for a portion of the program, while it is
6 seconds in the original mode, the improvement is 6/3. This value is Speedup enhanced.
Speedup Enhanced is always greater than 1.
The overall Speedup is the ratio of the execution time:-

Overall Speedup ( 𝑆𝑎𝑙𝑙 ) = Old Execution Time/ New Execution Time

1
𝑆𝑎𝑙𝑙 = 𝐹𝑟𝑎𝑐𝑡𝑖𝑜𝑛𝑒𝑛ℎ𝑎𝑛𝑐𝑒𝑑
൫1−𝐹𝑟𝑎𝑐𝑡𝑖𝑜𝑛𝑒𝑛ℎ𝑎𝑛𝑐𝑒𝑑 ൯ + ( )
𝑆𝑝𝑒𝑒𝑑𝑢𝑝𝑒𝑛ℎ𝑎𝑛𝑐𝑒𝑑

Overall Speedup (max) = 1/ {1 – 𝐹𝑟𝑎𝑐𝑡𝑖𝑜𝑛𝑒𝑛ℎ𝑎𝑛𝑐𝑒𝑑 }

Of course, this is just theoretical (ideal) and is not achievable in real-life conditions.
Likewise, we can also think of the case where f = 1.
Amdahl’s law is a principle that states that the maximum potential improvement to the
performance of a system is limited by the portion of the system that cannot be improved. In
other words, the performance improvement of a system as a whole is limited by its bottlenecks.
The law is often used to predict the potential performance improvement of a system when
adding more processors or improving the speed of individual processors. It is named after Gene
Amdahl, who first proposed it in 1967.
The formula for Amdahl’s law is:
S = 1 / (1 – P + (P / N))
Where:
S is the speedup of the system
P is the proportion of the system that can be improved
N is the number of factor to enhance the system
For example, if a system has a single bottleneck that occupies 20% of the total execution time,
and we add 4 more processors to the system, the speedup would be:
S = 1 / (1 – 0.2 + (0.2 / 4))
S = 1 / (0.8 + 0.04)
S = 1 / 0.84
S = 1.176
This means that the overall performance of the system would improve by about 19% with the
addition of the 4 processors.
It’s important to note that Amdahl’s law assumes that the rest of the system is able to fully
utilize the additional processors, which may not always be the case in practice.
Advantages of Amdahl’s law:
✓ Provides a way to quantify the maximum potential speedup that can be achieved by
parallelizing a program, which can help guide decisions about hardware and software
design.
✓ Helps to identify the portions of a program that are not easily parallelizable, which can
guide efforts to optimize those portions of the code.
✓ Provides a framework for understanding the trade-offs between parallelization and
other forms of optimization, such as code optimization and algorithmic improvements.
Disadvantages of Amdahl’s law:
❖ Assumes that the portion of the program that cannot be parallelized is fixed, which may
not be the case in practice. For example, it is possible to optimize code to reduce the
portion of the program that cannot be parallelized, making Amdahl’s law less accurate.
❖ Assumes that all processors have the same performance characteristics, which may not
be the case in practice. For example, in a heterogeneous computing environment, some
processors may be faster than others, which can affect the potential speedup that can be
achieved.
❖ Does not take into account other factors that can affect the performance of parallel
programs, such as communication overhead and load balancing. These factors can
impact the actual speedup that is achieved in practice, which may be lower than the
theoretical maximum predicted by Amdahl’s law.
What Kinds of Problems Do We Solve with Amdahl’s Law?
Recall how we defined the performance of a system that has been speedup:
Example 1: Assume a microprocessor which is widely used for scientific applications. It has
both integer and floating point instructions. Now the floating point instructions are enhanced
and made 3 times faster than before and the integer instructions are unenhanced. If there are
20% of the floating point instruction in the program. Find the overall speedup.
Solution:
P = 20% = 0.2, N = 3, S=?
S = 1/ ((1-P) + (P/N))
S= 1/ ((1-0.2) + (0.2/3))
S = 1/ ((0.8) + (0.66666))
S= 1/ (0.867)
S= 1.153
Example 2: Suppose that a task makes extensive use of floating point operation with 40% of
the time consumed by floating point operation with a new hardware design. If the floating point
module is speedup by factor 4. What is the overall speedup?
Solution:
P = 40% = 0.4, N = 4, S = ?
S = 1/ ((1-P) + (P/N))
S= 1/ ((1-0.4) + (0.4/4))
S = 1/ ((0.6) + (0.1))
S= 1/ (0.7)
S = 1.428

Assignment: In an Enhancement of the design of a CPU, the speed of a floating point unit has
been increased by 20% and the fixed point unit has been increased by 10%. What is the overall
speedup achieved if the ratio of the number of floating point operations to the number of the
fixed point operation is 2:3 and the floating point operation used to take twice the time taken
by the fixed point operation in the original design?
COMPUTER ARCHITECTURE AND ORGANIZATION
Computer Architecture is the design of computers, including their instruction sets, hardware
component and system organization.
Computer Architecture deals with the functional behavior of computer system and design
implementation for the various parts of the computer while Computer organization deals with
the structural relationship, operational attributes that are linked together and contribute to
realize the architectural specification.
We have different types of Computer Architecture:
i. Von Newmann Architecture/ Princeton Architecture
ii. Harvard Architecture.
Von Newmann Architecture is a digital computer architecture whose design is based on the
concept of stored program computers where program data and instruction data are stored in the
same memory. This architecture was designed by the famous mathematician and physicist John
Von Neumann in 1945.

CPU
Instruction Data
Program Data

Computer Architecture / Van Newmann Architecture.

ADVANTAGES OF VON NEUMANN ARCHITECTURE


✓ Less physical space is required than Harvard
✓ Handling just one memory block is simpler and easier to achieve
✓ Cheaper to use than Harvard
DISADVANTAGE OF VON NEUMANN ARCHITECTURE
❖ Shared memory - a defective program can overwrite another in memory, causing it to
crash
❖ Memory leaks - some defective programs fail to release memory when they are finished
with it, which could cause the computer to crash due to insufficient memory
❖ Data bus speed - the CPU is much faster than the data bus, meaning it often sits idle
(Von Neumann bottleneck)
❖ Fetch rate - data and instructions share the same data bus, even though the rate at which
each needs to be fetched is often very different.

Harvard Architecture is the digital computer architecture whose design is based on the
concept that there are separate storage and separate buses (signal paths) for instruction and
data. It was basically developed to overcome the bottleneck of Von Neumann Architecture. The
main advantage of having separate buses for instruction and data is that the CPU can access
instructions and read/write data at the same time.

Harvard Architecture

ADVANTAGE OF HARVARD ARCHITECTURE


✓ Simultaneous instruction and data access: utilizes separate memory and buses for
instructions and data.
✓ Reduced resource conflicts: The distinct memory units and buses for instructions and
data reduce the likelihood of pipeline stalls caused by resource conflicts.
✓ Independent Cache Memory optimization: enables independent caching of instructions
and data. This feature allows for more effective cache memory usage, as the likelihood
of cache misses is diminished, contributing to speed and performance improvements.
✓ Enhanced parallelism: With its separate memory units and buses, Harvard Architecture
promotes parallelism in processing instructions and data.
DISADVANTAGE OF HARVARD ARCHITECTURE
❖ Increased design complexity: The architecture necessitates separate memory units,
buses, and management mechanisms for instructions and data, increasing system
complexity and potentially leading to a larger chip size.
❖ Higher implementation cost: Due to the increased complexity of the design,
implementing the Harvard Architecture may entail higher manufacturing expenses
when compared to the von Neumann Architecture.
❖ Code and data sharing limitations: The separation of instructions and data memory can
create challenges when code and data need to be shared.
❖ While the Harvard Architecture offers superior performance and efficiency for specific
use cases, such as digital signal processing and embedded systems, it may not always
be the optimal choice for general-purpose computing applications.
Difference between Von Newmann and Harvard Architecture
VON NEUMANN ARCHITECTURE HARVARD ARCHITECTURE
It is ancient computer architecture based on It is modern computer architecture based on
stored program computer concept. Harvard Mark I relay based model.
Same physical memory address is used for Separate physical memory address is used for
instructions and data. instructions and data.
There is common bus for data and instruction Separate buses are used for transferring data
transfer. and instruction.
Two clock cycles are required to execute single An instruction is executed in a single cycle.
instruction.
It is cheaper in cost. It is costly than Von Neumann Architecture.
CPU cannot access instructions and read/write CPU can access instructions and read/write at
at the same time. the same time.
It is used in personal computers and small It is used in micro controllers and signal
computers. processing.

Flynn’s taxonomy is a classification scheme for computer architectures proposed by Michael


Flynn in 1966. The taxonomy is based on the number of instruction streams and data streams
that can be processed simultaneously by a computer architecture.
i. Single Instruction Stream, Single Data Stream (SISD): In a SISD architecture, there
is a single processor that executes a single instruction stream and operates on a
single data stream. This is the simplest type of computer architecture and is used in
most traditional computers.

SSID
ii. Single Instruction Stream, Multiple Data Stream (SIMD): In a SIMD architecture,
there is a single processor that executes the same instruction on multiple data
streams in parallel. This type of architecture is used in applications such as image
and signal processing.
SIMD

iii. Multiple Instruction Stream, Single Data Stream (MISD): In a MISD architecture,
multiple processors execute different instructions on the same data stream. This
type of architecture is not commonly used in practice, as it is difficult to find
applications that can be decomposed into independent instruction streams.

MISD

iv. Multiple Instruction Stream, Multiple Data Stream (MIMD): In a MIMD


architecture, multiple processors execute different instructions on different data
streams. This type of architecture is used in distributed computing, parallel
processing, and other high-performance computing applications.

MIMD
CPU ORGANIZATION AND MICRO-ARCHITECTURAL LEVEL DESIGN
CPU Organization
What is a CPU?
A Central Processing Unit is the most important component of a computer system. A CPU is
hardware that performs data input/output, processing, and storage functions for a computer
system. A CPU can be installed into a CPU socket. These sockets are generally located on the
motherboard. CPU can perform various data processing operations. CPU can store data,
instructions, programs, and intermediate results.
History of CPU
Since 1823, when Baron Jons Jakob Berzelius discovered silicon, which is still the primary
component used in the manufacture of CPUs today, the history of the CPU has experienced
numerous significant turning points. The first transistor was created by John Bardeen, Walter
Brattain, and William Shockley in December 1947. In 1958, the first working integrated circuit
was built by Robert Noyce and Jack Kilby.
The Intel 4004 was the company’s first microprocessor, which it unveiled in 1971. Ted Hoff’s
assistance was needed for this. When Intel released its 8008 CPU in 1972, Intel 8086 in 1976,
and Intel 8088 in June 1979, it contributed to yet another win. The Motorola 68000, a 16/32-
bit processor, was also released in 1979. The Sun also unveiled the SPARC CPU in 1987. AMD
unveiled the AM386 CPU series in March 1991.
In January 1999, Intel introduced the Celeron 366 MHZ and 400 MHz processors. AMD back
in April 2005 with its first dual-core processor. Intel also introduced the Core 2 Dual processor
in 2006. Intel released the first Core i5 desktop processor with four cores in September 2009.
In January 2010, Intel released other processors like Core 2 Quad processor Q9500, the first
Core i3 and i5 mobile processors, first Core i3 and i5 desktop processors.
In June 2017, Intel released Core i9 desktop processor, and Intel introduced its first Core i9
mobile processor In April 2018.
What Does a CPU Do?
The main function of a computer processor is to execute instructions and produce an output.
CPU: Fetch, Decode, and Execute are the fundamental functions of the computer.
Fetch: the first CPU gets the instruction. That means binary numbers that are passed from
RAM to CPU.
Decode: When the instruction is entered into the CPU, it needs to decode the instructions. with
the help of ALU (Arithmetic Logic Unit), the process of decode begins.
Execute: After decode step the instructions are ready to execute
Store: After execute step the instructions are ready to store in the memory.
Types of CPU
We have three different types of CPUs:
Single Core CPU: The oldest type of computer CPU is a single-core CPU. These CPUs were
used in the 1970s. these CPUs only have a single core that performs different operations. This
means that the single-core CPU can only process one operation at a single time. single-core
CPU is not suitable for multitasking.
Dual-Core CPU: Dual-Core CPUs contain a single Integrated Circuit with two cores. Each
core has its cache and controller. These controllers and cache work as a single unit. dual-core
CPUs can work faster than single-core processors.
Quad-Core CPU: Quad-Core CPUs contain two dual-core processors present within a single
integrated circuit (IC) or chip. A quad-core processor contains a chip with four independent
cores. These cores read and execute various instructions provided by the CPU. Quad Core CPU
increases the overall speed for programs. Without even boosting the overall clock speed it
results in higher performance.
Different Parts of the CPU
The CPU consists of 3 major units, which are:

• Control Unit
• Memory or Storage Unit
• ALU (Arithmetic Logic Unit)
Control Unit
The Control Unit is the part of the computer’s central processing unit (CPU), which directs the
operation of the processor. It was included as part of the Von Neumann Architecture by John
von Neumann. It is the responsibility of the control unit to tell the computer’s memory,
arithmetic/logic unit, and input and output devices how to respond to the instructions that have
been sent to the processor. It fetches internal instructions of the programs from the main
memory to the processor instruction register, and based on this register contents, the control
unit generates a control signal that supervises the execution of these instructions. A control unit
works by receiving input information which it converts into control signals, which are then sent
to the central processor. The computer’s processor then tells the attached hardware what
operations to perform. The functions that a control unit performs are dependent on the type of
CPU because the architecture of the CPU varies from manufacturer to manufacturer.
Examples of devices that require a CU are:

• Central Processing Units (CPUs)


• Graphics Processing Units (GPUs)
Functions of the Control Unit
❖ It coordinates the sequence of data movements into, out of, and between a processor’s
many sub-units.
❖ It interprets instructions.
❖ It controls data flow inside the processor.
❖ It receives external instructions or commands which it converts to a sequence of control
signals.
❖ It controls many execution units (i.e. ALU, data buffers, and registers) contained within
a CPU.
❖ It also handles multiple tasks, such as fetching, decoding, execution handling, and
storing results.
Advantages of a Well-Designed Control Unit
✓ Efficient instruction execution: A well-designed control unit can execute instructions
more efficiently by optimizing the instruction pipeline and minimizing the number of
clock cycles required for each instruction.
✓ Improved performance: A well-designed control unit can improve the performance of
the CPU by increasing the clock speed, reducing the latency, and improving the
throughput.
✓ Support for complex instructions: A well-designed control unit can support complex
instructions that require multiple operations, reducing the number of instructions
required to execute a program.
✓ Improved reliability: A well-designed control unit can improve the reliability of the
CPU by detecting and correcting errors, such as memory errors and pipeline stalls.
✓ Lower power consumption: A well-designed control unit can reduce power
consumption by optimizing the use of resources, such as registers and memory, and
reducing the number of clock cycles required for each instruction.
✓ Better branch prediction: A well-designed control unit can improve branch prediction
accuracy, reducing the number of branch mispredictions and improving performance.
✓ Improved scalability: A well-designed control unit can improve the scalability of the
CPU, allowing it to handle larger and more complex workloads.
✓ Better support for parallelism: A well-designed control unit can better support
parallelism, allowing the CPU to execute multiple instructions simultaneously and
improve overall performance.
✓ Improved security: A well-designed control unit can improve the security of the CPU
by implementing security features such as address space layout randomization and data
execution prevention.
✓ Lower cost: A well-designed control unit can reduce the cost of the CPU by minimizing
the number of components required and improving manufacturing efficiency.
Disadvantages of a Poorly-Designed Control Unit
❖ Reduced performance: A poorly designed control unit can reduce the performance of
the CPU by introducing pipeline stalls, increasing the latency, and reducing the
throughput.
❖ Increased complexity: A poorly designed control unit can increase the complexity of
the CPU, making it harder to design, test, and maintain.
❖ Higher power consumption: A poorly designed control unit can increase power
consumption by inefficiently using resources, such as registers and memory, and
requiring more clock cycles for each instruction.
❖ Reduced reliability: A poorly designed control unit can reduce the reliability of the CPU
by introducing errors, such as memory errors and pipeline stalls.
❖ Limitations on instruction set: A poorly designed control unit may limit the instruction
set of the CPU, making it harder to execute complex instructions and limiting the
functionality of the CPU.
❖ Inefficient use of resources: A poorly designed control unit may inefficiently use
resources such as registers and memory, leading to wasted resources and reduced
performance.
❖ Limited scalability: A poorly designed control unit may limit the scalability of the CPU,
making it harder to handle larger and more complex workloads.
❖ Poor support for parallelism: A poorly designed control unit may limit the ability of the
CPU to support parallelism, reducing the overall performance of the system.
❖ Security vulnerabilities: A poorly designed control unit may introduce security
vulnerabilities, such as buffer overflows or code injection attacks.
❖ Higher cost: A poorly designed control unit may increase the cost of the CPU by
requiring additional components or increasing the manufacturing complexity.

Memory or Storage Unit


What is Computer Memory?
Computer memory is just like the human brain. It is used to store data/information and
instructions. It is a data storage unit or a data storage device where data is to be processed and
instructions required for processing are stored. both the input and output can be stored here.
Characteristics of Computer Memory
It is faster as compared to secondary memory.
It is semiconductor memories.
It is usually a volatile memory and the main memory of the computer.
A computer system cannot run without primary memory.
How Does Computer Memory Work?
When you open a program, it is loaded from secondary memory into primary memory. Because
there are various types of memory and storage, an example would be moving a program from
a solid-state drive (SSD) to RAM. Because primary storage is accessed more quickly, the
opened software can connect with the computer’s processor more quickly. The primary
memory is readily accessible from temporary memory slots or other storage sites.
Memory is volatile, which means that data is only kept temporarily in memory. Data saved in
volatile memory is automatically destroyed when a computing device is turned off. When you
save a file, it is sent to secondary memory for storage.
There are various kinds of memory accessible. Its operation will depend upon the type of
primary memory used. but normally, semiconductor-based memory is more related with
memory. Semiconductor memory made up of IC (integrated circuits) with silicon-based metal-
oxide-semiconductor (MOS) transistors.
Types of Computer Memory
In general, computer memory is of three types:
1. Primary memory
2. Secondary memory
3. Cache memory
1. Primary Memory
It is also known as the main memory of the computer system. It is used to store data and
programs or instructions during computer operations. It uses semiconductor technology and
hence is commonly called semiconductor memory. Primary memory is of two types:
RAM (Random Access Memory): It is a volatile memory. Volatile memory stores
information based on the power supply. If the power supply fails/is interrupted/stopped, all the
data and information on this memory will be lost. RAM is used for booting up or starting the
computer. It temporarily stores programs/data which has to be executed by the processor. RAM
is of two types:
S RAM (Static RAM): S RAM uses transistors and the circuits of this memory are capable of
retaining their state as long as the power is applied. This memory consists of the number of flip
flops with each flip flop storing 1 bit. It has less access time and hence, it is faster.
D RAM (Dynamic RAM): D RAM uses capacitors and transistors and stores the data as a
charge on the capacitors. They contain thousands of memory cells. It needs refreshing of charge
on the capacitor after a few milliseconds. This memory is slower than S RAM.
ROM (Read Only Memory): It is a non-volatile memory. Non-volatile memory stores
information even when there is a power supply failed/ interrupted/stopped. ROM is used to
store information that is used to operate the system. As its name refers to read-only memory,
we can only read the programs and data that are stored on it. It contains some electronic fuses
that can be programmed for a piece of specific information. The information is stored in the
ROM in binary format. It is also known as permanent memory. ROM is of four types:
MROM (Masked ROM): Hard-wired devices with a pre-programmed collection of data or
instructions were the first ROMs. Masked ROMs are a type of low-cost ROM that works in
this way.
PROM (Programmable Read-Only Memory): This read-only memory is modifiable once
by the user. The user purchases a blank PROM and uses a PROM program to put the required
contents into the PROM. Its content can’t be erased once written.
EPROM (Erasable Programmable Read Only Memory): EPROM is an extension to PROM
where you can erase the content of ROM by exposing it to Ultraviolet rays for nearly 40
minutes.
EEPROM (Electrically Erasable Programmable Read Only Memory): Here the written
contents can be erased electrically. You can delete and reprogramme EEPROM up to 10,000
times. Erasing and programming take very little time, i.e., nearly 4 -10 ms(milliseconds). Any
area in an EEPROM can be wiped and programmed selectively.
2. Secondary Memory
It is also known as auxiliary memory and backup memory. It is a non-volatile memory and
used to store a large amount of data or information. The data or information stored in secondary
memory is permanent, and it is slower than primary memory. A CPU cannot access secondary
memory directly. The data/information from the auxiliary memory is first transferred to the
main memory, and then the CPU can access it.
Characteristics of Secondary Memory
It is a slow memory but reusable.
It is a reliable and non-volatile memory.
It is cheaper than primary memory.
The storage capacity of secondary memory is large.
A computer system can run without secondary memory.
In secondary memory, data is stored permanently even when the power is off.
Types of Secondary Memory
1. Magnetic Tapes: Magnetic tape is a long, narrow strip of plastic film with a thin, magnetic
coating on it that is used for magnetic recording. Bits are recorded on tape as magnetic patches
called RECORDS that run along many tracks. Typically, 7 or 9 bits are recorded concurrently.
Each track has one read/write head, which allows data to be recorded and read as a sequence
of characters. It can be stopped, started moving forward or backward, or rewound.

2. Magnetic Disks: A magnetic disk is a circular metal or a plastic plate and these plates are
coated with magnetic material. The disc is used on both sides. Bits are stored in magnetized
surfaces in locations called tracks that run in concentric rings. Sectors are typically used to
break tracks into pieces.

Hard discs are discs that are permanently attached and cannot be removed by a single user.

3. Optical Disks: It’s a laser-based storage medium that can be written to and read. It is
reasonably priced and has a long lifespan. The optical disc can be taken out of the computer by
occasional users.
Types of Optical Disks
CD – ROM
It’s called compact disk. Only read from memory.
Information is written to the disc by using a controlled laser beam to burn pits on the disc
surface.
It has a highly reflecting surface, which is usually aluminium.
The diameter of the disc is 5.25 inches.
16000 tracks per inch is the track density.
The capacity of a CD-ROM is 600 MB, with each sector storing 2048 bytes of data.
The data transfer rate is about 4800KB/sec. & the new access time is around 80 milliseconds.
WORM- (WRITE ONCE READ MANY)
A user can only write data once.
The information is written on the disc using a laser beam.
It is possible to read the written data as many times as desired.
They keep lasting records of information but access time is high.
It is possible to rewrite updated or new data to another part of the disc.
Data that has already been written cannot be changed.
Usual size – 5.25 inch or 3.5inch diameter.
The usual capacity of 5.25inch disk is 650 MB,5.2GB etc.
DVDs
The term “DVD” stands for “Digital Versatile/Video Disc,” and there are two sorts of DVDs:
DVDR (writable)
DVDRW (Re-Writable)
DVD-ROMS (Digital Versatile Discs): These are read-only memory (ROM) discs that can be
used in a variety of ways. When compared to CD-ROMs, they can store a lot more data. It has
a thick polycarbonate plastic layer that serves as a foundation for the other layers. It’s an optical
memory that can read and write data.
DVD-R: DVD-R is a writable optical disc that can be used just once. It’s a DVD that can be
recorded. It’s a lot like WORM. DVD-ROMs have capacities ranging from 4.7 to 17 GB. The
capacity of 3.5 inch disk is 1.3 GB.
3. Cache Memory
It is a type of high-speed semiconductor memory that can help the CPU run faster. Between
the CPU and the main memory, it serves as a buffer. It is used to store the data and programs
that the CPU uses the most frequently.
Advantages of Cache Memory
It is faster than the main memory.
When compared to the main memory, it takes less time to access it.
It keeps the programs that can be run in a short amount of time.
It stores data in temporary use.
Disadvantages of Cache Memory
Because of the semiconductors used, it is very expensive.
The size of the cache (amount of data it can store) is usually small.
Arithmetic and Logical Unit (ALU)
An arithmetic unit, or ALU, enables computers to perform mathematical operations on binary
numbers. They can be found at the heart of every digital computer and are one of the most
important parts of a CPU (Central Processing Unit).
In its simplest form, an arithmetic unit can be thought of as a simple binary calculator -
performing binary addition or subtraction on two inputs (A & B) to output a result (to explore
more on how this works check out our note: Binary Addition with Full Adders).

As well as performing basic mathematical operations, the arithmetic unit may also output a
series of 'flags' that provide further information about the status of a result: if it is zero, if there
is a carryout, or if an overflow has occurred. This is important as it enables a computational
machine to perform more complex behaviors like conditional branching.
Modern computational machines, however, contain 'arithmetic units' which are far more
complex than the one described above. These units may perform additional basic mathematical
operations (multiply & divide) and bitwise operations (AND, OR, XOR et al). As such, they
are commonly referred to as an ALU (Arithmetic Logic Unit).
ALUs enable mathematical procedures to be performed in an optimized manner, and this can
significantly reduce the number of steps required to perform a particular calculation.
Today, most CPUs (Central Processing Units) contain ALUs that can perform operations on
32 or 64-bit binary numbers. However, AUs & ALUs which process much smaller numbers
also have their place in the history of computing.
COMPUTER REGISTER

Registers are a type of computer memory used to quickly accept, store, and transfer data and
instructions that are being used immediately by the CPU. The registers used by the CPU are
often termed as Processor registers.

A processor register may hold an instruction, a storage address, or any data (such as bit
sequence or individual characters).
The computer needs processor registers for manipulating data and a register for holding a
memory address. The register holding the memory location is used to calculate the address of
the next instruction after the execution of the current instruction is completed.

Following is the list of some of the most common registers used in a basic computer:

Register Symbol Number of bits Function

Data register DR 16 Holds memory operand

Address register AR 12 Holds address for the memory

Accumulator AC 16 Processor register

Instruction register IR 16 Holds instruction code

Program counter PC 12 Holds address of the instruction

Temporary register TR 16 Holds temporary data

Input register INPR 8 Carries input character

Output register OUTR 8 Carries output character

The common registers in a computer and the memory are depicted in the diagram below:

The following explains the various computer registers and their functions:
Accumulator Register (AC)
The Accumulator Register is a general-purpose Register. The initial data to be processed, the
intermediate result, and the final result of the processing operation are all stored in this register.
If no specific address for the result operation is specified, the result of arithmetic operations is
transferred to AC. The number of bits in the accumulator register equals the number of bits per
word.
Address Register (AR)
The Address Register is the address of the memory location or Register where data is stored or
retrieved. The size of the Address Register is equal to the width of the memory address is
directly related to the size of the memory because it contains an address. If the memory has a
size of 2n * m, then the address is specified using n bits.
Data Register (DR)
The operand is stored in the Data Register from memory. When a direct or indirect addressing
operand is found, it is placed in the Data Register. This value was then used as data by the
processor during its operation. It's about the same size as a word in memory.
Instruction Register (IR)
The instruction is stored in the Instruction Register. The instruction register contains the
currently executed instruction. Because it includes instructions, the number of bits in the
Instruction Register equals the number of bits in the instruction, which is n bits for an n-bit
CPU.
Input Register (INPR)
Input Register is a register that stores the data from an input device. The computer's
alphanumeric code determines the size of the input register.
Program Counter (PC)
The Program Counter serves as a pointer to the memory location where the next instruction is
stored. The size of the PC is equal to the width of the memory address, and the number of bits
in the PC is equal to the number of bits in the PC.
Temporary Register (TR)
The Temporary Register is used to hold data while it is being processed. As Temporary
Register stores data, the number of bits it contains is the same as the number of bits in the data
word.
Output Register (OUTR)
The data that needs to be sent to an output device is stored in the Output Register. Its size is
determined by the alphanumeric code used by the computer.

COMPUTER BUS
A Computer bus consists of a set of parallel conductors, which may be conventional wires,
copper tracks on a PRINTED CIRCUIT BOARD, or microscopic aluminum trails on the
surface of a silicon chip. Each wire carries just one bit, so the number of wires determines the
most significant data WORD the bus can transmit: a bus with eight wires can carry only 8-bit
data words and hence defines the device as an 8-bit device.
The bus is a communication channel.
The characteristic of the bus is shared transmission media.
The limitation of a bus is only one transmission at a time.
A bus used to communicate between the major components of a computer is called a System
bus.

System bus contains 3 categories of lines used to provide the communication between the CPU,
memory and IO named as:
1. Address lines (AL)
2. Data lines (DL)
3. Control lines (CL)
1. Address Lines:
Used to carry the address to memory and IO.
Unidirectional.
Based on the width of an address bus we can determine the capacity of a main memory
2. Data Lines:
Used to carry the binary data between the CPU, memory and IO.
Bidirectional.
Based on the width of a data bus we can determine the word length of a CPU.
Based on the word length we can determine the performance of a CPU.
3. Control Lines:
Used to carry the control signals and timing signals
Control signals indicate the type of operation.
Timing Signals are used to synchronize the memory and IO operations with a CPU clock.
Typical Control Lines may include Memory Read/Write, IO Read/Write, Bus Request/Grant,
etc.
COMPUTER ARITHMETIC
BOOLEAN ALGEBRA
Boolean algebra is a type of algebra that is created by operating the binary system. In the year
1854, George Boole, an English mathematician, proposed this algebra. This is a variant of
Aristotle’s propositional logic that uses the symbols 0 and 1, or True and False. Boolean algebra
is concerned with binary variables and logic operations.
Boolean Expression and Variables
A Boolean expression is an expression that produces a Boolean value when evaluated, true or
false, the only way to express a Boolean value. Whereas boolean variables are variables that
store Boolean numbers. P + Q = R is a Boolean phrase in which P, Q, R are Boolean variables
that can only store two values: 0 and 1. The computer performs all operations using binary 0
and 1 as the computer understands machine language (0/1). Boolean logic, named after George
Boole, is a type of algebra in which all values are reduced to one of two possibilities: 1 or 0.
To effectively comprehend Boolean logic, we must first comprehend the rules of Boolean logic,
as well as the truth table and logic gates.
Truth Tables
A truth table represents all the variety of combinations of input values and outputs in a tabular
manner. All the possibilities of the input and output are shown in it and hence the name truth
table is kept. In logic problems such as Boolean algebra and electronic circuits, truth tables are
commonly used. T or 1 denotes ‘True’ & F or 0 denotes ‘False’ in the truth table
A B X=A.B
T T T
T F F
F T F
F F F

Logic Gates
A logic gate is a virtual or physical device that performs a Boolean function. These are used to
make logic circuits. Logic gates are the main components of any digital system. This electrical
circuit can have only one output and 1 or more inputs. The relation between the input and the
output is governed by specific logic. AND, OR, NOT gate, etc are the examples of logic gates.
Types of Logic Gates
1. AND Gate (Product): A logic gate with two or more inputs and a single output is known as
an AND gate. The logic multiplication rules are used to operate an AND gate. An AND gate
can have any number of inputs, although the most common are two and three-input AND gates.
If any of the inputs are low (0), the output is also low in this gate. When all of the inputs are
high (1), the output will be high as well.
Truth table: For AND gate, the output X is true if and only if both the inputs P and Q are true.
So the truth table of AND gate is as follows:
P Q X=P.Q
T T T
T F F
F T F
F F F

2. OR Gate (Sum): A logic gate that performs a logical OR operation is known as an OR gate.
If one or both of the gate’s inputs are high, the logical OR operation produces a high output
(1). (1). If neither of the inputs is high, the result is a low output (0). In the same way that an
AND gate can have an unlimited number of input probes, an OR gate can only have one output
probe. A logical OR gate finds the maximum between two binary digits.

Truth table: For the OR gate, the output X is true if and only if any of the inputs P or Q is
true. So the truth table of OR gate is as follows:
P Q X=P+Q
T T T
T F T
F T T
F F F

3. NOT Gate (Complement): Inverting NOT gates are those devices that takes only one input
with an output level that is ordinarily at logic level 1 and goes low to a logic level 0 when their
single input is at logic level 1, or in other words, they invert their input signal. A NOT gate’s
output only returns high, when its input is at logic level 0. The output ~P (~ denotes Not) of a
single input NOT gate is only true when the input P is false or we can say, Not true. It is also
called inverse gate as it results the negation of the input Boolean Expression.

Truth table: For the NOT gate, the output X is true if and only if input P is false. So the truth
table of NOT gate is as follows:
P ~P
T F
F T
4. NAND Gate: A logic gate known as a NAND gate provides a low output (0) only if all of
its inputs are true, and high output (1) otherwise. As a result, the NAND gate is the inverse
of an AND gate, and its circuit is created by joining AND gate and NOT gate. NAND means
‘Not of AND’ Gate and it results in false only when both the inputs P and Q are true. AND
gates (together with NOR gates) are known as universal gates because they are a form of
logic gate that can implement any Boolean function without the usage of any other gate type.

Truth table:
For the NAND gate, the output X is false if and only if both the inputs (i.e., P and Q) are true.
So the truth table of the NAND gate is as follows:

P Q ~(P.Q)
T T F
T F T
F T T
F F T

5. NOR Gate: A logic gate known as a NOR gate provides a high output (1) only if all of its
inputs are false, and low output (0) otherwise. As a result, the NOR gate is the inverse of an
OR gate, and its circuit is created by joining OR gate and NOT gate. NOR means ‘Not of
OR’ Gate & it results in true only when both the inputs P and Q are false.

Truth table:
For the NAND gate, the output X is true if and only if both the inputs (i.e., P and Q) are false.
So the truth table of NOR gate is as follows:

P Q ~(P+Q)
T T F
T F F
F T F
F F T

6. XOR Gate: An XOR gate (also known as an Exclusive OR gate) is a digital logic gate that
conducts exclusive disjunction and has two or more inputs and one output. Only one of an
XOR gate’s inputs must be true for the output to be true. The output of an XOR gate is false
if both of its inputs are false, or true if both of its inputs are true. XOR means ‘Exclusive OR’
Gate & it results in true only when either of the 2 inputs P & Q is true, i.e., either P is true or
Q is true but not both.

Truth table:

P Q X=P⊕Q
T T F
T F T
F T T
F F F

7. XNOR Gate: An NOR gate (also known as an Exclusive NOR gate) is a digital logic gate
that is just opposite of XOR gate. It has two or more inputs and one output. When one of its
two input is true but not both then it will return false. XNOR means ‘Exclusive NOR’ Gate
and it result is true only when both of its inputs P and Q are either true or false.

Truth table:
P Q X=P XNOR Q
T T T
T F F
F T F
F F T

Laws for Boolean Logic

Following are some laws for boolean logic:


Law OR form AND form
Identity Law P+0=P P.1 = P
Idempotent Law P+P=P P.P = P
Commutative Law P+Q=Q+P P.Q = Q.P
Null Law 1+P=1 0.P = 0
Inverse Law P + (~P) = 1 P.(~P) = 0
Associative Law P + (Q + R) = (P + Q) + R P.(Q.R) = (P.Q).R
Distributive Law P + QR = (P + Q).(P + R) P.(Q + R) = P.Q + P.R
Absorption Law P + PQ = P P.(P + Q) = P
De Morgan’s Law ~(P + Q) = (~P).(~Q) ~(P.Q) = (~P) + (~Q)

De Morgan’s laws

De Morgan’s Law states that:


Statement 1: The Complement of the product (AND) of two Boolean variables (or
expressions) is equal to the sum(OR) of the complement of each Boolean variable (or
expression).
~(P.Q) = (~P) + (~Q)
Proof:
Statement: ~(P.Q) = (~P) + (~Q)
The truth table is:
P Q (~P) (~Q) ~(P.Q) (~P)+(~Q)
T T F F F F
T F F T T T
F T T F T T
F F T T T T
We can clearly see that truth values for ~(P.Q) are equal to truth values for (~P) + (~Q),
corresponding to the same input.
Statement 2: The Complement of sum (OR) of two Boolean variables (or expressions) is
equal to the product (AND) of the complement of each Boolean variable (or expression).
~(P + Q) = (~P).(~Q)
Proof
Statement: ~(P+Q) = (~P).(~Q)
The truth table is :
P Q (~P) (~Q) ~(P + Q) (~P).(~Q)
T T F F F F
T F F T F F
F T T F F F
F F T T T T
We can clearly see that truth values for ~(P + Q) are equal to truth values for (~P).(~Q),
corresponding to the same input.

Logic circuits

An electric circuit in which we can give one or more binary inputs (assuming two states, on
or off) and we get a single binary output corresponding to the input in a fashion that can be
described as a function in symbolic logic. The AND, OR, and NOT gates are basic logic
circuits that perform the logical functions – AND, OR, and NOT, respectively. Computers
can do more complicated tasks with circuits than they could with only a single gate.
Example: A chain of two logic gates is the smallest circuit. Consider the following circuit:

This logic circuit is for the Boolean expression: (P + Q).R.


Here, the first OR gate is used: P, Q are input to it and P + Q is the output.
Then, AND gate is used : (P + Q), R is input to it & (P + Q).R is the output.
So the truth table is :
P Q R P+Q X = (P + Q).R
T T T T T
T T F T F
T F T T T
T F F T F
F T T T T
F T F T F
F F T F F
F F F F F

Questions

Question 1. Design the logical circuit for: A.B + B.C


Solution:

Question 2. What will be the Boolean expression for the following logic circuit:

Solution:
X = ~(P + Q).R + S
Question 3. Verify using truth table: P + P.Q = P
Solution:
The truth table for P + P.Q = P
P Q P.Q P + P.Q
T T T T
T F F T
F T F F
F F F F
In the truth table, we can see that the truth values for P + P.Q is exactly the same as P.

ADDRESSING MODES
The term addressing modes refers to how the operand of an instruction is specified. The
addressing mode specifies a rule for interpreting or modifying the address field of the
instruction before the operand is actually executed.
Addressing mode can also be define as the different ways of specifying the location of an
operand in an instruction.
DIFFERENT TYPES OF ADDRESSING MODE
• Implied Addressing Mode / Implicit Addressing Mode
• Immediate Addressing Mode.
• Register Direct Mode.
• Register Indirect Mode.
• Direct Addressing Mode.
• Indirect Addressing Mode.
• Stack Addressing Mode.
• Relative Addressing Mode.
• Indexed Addressing Mode.
• Base Register Addressing Mode.
• Auto-increment Addressing Mode.
• Auto-decrement Addressing Mode.
All computer architectures provide more than one of these addressing modes. The question
arises as to how the control unit can determine which addressing mode is being used in a
particular instruction. Several approaches are used.
Often, the different OPCODE will use different addressing modes. The value of the mode field
determines which addressing mode is to be used and what is the interpretation of effective
address.
Effective Address
In a system without virtual memory, the effective address will be either the main memory
address or a register.
In a virtual memory system, the effective address is a virtual address or a register.
Implied Addressing Mode / Implicit Addressing Mode
The operand is specified implicitly in the definition of the instruction.
Example: the instruction “Complement Accumulator” is an implied mode instruction
Immediate Address Mode
The simplest form of addressing is immediate addressing mode, in which the operand is itself
present in the instruction
The address field which is part of the instruction is nothing but the operand itself. Immediate
addressing mode can be used to define and use constants or set initial values of variables.
Example: Consider the instruction LOAD 5. So in this instruction, five is itself is an operand.
Example: MOV AL, 35H (move the data 35H into AL register)
The advantage of the immediate addressing mode is that no memory reference other than the
instruction fetch is required to obtain the operand.
The disadvantage of immediate addressing mode is that the size of the number is restricted to
the size of the address field, which, in most instruction sets, is small compared with the word
length.
Register Direct Addressing Mode
It means the operand is stored in the register set, as specified by the address field of the
instruction refers to the CPU register that contains the operand.
No reference to memory is required to fetch the operand.
Register direct addressing is similar to direct addressing. The only difference is that the address
field refers to a register rather than the main memory address.

Example: Add R will increment the value stored in the accumulator by the content of register
R
AC AC+[R]
The advantages of registers direct addressing are that only a small address field is needed in
the instruction, and no memory reference is required.
The disadvantage of registers direct addressing is that the address space is very limited.

Register Indirect Addressing Mode


The address field of the instruction refers to a CPU register that contain the effective address
of the operand.
Register indirect addressing is similar to indirect addressing, except that the address meaning
field refers to a register instead of a memory location. It requires only one memory reference
and no special calculation.
Register indirect addressing mode uses one less memory reference than indirect addressing.
Because the first information is available in a register, which is nothing but a memory address,
and at that memory address, the operand is stored.

Example: ADD R will increment the value stored on the accumulator by the content of the
memory location specified in register R
AC AC+[[R]]
Direct Addressing Mode
Direct address mode, it when the operand is stored in the memory, not in the register. A very
simple form of addressing is direct addressing, in which the address field contains the effective
address of the operand.
This mode is also known as the absolute addressing mode. It requires only one memory
reference and no special calculation.
With direct addressing, the length of the address field is usually less than the word length, thus
limiting the address range
Example: Add x will increment the value stored in the accumulator by the value stored in the
memory location X
AC AC+[X]

Indirect Addressing Mode


The address field of the instruction specifies the address of memory location that contains the
effective address of the operand.
Two references to memory are required to fetch the operand
One solution is to have the address field refer to the address of a word in memory, which
contains a full-length address of the operand. This is known as the indirect addressing mode.

Example: ADD X will increment the value stored in the accumulator by the value stored at
memory location specified by X
AC AC+[[X]]
Effective Address = Address part of Instruction + Content of CPU Register
Stack Addressing Mode
The operand is contained at the top of the stack.
Example: ADD
This instruction simply pops out two symbols contained at the top of the stack, The addition of
those two operands is performed and the result obtained after addition is pushed again to the
top of the stack

Relative Addressing Mode


Effective address of the operand is obtained by adding the content of Program Counter with
the address part of the instruction.
PC relative addressing mode: PC relative addressing mode is used to implement intra segment
transfer of control, In this mode effective address is obtained by adding displacement to PC.
EA= content of the Program Counter + Address part of the instruction.
Program Counter always contain the address of the next instruction to be executed. After
fetching the address of the instruction, the value of PC immediately increases. Note: the value
increases irrespective of whether the fetched instruction has completely executed or not.
Indexed Addressing Mode
The effective address of the operand is obtained by adding the content of the index register
with the address part of the instruction.
EA = content of index register + address part of the instruction.

Base-Register Addressing Mode


The reference register contains a memory address, and the address field contains a
displacement from that address. The register reference may be explicit or implicit.
Base register addressing mode is used to implement inter segment transfer of control. In this
mode effective address is obtained by adding base register value to address part of the
instruction.
EA= Content of Base register + Address part of the instruction.
In some implementations, a single segment/base register is employed and is used implicitly. In
others, the programmer may choose a register to hold the base address of a segment, and the
instruction must reference it explicitly.
Auto-increment Addressing Mode
This addressing mode is a special case of register indirect addressing mode where the effective
address of the operand is the contents of a register specified in the instruction. After accessing
the operand, the contents of this register are automatically incremented by step size ‘d’, Step
size ‘d’ depends on the size of the operand accessed. And it is only one reference to memory
is required to fetch the operand.
Example: Assume operand sizes = 2bytes
Here, after fetching the operand 6B, the instruction register R(Auto) will be automatically
incremented by 2. Then, the updated value of R(Auto) will be 3300+2 =3302. At memory
address 3302, the next address will be found.

Auto-decrement Addressing Mode


This addressing mode is again a special case of register indirect addressing mode where the
effective address of the operand is the contents of a register specified in the instruction minus
step size ‘d’. Before accessing the operand, the contents of this register are automatically
decremented to point to the previous consecutive memory location. -(R).
First, the content of the register is decremented by step size ‘d’, The step size ‘d’ depends on
the size of the operand accessed. After decrementing, the operand is read.
Only one reference to memory is required to fetch the operand.
Example: Assume operand sizes = 2bytes
Here, after fetching the operand 6C, the instruction register R(Auto) will be automatically
decremented by 2. Then, the updated value of R(Auto) will be 3302-2 =3300. At memory
address 3300, the next address will be found.

Auto decrement mode is same as auto increment mode. Both can also be used to implement a
stack as push and pop. Auto increment and Auto decrement modes are useful for implementing
“Last-In-First-Out” data structures.

Addressing Mode Example


At memory address 200, a two instruction LOAD AC is stored. At location 201, the address
stored is 500. At location 202 next instruction is stored. The following numbers are stored at
the different memory locations, as shown in this table.
Memory Location ( Address)

Memory Location ( Address) Memory Content


399 450
400 700
500 800
600 900
702 325
800 300

Suppose the content of PC is 200, while the content of register R1 is 400. XR register is 100 if
all the numbers and addresses in decimal numbers, determine the content of AC and effective
address for the following addressing modes.
(a) Direct Addressing (b) Indirect Addressing (c) Relative Addressing (d) Indexed
Addressing(e) Register Indirect Addressing.
Solution :
(1) Direct Addressing: Since it is given that the address field of instruction is 500, so it direct
mode, this value itself is an Effective Address.
So Effective Address = 500 and
Operand = 800.
(2) Indirect Mode
Effective Address = M[500]=800
Operand = M[800]= 300
(3) Relative Addressing Mode:
Effective Address = PC + 500 = 202 + 500 =702
Operand =325
(4) Indexed Address Mode
Effective Address = Base Register Value + 500
= 100 + 500 = 600
Operand = M[600] = 900
(5) Register Indirect Mode
Effective Address = M[R1]= 400
Operand = M[400]=700.

Note:
1. PC relative and based register both addressing modes are suitable for program
relocation at runtime.
2. Based register addressing mode is best suitable for writing position-independent codes.
Advantages of Addressing Modes
✓ To give programmers the facilities such as Pointers, counters for loop controls, indexing
of data and program relocation.
✓ To reduce the number bits in the addressing field of the Instruction.

Microprogramming
In any digital computer, the function of the control unit is to initiate sequences of
microoperations. The number of different types of microoperations that are available in a given
system is finite.
Generally, control unit of a digital computer may be designed using one of the following
techniques:
1. Hardwired Control Unit
In this type, the control signals are generated by hardware using conventional logic design
techniques.
2. Microprogrammed Control Unit
In this type, the control variables stored in memory at any given time can be represented by a
string of 1's and 0's called a control word. As such, control words can be programmed to
perform various operations on the components of the system. Each word in control memory
contains within it a microinstruction.
Generally, a microinstruction specifies one or more microoperations. A sequence of
microinstructions forms what is called a microprogram.
Generally, a computer that employs a microprogrammed control unit will have two separate
memories:
1. The main memory
This memory is available to the user for storing programs. The user's program in main memory
consists of machine instructions and data.
2. The control memory
This memory contains a fixed microprogram that cannot alter by the occasional user. The
microprogram consists of microinstructions that specify various internal control signals for
execution of register microoperations.
Each instruction initiates a series of microinstructions in control memory. These
microinstructions generate the microoperations to:
1. Fetch the instruction from main memory.
2. Evaluate the effective address.
3. Execute the operation specified by the instruction.
4. Finally, return the control to the fetch phase in order to repeat the cycle for the next
instruction.

Block diagram of a microprogrammed control unit which assumed to be a ROM.


The function of the control address register is to specify the address of the microinstruction,
while the function of control data register is to holds the microinstruction read from memory.
The microinstruction contains a control word that specifies one or more microoperations for
the data processor. Once these operations are executed, the control must determine the next
address.
The location of the next microinstruction may be the one next in sequence, or it may be located
somewhere else in the control memory. For this reason, it is necessary to use some bits of the
present microinstruction to control the generation of the address of the next microinstruction.
The next address may also be a function of external input conditions. While the
microoperations are being executed, the next address is computed in the next address generator
circuit and then transferred into the control address register to read the next microinstruction.
Thus, a microinstruction contains bits for initiating microoperations in the data processor part
and bits that determine the address sequence for the control memory.
The next address generator is sometimes called a microprogram sequencer, as it determines the
address sequence that is read from control memory.
Depending on the sequencer inputs, the address of the next microinstruction can be specified
in several ways:
1. By incrementing control address register by one.
2. Loading the control address register and address from control memory.
3. Transferring an external address.
4. Loading an initial address to start the control operations.
The control data register holds the present microinstruction while the next address is computed
and read from memory.
The data register is sometimes called a pipeline register. It allows the execution of the
microoperations specified by the control word simultaneously with the generation of the next
microinstruction. This configuration requires a two-phase clock, one clock applied to the
address register and the other to the data register.
The system can operate without the control data register by applying a single-phase clock to
the address register. The control word and next address information are taken directly from the
control memory.
It must be realized that a ROM operates as a combinational circuit, with the address value as
the input and the corresponding word as the output. The content of the specified word in ROM
remains in the output wires as long as its address value remains in the address register. No
read signal is needed as in a random-access memory. Each clock pulse will execute the
microoperations specified by the control word also transfer a new address to the control address
register.
In the example that follows we assume a single-phase clock and therefore we do not use a
control data register, in this way the address register is the only component in the control system
that receives clock pulses. The other two components: the sequencer and the control memory
are combinational circuits and do not need a clock.
The main advantage of the microprogrammed control is the fact that once the hardware
configuration is established; there should be no need for further hardware or wiring changes.
If we want to establish a different control sequence for the system, all we need to do is specify
a different set of microinstructions for control memory (i.e. different microprograms residing
in control memory).
Address Sequencing
Each computer has a set of instructions; each instruction has its own microprogram routine in
control memory which generates the microoperations that execute it. When the computer is
turned on, an initial address is loaded into the control address register. This address usually
presents the first address of the microinstruction that activates the instruction fetch routine. The
fetch routine may be sequenced by incrementing the control address register through the rest
of its microinstructions. At the end of the fetch routine, the instruction is loaded in the
instruction register.
The next step is to determine the effective address of the operand. The effective address
computation routine in control memory can be reached through a branch microinstruction,
which is conditioned on the status of the mode bits of the instruction. When the effective
address computation routine is completed, the address of the operand is available in the
memory address register.
The following step is to generate the microoperations that execute the instruction fetched from
memory. Each instruction has its own microprogram routine stored in a given location of
control memory.
The process that transforms the instruction code bits to an address in control memory where
the routine is located is referred as a mapping process. Once the required routine is reached,
the microinstructions that execute the instruction may be sequenced by incrementing the
control address register, but sometimes the sequence of microoperations will depend on values
of certain status bits in processor registers. Microprograms that employ subroutines will require
an external register to store the return address, since the return addresses cannot be stored in
ROM.
When the execution of the instruction is completed, control must return to the fetch routine.
This is accomplished by executing an unconditional branch microinstruction to the first address
of the fetch routine.
In summary, the address sequencing capabilities required in a control memory are:
1. Incrimination of the control address register.
2. Unconditional branch or conditional branch, depending on the status bit conditions.
3. A mapping process from the bits of the instruction to an address for control memory.
4. A facility for subroutine call and return.
Figure below shows a block diagram of a control memory and the associated hardware needed
for selecting the next microinstruction address.
Each microinstruction in control memory contains:
1. A set of bits to initiate microoperations in computer registers.
2. A set of bits to specify the method by which the next microinstruction address is obtained.

The diagram shows four different paths from which the control address register (CAR)
receives the address of the next instruction.
1. By incrementer, which increments the content of the control address register by one.
2. By branching; this is achieved by specifying the branch address in one of the fields of the
microinstruction. Conditional branching is obtained by using part of the microinstruction to
select a specific status bit in order to determine its condition.
3. An external address via a mapping logic circuit is transferred into control memory.
4. The return address for a subroutine is stored in a special register whose value is then used
when the microprogram wishes to return from the subroutine.
Conditional Branching
The branch logic of the Fig. above provides decision-making capabilities in the control unit.
In every digital system, the status conditions are special bits that provide parameter information
such as:
1. The carry-out of an adder.
2. The sign bit of a number.
3. The mode bits of an instruction.
4. Input or output status conditions.
The field that specifies a branch address in the microinstruction, together with the status bits,
controls the conditional branch decisions generated in the branch logic.
The simplest way to implement the branch logic hardware is to test the specified condition and
branch to the indicated address if the condition is met; otherwise, the address register is
incremented.
This can be implemented using a multiplexer. For example, suppose that there are eight status
bit conditions in the system. Therefore, three bits in the microinstruction are used to specify
one of the eight status bit conditions. These three bits provide the selection variables for the
multiplexer. If the selected status bit is in the 1 state, the output of the multiplexer is 1;
otherwise, it is 0. A 1 output in the multiplexer generates a control signal to transfer the
branch address from the microinstruction into the control address register. A 0 output in the
multiplexer causes the address register to be incremented. In this configuration, the
microprogram follows one of two possible paths, depending on the value of the selected status
bit.
An unconditional branch microinstruction can be implemented by loading the branch address
from control memory into the control address register via the multiplexer. This can be
accomplished by fixing the value of one status bit at the input of the multiplexer, so it is always
equal to 1. A reference to this bit by the status bit select lines from control memory causes the
branch address to be loaded into the control address register unconditionally.
Mapping of Instruction
A special type of branch exists when a microinstruction specifies a branch to the first word in
control memory where a microprogram routine for an instruction is located. The status bits for
this type of branch are the bits in the operation code part of the instruction.
For example, suppose a computer with an instruction format as shown in Fig. below, four bits
are specified for the operation code which specifies up to 16 distinct instructions. Assume
further that the control memory has 128 words, requiring an address of seven bits. As
mentioned before, the control memory includes a microprogram routine for each operation
code that executes the corresponding instruction.

One simple mapping process that converts the 4-bit operation code to a 7-bit address for control
memory is shown in the Fig. above This mapping consists of placing a 0 in the most significant
bit of the address, transferring the four-operation code bits, and cleaning the two least
significant bits of the control address register.
This provides for each computer instruction a microprogram routine with a capacity of four
microinstructions, if the routine needs more than four microinstructions, it can use addresses
1000000 through 1111111. If it uses fewer than four microinstructions, the unused memory
locations would be available for other routines.
A more general mapping rule can be achieved by using a ROM to specify the mapping function.
In this configuration, the bits of the instruction specify the address of a mapping ROM. The
contents of the mapping ROM give the bits for the control address register.
The advantages of this mapping process are:
1. The microprogram routine that executes the instruction can be placed in any desired location
in control memory.
2. It provides flexibility for adding new subroutines for new instructions in the control memory
as the need arises.
Subroutines
Subroutines are programs that are used by other routine to accomplish a particular task.
An example of these subroutines is the subroutine needed to generate the effective address of
the operand for an instruction. This is common to all memory reference instructions.
A subroutine can be called from any point within the main body of the microprogram.
Microprogram Example
When the configuration and the microprogrammed control unit of the computer are established,
the designer's task is to generate the microcode for the control memory. This code generation
is called microprogramming.
Computer Configuration
To explain the microprogramming process, Let look at a simple digital computer

Block diagram of a computer.


This computer consists of:
1. A main memory (2048 × 16) for storing instructions and data.
2. A control memory (128 × 20) for storing the microprogram.
3. Four registers associated with the processor unit;
These registers are:
a. The Accumulator Register (AC).
b. The Program Counter Register (PC).
c. The Address Register (AR).
d. The Data Register (DR).
4. Two registers associated with control unit:
a. The Subroutine Register (SBR).
b. The Control Address Register (CAR).
5. The Arithmetic Logic and Shift Unit.
6. Two multiplexers for transfer of information among the registers and main memory.
From the diagram, it's clear that DR can receive information from AC, PC, or main memory.
AR can receive information either from PC or DR. PC receives information from AR only.
The arithmetic, logic, and shift unit perform microoperations with data from AC and DR and
places the result in AC. The memory receives its address from AR. Data written to memory
comes from DR, while DR receives the data read from memory.
Instruction Format
Figure below presents the instruction format for the assumed computer.
The instruction format consists of three fields:
1. The1-bit field specified for addressing mode symbolized by I.
2. The4-bit field specified for operation code (Opcode).
3. The11-bit field specified for the address.

MICROPROCESSORS

Computer's Central Processing Unit (CPU) built on a single Integrated Circuit (IC) is called
a microprocessor. A digital computer with one microprocessor which acts as a CPU is called
microcomputer.

It is a programmable, multipurpose, clock -driven, register-based electronic device that reads


binary instructions from a storage device called memory, accepts binary data as input and
processes data according to those instructions and provides results as output.

The microprocessor contains millions of tiny components like transistors, registers, and diodes
that work together.

Block Diagram of a Microcomputer


A microprocessor consists of an ALU, control unit and register array. Where ALU performs
arithmetic and logical operations on the data received from an input device or memory. Control
unit controls the instructions and flow of data within the computer. And, register
array consists of registers identified by letters like B, C, D, E, H, L, and accumulator.

Evolution of Microprocessors

We can categorize the microprocessor according to the generations or according to the size of
the microprocessor:

First Generation (4 - bit Microprocessors)

The first generation microprocessors were introduced in the year 1971-1972 by Intel
Corporation. It was named Intel 4004 since it was a 4-bit processor.

It was a processor on a single chip. It could perform simple arithmetic and logical operations
such as addition, subtraction, Boolean OR and Boolean AND.

I had a control unit capable of performing control functions like fetching an instruction from
storage memory, decoding it, and then generating control pulses to execute it.

Second Generation (8 - bit Microprocessor)

The second generation microprocessors were introduced in 1973 again by Intel. It was a first 8
- bit microprocessor which could perform arithmetic and logic operations on 8-bit words. It
was Intel 8008, and another improved version was Intel 8088.

Third Generation (16 - bit Microprocessor)

The third generation microprocessors, introduced in 1978 were represented by Intel's 8086,
Zilog Z800 and 80286, which were 16 - bit processors with a performance like minicomputers.

Fourth Generation (32 - bit Microprocessors)


Several different companies introduced the 32-bit microprocessors, but the most popular one
is the Intel 80386.

Fifth Generation (64 - bit Microprocessors)

From 1995 to now we are in the fifth generation. After 80856, Intel came out with a new
processor namely Pentium processor followed by Pentium Pro CPU, which allows multiple
CPUs in a single system to achieve multiprocessing.

Other improved 64-bit processors are Celeron, Dual, Quad, Octa Core processors.

Table: Important Intel Microprocessors

Microprocessor Year of Word Memory Pins Clock Remarks


Invention Length addressing
Capacity
4004 1971 4-bit 1 KB 16 750 KHz First
Microprocessor
8085 1976 8-bit 64 KB 40 3-6 MHz Popular 8-bit
Microprocessor
8086 1978 16-bit 1MB 40 5-8 MHz Widely used in
PC/XT
80286 1982 16-bit 16MB real, 4 68 6-12.5 Widely used in
GB virtual MHz PC/AT

80386 1985 32-bit 4GB real, 132 20-33 Contains MMU


64TB virtual 14X14 MHz on chip
PGA
80486 1989 32-bit 4GB real, 168 25-100 Contains MMU,
64TB virtual 17X17 MHz cache and FPU,
PGA 1.2 million
transistors

Pentium 1993 32-bit 4GB real,32- 237 60-200 Contains 2


bit PGA ALUs,2 Caches,
address,64- FPU, 3.3 Million
bit data bus transistors, 3.3
V, 7.5 million
transistors

Pentium Pro 1995 32-bit 64GB real, 387 150-200 It is a data flow
36-bit PGA MHz processor. It
address bus contains second
level cache
also,3.3 V
Pentium II 1997 32-bit - - 233-400 All features
MHz Pentium pro plus
MMX
technology,3.3
V, 7.5 million
transistors
Pentium III 1999 32-bit 64GB 370 600-1.3 Improved
PGA MHz version of
Pentium II; 70
new SIMD
instructions
Pentium 4 2000 32-bit 64GB 423 600-1.3 Improved
PGA GHz version of
Pentium III
Itanium 2001 64-bit 64 address 423 733 64-bit EPIC
lines PGA MHz-1.3 Processor
GHz

Where,

o PGA - Pin Grid Array


o MMX - MultiMedia eXtensions
o EPIC - Explicitly Parallel Instruction Computing
o SIMD - Single Instruction Multiple Data
o ALU - Arithmetic and Logic Unit
o MMU - Memory Management Unit
o FPU - Floating Point Unit

Basic Terms used in Microprocessor

Here is a list of some basic terms used in microprocessor:

Instruction Set - The group of commands that the microprocessor can understand is called
Instruction set. It is an interface between hardware and software.

Bus - Set of conductors intended to transmit data, address or control information to different
elements in a microprocessor. A microprocessor will have three types of buses, i.e., data bus,
address bus, and control bus.

IPC (Instructions Per Cycle) - It is a measure of how many instructions a CPU is capable of
executing in a single clock.

Clock Speed - It is the number of operations per second the processor can perform. It can be
expressed in megahertz (MHz) or gigahertz (GHz). It is also called the Clock Bandwidth - The
number of bits processed in a single instruction is called Bandwidth.
Word Length - The number of bits the processor can process at a time is called the word length
of the processor. 8-bit Microprocessor may process 8 -bit data at a time. The range of word
length is from 4 bits to 64 bits depending upon the type of the microcomputer.

Data Types - The microprocessor supports multiple data type formats like binary, ASCII,
signed and unsigned numbers.

Working of Microprocessor

The microprocessor follows a sequence to execute the instruction: Fetch, Decode, and then
Execute.

Initially, the instructions are stored in the storage memory of the computer in sequential order.
The microprocessor fetches those instructions from the stored area (memory), then decodes it
and executes those instructions till STOP instruction is met. Then, it sends the result in binary
form to the output port. Between these processes, the register stores the temporary data and
ALU (Arithmetic and Logic Unit) performs the computing functions.

Features of Microprocessor

Low Cost - Due to integrated circuit technology microprocessors are available at very low
cost. It will reduce the cost of a computer system.

High Speed - Due to the technology involved in it, the microprocessor can work at very high
speed. It can execute millions of instructions per second.

Small Size - A microprocessor is fabricated in a very less footprint due to very large scale and
ultra large-scale integration technology. Because of this, the size of the computer system is
reduced.

Versatile - The same chip can be used for several applications; therefore, microprocessors are
versatile.

Low Power Consumption - Microprocessors are using metal oxide semiconductor


technology, which consumes less power.

Less Heat Generation - Microprocessors uses semiconductor technology which will not emit
much heat as compared to vacuum tube devices.

Reliable - Since microprocessors use semiconductor technology, therefore, the failure rate is
very less. Hence it is very reliable.

Portable - Due to the small size and low power consumption microprocessors are portable.
INPUT/OUTPUT SUBSYSTEM

The I/O subsystem of a computer provides an efficient mode of communication between the
central system and the outside environment. It handles all the input-output operations of the
computer system.

Peripheral Devices

Input or output devices that are connected to computer are called peripheral devices. These
devices are designed to read information into or out of the memory unit upon command from
the CPU and are considered to be the part of computer system. These devices are also called
peripherals.

For example: Keyboards, display units and printers are common peripheral devices.

There are three types of peripherals:

Input peripherals: Allows user input, from the outside world to the computer. Example:
Keyboard, Mouse etc.

Output peripherals: Allows information output, from the computer to the outside world.
Example: Printer, Monitor etc

Input-Output peripherals: Allows both input (from outside world to computer) as well as,
output(from computer to the outside world). Example: Touch screen etc.

INTERFACES

Interface is a shared boundary between two separate components of the computer system which
can be used to attach two or more components to the system for communication purposes.

There are two types of interfaces:

✓ CPU Interface
✓ I/O Interface

Input-Output Interface

Peripherals connected to a computer need special communication links for interfacing with
CPU. In computer system, there are special hardware components between the CPU and
peripherals to control or manage the input-output transfers. These components are called input-
output interface units because they provide communication links between processor bus and
peripherals. They provide a method for transferring information between internal system and
input-output devices.

Input-Output Interface
is used as an method which helps in transferring of information between the internal storage
devices i.e. memory and the external peripheral device . A peripheral device is that which
provide input and output for the computer, it is also called Input-Output devices. For Example:
A keyboard and mouse provide Input to the computer are called input devices while a monitor
and printer that provide output to the computer are called output devices. Just like the external
hard-drives, there is also availability of some peripheral devices which are able to provide both
input and output.

In micro-computer base system, the only purpose of peripheral devices is just to provide special
communication links for the interfacing them with the CPU. To resolve the differences between
peripheral devices and CPU, there is a special need for communication links.

The nature of peripheral devices is electromagnetic and electro-mechanical. The nature of the
CPU is electronic. There is a lot of difference in the mode of operation of both peripheral
devices and CPU.

There is also a synchronization mechanism because the data transfer rate of peripheral devices
are slow than CPU.

In peripheral devices, data code and formats are differ from the format in the CPU and memory.

The operating mode of peripheral devices are different and each may be controlled so as not to
disturb the operation of other peripheral devices connected to CPU.

There is a special need of the additional hardware to resolve the differences between CPU and
peripheral devices to supervise and synchronize all input and output devices.

Functions of Input-Output Interface:

It is used to synchronize the operating speed of CPU with respect to input-output devices.

It selects the input-output device which is appropriate for the interpretation of the input-output
signal.

It is capable of providing signals like control and timing signals.


In this data buffering can be possible through data bus.

There are various error detectors.

It converts serial data into parallel data and vice-versa.

It also convert digital data into analog signal and vice-versa.

The major differences are as follows:

Modes of I/O Data Transfer

Data transfer between the central unit and I/O devices can be handled in generally three types
of modes which are given below:

✓ Programmed I/O
✓ Interrupt Initiated I/O
✓ Direct Memory Access

Programmed I/O

Programmed I/O instructions are the result of I/O instructions written in computer program.
Each data item transfer is initiated by the instruction in the program. Usually the program
controls data transfer to and from CPU and peripheral. Transferring data under programmed
I/O requires constant monitoring of the peripherals by the CPU.

Interrupt Initiated I/O

In the programmed I/O method the CPU stays in the program loop until the I/O unit indicates
that it is ready for data transfer. This is time consuming process because it keeps the processor
busy needlessly.

This problem can be overcome by using interrupt initiated I/O. In this when the interface
determines that the peripheral is ready for data transfer, it generates an interrupt. After receiving
the interrupt signal, the CPU stops the task which it is processing and service the I/O transfer
and then returns back to its previous processing task.

Direct Memory Access

Removing the CPU from the path and letting the peripheral device manage the memory buses
directly would improve the speed of transfer. This technique is known as DMA.

In this, the interface transfer data to and from the memory through memory bus. A DMA
controller manages to transfer data between peripherals and memory unit.

Many hardware systems use DMA such as disk drive controllers, graphic cards, network cards
and sound cards etc. It is also used for intra chip data transfer in multicore processors. In DMA,
CPU would initiate the transfer, do other operations while the transfer is in progress and receive
an interrupt from the DMA controller when the transfer has been completed.
Below figure shows block diagram of DMA

An input-output processor (IOP) is a processor with direct memory access capability. In this,
the computer system is divided into a memory unit and number of processors.

Each IOP controls and manage the input-output tasks. The IOP is similar to CPU except that it
handles only the details of I/O processing. The IOP can fetch and execute its own instructions.
These IOP instructions are designed to manage I/O transfers only.

Block Diagram Of I/O Processor

Below is a block diagram of a computer along with various I/O Processors. The memory unit
occupies the central position and can communicate with each processor.

The CPU processes the data required for solving the computational tasks. The IOP provides a
path for transfer of data between peripherals and memory. The CPU assigns the task of
initiating the I/O program.

The IOP operates independent from CPU and transfer data between peripherals and memory.

The communication between the IOP and the devices is similar to the program control method
of transfer. And the communication with the memory is similar to the direct memory access
method.
In large scale computers, each processor is independent of other processors and any processor
can initiate the operation.

The CPU can act as master and the IOP act as slave processor. The CPU assigns the task of
initiating operations but it is the IOP, who executes the instructions, and not the CPU. CPU
instructions provide operations to start an I/O transfer. The IOP asks for CPU through interrupt.

Instructions that are read from memory by an IOP are also called commands to distinguish
them from instructions that are read by CPU. Commands are prepared by programmers and are
stored in memory. Command words make the program for IOP. CPU informs the IOP where
to find the commands in memory.

INTERRUPTS

Data transfer between the CPU and the peripherals is initiated by the CPU. But the CPU cannot
start the transfer unless the peripheral is ready to communicate with the CPU. When a device
is ready to communicate with the CPU, it generates an interrupt signal. A number of input-
output devices are attached to the computer and each device is able to generate an interrupt
request.

The main job of the interrupt system is to identify the source of the interrupt. There is also a
possibility that several devices will request simultaneously for CPU communication. Then, the
interrupt system has to decide which device is to be serviced first.

Priority Interrupt

A priority interrupt is a system which decides the priority at which various devices, which
generates the interrupt signal at the same time, will be serviced by the CPU. The system has
authority to decide which conditions are allowed to interrupt the CPU, while some other
interrupt is being serviced. Generally, devices with high speed transfer such as magnetic disks
are given high priority and slow devices such as keyboards are given low priority.

When two or more devices interrupt the computer simultaneously, the computer services the
device with the higher priority first.

Types of Interrupts:

Hardware Interrupts

When the signal for the processor is from an external device or hardware then this interrupts is
known as hardware interrupt.

Let us consider an example: when we press any key on our keyboard to do some action, then
this pressing of the key will generate an interrupt signal for the processor to perform certain
action. Such an interrupt can be of two types:

✓ Maskable Interrupt
The hardware interrupts which can be delayed when a much high priority interrupt has
occurred at the same time.

✓ Non Maskable Interrupt

The hardware interrupts which cannot be delayed and should be processed by the processor
immediately.

Software Interrupts

The interrupt that is caused by any internal system of the computer system is known as a
software interrupt. It can also be of two types:

✓ Normal Interrupt

The interrupts that are caused by software instructions are called normal software
interrupts.

✓ Exception

Unplanned interrupts which are produced during the execution of some program are called
exceptions, such as division by zero.

Daisy Chaining Priority

This way of deciding the interrupt priority consists of serial connection of all the devices which
generates an interrupt signal. The device with the highest priority is placed at the first position
followed by lower priority devices and the device which has lowest priority among all is placed
at the last in the chain.

In daisy chaining system all the devices are connected in a serial form. The interrupt line
request is common to all devices. If any device has interrupt signal in low level state then
interrupt line goes to low level state and enables the interrupt input in the CPU. When there is
no interrupt the interrupt line stays in high level state. The CPU respond to the interrupt by
enabling the interrupt acknowledge line. This signal is received by the device 1 at its PI input.
The acknowledge signal passes to next device through PO output only if device 1 is not
requesting an interrupt.

The following figure shows the block diagram for daisy chaining priority system.
PARALLEL COMPUTING

It is the use of multiple processing elements simultaneously for solving any problem. Problems
are broken down into instructions and are solved concurrently as each resource that has been
applied to work is working at the same time.

Advantages of Parallel Computing over Serial Computing are as follows:

➢ It saves time and money as many resources working together will reduce the time and
cut potential costs.
➢ It can be impractical to solve larger problems on Serial Computing.
➢ It can take advantage of non-local resources when the local resources are finite.
➢ Serial Computing ‘wastes’ the potential computing power, thus Parallel Computing
makes better work of the hardware.

Types of Parallelism:

Bit-level parallelism –

It is the form of parallel computing which is based on the increasing processor’s size. It reduces
the number of instructions that the system must execute in order to perform a task on large-
sized data.

Example: Consider a scenario where an 8-bit processor must compute the sum of two 16-bit
integers. It must first sum up the 8 lower-order bits, then add the 8 higher-order bits, thus
requiring two instructions to perform the operation. A 16-bit processor can perform the
operation with just one instruction.

Instruction-level parallelism –

A processor can only address less than one instruction for each clock cycle phase. These
instructions can be re-ordered and grouped which are later on executed concurrently without
affecting the result of the program. This is called instruction-level parallelism.

Task Parallelism –

Task parallelism employs the decomposition of a task into subtasks and then allocating each
of the subtasks for execution. The processors perform the execution of sub-tasks concurrently.

Data-level parallelism (DLP) –

Instructions from a single stream operate concurrently on several data – Limited by non-regular
data manipulation patterns and by memory bandwidth

Why parallel computing?

• The whole real-world runs in dynamic nature i.e. many things happen at a certain time
but at different places concurrently. This data is extensively huge to manage.
• Real-world data needs more dynamic simulation and modeling, and for achieving the
same, parallel computing is the key.
• Parallel computing provides concurrency and saves time and money.
• Complex, large datasets, and their management can be organized only and only using
parallel computing’s approach.
• Ensures the effective utilization of the resources. The hardware is guaranteed to be used
effectively whereas in serial computation only some part of the hardware was used and
the rest rendered idle.
• Also, it is impractical to implement real-time systems using serial computing.

Applications of Parallel Computing:

o Databases and Data mining.


o Real-time simulation of systems.
o Science and Engineering.
o Advanced graphics, augmented reality, and virtual reality.

Limitations of Parallel Computing:

❖ It addresses such as communication and synchronization between multiple sub-tasks


and processes which is difficult to achieve.
❖ The algorithms must be managed in such a way that they can be handled in a parallel
mechanism.
❖ The algorithms or programs must have low coupling and high cohesion. But it’s
difficult to create such programs.
❖ More technically skilled and expert programmers can code a parallelism-based program
well.

Future of Parallel Computing: The computational graph has undergone a great transition
from serial computing to parallel computing. Tech giant such as Intel has already taken a step
towards parallel computing by employing multicore processors. Parallel computation will
revolutionize the way computers work in the future, for the better good. With all the world
connecting to each other even more than before, Parallel Computing does a better role in
helping us stay that way. With faster networks, distributed systems, and multi-processor
computers, it becomes even more necessary.

COMPUTER ARITHMETIC (Continuation)

The ALU is that part of the computer that actually performs arithmetic and logical operations
on data. All of the other elements of the computer system control unit, registers, memory, I/O—
are there mainly to bring data into the ALU for it to process and then to take the results back
out. We have, in a sense, reached the core or essence of a computer when we consider the ALU.

An ALU and, indeed, all electronic components in the computer are based on the use of simple
digital logic devices that can store binary digits and perform simple Boolean logic operations.
Fixed point Number Representation

Fixed point numbers are also known as whole numbers or Integers. The number of bits used in
representing the integer also implies the maximum number that can be represented in the
system hardware. However for the efficiency of storage and operations, one may choose to
represent the integer with one Byte, two Bytes, Four bytes, or more. This space allocation is
translated from the definition used by the programmer while defining a variable as an integer
short or long and the Instruction Set Architecture.

In addition to the bit length definition for integers, we also have a choice to represent them as
below:

• Unsigned Integer: A positive number including zero can be represented in this format.
All the allotted bits are utilized in defining the number. So if one is using 8 bits to
represent the unsigned integer, the range of values that can be represented is 2^8 i.e.
"0" to "255". If 16 bits are used for representing then the range is 2^16 i.e. "0 to 65535".
• Signed Integer: In this format, negative numbers, zero, and positive numbers can be
represented. A sign bit indicates the magnitude direction as positive or negative. There
are three possible representations for signed integers and these are Sign Magnitude
format, 1's Compliment format, and 2's Complement format.

Signed Integer – Sign Magnitude format: Most Significant Bit (MSB) is reserved for
indicating the direction of the magnitude (value). A "0" on MSB means a positive number and
a "1" on MSB means a negative number. If n bits are used for representation, n-1 bits indicate
the absolute value of the number. Examples for n=8:

Examples for n=8:


0010 1111 = + 47 Decimal (Positive number)
1010 1111 = - 47 Decimal (Negative Number)
0111 1110 = +126 (Positive number)
1111 1110 = -126 (Negative Number)
0000 0000 = + 0 (Positive Number)
1000 0000 = - 0 (Negative Number)
Although this method is easy to understand, Sign Magnitude representation has several
shortcomings like

• Zero can be represented in two ways causing redundancy and confusion.


• The total range for magnitude representation is limited to 2n-1, although n bits were
accounted.
• The separate sign bit makes addition and subtraction more complicated. Also,
comparing two numbers is not straightforward.

Signed Integer – 1’s Complement format: In this format too, MSB is reserved as the sign
bit. But the difference is in representing the Magnitude part of the value for negative numbers
(magnitude) is inversed and hence called 1’s Complement form. The positive numbers are
represented as it is in binary. Let us see some examples to better our understanding.

Examples for n=8:


0010 1111 = + 47 Decimal (Positive number)
1101 0000 = - 47 Decimal (Negative Number)
0111 1110 = +126 (Positive number)
1000 0001 = -126 (Negative Number)
0000 0000 = + 0 (Positive Number)
1111 1111 = - 0 (Negative Number)
Converting a given binary number to its 2's complement form

Step 1. -x = x' + 1 where x' is the one's complement of x.

Step 2 Extend the data width of the number, fill up with sign extension i.e. MSB bit is used to
fill the bits.

Example: -47 decimal over 8bit representation

The binary equivalent of + 47 is 0010 1111


The binary equivalent of - 47 is 1010 1111 (Sign Magnitude Form)
1's complement equivalent is 1101 0000
2’s complement equivalent is 1101 0001

As you can see zero is not getting represented with redundancy. There is only one way of
representing zero. The other problem of the complexity of the arithmetic operation is also
eliminated in 2’s complement representation. Subtraction is done as Addition.

In a computer, the basic arithmetic operations are Addition and Subtraction. Multiplication and
Division can always be managed with successive addition or subtraction respectively.
However, hardware algorithms are implemented for Multiplication and Division.

It is to be recollected that computers deal with binary numbers unless special hardware is
implemented for dealing with other number systems. Although instructions may be available
for treating signed and unsigned operations, the programmer must deal with the numbers and
handling of the result. The hardware assists the programmer by way of appropriate instructions
and flags.

Addition

Adding two numbers is an addition. We may add signed or unsigned numbers. When we add
two numbers, say 8 and 5, the result is 13 i.e. while adding two single-digit numbers, we may
get a two-digit number in the result. A similar possibility exists in the binary system too. Thumb
rule of binary addition is:

0 + 0 = 0, 0 + 1 = 1, 1 + 0 = 1, 1 + 1 = 10
Examples (a –e) of unsigned binary addition are given in the figure above
Adder

The hardware circuit which executes this addition is called Adder. There are two types of
adders namely Half adder and Full adder. Basic adder circuit does 1-bit addition and is extended
for n-bit addition. The adder circuit characteristics are detailed by a circuit, a truth table,
Formula and a block symbol. The adder circuits are constructed from logic gates which satisfy
the formula as per truth table. These are also called combinational logic. A Combinational logic
output reflects the input without clocking.

The Half Adder (HA) has two inputs (A, B) and two outputs (Sum and Carry). The Sum is
XOR of input while the Carry is AND of the input. The Half Adder is detailed in the figure
above.

A Full Adder (FA) also performs 1-bit addition but taking 3 inputs (A, B and Ci) and produces
two outputs (Sum and Carry). Like HA, FA generates result consisting of Sum (S) and Carry
out (Cout). Cout is used as Ci+1 while cascading for multiple bits of a word. Full Adder is
detailed in figure Below. A full adder can also be constructed using half adder blocks as in
figure below.
Subtraction

Subtraction is finding the difference of B from A i.e A-B. Basis of binary subtraction is:

0 - 0 = 0, 0 - 1 = -1, 1 - 0 = 1, 1 - 1 = 0

Of course, the usual borrow logic from the adjacent digit is applied as in the case of decimal
numbers. Examples of signed binary Subtraction is as below:

Multiplication
Just recall with micro details as to how do we do multiplication using pen and paper. Then it is
easier to visualize that it is possible to implement a hardware algorithm.

Multiplicand M = 12 1100
Multiplier Q = 11 x1011
--------
1100
1100
0000
1100
--------
Product P = 10000100 =132

As you see, we start with LSB of the Multiplier Q, multiply the Multiplicand, the partial product
is jotted down. Then we used the next higher digit of Q to multiply the multiplicand. This time
while jotting the partial product, we shift the jotting to the left corresponding to the Q–digit
position. This is repeated until all the digits of Q are used up and then we sum the partial
products. By multiplying 12x11, we got 132. You may realize that we used binary values and
the product also in binary. Binary multiplication was much simpler than decimal multiplication.
Essentially this is done by a sequence of shifting and addition of multiplicand when the
multiplier consists only of 1's and 0's. This is true and the same, in the case of Binary
multiplication. Binary multiplication is simple because the multiplier would be either a 0 or 1
and hence the step would be equivalent to adding the multiplicand in proper shifted position or
adding 0's.
Division
In a simple approach, Division is a successive subtraction. An arithmetic division is more
complex than multiplication in the sense that a divide instruction requires more time to get
executed. A Binary division is much simpler than decimal division because the quotient digits
are either 0 or 1 and there is no need to estimate how many times the dividend or partial
remainder fits into the divisor. The objective of a division is to divide the Dividend by the
Divisor and obtain the Quotient and Remainder. Let us see the example of dividing 213 with 5
in binary form. The workout as below.

On dividing, we obtained 42 as quotient and 3 as remainder. Observe that:


• The number of bits taken at the first instance equals that of the bits in the divisor; these
bits are considered from MSB onwards. In this case, it is 3-bits at a time.
• This set of bits from the dividend are compared with divisor. If dividend set is greater
than the divisor, a '1' is placed in the quotient, else a '0' is placed in quotient.
• The partial reminder at this stage is derived by subtraction of the divisor from the
dividend set of bits.
• The next most significant bit from the divisor is appended to the partial reminder.
• The divisor is shifted right.
• Steps 2 to 5 are repeated until all the bits in dividend are used up
In the hardware implementation of division the dividend or partial remainder is shifted to the
left than shifting the divisor to the right. Thus, the two numbers are left in the required relative
position. Subtraction is achieved by adding it's 2's complement of the divisor to the dividend.
The information about the relative magnitudes i.e. whether the dividend set is greater than
divisor is available from the end-carry.

Floating Point Number system

The maximum number at best represented as a whole number is 2n. In the Scientific world, we
do come across numbers like the Mass of an Electron is 9.10939 x10−31, Kg. The velocity of
light is 2.99792458 x108 , m/s. Imagine writing the number on a piece of paper without an
exponent and converting it into binary for computer representation. Sure you are tired!!. It
makes no sense to write a number in a non-readable form or non-processible form. Hence we
write such large or small numbers using exponent and mantissa. This is said to be a Floating
Point representation or real number representation. The real number system could have infinite
values between 0 and 1.

Representation in computer

Unlike the two's complement representation for integer numbers, the Floating Point number
uses Sign and Magnitude representation for both mantissa and exponent. In the number
9.10939 x1031 , in decimal form, +31 is Exponent, and 9.10939 is known as Fraction.
Mantissa, Significand, and fraction are synonymously used terms. In the computer, the
representation is binary and the binary point is not fixed. For example, a number, say, 23.345
can be written as 2.3345 x101 , or 0.23345 x102 , or 2334.5 x10−2 . The representation 2.3345
x101 , is said to be in normalized form.

Floating-point numbers usually use multiple words in memory as we need to allot a sign bit, a
few bits for exponent, and many bits for mantissa. There are standards for such allocation which
we will see sooner.

IEEE 754 Floating Point Representation

We have two standards known as Single Precision and Double Precision from IEEE. These
standards enable portability among different computers. The picture below shows Single
precision. Single Precision uses a 32-bit format while double precision is 64 bits word length.
As the name suggests double precision can represent fractions with larger accuracy. In both
cases, MSB is a sign bit for the Mantissa part, followed by Exponent and Mantissa. The
exponent part has its sign bit.
Single Precision Floating Point representation Standard

It is to be noted that in Single Precision, we can represent an exponent in the range of -127 to
+127. It is possible as a result of arithmetic operations the resulting exponent may not fit in.
This situation is called overflow in the case of a positive exponent and underflow in the case
of a negative exponent. The Double Precision format has 11 bits for exponent meaning a
number as large as -1023 to 1023 can be represented. The programmer has to make a choice
between Single Precision and Double Precision declaration using his knowledge of the data
being handled.

Double Precision Floating Point Representation Standard

The Floating Point operations on the regular CPU are very slow. Generally, a special-purpose
CPU known as Co-processor is used. This Co-processor works in tandem with the main CPU.
The programmer should be using the float declaration only if his data is in real number form.
Float declaration is not to be used generously.

Decimal Numbers Representation

Decimal numbers (radix 10) are represented and processed in the system with the support of
additional hardware. We deal with numbers in decimal format in everyday life. Some machines
implement decimal arithmetic too, like floating-point arithmetic hardware. In such a case, the
CPU uses decimal numbers in BCD (binary coded decimal) form and does BCD arithmetic
operations. BCD operates on radix 10. This hardware operates without conversion to pure
binary. It uses a nibble to represent a number in packed BCD form. BCD operations require
not only special hardware but also a decimal instruction set.

You might also like