0% found this document useful (0 votes)
62 views

Computer System Architecture

Uploaded by

thelodie
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

Computer System Architecture

Uploaded by

thelodie
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Ministry of Higher Education Republic of Cameroon

Experiential Higher Institute of Science and Peace – Work – Fatherland


Technology School Year: 2022/2023
EXHIST – Yaoundé
Department of Computer Engineering

TOPIC: COMPUTER SYSTEM AND


ARCHITECTURE
Class : HND 1 (One) By: MVOGO BILEGUE Edouard

Just as buildings, each computer has a visible structure, referred to as its architecture. The
architecture of a building can be examined at various levels of detail, namely, the number of
stories, the size of the rooms, the details of door and window placement and so on. One can
look at a computers’ architecture at similar levels of detail of basic hardware elements, which
in turn depends on the type of computer (personal computer, super computer, etc.) required.
Computer architecture is defined as the science of selecting and interconnecting hardware
components to create computers that meet functional, performance and cost goals. It can also
be described as the logical structure of the computer system. The computer architecture forms
the backbone for building successful computer systems. The architecture largely precludes a
computer system's quality attributes such as performance and reliability.

Learning objectives
After studying this chapter, student should be able to:
 Define architecture and state the difference between computer architecture and
computer organization.
 State and describe the abstraction layers of a computer architecture
 Name some examples of architecture and describe the functioning of the Von
Neumann architecture
 State the difference between RISC and CISC instruction sets
 State and explain some examples of modern processors

Contents
I. INTRODUCTION TO COMPUTER SYSTEM........................................................................................................................................2

II. WHAT IS COMPUTER ARCHITECTURE?...........................................................................................................................................2

III. ABSTRACTION LAYERS OF A COMPUTER ARCHITECTURE..................................................................................................3

IV. THE VON NEUMANN ARCHITECTURE........................................................................................................................................5

V. THE INSTRUCTION FORMAT.............................................................................................................................................................12

VI. INSTRUCTION SET..........................................................................................................................................................................15

VII. INTERRUPT AND POOLING..........................................................................................................................................................16

VIII. OTHER TYPES OF ARCHITECHTURE.........................................................................................................................................18

IX. MODERN MICROPROCESSOR ARCHITECTURE.......................................................................................................................20

You can contact me through this email [email protected]


You can contact me through this email [email protected]
Topic: Computer System Architecture 3 By EDOUARD MVOGO

I. INTRODUCTION TO COMPUTER SYSTEM

A computer can be viewed as a system, which consists of a number of interrelated


components that work together with the aim of converting data into information. In a
computer system, processing is carried out electronically, usually with little or no
intervention from the user. The physical parts that make up a computer (the central
processing unit, input, output and storage unit) are known as hardware. Any hardware device
connected to the computer or any part of the computer outside the CPU and the working
memory is known as a peripheral; for example, keyboard, mouse and monitor.

There are four components required for the implementation of a computerized input-process-
output model:

1. The computer hardware, which provides the physical mechanisms to input and
output data, to manipulate and process data, and to electronically control the various
input, output, and storage components.
2. The software, both application and system, which provides instructions that tell the
hardware exactly what tasks are to be performed and in what order.
3. The data that is being manipulated and processed. This data may be numeric, it may
be alphanumeric, it may be graphic, or it may take some other form, but in all cases it
must be representable in a form that the computer can manipulate.
4. The communication component, which consists of hardware and software that
transport programs and data between interconnected computer systems.

II. WHAT IS COMPUTER ARCHITECTURE?

Computer architecture is a specification detailing how a set of software and hardware


technology standards interact to form a computer system or platform. In short, computer
architecture refers to how a computer system is designed and what technologies it is
compatible with.

There are two main categories of computer architecture:

1. Instruction Set Architecture (ISA): This is the embedded programming language of


the central processing unit. It defines the CPU's functions and capabilities based on
what programming it can perform or process. This includes the word size, processor
register types, memory addressing modes, data formats and the instruction set that
programmers use.

2. Microarchitecture: Otherwise known as computer organization, this type of


architecture defines the data paths, data processing and storage elements, as well as
how they should be implemented in the ISA. The computer organization is the set of

You can contact me through this email [email protected]


Topic: Computer System Architecture 4 By EDOUARD MVOGO

resources that realizes the architecture which include the CPU, the memory and I/O
controllers. These are digital systems with registers, buses, ALUs, sequencers, etc.

Computer Architecture = Instruction Set Architecture + Computer Organization. Instruction


Set Architecture (ISA) is what the computer does (logical view) and Computer Organization is
how the ISA is implemented (physical view).

III. ABSTRACTION LAYERS OF A COMPUTER ARCHITECTURE

Computer systems span many levels of detail, which in computer science we call levels of
abstraction. Abstractions help us express intangible concepts in visible representations
that can be manipulated. In layered architecture, complex problems can be segmented
into smaller and more manageable form. Each layer is specialized for specific functioning.
Team development is possible because of logical segmentation. A team of programmers will
build. The system and work has to be sub-divided of along clear boundaries.

Figure 1 illustrates another view of a computer system, which is comprised of different levels
of language and means of translating these languages into lower-level languages. Finally, the
microprogram is loaded onto the hardware.

Figure 1. A view of levels of abstraction in computer systems

You can contact me through this email [email protected]


Topic: Computer System Architecture 5 By EDOUARD MVOGO

III.1 Application Layer :


This layer indicates the set of applications intended for the computer! Ideally, all applications
can be run on a computer. However, in practice the computer is designed to “efficiently” run
a subset of them. For example, a computer runs scientific applications, a different computer
runs business applications, etc.

III.2 Computational Methods Layer :


This layer is highly theoretical and abstract. The computational method (i) determines
characteristics of items (data and other) and work (operations), ii) describes how operations
initiate each other during execution, i.e. which operation is followed by which or determining
the order of performing operations, and (iii) implicitly determines the amount of parallelism
among the operations.

III.3 Algorithm Layer :


The algorithm for an application specifies major steps to generate the output. The algorithm
follows the computational method chosen. An algorithm is abstract and short. It is
independent of high-level languages.

III.4 High-Level Language Layer :


The algorithm developed for an application is coded in a high-level language, such as
Fortran, C, C++, Java, etc.

III.5 Operating Systems Layer :


This layer interfaces with hardware. That is, it hides hardware details from the programmer
and provides security, stability and fairness in the computing system. Thus, this layer adds
more code to run on the behalf of the application. The layer also handles interrupts and
input/output operations facilitates design of functionality independent of the hardware.

III.6 Architecture Layer :


The architecture layer is the hardware/software interface. Its elements include the machine
language instruction set, register sets, the memory and Input/Output structures among others.

III.7 Microarchitecture Layer :


This layer consists of digital systems. A computer which is a digital system consists of at
least three smaller digital systems : the processor (CPU), the memory and Input/Output
controller. A digital system consists of registers, buses, ALUs, sequencers, etc. Other names
used for this layer are organization and register transfer level (RTL).

III.8 Logic Layer :

You can contact me through this email [email protected]


Topic: Computer System Architecture 6 By EDOUARD MVOGO

This layer consists of digital circuits. Digital circuits form digital systems of the
microarchitecture level. Digital circuits use two types of components : gates and flip-flops. A
gate outputs 1 or 0, depending on its current input values, i.e. the output now is a function of
the inputs now. Most common gates used are AND, OR, NOT, NAND and NOR gates.

III.9 Transistor Layer :


This layer consist of digital electronic circuits. Digital electronic circuits are used to build
digital circuits. That is, digital electronic circuits implement gates (also flip-flops). Digital
electronic circuits consist of transistors, resistors, capacitors, diodes, etc. Transistors are the
main component and so this level is often called the transistor level. Transistors in these
circuits are used as on-off switches.

IV. THE VON NEUMANN ARCHITECTURE

A very good example of computer architecture is Von Neumann architecture, which is still
used by most types of computers today. This was proposed by the mathematician John Von
Neumann in 1945. It is comprised of the five classical components (input, output,
processor, memory, and datapath). The processor is divided into an arithmetic logic unit
(ALU) and control unit, a method of organization that persists to the present. Within the
ALU, an accumulator supports efficient addition or incrementation of values corresponding
to variables such as loop indices.

Fig 2: Schematic diagram of von Neumann architecture

The Von Neumann Architecture comprises the following components: Central Processing
Unit (CPU), Input Unit, Output Unit, and Storage Unit. The diagram below shows the
logical functioning of all those components.

You can contact me through this email [email protected]


Topic: Computer System Architecture 7 By EDOUARD MVOGO

Fig 3: Functioning of the Von Neumann Architecture

The von Neumann architecture has a significant disadvantage - its speed is dependent on the
bandwidth or throughput of the datapath between the processor and memory. This is called
the von Neumann bottleneck.

IV.1 Central Processing Unit

The central processing unit, also known as processor, is the brain of the computer system
that processes data (input) and converts it into meaningful information (output). It is referred
to as the administrative section of the computer system that interprets the data and
instructions, coordinates the operations and supervises the instructions. CPU works with data
in discrete form, that is, either 1 or 0. Some of the basic functions of a CPU are as follows:

 It issues commands to all parts of the computer system.


 It controls the sequence of operations as per the stored instructions.
 It stores data as well as programs (instruction).
 It performs the data-processing operation and sends the results to the output unit.

The CPU consists of three main subsystems: the arithmetic/logic unit (ALU), the control unit
(CU) and the registers. These three subsystems work together to provide operational
capabilities to the computer.

IV.1.1 Arithmetic Logic Unit (ALU):

This unit performs the arithmetic (add and subtract) and logical operations (and, or) on the
available data. Whenever an arithmetic or logical operation is to be performed, the required
data are transferred from the memory unit to ALU, the operation is performed, and the result
is returned to memory unit. Before the completion of the processing, data may need to be
transferred back and forth several times between these two sections. The ALU comprises two
units: an arithmetic unit and a logic unit.

 Arithmetic Unit: The arithmetic unit contains the circuitry that is responsible for
performing the actual computing and carrying out the arithmetic calculations such as

You can contact me through this email [email protected]


Topic: Computer System Architecture 8 By EDOUARD MVOGO

addition, subtraction, multiplication and division. It can perform these operations at a


very high speed.
 Logic Unit: The logic unit enables the CPU to make logical operations based on the
instructions provided. These operations are the logical comparison between the data
items. The logic unit can compare numbers, letters or special characters and can then
take action based on the result of the comparison.

IV.1.2 Control Unit:

The control unit can be thought of as the heart of the CPU. It checks the correctness of the
sequence of operations. It fetches the program instructions from the memory unit, interprets
them and ensures correct execution of the program. It also controls the input/output devices
and directs the overall functioning of the other units of the computer.

Figure 5 illustrates how control unit instructs the other parts of the CPU (i.e. ALU and
registers) and the I/O devices on what to do and when to do. In addition, it determines what
data are needed, where they are stored and where to store the results of the operation as well
as sends the control signals to the devices involved in the execution of the instructions. It
administers the movement of large amount of instructions and data used by the computer. To
maintain the proper sequence of events required for any processing task, the control unit uses
clock inputs. Thus, the control unit repeats a set of four basic operations: fetching, decoding,
executing and storing.

You can contact me through this email [email protected]


Topic: Computer System Architecture 9 By EDOUARD MVOGO

Figure 5: The control unit

The four basic operations are explained as follows:

1. Fetching: It is the process of obtaining a program instruction or data item from the
memory.
2. Decoding: It is the process of translating the instruction into commands the computer
can execute.
3. Executing: It is the process of carrying out the commands.
4. Storing: It is the process of writing the results to the memory.

IV.1.3 Registers:

These are the special-purpose, high-speed temporary memory units that can hold varied
information such as data, instructions, addresses and intermediate results of calculations.
Essentially, they hold the information that the CPU is currently working on. Registers can be
considered the CPU's working memory, an additional storage location that provides the
advantage of speed. Registers work under the direction of the control unit to accept, hold and
transfer instructions or data and perform arithmetic or logical comparisons at high speed. The
control unit uses a data storage register in a similar way a store owner uses a cash register as
a temporary, convenient place to store the transactions. As soon as a particular instruction or
piece of data is processed, the next instruction immediately replaces it, and the information
that result from the processing are returned to main memory. Figure 6 reveals various types
of registers present inside a CPU.

Fig 6: Registers in CPU

You can contact me through this email [email protected]


Topic: Computer System Architecture 10 By EDOUARD MVOGO

Register Name Function


Program Counter (PC) A program counter keeps track of the next instruction to be
executed.
Instruction Register (IR) An instruction register holds the instruction to be decoded by
the control unit.
Memory Address A memory address register holds the address of the next
Register (MAR) location in memory to be accessed.
Memory Buffer Register A memory buffer register is used for storing data either
(MBR) coming to the CPU or data being transferred by the CPU.
Accumulator (ACC) An accumulator is a general-purpose register used for storing
temporary results and results produced by arithmetic logic
unit.
Data Register (DR) A data register is used for storing the operands and other data.
Besides these, a processor can have many other registers. The above-discussed registers are
the most basic and essential registers for any CPU.

The size or length of each register is determined by its function. For example, the memory
address register, which holds the address of the next location in memory to be accessed, must
have the same number of bits as the memory address. Instruction register holds the next
instruction to be executed and, therefore, should be of the same number of bits as the
instruction. (NB: The number and sizes of registers vary from processor to processor.)

IV.2 Main Memory Unit


Memory is that part of the computer, which holds data and instructions for processing.
Logically, it is an integral component of the CPU, but physically it is a separate part placed
on the computer's motherboard. Memory stores program instructions or data for only as long
as the program they pertain to is in operation. The primary memory is of two types: random
access memory (RAM) and read only memory (ROM).

IV.2.1 Random Access Memory

Random access memory (RAM) directly provides the required information to the processor.
RAM can be defined as a block of sequential memory locations, each of which has a unique
address determining the location and those locations contain a data element. It stores
programs and data that are in active use. It is volatile in nature, which means the information
stored in it remains as long as the power is switched ON. RAM can be further classified into
two categories:

• Dynamic Random Access Memory (DRAM): This type of RAM holds the data in
dynamic (keeps on refreshing) manner with the help of a refresh circuitry. Each second
or even less than that the content of each memory cell is read and the reading action
refreshes the contents of the memory. DRAMs are made from transistors and
capacitors. The capacitor holds the electrical charge if the bit contains 1, and no
charge if the bit is 0. The transistor reads the contents of the capacitor. The charge is
held for a short period and then it fades away, that is, when refresh circuitry comes in.

You can contact me through this email [email protected]


Topic: Computer System Architecture 11 By EDOUARD MVOGO

• Static Random Access Memory (SRAM): SRAM along with DRAM is essential for
a system to run optimally, because it is very fast as compared to DRAM. It is effective
because most of the programs access the same data repeatedly and keeping all these
information in the fast SRAM allows the computer to avoid accessing the slower
DRAM. Data are first written to SRAM assuming that they will be used again soon.
SRAM is generally included in a computer system by the name of cache.

IV.2.2 Read-only Memory

As the name suggests, read-only memory (ROM) can only be read, not written. In other
words, the CPU can only read from any location in the ROM but cannot write. The ROM
stores the initial start-up instructions and routines in the BIOS (basic input/output system).
The contents of ROM are not lost even in case of a sudden power failure, thus making it non-
volatile in nature. The instructions in the ROM are built into the electronic circuits of the
chip, which is called firmware. The ROM is also random access in nature. Various types of
ROM, namely, programmable read-only memory (PROM), erasable programmable read-
only memory (EPROM) and electrically erasable programmable read-only memory
(EEPROM) are in existence.

IV.3 Interconnection of Units


Now, let us discuss the interconnection between the CPU (CU, ALU and registers), the
memory unit, and the I/O devices, which constitute the entire computer system.

IV.3.1 System Bus

The Bus is the functional units are interconnected to enable data transport (e.g. writeCPU
register data content to a certain address in memory). It is a set of connections between two
or more components/devices, which is designed to transfer several/all bits of a word from
source to destination. A bus consists of multiple paths, which are also termed as lines and
each line is capable of transferring one bit at a time. A bus can be unidirectional
(transmission of data can be only in one direction) or bidirectional (transmission of data can
be in both directions). A bus that connects to all the three components (CPU, memory and I/O
devices) is called a system bus.

Fig 7: System bus

a) Data lines: Data lines provide a path for moving data between the system modules.
These are collectively known as data bus. Normally, a data bus consists of 8, 16 or
32 separate lines. The number of lines present in a data bus is called the width of data
bus. Data bus width limits the maximum number of bits, which can be transferred

You can contact me through this email [email protected]


Topic: Computer System Architecture 12 By EDOUARD MVOGO

simultaneously between two modules. The width of data bus helps in determining the
overall performance of a computer system.
b) Address Lines: Address lines are used to designate the source of data for data bus.
Address lines are collectively called address bus Thus, the width of the address bus
specifies the maximum possible memory supported by a system. For example, if a
system has 16-bit wide address bus, it can support memory size equal to 2 16 (or
65536) bytes.
c) Control lines: Control lines are used to control the access to data and address bus.
This is required as bus is a shared medium. The control lines are collectively called
control bus. These lines are used for the transmission of commands and timing signals
(which validate data and address) between the system modules. Timing signals
indicate whether data and address information is valid, whereas command signals
specify which operations are to be performed. Some of the control lines of bus are
required for providing clock signals to synchronize operations and for resetting
signals to initialize the modules. The control lines are also required for
reading/writing to I/O devices or memory.

The bus bottleneck

The bus architecture comes with a serious disadvantage: An electronic bus can transfer only
one item at a time (e.g., one data word, one address). The bus transmission speed thus poses a
limit on the overall performance of the system (this phenomenon is known as the bus
bottleneck)

– Bus transmission speed is limited by physical characteristics (the capacitance and


inductance of the bus wires)
– Note that advances in CPU speed do not help here: the faster the CPU operates
internally, the slower the bus transfers appear to be
– While the CPU waits for a bus transfer to complete, the CPU is stalled

IV.3.2 Cache

A cache is a piece of very fast memory, made from high-speed static RAM that reduces the
time of accessing data. It is very expensive and generally incorporated in the processor,
where valuable data and program segments are kept. This enables the processor to access data
quickly whenever it is needed. Major reason for incorporating cache in the system is that the
CPU is much faster than the DRAM and needs a place to store information that can be
accessed rapidly. The cache facilitates the system to catch up with the processor's speed. The
cache fetches the frequently used data from the dram and buffers (stores) it for further
processor usage. Cache can be further categorized into three levels

• Level 1 Cache (L1): Level 1 cache, also known as primary cache, is built into the
processor chip. It is a small fast memory area that works together with the Level 2
cache to provide the processor much faster access to important and often used data.
• Level 2 Cache (L2): Level 2 cache, also known as secondary cache, is a collection of
static RAM chips that are built onto the motherboard. It is little larger and slower than

You can contact me through this email [email protected]


Topic: Computer System Architecture 13 By EDOUARD MVOGO

L1, but is faster than the main memory. L1 and L2 cache are used together for optimal
use of the processor.
• Level 3 Cache (L3): L3 cache memory is an enhanced form of memory present on the
motherboard of the computer. It is an extra cache built into the motherboard between
the processor and the main memory to speed up the processing operations. It reduces
the time gap between the request and the retrieval of the data and instructions, thereby
accessing data much more quickly than the main memory.

Fig 8: L1, L2 and L3 caches

V. THE INSTRUCTION FORMAT

Computer understands instructions only in terms of 0s and 1s, which is called the machine
language. To accomplish significant tasks, the processor must have two inputs: instructions
and data. The instructions tell the processor what actions need to be performed on the data.
Each machine language instruction is composed of two parts: the op-code and the operand.
The bit pattern appearing in the op-code field indicates which operations (e.g. STORE, ADD,
SUB and so on) are instructed. The bit pattern of the operand field provides further details
about the operation specified by the op-code.

Figure 9 illustrates the format of an instruction for the processor. The first three bits represent
the op-code and the final six bits represent the operand. The middle bit represents whether the
operand is a memory address or a number. When the bit is set to 1, the operand represents a
number.

You can contact me through this email [email protected]


Topic: Computer System Architecture 14 By EDOUARD MVOGO

Figure 9. An Instruction Format

Instructions are usually divided into the following types:

• Data Transfer Instructions: These are used to transfer or copy data from one location
to another either in the registers or in the external main memory.
• Arithmetic Instructions: These instructions are used to perform operations on
numerical data.
• Logical Instructions: These are used to perform Boolean operations on non-
numerical data.
• Program Control Instructions: These are used to change the sequence of a program
execution.
• Input–output Instructions: These are used to transfer data from and to I/O devices.

Now, let us discuss few very basic instructions in the assembly language. These instructions
tell the processor to carry out various operations.

Instructions Functions
ADD Perform addition
SUB Perform subtraction
MUL Perform multiplication
MOV Move the contents from one location to another
DIV Perform division
LDA Load the contents of variable
JMP Jump to the instruction
ABS Calculate absolute value

Example:
ADD R1, R2 will add the content of register R1 and R2.
MOV R1, R2 will move the content of register R2 to R1.
LDA Var1 will load the contents of 'Var1' into the accumulator.

V.1 The Instruction Cycle


The processor of the system performs the execution work. The instruction cycle details the
sequence of events that takes place as an instruction is read from the memory and executed.
A simple instruction cycle consists of the following steps:

1. Fetch Cycle: Fetching the instruction from the memory.


2. Decode Cycle: Decoding the instruction.
3. Execute Cycle: Executing the instruction.
4. Store Cycle: Storing the results back to the memory.

You can contact me through this email [email protected]


Topic: Computer System Architecture 15 By EDOUARD MVOGO

Figure 10. The Instruction Cycle

V.1.1 The Fetch Cycle

During this cycle, the instruction, which is to be executed next, is fetched from the memory
to the processor. The steps performed during the fetch cycle are as follows:

1. The program counter (PC) keeps track of the memory location of the next instruction.
2. This address is transferred from PC to MAR.
3. The instruction is read from the memory.
4. Then, the PC is incremented by 1 (PC = PC + 1) and instruction so obtained is
transferred to the IR.
5. In the IR, the unique bit patterns that make up machine language are extracted and
sent to the decoder.

V.1.2 The Decode Cycle

The decode cycle is responsible for recognizing the operation that the bit pattern represents
and activating the correct circuitry to perform that operation. The steps performed during the
decode cycle are as follows:

1. The operation code (op-code) of the instruction is first read, and then interpreted in
the machine language.
2. The data required by the instruction (operand) are then transferred to the data register
(DR).

V.1.3 The Execute Cycle

Once the instruction has been decoded, the operation specified by the op-code is performed
on user-provided data in ALU. The execution cycle involves following steps:

1. The data is fetched into ALU from the memory location pointed by memory address
register.
2. The operation specified by the decoded op-code is performed on the data in ALU.

V.1.4 The Store Cycle

You can contact me through this email [email protected]


Topic: Computer System Architecture 16 By EDOUARD MVOGO

After the fetch, decode and execute cycles have executed, the results are ready to be stored.
The steps involved in the store cycle are as follows:

1. The results from the execution cycle are stored in the memory buffer register.
2. Then, the results from the memory buffer register are stored back in the memory.

V.2 Fixed and Variable length Instructions


Instructions are translated to machine code. In some architecture all machine code
instructions are the same length i.e. fixed length. In other architectures, different instructions
may be translated into variable lengths in machine code.

This is the situation with 8086 instructions which range from one byte to a maximum of 6
bytes in length. Such instructions are called variable length instructions and are commonly
used on CISC machines. The advantage of using such instructions, is that each instruction
can use exactly the amount of space it requires, so that variable length instructions
reduce the amount of memory space required for a program.

On the other hand, it is possible to have fixed length instructions, where as the name
suggests, each instruction has the same length. Fixed length instructions are commonly used
with RISC processors such as the PowerPC and Alpha processors. Since each instruction
occupies the same amount of space, every instruction must be long enough to specify a
memory operand, even if the instruction does not use one. Hence, memory space is wasted by
this form of instruction.

VI. INSTRUCTION SET

The processors are built with the ability to execute a limited set of basic operations. The
collections of these operations are known as the processor's instruction set. An instruction
set is necessary so that a user can create machine language programs to perform any logical
and/or mathematical operations. The instruction set is hardwired (embedded) in the
processor, which determines the machine language for the processor. The more complicated
the instruction set, the slower the processor works.

Processors differ from one another by their instruction sets. If the same program can run on
two different processors, they are said to be compatible. For example, programs written for
IBM computers may not run on Apple computers because these two architectures (different
processors) are not compatible. Since each processor has its unique instruction set, machine
language programs written for one processor will normally not run on a different processor.
Based upon the instruction sets, there are two common types of architectures: complex
instruction set computer (CISC) and reduced instruction set computer (RISC).

VI.1 CISC Architecture


To make compiler development easier, complex instruction set computer (CISC) was
developed. The sole motive of manufacturers of CISC-based processor was to manufacture

You can contact me through this email [email protected]


Topic: Computer System Architecture 17 By EDOUARD MVOGO

processors with more extensive and complex instruction set. It shifted most of the burden of
generating machine instructions to the processor. For example, instead of making a compiler
to write long machine instructions for calculating a square root, a CISC processor would
incorporate a hardwired circuitry for performing the square root in a single step. Writing
instructions for a CISC processor is comparatively easy because a single instruction is
sufficient to utilize the built-in ability. Most of the PCs today include a CISC processor.

VI.2 RISC Architecture


Reduced instruction set computer (RISC) is a processor architecture that utilizes a small,
highly optimized set of instructions. The concept behind RISC architecture is that a small
number of instructions are faster in execution as compared to a single long instruction. To
implement this, RISC architecture simplifies the instruction set of the processor, which helps
in reducing the execution time. Optimization of each instruction in the processor is done
through a technique known as pipelining. Pipelining allows the processor to work on
different steps of the instruction at the same time; using this technique, more instructions can
be executed in a shorter period.

As each instruction is executed directly via the processor, no hardwired circuitry (used for
complex instructions) is required. This allows RISC processors to be smaller, consume less
power and run cooler than CISC processors. Due to these advantages, RISC processors are
ideal for embedded applications such as mobile phones, PDAs and digital cameras. In
addition, the simple design of an RISC processor reduces its development time as compared
to a CISC processor.

VI.3 Comparing CISC and RISC Architectures


Basic CISC RISC
Instruction Set Complex instructions Simple instructions
Program Code Size Smaller Lengthier
Processor Size Increased hardwired circuitry Reduced hardwired circuitry
leads to increased processor leads to reduced processor size
size
Memory Usage Less memory intensive More memory intensive
Power More power Less power
Consumption
Heating More heat Less heat
Fixed length instructions Variable length intructions

VII. INTERRUPT AND POOLING

VII.1 What is an Interrupt?

An interrupt is an event external to the currently executing process that causes a


change in the normal flow of instruction execution; usually generated by hardware devices

You can contact me through this email [email protected]


Topic: Computer System Architecture 18 By EDOUARD MVOGO

Hardware interrupts are used by devices to communicate that they require attention from the
operating system. Some common examples are a hard disk signaling that is has read a series
of data blocks, or that a network device has processed a buffer containing network packets.
Interrupts are also used for asynchronous events, such as the arrival of new data from an
external network. Hardware interrupts are delivered directly to the CPU using a small
network of interrupt management and routing devices. This chapter describes the different
types of interrupt and how they are processed by the hardware and by the operating system. It
also describes how the MRG Realtime kernel differs from the standard kernel in handling the
types of interrupt.

Hardware interrupts are referenced by an interrupt number. These numbers are mapped back
to the piece of hardware that created the interrupt. This enables the system to monitor which
device created the interrupt and when it occurred.

In most computer systems, interrupts are handled as quickly as possible. When an interrupt is
received, any current activity is stopped and an interrupt handler is executed. The handler
will preempt any other running programs and system activities, which can slow the entire
system down, and create latencies.

VII.2 Non-Maskable Interupt (NMI)

There exist two type of interrupt: maskable and non-maskable interrupts

An interrupt is said to be masked when it has been disabled, or when the CPU has been
instructed to ignore it. A non-maskable interrupt (NMI) cannot be ignored, and is generally
used only for critical hardware errors.

NMIs are normally delivered over a separate interrupt line. When an NMI is received by the
CPU, it indicates that a critical error has occurred, and that the system is probably about to
crash. The NMI is generally the best indication of what might have caused the problem.

Because NMIs are not able to be ignored, they are also used by some systems as a hardware
monitor. The device sends a stream of NMIs, which are checked by an NMI handler in the
processor. If certain conditions are met - such as an interrupt not being triggered after a
specified length of time - the NMI handler can produce a warning and debugging information
about the problem. This helps to identify and prevent system lockups.

Fig 11: maskable and non maskable interupt

You can contact me through this email [email protected]


Topic: Computer System Architecture 19 By EDOUARD MVOGO

VII.3  System Management Interrupts (SMI)

System management interrupts (SMIs) are used to offer extended functionality, such as
legacy hardware device emulation. They can also be used for system management tasks.
SMIs are similar to NMIs in that they use a special electrical signalling line directly into the
CPU, and are generally not able to be masked.
When an SMI is received, the CPU will enter System Management Mode (SMM). In this
mode, a very low-level handler routine is run to handle the SMIs. The SMM is typically
provided directly from the system management firmware, often the BIOS or the EFI.

VII.4  Advanced programmable interrupt controller

The advanced programmable interrupt controller (APIC) was developed by Intel® to


provide the ability to handle large amounts of interrupts, to allow each of these to be
programmatically routed to a specific set of available CPUs (and for this to be changed
accordingly), to support inter-CPU communication, and to remove the need for a large
number of devices to share a single interrupt line.
APIC represents a series of devices and technologies that work together to generate, route,
and handle a large number of hardware interrupts in a scalable and manageable way. It uses a
combination of a local APIC built into each system CPU, and a number of Input/Outpt APICs
that are connected directly to hardware devices. When a hardware device generates an
interrupt, it is detected by the IO-APIC it is connected to, and then routed across the system
APIC bus to a particular CPU. The operating system knows which IO-APIC is connected to
which device, and to which particular interrupt line within that device because of a
combination of information sources.

VII.5 Polling
Polling, or polled operation, in computer science, refers to actively sampling the status of an
external device by a client program as a synchronous activity. Polling is most often used in
terms of input/output (I/O), and is also referred to as polled I/O or software-driven I/O.
Polling is sometimes used synonymously with busy-wait polling (busy waiting). In this
situation, when an I/O operation is required, the computer does nothing other than check the
status of the I/O device until it is ready, at which point the device is accessed. Polling has the
disadvantage that if there are too many devices to check, the time required to poll them can
exceed the time available to service the I/O device.

VIII. OTHER TYPES OF ARCHITECHTURE

XI.2 The system bus model.


There are many animals (e.g. cat or dog) whose internal organs are hung from a backbone
that conveniently happens to be horizontal. Bus-based computers are structured like that -
processors and memory are connected to a backbone bus that acts as a "superhighway" for
data or instructions to move between processors and memory. In practice, the bus architecture
has the same components as the von Neumann architecture, but they are arranged along a
bus, as shown in Figure 2.

You can contact me through this email [email protected]


Topic: Computer System Architecture 20 By EDOUARD MVOGO

Figure 12. Schematic diagram of a system bus architecture

In principle, the bus computer solves the von Neumann bottleneck problem by using a fast
bus. In practice, the bus is rarely fast enough to support I/O for the common case (90 percent
of practical applications), and bus throughput can be significantly reduced under large
amounts of data.

III.3 Multiprocessor or Parallel Architecture.

Recall the old saying, "Many hands make less work." In computers, the use of many
processors together reduces the amount of time required to perform the work of solving a
given problem. Due to I/O and routing overhead, this efficiency is sublinear in the number of
processors. That is, if W(N) [or T(N)] denotes the work [or time to perform the work]
associated with N processors, then the following relationships hold in practice:

W(N) < N · W(1) and T(N) > T(1)/N .

The first equation means that the work performed by N processors working on a task, where
each processor performs work W(1) [the work of one processor in a sequential computation
paradigm], will be slightly less than N times W(1). Note that we use "<" instead of "="
because of the overhead required to

- divide up the problem,


- assign the parts of the problem to N processors,
- collect the partial results from the N processors, and
- combine the partial results to achieve a whole, coherent result.

The second equation means essentially the same thing as the first equation, but the work is
replaced by time. Here, we are saying that if one processor takes time T(1) to solve a
problem, then that same problem solved on an N-processor architecture will take time slightly
greater than T(1)/N, assuming all the processors work together at the same time. As in the
preceding paragraph, this discrepancy is due to the previously-described overhead.

You can contact me through this email [email protected]


Topic: Computer System Architecture 21 By EDOUARD MVOGO

Figure 13.Schematic diagram of multiprocessor architecture

This is a simple architecture that is useful for solving selected types of compute-intensive
problems. However, if you try to solve data-intensive problems on such an architecture, you
encounter the von Neumann bottleneck trying to read and write the large amount of data from
and to the shared memory.

Figure 14.Schematic diagram of multiprocessor architecture with shared memory, where


each CPU also has its own fast, local memory

To help solve the problem of bus contention inherent in shared-memory multiprocessors,


computer scientists developed the mixed model of parallel processing, in which the CPUs
have small, very fast local memories that communicate with the CPU via a very fast, short
bus. Because the bus is short, there are fewer impedance problems and the bus bandwidth can
be increased. This migrates the von Neumann bottleneck closer to the CPU, and alleviates
some of the demands on shared memory. Local memory architectures are useful for problems
that have data locality, where :

- each CPU can solve part of a problem with part of the problem's data, and
- there is little need for data interchange between processors.

A special type of multiprocessor is called a distributed processor, in which the individual


CPUs (called compute servers) are connected to storage servers by a network. You can make
the multiprocessor diagrammed schematically in Figure 4 into a distributed computer by
replacing with a network the bus that connects the CPUs to each other and to the shared
memory.

IX. MODERN MICROPROCESSOR ARCHITECTURE

IX.1 Pipelining & Instruction-Level Parallelism


Consider how an instruction is executed – first it is fetched, then decoded, then executed by
the appropriate functional unit, and finally the result is written into place. With this scheme, a
simple processor might take 4 cycles per instruction (CPI = 4)...

You can contact me through this email [email protected]


Topic: Computer System Architecture 22 By EDOUARD MVOGO

Figure 15 – The instruction flow of a sequential processor.

IX.1.1 Simple pipelining

Modern processors overlap these stages in a pipeline, like an assembly line. While one
instruction is executing, the next instruction is being decoded, and the one after that is being
fetched... Pipelining is an implementation technique where multiple instructions are
overlapped in execution.

Figure 16 – The instruction flow of a pipelined processor

Now the processor is completing 1 instruction every cycle (CPI = 1). This is a four-fold
speedup without changing the clock speed at all.

The pipeline designer’s goal is to balance the length of each pipeline stage. If the stages are
perfectly balanced, then the time per instruction on the pipelined machine is equal to
Time per instruction on unpipelined machine
Number of pipe stages

Non-Pipelined
Instruction
Order
0 200 400 600 800 1000 1200 1400 1600 1800
Time
Instruction REG REG
lw $1, 100($0) ALU MEM
Fetch RD WR
Instruction REG REG
lw $2, 200($0) Fetch
ALU MEM
RD WR
800ps
Instruction
lw $3, 300($0)
Fetch
800ps
800ps

Pipelined0
Instruction 200 400 600 800 1000 1200 1400 1600
Time
Order
Instruction REG REG
lw $1, 100($0) Fetch
ALU MEM
RD WR
Instruction REG REG
lw $2, 200($0) Fetch
ALU MEM
RD WR
200ps
Instruction REG REG
lw $3, 300($0) Fetch
ALU MEM
RD WR
200ps
200ps 200ps 200ps 200ps 200ps

Fig 17: comparison pipelined and unpipelined architecture

Limit of pipelining

You can contact me through this email [email protected]


Topic: Computer System Architecture 23 By EDOUARD MVOGO

Limits to pipelining: Hazards prevent next instruction from executing during its designated
clock cycle

– Structural hazards: two different instructions use same h/w in same cycle
– Data hazards: Instruction depends on result of prior instruction still in the pipeline
– Control hazards: Pipelining of branches & other instructions that change the PC

IX.1.2 Deeper Pipelines – Superpipelining

Since the clock speed is limited by (among other things) the length of the longest stage in the
pipeline, the logic gates that make up each stage can be subdivided, especially the longer
ones, converting the pipeline into a deeper super-pipeline with a larger number of shorter
stages. Then the whole processor can be run at a higher clock speed! Of course, each
instruction will now take more cycles to complete (latency), but the processor will still be
completing 1 instruction per cycle (throughput), and there will be more cycles per second, so
the processor will complete more instructions per second (actual performance)...

Figure 18 – The instruction flow of a superpipelined processor

IX.1.3 Multiple Issue – Superscalar

Since the execute stage of the pipeline is really a bunch of different functional units, each
doing its own task, it seems tempting to try to execute multiple instructions in parallel, each
in its own functional unit. To do this, the fetch and decode/dispatch stages must be enhanced
so that they can decode multiple instructions in parallel and send them out to the "execution
resources"... A superscalar CPU architecture implements a form of parallelism called
instruction level parallelism within a single processor. It therefore allows faster CPU
throughput than would otherwise be possible at a given clock rate.

Figure 8 – The instruction flow of a superscalar processor.

You can contact me through this email [email protected]


Topic: Computer System Architecture 24 By EDOUARD MVOGO

IX.2 Multiprocessor system


A computer system, which includes only one processor is called a single-processor
system. Computer systems that include more than one processor are called multiprocessor
systems or parallel systems

In a multiprocessing system, two or more independent processors are linked together in a


coordinated manner. In such systems, instructions from different and independent programs
can be processed at the same time, by different processors. Independent processors in such a
system are connected by a high-speed system bus, with each processor having its own cache
to improve the performance. Multiprocessor systems have the advantage of redundancy
(fault-tolerance); if one processor fails, then the system keeps on functioning, though at a
slower speed.

IX.3 Parallel vs Distributed Processing

Most computers have just one CPU, but some models have several. There are even computers
with thousands of CPUs. With single-CPU computers, it is possible to perform parallel
processing by connecting the computers in a network. However, this type of parallel
processing requires very sophisticated software called distributed processing software.

Note that parallel processing differs from multitasking, in which a single CPU executes
several programs at once. Parallel processing is also called parallel computing.

IX.4 Examples of modern processor

Intel has been making class leading processors for computers for a long time now. They
overcame a period where AMD reigned as king by releasing their lineup of Core 2 processors
in 2006. Now Intel has a new line of processors called the Core I series. The i3, the i5 and the
i7 are the new kids on the block. This guide will help you understand, which one is right for
you.

CORE i3 is a basic level processor type of the new generation launched by Intel
All Core i3s are dual-core processors. They have a Clock speeds ranging from 2.93 to 3.06
GHz They possess 3MB of cache It is only a dual core, supports Hyper-Threading. It can
actually serve two threads per core i.e total 4 threads. The point to be noted is that Integrated
graphics processor of i3 processor is restricted to a maximum clock speed of 1100 MHz
32 nm Silicon (less heat and energy) and they are the cheapest in the lot.

CORE i5-They have 2 categories. Dual core and Quad core. Lets talk about both of them in
a nutshell.

– i5-Dual core has a 32 nm fabrication? 4 MB cache is present in i5. The range of their
clock speeds is between 3.2 to 3.6 GHz for Dual cores. Just like Core i3′s have Hyper

You can contact me through this email [email protected]


Topic: Computer System Architecture 25 By EDOUARD MVOGO

threading support & Integrated graphics processor. i5 has the support of remarkable
Turbo Boost technology
– i5–Quad Core. Quad cores clock speeds of 2.4 and 2.66 GHz.
Turbo Boost technology Supported, don’t support Hyper threading and don’t have a
Integrated Graphics Processor. They have 6MB-8MB of cache

COREi7-High end processor. They are also the fastest and the most expensive in the lot.
Total 4 Cores are present as they are quad core. The range of clock speeds is 1.06 GHz to
3.20 GHz, 8MB of cache is given. Turbo Boost technology Supported, Hyper-Threading
support+ a total of eight threads simultaneously. The IGP [Integrated Graphics] on Core i7
processors can also reach a higher maximum clock speed of 1350 MHz
32-45 nm Silicon (less heat and energy)

You can contact me through this email [email protected]

You might also like