Computer Archi Midterm
Computer Archi Midterm
Computer architecture refers to those attributes of a system visible to a programmer or, put another
way, those attributes that have a direct impact on the logical execution of a program. Computer
organization refers to the operational units and their interconnections that realize the architectural
specifications. Examples of architectural attributes include the instruction set, the number of bits used
to represent various data types (e.g., numbers, characters), I/O mechanisms, and techniques for
addressing memory. Organizational attributes include those hardware details transparent to the
programmer, such as control signals; interfaces between the computer and peripherals; and the
memory technology used. For example, it is an architectural design issue whether a computer will
have a multiply instruction. It is an organizational issue whether that instruction will be implemented
by a special multiply unit or by a mechanism that makes repeated use of the add unit of the system.
The organizational decision may be based on the anticipated frequency of use of the multiply
instruction, the relative speed of the two approaches, and the cost and physical size of a special
multiply unit. Historically, and still today, the distinction between architecture and organization has
been an important one. Many computer manufacturers offer a family of computer models, all with
the same architecture but with differences in organization. Consequently, the different models in the
family have different price and performance characteristics. Furthermore, a particular architecture
may span many years and encompass a number of different computer models, its organization
changing with changing technology. A prominent example of both these phenomena is the IBM
System/370 architecture. This architecture was first introduced in 1970 and included a number of
models. The customer with modest requirements could buy a cheaper, slower model and, if demand
increased, later upgrade to a more expensive, faster model without having to abandon software that
had already been developed. Over the years, IBM has introduced many new models with improved
technology to replace older models, offering the customer greater speed, lower cost, or both. These
newer models retained the same architecture so that the customer’s software investment was
protected. Remarkably, the System/370 architecture, with a few enhancements, has survived to this
day as the architecture of IBM’s mainframe product line.
REFERENCE
http://home.ustc.edu.cn/~louwenqi/reference_books_tools/Computer-Organization-and-Architecture-
9th-Edition-William-Stallings2012.pdf
HISTORY OF COMPUTERS
In particular, when viewing the movies you should look for two things:
First Computers
The first substantial computer was the giant ENIAC machine by John W. Mauchly and
J. Presper Eckert at the University of Pennsylvania. ENIAC (Electrical Numerical
Integrator and Calculator) used a word of 10 decimal digits instead of binary ones like
previous automated calculators/computers. ENIAC was also the first machine to use
more than 2,000 vacuum tubes, using nearly 18,000 vacuum tubes. Storage of all those
vacuum tubes and the machinery required to keep the cool took up over 167 square
meters (1800 square feet) of floor space. Nonetheless, it had punched-card input and
output and arithmetically had 1 multiplier, 1 divider-square rooter, and 20 adders
employing decimal "ring counters," which served as adders and also as quick-access
(0.0002 seconds) read-write register storage.
The executable instructions composing a program were embodied in the separate units
of ENIAC, which were plugged together to form a route through the machine for the
flow of computations. These connections had to be redone for each different problem,
together with presetting function tables and switches. This "wire-your-own"
instruction technique was inconvenient, and only with some license could ENIAC be
considered programmable; it was, however, efficient in handling the particular
programs for which it had been designed. ENIAC is generally acknowledged to be the
first successful high-speed electronic digital computer (EDC) and was productively
used from 1946 to 1955. A controversy developed in 1971, however, over the
patentability of ENIAC's basic digital concepts, the claim being made that another U.S.
physicist, John V. Atanasoff, had already used the same ideas in a simpler vacuum-
tube device he built in the 1930s while at Iowa State College. In 1973, the court found
in favor of the company using Atanasoff claim and Atanasoff received the acclaim he
rightly deserved.
Progression of Hardware
In the 1950's two devices would be invented that would improve the computer field
and set in motion the beginning of the computer revolution. The first of these two
devices was the transistor. Invented in 1947 by William Shockley, John Bardeen, and
Walter Brattain of Bell Labs, the transistor was fated to oust the days of vacuum tubes
in computers, radios, and other electronics.
The vacuum tube, used up to this time in almost all the
computers and calculating machines, had been invented
by American physicist Lee De Forest in 1906. The
vacuum tube, which is about the size of a human thumb,
worked by using large amounts of electricity to heat a
filament inside the tube until it was cherry red. One
result of heating this filament up was the release of
Vaccum Tubes
electrons into the tube, which could be controlled by
other elements within the tube. De Forest's original device was a triode, which could
control the flow of electrons to a positively charged plate inside the tube. A zero could
then be represented by the absence of an electron current to the plate; the presence of a
small but detectable current to the plate represented a one.
In 1958, this problem too was solved by Jack St. Clair Kilby of Texas Instruments. He
manufactured the first integrated circuit or chip. A chip is really a collection of tiny
transistors which are connected together when the transistor is manufactured. Thus,
the need for soldering together large numbers of transistors was practically nullified;
now only connections were needed to other electronic components. In addition to
saving space, the speed of the machine was now increased since there was a
diminished distance that the electrons had to follow.
Mainframes to PCs
The 1960s saw large mainframe computers become much more common in large
industries and with the US military and space program. IBM became the unquestioned
market leader in selling these large, expensive, error-prone, and very hard to use
machines.
A veritable explosion of personal computers occurred in the early 1970s, starting with
Steve Jobs and Steve Wozniak exhibiting the first Apple II at the First West Coast
Computer Faire in San Francisco. The Apple II boasted built-in BASIC programming
language, color graphics, and a 4100 character memory for only $1298. Programs and
data could be stored on an everyday audio-cassette recorder. Before the end of the fair,
Wozniak and Jobs had secured 300 orders for the Apple II and from there Apple just
took off.
Also introduced in 1977 was the TRS-80. This was a home computer manufactured by
Tandy Radio Shack. In its second incarnation, the TRS-80 Model II, came complete
with a 64,000 character memory and a disk drive to store programs and data on. At
this time, only Apple and TRS had machines with disk drives. With the introduction of
the disk drive, personal computer applications took off as a floppy disk was a most
convenient publishing medium for distribution of software.
IBM, which up to this time had been producing mainframes and minicomputers for
medium to large-sized businesses, decided that it had to get into the act and started
working on the Acorn, which would later be called the IBM PC. The PC was the first
computer designed for the home market which would feature modular design so that
5|Page Computer Architecture and Organization
WEEK 2: History of Computers and the Evolution of Intel Microprocessors
pieces could easily be added to the architecture. Most of the components, surprisingly,
came from outside of IBM, since building it with IBM parts would have cost too much
for the home computer market. When it was introduced, the PC came with a 16,000
character memory, keyboard from an IBM electric typewriter, and a connection for
tape cassette player for $1265.
By 1984, Apple and IBM had come out with new models. Apple released the first
generation Macintosh, which was the first computer to come with a graphical user
interface(GUI) and a mouse. The GUI made the machine much more attractive to home
computer users because it was easy to use. Sales of the Macintosh soared like nothing
ever seen before. IBM was hot on Apple's tail and released the 286-AT, which with
applications like Lotus 1-2-3, a spreadsheet, and Microsoft Word, quickly became the
favourite of business concerns.
That brings us up to about ten years ago. Now people have their own personal
graphics workstations and powerful home computers. The average computer a person
might have in their home is more powerful by several orders of magnitude than a
machine like ENIAC. The computer revolution has been the fastest growing
technology in man's history.
Introduction
The world of electrical and electronics is quite closely related, yet sometimes far apart
in terms of size and magnitude. Whereas electrical equipment could be mostly
bulky electrical motors of various types, the electronic items are normally in the
shorter version. In this article we will study one such aspect namely the
microprocessor which, apart from several other functions, also serves as the heart
of the computer on which you are probably reading this article.
The first microprocessor was introduced in the year 1971. It was introduced by Intel
and was named Intel 4004.
Intel 4004 is a 4 bit microprocessor and it was not a powerful microprocessor. It can
perform addition and subtraction operation on 4 bits at a time.
The greatest advantage of the above processors are that it do not contain Floating point
instructions. Here floating point refers to the radix point or decimal point. For
example: 123.456 is a floating point representation. Processors such as 8085 and 8086
do not support such representations and instructions.
Intel later introduced 8087 processor which was the first math co-processor and later
the 8088 processor which was incorporated into IBM personal computers.
8|Page Computer Architecture and Organization
WEEK 2: History of Computers and the Evolution of Intel Microprocessors
As the years progressed lots of processors from 8088,80286,80386,80486,Pentium II,
Pentium III, Pentium IV and now Core2Duo,Dual Core and Quad core processors are
the latest in the market.
Basic Microprocessor
· Registers
· Arithmetic and Logic Unit (ALU)
· Control Logic
· Instruction register
· Program counter
· Bus.
Arithmetic and Logic Unit:
It is the computational unit of microprocessor. It performs arithmetic and logical
operations on various data. Whenever there is a need to perform an operation on a
data then the data is sent to ALU to perform the necessary function.
Registers:
Registers may be called as the Internal Storage device. Input data, Output data and
various other binary data is stored in this unit for further processing.
Control Unit:
Control unit as the name specifies controls the flow of data and signals in the
microprocessor. It generates the necessary control signals for various data that are fed
to microprocessor.
Advantages of Microprocessor:
It is cheap and cost of manufacture is low.
High Reliability
High Versatility
What is a Microprocessor?
Advantages of a Microprocessor
Low Cost
Microprocessors are available at low cost due to integrated circuit technology.
Which will reduce the cost of a computer system.
High Speed
Microprocessor chips can work at very high speed due to the technology
involved in it. It is capable of executing millions of instructions per second.
Small Size
Due to very large scale and ultra large scale integration technology, a
microprocessor is fabricated in a very less footprint. This will reduce the size of
the entire computer system.
Versatile
Microprocessors are very versatile, the same chip can be used for a number of
applications by simply changing the program (instructions stored in the
memory).
Here are some common terms that we will use in microprocessor field.
Bus
Instruction Set
Word Length
Word Length is the number of bits in the internal data bus of a processor or it is the
number of bits a processor can process at a time. For eg. An 8-bit processor will have
an 8-bit data bus, 8-bit registers and will do 8-bit processing at a time. For doing higher
bits (32-bit, 16-bit) operations, it will split that into a series of 8-bit operations.
Cache Memory
Cache memory is a random access memory that is integrated into the processor. So the
processor can access data in the cache memory more quickly than from a regular RAM.
It is also known as CPU Memory. Cache memory is used to store data or instructions
that are frequently referenced by the software or program during the operation. So it
will increase the overall speed of the operation.
Clock Speed
Microprocessors uses a clock signal to control the rate at which instructions are
executed, synchronize other internal components and to control the data transfer
between them. So clock speed refers to the speed at which a microprocessor executes
Classification of Microprocessors
Hope you read about word length above. So based on the word length of a processor
we can have 8-bit, 16-bit, 32-bit and 64-bit processors.
RISC is a type of microprocessor architecture which uses small, general purpose and
highly optimized instruction set rather than more specialized set of instructions found
in others. RISC offers high performance over its opposing architecture CISC (see
below). In a processor, execution of each instruction require a special circuit to load
and process the data. So by reducing instructions, the processor will be using simple
circuits and faster in operation.
CISC is the opposing microprocessor architecture for RISC. It is made to reduce the
number of instructions per program, ignoring the number of cycles per instruction. So
complex instructions are directly made into hardware making the processor complex
and slower in operation.
This architecture is actually designed to reduce the cost of memory by reducing the
program length.
There are some processors which are designed to handle some specific functions.
IMPLEMENTATION OF INTERRUPTS
TYPES
Interrupt signals may be issued in response to hardware or software events. These are
classified as hardware interrupts or software interrupts, respectively. For any
particular processor, the number of hardware interrupts is limited by the number of
interrupt request (IRQ) signals to the processor, whereas the number of software
interrupts is determined by the processor design.
7|Page Computer Architecture and Organization
WEEK 3: Working Principles of Microprocessor and Implementation of Interrupts
Hardware interrupts
A hardware interrupt request (IRQ) is an electronic signal issued by an external (to the
processor) hardware device, to communicate that it needs attention from the operating
system (OS) or, if there is no OS, from the "bare-metal" program running on the CPU.
Such external devices may be part of the computer (e.g., disk controller) or they may
be external peripherals. For example, pressing a keyboard key or moving
the mouse triggers hardware interrupts that cause the processor to read the keystroke
or mouse position.
Unlike software interrupts, hardware interrupts can arrive asynchronously with
respect to the processor clock, and at any time during instruction execution.
Consequently, all hardware interrupt signals are conditioned by synchronizing them
to the processor clock, and acted upon only at instruction execution boundaries.
In many systems, each device is associated with a particular IRQ signal. This makes it
possible to quickly determine which hardware device is requesting service, and to
expedite servicing of that device.
Masking
Processors typically have an internal interrupt mask register which allows selective
enabling and disabling of hardware interrupts. Each interrupt signal is associated with
a bit in the mask register; the interrupt is enabled when the bit is set and disabled
when the bit is clear, or vice versa. When the interrupt is disabled, the associated
interrupt signal will be ignored by the processor. Signals which are affected by the
mask are called maskable interrupts.
Some interrupt signals are not affected by the interrupt mask and therefore cannot be
disabled; these are called non-maskable interrupts (NMI). NMIs indicate high priority
events which cannot be ignored under any circumstances, such as the timeout signal
from a watchdog timer.
Spurious interrupts
A spurious interrupt is the occurrence of a false interrupt request signal. These are
typically short-lived, invalid signal levels which are generated by electrical
interference or malfunctioning devices.
Software interrupts
A software interrupt is requested by the processor itself upon executing particular
instructions or when certain conditions are met. Every software interrupt signal is
associated with a particular interrupt handler.
TRIGGERING METHODS
Each interrupt signal input is designed to be triggered by either a logic signal level or a
particular signal edge (level transition). Level-sensitive inputs continuously request
processor service so long as a particular (high or low) logic level is applied to the input.
Edge-sensitive inputs react to signal edges: a particular (rising or falling) edge will
cause a service request to be latched; the processor resets the latch when the interrupt
handler executes.
Level-triggered inputs allow multiple devices to share a common interrupt signal via
wired-OR connections. The processor polls to determine which devices are requesting
service. After servicing a device, the processor may again poll and, if necessary, service
other devices before exiting the ISR.
Edge-triggered
An edge-triggered interrupt is an interrupt signaled by a level transition on the interrupt
line, either a falling edge (high to low) or a rising edge (low to high). A device wishing
to signal an interrupt drives a pulse onto the line and then releases the line to its
inactive state. If the pulse is too short to be detected by polled I/O then special
hardware may be required to detect it.
PROCESSOR RESPONSE
The processor samples the interrupt trigger signal during each instruction cycle, and
will respond to the trigger only if the signal is asserted when sampling occurs.
Regardless of the triggering method, the processor will begin interrupt processing at
the next instruction boundary following a detected trigger, thus ensuring:
SYSTEM IMPLEMENTATION
REFERENCES:
https://electrosome.com/microprocessor/
https://en.m.wikipedia.org/wiki/Interrupt
• Memory: Typically, a memory module will consist of N words of equal length. Each
word is assigned a unique numerical address (0, 1, . . . , N – 1). A word of data can be
read from or written into the memory.The nature of the operation is indicated by read
and write control signals. The location for the operation is specified by an address.
• I/O module: From an internal (to the computer system) point of view, I/O
is functionally similar to memory. There are two operations, read and write.
Further, an I/O module may control more than one external device. We can refer to
each of the interfaces to an external device as a port and give each a unique address
(e.g., 0, 1,…, M – 1). In addition, there are external data paths for the input and output
of data with an external device. Finally, an I/O module may be able to send interrupt
signals to the processor.
• Processor: The processor reads in instructions and data, writes out data
after processing, and uses control signals to control the overall operation of the
system. It also receives interrupt signals. The preceding list defines the data to be
exchanged. The interconnection structure must support the following types of
transfers:
BUS INTERCONNECTION
BUS STRUCTURE
Each line is assigned a particular meaning or function.
The lines can be classified into 3 functional groups
1. Data line
2.Addressline
3.Control line
1. DATA LINE
Used to control the access to and the use of the data and address lines
Since the data and the address line shared by all the components, there must be a
means of controlling their use
Control signal transmit both commands and timing information between the
modules
Typical control lines include
1. Memory write
2. Memory read
3. I/O write
4. I/O read
5. Clock
6. Reset
7. Bus request
8. Bus grant
9. Interrupt request
10. Interrupt ACK
11. Transfer ACK
These specifications represent the most common version of PCI used in normal PCs:
A PCI-X Gigabit Ethernet expansion card with both 5 V and 3.3 V support
notches, side B toward the camera
Typical PCI cards have either one or two key notches, depending on their signaling
voltage. Cards requiring 3.3 volts have a notch 56.21 mm from the card backplate;
those requiring 5 volts have a notch 104.47 mm from the backplate. This allows cards
to be fitted only into slots with a voltage they support. "Universal cards" accepting
either voltage have both key notches.
Connector pinout
The PCI connector is defined as having 62 contacts on each side of the edge connector,
but two or four of them are replaced by key notches, so a card has 60 or 58 contacts on
each side. Side A refers to the 'solder side' and side B refers to the 'component side': if
1 −12 V TRST#
2 TCK +12 V
JTAG port pins (optional)
3 Ground TMS
4 TDO TDI
5 +5 V +5 V
6 +5 V INTA#
8 INTD# +5 V
12 Ground Ground
Key notch for 3.3 V-capable cards
13 Ground Ground
20 AD[31] AD[30]
21 AD[29] +3.3 V
22 Ground AD[28]
23 AD[27] AD[26]
24 AD[25] Ground
25 +3.3 V AD[24]
Address/data bus (upper half)
26 C/BE[3]# IDSEL
27 AD[23] +3.3 V
28 Ground AD[22]
29 AD[21] AD[20]
30 AD[19] Ground
31 +3.3 V AD[18]
32 AD[17] AD[16]
33 C/BE[2]# +3.3 V
44 C/BE[1]# AD[15]
45 AD[14] +3.3 V
46 Ground AD[13]
Address/data bus (higher half)
47 AD[12] AD[11]
48 AD[10] Ground
50 Ground Ground
Key notch for 5 V-capable cards
51 Ground Ground
52 AD[08] C/BE[0]#
54 +3.3 V AD[06]
55 AD[05] AD[04]
56 AD[03] Ground
57 Ground AD[02]
58 AD[01] AD[00]
59 IOPWR IOPWR
61 +5 V +5 V
62 +5 V +5 V
64-bit PCI extends this by an additional 32 contacts on each side which provide
AD[63:32], C/BE[7:4]#, the PAR64 parity signal, and a number of power and ground
pins.
Legend
Most lines are connected to each slot in parallel. The exceptions are:
Each slot has its own REQ# output to, and GNT# input from the motherboard
arbiter.
Each slot has its own IDSEL line, usually connected to a specific AD line.
TDO is daisy-chained to the following slot's TDI. Cards without JTAG support must
connect TDI to TDO so as not to break the chain.
PRSNT1# and PRSNT2# for each slot have their own pull-up resistors on the
motherboard. The motherboard may (but does not have to) sense these pins to
determine the presence of PCI cards and their power requirements.
REQ64# and ACK64# are individually pulled up on 32-bit only slots.
13 | P a g e Computer Architecture and Organization
WEEK 4: Computer Interconnection Structures, Bus Interconnection, PCI
The interrupt lines INTA# through INTD# are connected to all slots in different
orders. (INTA# on one slot is INTB# on the next and INTC# on the one after that.)
Notes:
IOPWR is +3.3 V or +5 V, depending on the backplane. The slots also have a ridge in
one of two places which prevents insertion of cards that do not have the
corresponding key notch, indicating support for that voltage standard. Universal
cards have both key notches and use IOPWR to determine their I/O signal levels.
The PCI SIG strongly encourages 3.3 V PCI signaling, requiring support for it since
standard revision 2.3, but most PC motherboards use the 5 V variant. Thus, while
many currently available PCI cards support both, and have two key notches to
indicate that, there are still a large number of 5 V-only cards on the market.
The M66EN pin is an additional ground on 5 V PCI buses found in most PC
motherboards. Cards and motherboards that do not support 66 MHz operation also
ground this pin. If all participants support 66 MHz operation, a pull-up resistor on
the motherboard raises this signal high and 66 MHz operation is enabled. The pin is
still connected to ground via coupling capacitors on each card to preserve
its AC shielding function.
The PCIXCAP pin is an additional ground on conventional PCI buses and cards. If
all cards and the motherboard support the PCI-X protocol, a pull-up resistor on the
motherboard raises this signal high and PCI-X operation is enabled. The pin is still
connected to ground via coupling capacitors on each card to preserve its AC
shielding function.
At least one of PRSNT1# and PRSNT2# must be grounded by the card. The
combination chosen indicates the total power requirements of the card (25 W, 15 W,
or 7.5 W).
SBO# and SDONE are signals from a cache controller to the current target. They are
not initiator outputs, but are colored that way because they are target inputs.
A semi-inserted PCI-X card in a 32 bit PCI slot, illustrating the necessity of the rightmost notch and the extra room on the motherboard in
order to remain backwards compatible
Most 32-bit PCI cards will function properly in 64-bit PCI-X slots, but the bus clock rate
will be limited to the clock frequency of the slowest card, an inherent limitation of
PCI's shared bus topology. For example, when a PCI 2.3, 66-MHz peripheral is
installed into a PCI-X bus capable of 133 MHz, the entire bus backplane will be limited
to 66 MHz. To get around this limitation, many motherboards have two or more
PCI/PCI-X buses, with one bus intended for use with high-speed PCI-X peripherals,
and the other bus intended for general-purpose peripherals.
Many 64-bit PCI-X cards are designed to work in 32-bit mode if inserted in shorter 32-
bit connectors, with some loss of performance. An example of this is the Adaptec 29160
64-bit SCSI interface card. However, some 64-bit PCI-X cards do not work in standard
32-bit PCI slots.[
Installing a 64-bit PCI-X card in a 32-bit slot will leave the 64-bit portion of the card
edge connector not connected and overhanging. This requires that there be no
motherboard components positioned so as to mechanically obstruct the overhanging
portion of the card edge connector.
http://tutorialbyte.com/2016/01/28/interconnection-structures/
http://casem3.blogspot.com/2016/08/bus-interconnection.html
https://techterms.com/definition/pci
https://en.wikipedia.org/wiki/Conventional_PCI#PCI_bus_transactions
Objectives:
At the end of the lesson the learner will be able to:
Define computer memory system
Identify the different types of memory
Identify the types of cache memory
Understand the design elements and principles of cache design
The computer memory is one of the most important elements in a computer system. It
stores data and instructions required during the processing of data and output results.
Storage may be required for a limited period of time, instantly or for an extended
period of time. Computer memory refers to the electronic holding place for
instructions and data where the processor can read quickly.
Memory Hierarchy
The memory is characterised on the basis of two key factors; capacity and access
time. The lesser the access time, the faster is the speed of memory.
Parameters of Memory:
The following terms are most commonly used for identifying comparative behaviour
of various memory devices and technologies.
Storage Capacity It is representative of the size of the memory. The capacity of internal
memory and main memory can be expressed in terms of number of words or bytes.
Access Modes A memory is comprised of various memory locations The information
from these memory locations can be accessed randomly. sequentially and directly.
Access Time The access time is the time required between the desired modes for a read
or write operation till the data is made available or written at the desired location.
Physical Characteristics In this respect. the devices can be categorised into four main
categories electronic, magnetic, mechanical and optical.
Permanence of Storage Its permanence is high for future use in magnetic materials.
The memory unit that communicates directly with the CPU is called main memory.
The primary memory allows the computer to store data for immediate manipulation
and to keep track of what is currently being processed. It is volatile in nature, it means
that when the power is turned off, the contents of the primary memory are lost forever.
It is also known as read/write memory, that allows CPU to read as well as write data
and instructions into it.
RAM is used for the temporary storage of input data, output data and intermediate
results. RAM is a microchip implemented using semiconductors.
(ii) Static RAM (SRAM) It retains the data as long as power is provided to the memory
chip. It needs not be ‘refreshed’ periodically. SRAM uses multiple transistors for each
memory cell. It does not use capacitor. SRAM is often used as cache memory due to its
high speed. SRAM is more expensive than DRAM.
Extended Data Output Dynamic RAM (EDO DRAM) It is a type of RAM chip. It is
used to improve the time to read content from memory and enhance the method of
access.
The data and instructions that are required during the processing of data are brought
from the secondary storage devices and stored in the RAM. For processing it is
required that the data and instructions are accessed from the RAM and stored in the
registers.
Cache memory is a very high speed memory placed in between RAM and CPU. Cache
memory increases the speed of processing.
It is also known as non-volatile memory or permanent storage. It does not lose its
content when the power is switched off.ROM has only read capability, no write
capability. ROM can have data and instructions written to it only one time. Once a
ROM chip is programmed at the time of manufacturing, it cannot be reprogrammed or
rewritten.
(ii) Erasable Programmable ROM (EPROM) It is similar to PROM, but it can be erased
by exposure to strong ultraviolet light,then rewritten. So,it is also known as Ultraviolet
Erasable Programmable ROM (UV EPROM).EPROM was invented by Dov Frohman of
Intel in 1971.
Tit-Bits
The secondary memory stores much larger amounts of data and information for
extended periods of time. Data in secondary memory cannot be processed directly by
the CPU, it must first be copied into primary storage i.e…, RAM.
Secondary storage is used to store data and programs when they are not being
processed. It is also non-volatile in nature. Due to this, the data remain in the
Optical Disk:
6|Page Computer Architecture and Organization
WEEK 6:Computer Memory System Overview, Cache Memory, Design Elements and
Principles of Cache Design
CD
DVD
Blue-ray Disk
Pen/Flash Drive
It is a non-volatile, random access digital data storage device. It is a data storage device
used for storing and retrieving digital information using rotating disks (platters)
coated with magnetic material. All programs of a computer are installed in hard disk.
It consists of a spindle that hold non-magnetic flat circular disks, called platters, which
hold the recorded data. Each platter requires two read/write heads, that is used to
write and read the information from a platter. All the read/ write heads are attached to
a single access arm so that they cannot move independently.
The information is recorded in bands; each band of information is called a track. Each
platter has the same number of tracks and a track location that cuts across all platters is
called a cylinder. The tracks are divided into pie- shaped sections known as sectors.
Floppy Disk:
It is used to store data but it can store small amount of data and it is slower to access
than hard disks. It is round in shape and a thin plastic disk coated with iron oxide.
Data is retrieved or recorded on the surface of the disk through a slot on the envelope.
Floppy disks is removable from the drive. Floppy disk is available in three sizes; 8
inch,5 1/4 inch and 3 1/2 inch.
It is the most popular and least expensive type of optical disk. A CD is capable of being
used as a data storage device along with storing of digital audio. The files are stored on
this particular contiguous sectors.
DVDs offer higher storage capacity than Compact discs while having the same
dimensions.
Depending upon the disk type, DVD can store several Gigabytes of data (4.7 GB -17.08
GB) DVDs are primarily used to store music or 6 movies and can be played back on
your television or the computer too. They are not rewritable media.
Blue-ray Disk:
Blue-ray disk (official abbreviation BD) is an optical disk storage medium designed to
recapture the data normally in DVD format. Blu-ray discs contain 25 GB(23.31 GB) Per
Layer space.
The name Blue-ray disk refers to the blue laser used to read the disk, which allows
information to be stored at a greater density than the longer-wavelength red laser used
is DVDs.
Blu-ray can hold almost 5 times more data than a single layer DVD.
Pen/Thumb Drive:
Pen drive is also known as flash drive. As flash drive is a data storage device that
consists of flash memory (key memory) with a portable USB (universal Serial Bus)
interface. USB flash drives are typically removable, re writable and much smaller than
a floppy disk. A USB flash drive is same as the size of thumb that plugs into a USB port
on the computer.
Today, flash drives are available in various storage capacities as 256MB, 512MB, 1GB,
4GB, 16GB upto 64GB. They are widely used as an easy and small medium to transfer
and store the information from their computer.
It is a USB-based flash memory drive. A family of flash memory cards from sony
designed for digital storage in cameras. camcorders and other handheld devices.
Capacity of memory stick varies from 4 MB to 256GB.
Magnetic Tape
Magnetic tapes are made of a plastic film-type material coated with magnetic materials
to store data permanently.Data can be read as well as recorded. It is usually 12.5 mm to
25 mm wide and 500 m to 1200 m long. These can store data in a sequential manner.
The data stored in magnetic tape is in the form of tiny segments of magnetised and
demagnetised portion on the surface of the material. Magnetic tapes are durable, can
be written. erased and rewritten. Magnetic tapes hold the maximum data, which can
be accessed sequentially.
There are mainly two types of magnetic tape as Tape Reel and Tape Cassette. Each of
the type has its own requirements. The older systems designed for networks use reel-
to-reel tapes. Newer systems use cassettes holding more data than that of the huge
reels.
Tit-Bits:
The rate at which data is written to disk or read from disk is called data transfer
rate.
Track it records data bits as tiny magnetic spots.
Sector it holds a block of data that is read or written at one time.
Root directory. is the main folder of disk . it contains information about all folders
on the disk.
Hard disk is a fixed disk i.e., cannot be removed from the drive.
12 | P a g e Computer Architecture and Organization
WEEK 6:Computer Memory System Overview, Cache Memory, Design Elements and
Principles of Cache Design
Secondary Memory Device and their Storage Method and Capacity.
Storage
S.No Secondary Memory Device Method Capacity
640 MB to 680
5 CD-ROM Optical MB
Memory Measurement:
When you use a RAM,ROM. Floppy disk or hard disk. the data is measured using
some unit. In computer terminology. They are called nibble. Bit, Byte, Kilobyte,
Megabyte, Gigabyte, etc.
Byte (B) A byte is approximately one character (letter ’a’. number ‘1’. Symbol’?’. etc…).
Also. a group of 8 bits is called a byte.
13 | P a g e Computer Architecture and Organization
WEEK 6:Computer Memory System Overview, Cache Memory, Design Elements and
Principles of Cache Design
Nibble 4 bits make one nibble.
Terabyte (TB) A terabyte , exactly 2 bytes (2 GB).is approximately a trillion (10 ) bytes.
CACHE MEMORY
Before getting on with the main topic let's have a quick refresher of how memory
systems work - skip to "Waiting for RAM" if you already know about addresses, data
and control buses.
Back in the early days of computing things were simple. There was the processor and
the memory. The processor stored data in the memory when it wanted to and read the
data back when it wanted to.
Let’s just consider RAM because the only difference between it and ROM is that the
processor cannot write data to ROM. The processor has a connection to the memory
that allows it to communicate the data being stored or retrieved which consists of a
wire for each bit of the address – making an address “bus”.
You will also know that the number of memory locations that the processor can
address depends on the number of address lines in the bus - each additional address
line doubles the amount of memory that can be used.
The processor also has a data bus, which it uses to send and retrieve data to and from
the RAM. Again the number of wires in the data bus is the number of bits that a single
memory location stores and the number of bits transferred in a single memory
operation.
So far, so good, and in fact nothing much more than was explained about general
memory systems in How Memory Works. However, as well as the address and data
bus there also has to be a control or system bus.
This passes signals to the memory that control exactly what is happening. For example,
there is usually a Read/Write (R/W) line, which indicates the direction in which the
data bus is operating and whether the memory should read the data on the data bus or
use the stored data to set the state of the data bus.
Memory
There is also usually a control signal that tells the processor that the data on the data
bus is valid and so on. The exact arrangement of the control bus varies from processor
to processor but you can see what sort of things it has to deal with.
Now we come to the interesting part – computers are made of components that work
at different speeds. One of the control lines usually carries the processor clock because
all operations in the machine are linked to this clock.
The fastest anything happens within the machine is within one clock pulse. So when
the processor wants to write to the memory or read from the memory it takes one clock
pulse.
The problem is that processor chips are made to high cost, high speed, designs.
Memory components are usually made to lower cost, slower, designs.
Why?
Simply because there is only one processor to be bought and paid for but lots and lots
of memory chips. What this means in practice is that for quite a long time processors
have been able to work much faster than memory.
There was a brief period back in the early days when processor chips ran at a clock rate
of 1MHz and the memory chips could keep up. As soon as the PC and the second
generation processor chips appeared things became more complicated.
The memory used, DRAM, needed more than one processor clock pulse time to store
and retrieve data and the “wait state” was born. The processor would put the address
onto the address bus and the data on the data bus, signal to the memory that it was
17 | P a g e Computer Architecture and Organization
WEEK 6:Computer Memory System Overview, Cache Memory, Design Elements and
Principles of Cache Design
ready to write data and then it would sit there for one, two, possibly more, clock pulses
doing nothing at all until the memory had enough time to store the data. It could then
move on to the next instruction.
As you can imagine there was a time when wait states were a big selling point, or
rather a non-selling point. The fewer the wait states a machine needed the faster it
would run your program but the more money it would cost.
Dell, it is rumoured was even so worried about it they built a 386SX 16MHz machine
using expensive static RAM that was fast enough to keep up. Very expensive and as
will be explained, quite unnecessary - but an interesting experiment.
Processor clock speeds rocketed from 1MHz, through 4MHz, hit 16MHz and carried
on up to today’s maximum of around 4GHz. There is absolutely no way that memory
chips could keep up with this sort of amazing speed and be cheap enough to supply
the ever increasing amounts of storage needed.
There are a number of ingenious intermediate solutions that boost memory thoughput
but there is only one really great new idea that solves most of the problems.
Cache Addresses
Almost all non-embedded processors, and many embedded processors, support virtual
memory. In essence, virtual memory is a facility that allows programs to address
memory from a logical point of view, without regard to the amount of main memory
physically available. When virtual memory is used, the address fields of machine
instructions contain virtual addresses. For reads to and writes from main memory, a
When virtual addresses are used, the system designer may choose to place the cache
between the processor and the MMU or between the MMU and main memory. A
logical cache, also known as a virtual cache, stores data using virtual addresses. The
processor accesses the cache directly, without going through the MMU. A physical
cache stores data using main memory physical addresses.
One obvious advantage of the logical cache is that cache access speed is faster than for
a physical cache, because the cache can respond before the MMU performs an address
translation. The disadvantage has to do with the fact that most virtual memory
systems supply each application with the same virtual memory address space. That is,
each application sees a virtual memory that starts at address 0. Thus, the same virtual
address in two different applications refers to two different physical addresses. The
cache memory must therefore be completely flushed with each application context
switch, or extra bits must be added to each line of the cache to identify which virtual
address space this address refers to.
REPORT THIS AD
Cache Size
The size of the cache should be small enough so that the overall average cost per bit is
close to that of main memory alone and large enough so that the overall average
access time is close to that of the cache alone. There are several other motivations for
minimizing cache size. The larger the cache, the larger the number of gates involved in
addressing the cache. The result is that large caches tend to be slightly slower than
small ones-even when built with the same integrated circuit technology and put in
the same place on chip and circuit board. The available chip and board area also
limits cache size. Because the performance of the cache is very sensitive to the nature
of the workload, it is impossible to arrive at a single “optimum” cache size.
Because there are fewer cache lines than main memory blocks, an algorithm is needed
for mapping main memory blocks into cache lines. Further, a means is needed for
determining which main memory block currently occupies a cache line. The choice of
the mapping function dictates how the cache is organized. Three techniques can be
used: direct, associative, and set associative.
Direct Mapping. The simplest technique, known as direct mapping, maps each block
of main memory into only one possible cache line. The mapping is expressed as,
i = j modulo m
where,
Figure below shows the mapping for the first m blocks of main memory.
REPORT THIS AD
The mapping function is easily implemented using the main memory address. Figure
below illustrates the general mechanism.
For purposes of cache access, each main memory address can be viewed as consisting
of three fields. The least significant w bits identify a unique word or byte within a
block of main memory; in most contemporary machines, the address is at the byte
level. The remaining s bits specify one of the 2^s blocks of main memory. The cache
logic interprets these s bits as a tag of s – r bits (most significant portion) and a line
field of r bits. This latter field identifies one of the m = 2^r lines of the cache. To
summarize,
The effect of this mapping is that blocks of main memory are assigned to lines of the
cache as follows:
Thus, the use of a portion of the address as a line number provides a unique mapping
of each block of main memory into the cache. When a block is actually read into its
assigned line, it is necessary to tag the data to distinguish it from other blocks that can
fit into that line. The most significant s – r bits serve this purpose.
REPORT THIS AD
The direct mapping technique is simple and inexpensive to implement. Its main
disadvantage is that there is a fixed cache location for any given block. Thus, if a
program happens to reference words repeatedly from two different blocks that map
into the same line, then the blocks will be continually swapped in the cache, and the hit
ratio will be low (a phenomenon known as thrashing).
One approach to lower the miss penalty is to remember what was discarded in case it
is needed again. Since the discarded data has already been fetched, it can be used again
at a small cost. Such recycling is possible using a victim cache. Victim cache was
In this case, the cache control logic interprets a memory address simply as a Tag and a
Word field. The Tag field uniquely identifies a block of main memory. To determine
whether a block is in the cache, the cache control logic must simultaneously examine
every line’s tag for a match. Figure below illustrates the logic.
Note that no field in the address corresponds to the line number, so that the number of
lines in the cache is not determined by the address format. To summarize,
REPORT THIS AD
With associative mapping, there is flexibility as to which block to replace when a new
block is read into the cache. The principal disadvantage of associative mapping is the
complex circuitry required to examine the tags of all cache lines in parallel.
m=n*k
i = j modulo n
where,
REPORT THIS AD
Each direct-mapped cache is referred to as a way, consisting of v lines. The first v lines
of main memory are direct mapped into the n lines of each way; the next group of v
lines of main memory are similarly mapped, and so on. The direct-mapped
implementation is typically used for small degrees of associativity (small values of k)
while the associative-mapped implementation is typically used for higher degrees of
associativity.
For set-associative mapping, the cache control logic interprets a memory address as
three fields: Tag, Set, and Word. The d set bits specify one of v=2^d sets. The s bits of
the Tag and Set fields specify one of the 2 s blocks of main memory. Figure below
illustrates the cache control logic.
With fully associative mapping, the tag in a memory address is quite large and must be
compared to the tag of every line in the cache. With k-way set-associative mapping, the
tag in a memory address is much smaller and is only compared to the k tags within a
single set. To summarize,
Once the cache has been filled, when a new block is brought into the cache, one of the
existing blocks must be replaced. For direct mapping, there is only one possible line for
any particular block, and no choice is possible. For the associative and setassociative
techniques, a replacement algorithm is needed. To achieve high speed, such an
algorithm must be implemented in hardware.
Probably the most effective is least recently used (LRU): Replace that block in the set
that has been in the cache longest with no reference to it. For two-way set associative,
this is easily implemented. Each line includes a USE bit. When a line is referenced, its
USE bit is set to 1 and the USE bit of the other line in that set is set to 0. When a block is
to be read into the set, the line whose USE bit is 0 is used. Because we are assuming
that more recently used memory locations are more likely to be referenced, LRU
should give
the best hit ratio. LRU is also relatively easy to implement for a fully associative cache.
The cache mechanism maintains a separate list of indexes to all the lines in the cache.
When a line is referenced, it moves to the front of the list. For replacement, the line at
the back of the list is used. Because of its simplicity of implementation, LRU is the most
popular replacement algorithm.
Another possibility is first-in-first-out (FIFO): Replace that block in the set that has
been in the cache longest. FIFO is easily implemented as a round-robin or
circular buffer technique. Still another possibility is least frequently used (LFU):
Replace that block in the set that has experienced the fewest references. LFU could be
implemented by associating a counter with each line. A technique not based on usage
(i.e., not LRU, LFU, FIFO, or some variant) is to pick a line at random from among
the candidate lines. Simulation studies have shown that random replacement
provides only slightly inferior performance to an algorithm based on usage.
When a block that is resident in the cache is to be replaced, there are two cases
to consider.
1) If the old block in the cache has not been altered, then it may be overwritten with a
new block without first writing out the old block.
2) If at least one write operation has been performed on a word in that line of the
cache, then main memory must be updated by writing the line of cache out to the block
of memory before bringing in the new block.
1) More than one device may have access to main memory. For example, an I/O
module may be able to read-write directly to memory. If a word has been altered only
in the cache, then the corresponding memory word is invalid. Further, if the I/O device
has altered main memory, then the cache word is invalid.
2) A more complex problem occurs when multiple processors are attached to the same
bus and each processor has its own local cache. Then, if a word is altered in one cache,
it could conceivably in-validate a word in other caches.
The simplest technique is called write through. Using this technique, all
write operations are made to main memory as well as to the cache, ensuring that
main memory is always valid. Any other processor–cache module can monitor traffic
to main memory to maintain consistency within its own cache. The main
disadvantage of this technique is that it generates substantial memory traffic and may
create a bottleneck.
In a bus organization in which more than one device (typically a processor) has a cache
and main memory is shared, a new problem is introduced. If data in one cache are
altered, this invalidates not only the corresponding word in main memory, but also
that same word in other caches (if any other cache happens to have that same word).
Even if a write-through policy is used, the other caches may contain in- valid data. A
system that prevents this problem is said to maintain cache coherency. Possible
approaches to cache coherency include the following:
1) Bus watching with write through: Each cache controller monitors the address lines
to detect write operations to memory by other bus masters. If another master writes to
a location in shared memory that also resides in the cache memory, the cache controller
invalidates that cache entry. This strategy depends on the use of a write-through policy
by all cache controllers.
2) Hardware transparency: Additional hardware is used to ensure that all updates to
main memory via cache are reflected in all caches. Thus, if one processor modifies a
word in its cache, this update is written to main memory. In addition, any matching
words in other caches are similarly updated.
3) Non-cacheable memory: Only a portion of main memory is shared by more than
one processor, and this is designated as noncacheable. In such a system, all accesses to
shared memory are cache misses, because the shared memory is never copied into the
cache. The noncacheable memory can be identified using chip-select logic or high-
address bits.
32 | P a g e Computer Architecture and Organization
WEEK 6:Computer Memory System Overview, Cache Memory, Design Elements and
Principles of Cache Design
REPORT THIS AD
Line Size
Another design element is the line size. When a block of data is retrieved and placed in
the cache, not only the desired word but also some number of adjacent words
are retrieved. As the block size increases from very small to larger sizes, the hit ratio
will
at first increase because of the principle of locality, which states that data in the vicinity
of a referenced word are likely to be referenced in the near future. As the block size
increases, more useful data are brought into the cache. The hit ratio will begin to
decrease, however, as the block becomes even bigger and the probability of using the
newly fetched information becomes less than the probability of reusing the
information that has to be replaced. Two specific effects come into play:
1) Larger blocks reduce the number of blocks that fit into a cache. Because each block
fetch overwrites older cache contents, a small number of blocks results in data being
overwritten shortly after they are fetched.
2) As a block becomes larger, each additional word is farther from the requested word
and therefore less likely to be needed in the near future.
The relationship between block size and hit ratio is complex, depending on the locality
characteristics of a particular program, and no definitive optimum value has been
found.
Number of Caches
When caches were originally introduced, the typical system had a single cache.
More recently, the use of multiple caches has become the norm. Two aspects of this
design issue concern the number of levels of caches and the use of unified versus split
caches.
33 | P a g e Computer Architecture and Organization
WEEK 6:Computer Memory System Overview, Cache Memory, Design Elements and
Principles of Cache Design
Multilevel Caches. As logic density has increased, it has become possible to have a
cache on the same chip as the processor: the on-chip cache. Compared with a cache
reachable via an external bus, the on-chip cache reduces the processor’s external bus
activity and therefore speeds up execution times and increases overall system
performance. When the requested instruction or data is found in the on-chip cache, the
bus access is eliminated. Because of the short data paths internal to the processor,
compared with bus lengths, on-chip cache accesses will complete appreciably faster
than would even zero-wait state bus cycles. Furthermore, during this period the bus is
free to support other transfers.
The inclusion of an on-chip cache leaves open the question of whether an off-chip, or
external, cache is still desirable. Typically, the answer is yes, and most contemporary
designs include both on-chip and external caches. The simplest such organization is
known as a two-level cache, with the internal cache designated as level 1 (L1) and the
external cache designated as level 2 (L2). The reason for including an L2 cache is the
following: If there is no L2 cache and the processor makes an access request for a
memory location not in the L1 cache, then the processor must access DRAM or ROM
memory across the bus. Due to the typically slow bus speed and slow memory access
time, this results in poor performance. On the other hand, if an L2 SRAM (static RAM)
cache is used, then frequently the missing information can be quickly retrieved. If the
SRAM is fast enough to match the bus speed, then the data can be accessed using a
zero-wait state transaction, the fastest type of bus transfer.
REFERENCES:
https://www.i-programmer.info/babbages-bag/375-cache-memory.html
https://www.informationq.com/computer-memory-overview/
34 | P a g e Computer Architecture and Organization
WEEK 6:Computer Memory System Overview, Cache Memory, Design Elements and
Principles of Cache Design
https://quickcse.wordpress.com/2018/08/12/elements-of-cache-design/
SEMICONDUCTOR MEMORIES:
Magnetic storage:
Optical storage:
There is also volatile memory. This is memory that loses its data once power is cut off,
while non-volatile memory retains data even without power.
RAM ROM
Volatility Non Volatility
Parameter
Mask
SRAM DRAM FeRAM EPROM EEPROM FLASH
ROM
Voltage
Data
Voltage Bias
Storage Unnecessary
Bias +
Method
Refresh
No. of
Read ∞ ∞ 10 billion∞ ∞ ∞ ∞
Operations to
1 trillion 100,000 to10,000 to
No. of
∞ ∞ times 0 times 100 times 1 million100,000
Rewrites
times times
Memory
Cell
Stored inMaintains Polarization Ions Maintains Maintains Maintains
a flipcharge inof theimplanted charge incharge incharge in
flop the ferroelectric in athe the the
circuit capacitor material transistor floating floating floating
gate gate gate
Random-access memory (RAM /ræm/) is a form of computer memory that can be read
and changed in any order, typically used to store working data and machine
code. A random-access memory device allows data items to be read or written in
almost the same amount of time irrespective of the physical location of data inside the
memory. In contrast, with other direct-access data storage media such as hard
disks, CD-RWs, DVD-RWs and the older magnetic tapes and drum memory, the time
4|Page COMPUTER ORGANIZATION AND ARCHITECTURE
WEEK 7: Semiconductor Memories, Random Access Memory, Read Only Memory
required to read and write data items varies significantly depending on their physical
locations on the recording medium, due to mechanical limitations such as media
rotation speeds and arm movement.
RAM contains multiplexing and demultiplexing circuitry, to connect the data lines to
the addressed storage for reading or writing the entry. Usually more than one bit of
storage is accessed by the same address, and RAM devices often have multiple data
lines and are said to be "8-bit" or "16-bit", etc. devices.
In today's technology, random-access memory takes the form of integrated circuit (IC)
chips with MOS (metal-oxide-semiconductor) memory cells. RAM is normally
associated with volatile types of memory (such as DRAM modules), where stored
information is lost if power is removed, although non-volatile RAM has also been
developed.[3] Other types of non-volatile memories exist that allow random access for
read operations, but either do not allow write operations or have other kinds of
limitations on them. These include most types of ROM and a type of flash
memory called NOR-Flash.
The two main types of volatile random-access semiconductor memory are static
random-access memory (SRAM) and dynamic random-access memory (DRAM).
Commercial uses of semiconductor RAM date back to 1965, when IBM introduced the
SP95 SRAM chip for their System/360 Model 95 computer, and Toshiba used DRAM
memory cells for its Toscal BC-1411 electronic calculator, both based on bipolar
transistors. Commercial MOS memory, based on MOS transistors, was developed in
the late 1960s, and has since been the basis for all commercial semiconductor memory.
The first commercial DRAM IC chip, the Intel 1103, was introduced in October
1970. Synchronous dynamic random-access memory (SDRAM) later debuted with
the Samsung KM48SL2000 chip in 1992.
The two widely used forms of modern RAM are static RAM (SRAM) and dynamic
RAM (DRAM). In SRAM, a bit of data is stored using the state of a six-
transistor memory cell. This form of RAM is more expensive to produce, but is
generally faster and requires less dynamic power than DRAM. In modern computers,
SRAM is often used as cache memory for the CPU. DRAM stores a bit of data using a
transistor and capacitor pair, which together comprise a DRAM cell. The capacitor
holds a high or low charge (1 or 0, respectively), and the transistor acts as a switch that
lets the control circuitry on the chip read the capacitor's state of charge or change it. As
this form of memory is less expensive to produce than static RAM, it is the
predominant form of computer memory used in modern computers.
Both static and dynamic RAM are considered volatile, as their state is lost or reset when
power is removed from the system. By contrast, read-only memory (ROM) stores data
by permanently enabling or disabling selected transistors, such that the memory
cannot be altered. Writeable variants of ROM (such as EEPROM and flash memory)
share properties of both ROM and RAM, enabling data to persist without power and to
be updated without requiring special equipment. These persistent forms of
semiconductor ROM include USB flash drives, memory cards for cameras and portable
devices, and solid-state drives. ECC memory (which can be either SRAM or DRAM)
includes special circuitry to detect and/or correct random faults (memory errors) in the
stored data, using parity bits or error correction codes.
In general, the term RAM refers solely to solid-state memory devices (either DRAM or
SRAM), and more specifically the main memory in most computers. In optical storage,
the term DVD-RAM is somewhat of a misnomer since, unlike CD-RW or DVD-RW it
does not need to be erased before reuse. Nevertheless, a DVD-RAM behaves much like
a hard disc drive if somewhat slower.
The memory cell is the fundamental building block of computer memory. The memory
cell is an electronic circuit that stores one bit of binary information and it must be set to
store a logic 1 (high voltage level) and reset to store a logic 0 (low voltage level). Its
value is maintained/stored until it is changed by the set/reset process. The value in the
memory cell can be accessed by reading it.
In SRAM, the memory cell is a type of flip-flop circuit, usually implemented
using FETs. This means that SRAM requires very low power when not being accessed,
but it is expensive and has low storage density.
A second type, DRAM, is based around a capacitor. Charging and discharging this
capacitor can store a "1" or a "0" in the cell. However, the charge in this capacitor
slowly leaks away, and must be refreshed periodically. Because of this refresh process,
DRAM uses more power, but it can achieve greater storage densities and lower unit
costs compared to SRAM.
To be useful, memory cells must be readable and writeable. Within the RAM device,
multiplexing and demultiplexing circuitry is used to select memory cells. Typically, a
RAM device has a set of address lines A0... An, and for each combination of bits that
may be applied to these lines, a set of memory cells are activated. Due to this
addressing, RAM devices virtually always have a memory capacity that is a power of
two.
Usually several memory cells share the same address. For example, a 4 bit 'wide' RAM
chip has 4 memory cells for each address. Often the width of the memory and that of
the microprocessor are different, for a 32 bit microprocessor, eight 4 bit RAM chips
would be needed.
Often more addresses are needed than can be provided by a device. In that case,
external multiplexors to the device are used to activate the correct device that is being
accessed.
MEMORY HIERARCHY
One can read and over-write data in RAM. Many computer systems have a memory
hierarchy consisting of processor registers, on-die SRAM caches,
external caches, DRAM, paging systems and virtual memory or swap space on a hard
drive. This entire pool of memory may be referred to as "RAM" by many developers,
even though the various subsystems can have very different access times, violating the
original concept behind the random access term in RAM. Even within a hierarchy level
such as DRAM, the specific row, column, bank, rank, channel,
or interleave organization of the components make the access time variable, although
not to the extent that access time to rotating storage media or a tape is variable. The
overall goal of using a memory hierarchy is to obtain the highest possible average
access performance while minimizing the total cost of the entire memory system
(generally, the memory hierarchy follows the access time with the fast CPU registers at
the top and the slow hard drive at the bottom).
READ-ONLY MEMORY
Read-only memory (ROM) is a type of storage medium that permanently stores data
on personal computers (PCs) and other electronic devices. It contains the
programming needed to start a PC, which is essential for boot-up; it performs major
input/output tasks and holds programs or software instructions.
There are numerous ROM chips located on the motherboard and a few on expansion
boards. The chips are essential for the basic input/output system (BIOS), boot up,
reading and writing to peripheral devices, basic data management and the software for
basic processes for certain utilities.
Other types of non-volatile memory include:
Because ROM cannot be changed and is read-only, it is mainly used for firmware.
Firmware is software programs or sets of instructions that are embedded into a
hardware device. It supplies the needed instructions on how a device communicates
with various hardware components. Firmware is referred to as semi-permanent
because it does not change unless it is updated. Firmware includes BIOS, erasable
programmable ROM (EPROM) and the ROM configurations for software.
REFERENCES:
https://en.wikipedia.org/wiki/Semiconductor_memory
https://www.rohm.com/electronics-basics/memory/what-is-semiconductor-memory
https://en.wikipedia.org/wiki/Random-access_memory
https://www.techopedia.com/definition/2804/read-only-memory-rom
Objectives:
At the end of the lesson the learner will be able to:
Identify the error detection and correction in semiconductor memories
Define the DRAM
Identify the new enhancements in DRAM architectures
Error detection codes − are used to detect the error(s) present in the received data (bit
stream). These codes contain some bit(s), which are included (appended) to the
original bit stream. These codes detect the error, if it is occurred during transmission
of the original data (bit stream).Example − Parity code, Hamming code.
Error correction codes − are used to correct the error(s) present in the received data
(bit stream) so that, we will get the original data. Error correction codes also use the
similar strategy of error detection codes.Example − Hamming code.
Therefore, to detect and correct the errors, additional bit(s) are appended to the data
bits at the time of transmission.
Parity Code
It is easy to include (append) one parity bit either to the left of MSB or to the right of
LSB of original bit stream. There are two types of parity codes, namely even parity
code and odd parity code based on the type of parity being chosen.
The value of even parity bit should be zero, if even number of ones present in the
binary code. Otherwise, it should be one. So that, even number of ones present
in even parity code. Even parity code contains the data bits and even parity bit.
1|Page COMPUTER ORGANIZATION AND ARCHITECTURE
WEEK 8: Error Detection and Correction in Semiconductor Memories, Advanced
DRAM Organization
The following table shows the even parity codes corresponding to each 3-bit binary
code. Here, the even parity bit is included to the right of LSB of binary code.
000 0 0000
001 1 0011
010 1 0101
011 0 0110
100 1 1001
101 0 1010
110 0 1100
111 1 1111
Here, the number of bits present in the even parity codes is 4. So, the possible even
number of ones in these even parity codes are 0, 2 & 4.
If the other system receives one of these even parity codes, then there is no error
in the received data. The bits other than even parity bit are same as that of
binary code.
error(s) in the received data. In this case, we can’t predict the original binary
code because we don’t know the bit position(s) of error.
Therefore, even parity bit is useful only for detection of error in the received parity
code. But, it is not sufficient to correct the error.
The value of odd parity bit should be zero, if odd number of ones present in the
binary code. Otherwise, it should be one. So that, odd number of ones present in odd
parity code. Odd parity code contains the data bits and odd parity bit.
The following table shows the odd parity codes corresponding to each 3-bit binary
code. Here, the odd parity bit is included to the right of LSB of binary code.
000 1 0001
001 0 0010
010 0 0100
011 1 0111
100 0 1000
101 1 1011
111 0 1110
Here, the number of bits present in the odd parity codes is 4. So, the possible odd
number of ones in these odd parity codes are 1 & 3.
If the other system receives one of these odd parity codes, then there is no error
in the received data. The bits other than odd parity bit are same as that of binary
code.
If the other system receives other than odd parity codes, then there is an error(s)
in the received data. In this case, we can’t predict the original binary code
because we don’t know the bit position(s) of error.
Therefore, odd parity bit is useful only for detection of error in the received parity
code. But, it is not sufficient to correct the error.
Hamming Code
Hamming code is useful for both detection and correction of error present in the
received data. This code uses multiple parity bits and we have to place these parity
bits in the positions of powers of 2.
The minimum value of 'k' for which the following relation is correct (valid) is nothing
but the required number of parity bits.
$$2^k\geq n+k+1$$
Where,
Based on requirement, we can use either even parity or odd parity while forming a
Hamming code. But, the same parity technique should be used in order to find
whether any error present in the received data.
Find the value of p1, based on the number of ones present in bit positions b3, b5,
b7 and so on. All these bit positions (suffixes) in their equivalent binary have ‘1’
in the place value of 20.
Find the value of p2, based on the number of ones present in bit positions b3, b6,
b7 and so on. All these bit positions (suffixes) in their equivalent binary have ‘1’
in the place value of 21.
Find the value of p3, based on the number of ones present in bit positions b5, b6,
b7 and so on. All these bit positions (suffixes) in their equivalent binary have ‘1’
in the place value of 22.
Find the value of c1, based on the number of ones present in bit positions b1, b3,
b5, b7 and so on. All these bit positions (suffixes) in their equivalent binary have
‘1’ in the place value of 20.
Find the value of c2, based on the number of ones present in bit positions b2, b3,
b6, b7 and so on. All these bit positions (suffixes) in their equivalent binary have
‘1’ in the place value of 21.
b6, b7 and so on. All these bit positions (suffixes) in their equivalent binary have
‘1’ in the place value of 22.
The decimal equivalent of the check bits in the received data gives the value of bit
position, where the error is present. Just complement the value present in that bit
position. Therefore, we will get the original binary code after removing parity bits.
Example 1
Let us find the Hamming code for binary code, d4d3d2d1 = 1000. Consider even parity
bits.
We can find the required number of parity bits by using the following mathematical
relation.
$$2^k\geq n+k+1$$
The minimum value of k that satisfied the above relation is 3. Hence, we require 3
parity bits p1, p2, and p3. Therefore, the number of bits in Hamming code will be 7,
since there are 4 bits in binary code and 3 parity bits. We have to place the parity bits
and bits of binary code in the Hamming code as shown below.
Example 2
The decimal value of check bits gives the position of error in received Hamming code.
Therefore, the error present in third bit (b3) of Hamming code. Just complement the
value present in that bit and remove parity bits in order to get the original binary
code.
The traditional DRAM is constrained both by its internal architecture and by its
interface to processor's memory bus. The new enhancements (most common) on
DRAM architecture are:
1. Synchronous DRAM (SDRAM)
2. Rambus DRAM (RDRAM)
3. Double Data Rate DRAM (DDR DRAM)
4. Cache DRAM (CDRAM)
Exchange data with the processor synchronized to an external clock signal and
running at the full speed of the processor/memory bus without imposing wait states.
One word of data is transmitted per clock cycle (single data rate).[All control, address,
& data signals are only valid (and latched) on a clock edge.]
Typical clock frequencies are 100 and 133 MHz.
SDRAM has multiple-bank internal architecture that improves opportunities for on-
chip parallelism. Generally it uses dual data banks internally. It starts access in one
bank then next, and then receives data from first then second.
SDRAM performs best when it is transferring large blocks of data serially, such as for
applications like word processing, spreadsheets and multimedia.
RDRAM chips are vertical packages with all pins on one side. The chip exchanges data
with the processor over 28 wires no more than 12 cm long. The bus can address upto
320 RDRAM chips.
Entire data blocks are access and transferred out on a high-speed bus-like interface
(500 Mbps to 1.6 Gbps).
It has tricky system level design where the bus itself defines impedance, clocking and
signaling very precisely.
More expensive memory chips.
Concurrent RDRAMs have been used in video games, while Direct RDRAMs have
been used in computers.
Uses both rising (positive edge) and falling (negative) edge of clock for data transfer.
That is DDR SDRAM can send data twice per clock cycle - once on the rising edge of
the clock pulse and once on the falling edge.
There has been improvement in DDR DRAM technology. The later generations (DDR2
and DDR3) increases the data rate by increasing the operational frequency of the RAM
chip and by increasing the prefetch buffer from 2 bits to 4 bits per chip.
DDR can transfer data at a clock rate in the range of 200MHz to 600 MHz, DDR2 can
transfer in the range of 400MHz to 1066MHz and DDR3 can transfer in the range of
800MHz to 1600 MHz.
REFERENCES
https://www.tutorialspoint.com/digital_circuits/digital_circuits_error_detection_correc
tion_codes.htm
https://searchstorage.techtarget.com/definition/DRAM
http://abhaycopi.blogspot.com/2014/04/advanced-dram-organization.html