Computer System and Network
Computer System and Network
NETWORK
(COMP 23)
ENCODED BY: DONDON LEDAMA
LESSON I:
Introduction
1
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Computer System Components
Recent advances in microelectronic technology have made computers an integral
part of our society. Each step in our everyday lives maybe influenced by computer
technology: we awake to a digital alarm clock’s beaming of reselected music at the right
time, drive to work in a digital alarm clock’s beaming of reselected music at the right
time, drive to work in a digital-processor-controlled automobile, work in an extensively
automated office, shop for computer-coded grocery items and return to rest in the
computer-regulated heating and cooling environment of our homes. It may not be
necessary to understand the detailed operating principles of a jet plane or an automobile
on order to use and enjoy the benefits of these technical marvels. But a fair
understanding of the operating principles, capabilities, and limitations of digital computers
is necessary, if we would use them in an efficient manner. This book is designed to give
such an understanding of the operating principles of digital computers. This chapter will
begin by describing the organization of a general-purpose digital computer system and
then will briefly trace the evolution of computers.
The diagram shows a general view of how desktop and workstation computers are
organized. Different systems have different details, but in general all computers consist of
components (processor, memory, controllers, video) connected together with a bus.
Physically, a bus consists of many parallel wires, usually printed (in copper) on the main
circuit board of the computer. Data signals, clock signals, and control signals are sent on
the bus back and forth between components. A particular type of bus follows a carefully
written standard that describes the signals that are carried on the wires and what the
signals mean. The PCI standard (for example) describes the PCI bus used on most current
PCs.
The way in which devices connected to a bus cooperate is another part of a bus
standard.
Input/output controllers receive input and output requests from the central
processor, and then send device-specific control signals to the device they control. They
also manage the data flow to and from the device. This frees the central processor from
involvement with the details of controlling each device. I/O controllers are needed only
for those I/O devices that are part of the system.
Often the I/O controllers are part of the electronics on the main circuit board (the
mother board) of the computer. Sometimes an uncommon device requires its own
controller which must be plugged into a connector (an expansion slot) on the mother
board.
Main Memory
In practice, data and instructions are often placed in different sections of memory,
but this is a matter of software organization, not a hardware requirement. Also, most
computers have special sections of memory that permanently hold programs (firmware
stored in ROM), and other sections that are permanently used for special purposes.
Main memory (also called main storage or just memory) holds the bit patterns of
machine instructions and the bit patterns of data. Memory chips and the electronics that
controls them are concerned only with saving bit patterns and returning them when
requested. No distinction is made between bit patterns that are intended as instructions
and bit patterns that are intended as data. The amount of memory on a system is often
described in terms of:
Kilobyte: 210 = 1024 bytes.
Megabyte: 220 = 1024 Kilobytes
30 = 1024
Gigabyte: 2
Megabytes
40 = 1024 Gigabytes
Terabyte: 2
3
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
These days (Winter 2005) the amount of main memory in a new desktop computer
ranges from 256 megabytes to 1 gigabyte. Hard disks and other secondary storage
devices are tens or hundreds of gigabytes. Backup storage comes in sizes as large as
several terabytes
Addresses
The assembly language of this course is for the MIPS32 chip, so we will use 32-bit
addresses. The assembly language of the 64-bit MIPS chips is similar.
The MIPS has an address space of 232 bytes. A Gigabyte is 230, so the MIPS have
4 gigabytes of address space. Ideally, all of these memory locations would be
implemented using memory chips (usually called RAM). RAM costs about $200 per
4
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
gigabyte. Installing the maximum amount of memory as RAM would cost about $800.
This might be more than you want to spend. Hard disk storage costs much less per
gigabyte. Hard disks cost about $50 per gigabyte (winter, 2005).
On modern computers, the full address space is present no matter how much RAM
has been installed. This is done by keeping some parts of the full address space on disk
and some parts in RAM. The RAM, the hard disk, some special electronics, and the
operating system work together to provide the full 32 bit address space. To a user or an
applications programmer it looks as if all 232 bytes of main memory are present.
This method of providing the full address space by using a combination of RAM memory
and the hard disk is called virtual memory. The word virtual means "appearing to
exist, but not really there." Some computer geeks have a virtual social life.
Cache Memory
Computer systems also have cache memory. Cache memory is very fast RAM that is
inside (or close to) the processor. It duplicates sections of main storage that are heavily
used by the currently running programs. The processor does not have to use the
5
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
system bus to get or store data in cache memory. Access to cache memory is much
faster than to normal main memory.
Contents of Memory
The memory system merely stores bit patterns. That some of these patterns
represent integers, that some represent characters, and that some represent
instructions (and so on) is of no concern to the electronics. How these patterns are used
depends on the programs that use them. A word processor program, for example, is
written to process patterns that represent characters. A spreadsheet program processes
patterns that represent numbers.
Of course, most programs process several types of data, and must keep track of how
each is used. Often programs keep the various uses of memory in separate sections,
but that is a programming convention, not a requirement of electronics.
Any byte in main storage can contain any 8-bit pattern. No byte of main storage can
contain anything but an 8-bit pattern. There is nothing in the memory system of a
computer that says what a pattern represents.
Computer System Organization
Before we look at the C language, let us look at the overall organization of computing
systems. Figure 1.1 shows a block diagram of a typical computer system. Notice it is
divided into two major sections; hardware and software.
6
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Computer
Hardware
The physical
machine,
consisting of
electronic
circuits, is
called the
hardware.
It consists
of several
major units:
the Central
P r o c e s si n g
Unit (CPU),
M a i n
Memory,
Secondary
Memory and
Peripherals.
The CPU is the major component of a computer; the ``electronic brain'' of the machine.
It consists of the electronic circuits needed to perform operations on the data. Main
Memory is where programs that are currently being executed as well as their data are
stored. The CPU fetches program instructions in sequence, together with the required
data, from Main Memory and then performs the operation specified by the instruction.
Information may be both read from and written to any location in Main Memory so the
devices used to implement this block are called random access memory chips (RAM).
The contents of Main Memory (often simply called memory) are both temporary (the
programs and data reside there only when they are needed) and volatile (the contents
are lost when power to the machine is turned off).
The Secondary Memory provides more long term and stable storage for both programs
and data. In modern computing systems this Secondary Memory is most often
implemented using rotating magnetic storage devices, more commonly called disks
(though magnetic tape may also be used); therefore, Secondary Memory is often
referred to as the disk. The physical devices making up Secondary Memory, the disk
7
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
drives are also known as mass storage devices because relatively large amounts of
data and many programs may be stored on them.
The disk drives making up Secondary Memory are one form of Input/Output (I/O)
device since they provide a means for information to be brought into (input) and taken
out of (output) the CPU and its memory. Other forms of I/O devices which transfer
information between humans and the computer are represented by the Peripherals box
in Figure 1.1. These Peripherals include of devices such as terminals -- a keyboard (and
optional mouse) for input and a video screen for output, high-speed printers, and
possibly floppy disk drives and tape drives for permanent, removable storage of data
and programs. Other I/O devices may include high-speed optical scanners, plotters,
multi-user and graphics terminals, networking hardware, etc. In general, these devices
provide the physical interface between the computer and its environment by allowing
humans or even other machines to communicate with the computer.
The remaining blocks in Figure 1.1 are typical software layers provided on most
computing systems. This software may be thought of as having a hierarchical, layered
structure, where each layer uses the facilities of layers below it. The four major blocks
shown in the figure are the Operating System, Utilities, User Programs and
Applications.
8
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Many mainframe machines normally use proprietary operating systems, such as VM and
CMS (IBM) and VAX VMS and TOPS 20 (DEC). More recently, there is a move towards a
standardized operating system and most workstations and desktops typically use UNIX
(AT&T and other versions). A widely used operating system for IBM PC and compatible
personal computers is DOS (Microsoft). Apple Macintosh machines are distinguished by
an easy to use proprietary operating system with graphical icons.
Utility Programs
The layer above the OS is labeled Utilities and consists of several programs which are
primarily responsible for the logical interface with the user, i.e. the ``view'' the user
has when interacting with the computer. (Sometimes this layer and the OS layer below
are considered together as the operating system). Typical utilities include such
programs as shells, text editors, compilers, and (sometimes) the file system.
A shell is a program which serves as the primary interface between the user and the
operating system. The shell is a ``command interpreter'', i.e. is prompts the user to
enter commands for tasks which the user wants done, reads and interprets what the
user enters, and directs the OS to perform the requested task. Such commands may
call for the execution of another utility (such as a text editor or compiler) or a user
program or application, the manipulation of the file system, or some system operation
such as logging in or out. There are many variations on the types of shells available,
from relatively simple command line interpreters (DOS) or more powerful command line
interpreters (the Bourne Shell, sh, or C Shell, csh in the Unix environment), to more
complex, but easy to use graphical user interfaces (the Macintosh or Windows). You
should become familiar with the particular shell(s) available on the computer you are
using, as it will be your primary means of access to the facilities of the machine.
A text editor (as opposed to a word processor) is a program for entering programs
and data and storing them in the computer. This information is organized as a unit
called a file similar to a file in an office filing cabinet, only in this case it is stored on the
disk. (Word processors are more complex than text editors in that they may
automatically format the text, and are more properly considered applications than
utilities). There are many text editors available (for example vi and emacs on Unix
systems) and you should familiarize yourself with those available on your system.
9
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
editors or debugging features). Your system manuals can describe the features
available on your system.
Finally, another important utility (or task of the operating system) is to manage the file
system for users. A file system is a collection of files in which a user keeps programs,
data, text material, graphical images, etc. The file system provides a means for the
user to organize files, giving them names and gathering them into directories (or
folders) and to manage their file storage. Typical operations which may be done with
files include creating new files, destroying, renaming, and copying files.
COMPUTER EVOLUTION
500 B.C.
The 2/5 Abacus is invented by the Chinese. Find out
more about the Abacus at "Abacus: The Art of Calculating
with Beads" by Luis Fernandes.
1 A.D.
The Antikythera Device, a mechanism that mimicked the actual movements of
the sun, moon, and planets, past, present, and future. This technology was then
lost for millennia.
10
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
1632
Wilhelm Schickard builds the first "automatic calculator", the
"Calculating Clock" which was used for computing astronomical
tables.
Wilhelm Schickard (born 1592 in Herrenberg - died 1635 in
Tübingen) built the first automatic calculator in 1623.
Contemporaries called this machine the Calculating Clock. It precedes the less versatile
Pascaline of Blaise Pascal and the calculator of Gottfried Leibniz by twenty years.
Schickard's letters to Johannes Kepler show how to use the machine for calculating
astronomical tables. The machine could add and subtract six-digit numbers, and
indicated an overflow of this capacity by ringing a bell; to aid more complex
calculations, a set of Napier's bones were mounted on it. The designs were lost until
the twentieth century; a working replica was finally constructed in 1960.
1642
Blaise Pascal, a French religious philosopher and
mathematician, builds the first practical m echa nica l
calculating machine. Thereby etching his name in history to
be resurrected later for the name of a, now a r c a n e ,
programming language.
1830
The "Analytical Engine" is designed by Charles Babbage.
1850
11
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Japanese refined the Abacus into the 1/5 with one bead on top deck and five on bottom
deck.
1890
The U.S. Census Bureau adopts the Hollerith Punch Card, Tabulating
Machine and Sorter to compile results of the 1890 census, reducing an
almost 10-year process to 2 ½ years, saving the government a
whopping $5 million. Inventor Herman Hollerith, a Census Bureau
statistician, forms the Tabulating Machine Company in 1896. The TMC
eventually evolved into IBM.
1930
Abacus is again changed, to the 1/4 design.
1939
The first semi-electronic digital computing device is constructed by John Atanassoff.
The "Mark I" Automatic Sequence Controlled Calculator, the first fully automatic
calculator, is begun at Harvard by mathematician Howard Aiken. Its designed purpose
was to generate ballistic tables for Navy artillery.
1941
German inventor Konrad Zuse produces the Z3 for use in aircraft and missile design but
the German government misses the boat and does not support him. There is some
debate as to whether the Mark I or the Z3 came first.
1943
English mathematician Alan Turing (bio by Andrew Hodges) begins operation of his secret
computer for the British military. It was used by cryptographers to break secret German
military codes. It was the first vacuum tube computer but its existence was not made
public until decades later.
1946
12
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Eniac (Electronic Numerical Integrator
and Calculator), the first credited, all
electronic computer is completed at the
University of Pennsylvania. It used
thousands of vacuum tubes.
1951
Seymour Cray gets his Masters degree in
Applied Mathematics, soon after joins
Engineering Research Associates and
starts working on the 1100 series
computers for what ended up being
Univac.
1957
Bill Norris and friends start Control Data Corporation (CDC) bring Seymour Cray on-board
and begin building Large Scale Scientific Computers.
1958
The first "integrated circuit" is designed by American Jack Kirby. It included resistors,
capacitors and transistors on a single wafer chip.
1960
Digital Equipment delivers PDP-1, an interactive computer with CRT and keyboard. Its big
screen inspires MIT students to write the world's first computer game.
1963
Sketchpad, first WYSIWYG interactive drawing tool, is published by Ivan Sutherland as his
MIT doctoral thesis.
1965
Sutherland demonstrates first VR head-mounted 3-D display.
Ted Nelson coins the terms hypertext and hypermedia in a paper at the Association for
Computing Machinery's 20th national conference.
13
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
1968
Doug Engelbart demonstrates the first mouse.
1970
First four nodes are established on Arpanet, precursor of the Internet and World Wide
Web.
1971
IBM introduces the 3270 mainframe terminal; its character-based interface becomes the
standard for business applications.
The first "microprocessor" is produced by American engineer Marcian E. Hoff.
1972
First GUI appears as part of Xerox Parc's Smalltalk programming environment.
Seymour Cray incorporates Cray Research.
1974
Xerox PARC researches create Alto, the first computer to use the WIMP interface.
Altair 8800 microcomputer, based on Intel's 8080 processor; Interface uses toggle
switches, LEDs.
1975
Bill Gates and Paul Allen create and license the first microcomputer version of Basic, for
the Altair; Loads via a paper tape.
14
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
1977
Tandy (Radio Shack) produces the first practical personal computer, using a cassette tape
drive for programs and storage.
Apple ships Apple II, with integrated keyboard, 16-color graphics, and command-line disk
operating system.
1978
At Apple Computer, Steve Jobs proposes a "next generation"
business machine with graphical user interface. It becomes the Lisa
project.
Don Brickland and Bob Frankston's VisiCalc's text-based
spreadsheet interface becomes the personal computer's first killer
app, runs on Apple II.
1981
IBM releases the PC with 4.77 MHz, MS-DOS, command line interfaces,
and monochrome block graphics.
1984
Apple ships the Macintosh, the first mass-market computer with a monochrome desktop
GUI, plug and play, and suite of GUI productivity applications.
1985
Microsoft ships Windows 1.0, its first graphical environment.
1990
Microsoft announces Windows 3.0; adds 3-D look and feel, Program
Manager and File Manager.
1992
15
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Apple announces Newton PDA with pen-based user interface.
1993
Early Web Browsers: ECP Web browser for Macintosh released. NCSA releases Marc
Andreessen's Mosaic Web browser for X Window.
1995
Microsoft introduces Bob, industry's first "Social User Interface", and
featuring animated "assistants." Bob bombs. Watch the "Remembering
the Bob" at Tech TV.
Microsoft ships Windows 95, regarded by many as the release that
offers features comparable with Apple's Mac. It's the fastest-selling
operating system ever shipped.
1997
Microsoft Active Desktop integrates the Web with Windows.
Netscape Communicator and Constellation combine Web and desktop GUI.
Microsoft invests $150,000,000 in Apple Computers.
1998
Windows 98 released.
A good portion of the world still using the abacus, maybe 2 people
using the TRS-80.
Introduction
Over the past 10 years many practitioners and researchers have sought to define
software architecture. At the SEI, we use the following definition:
Confusion also stems from the use of the same specification language for both
architectural and design specifications. For example, UML is often used as an architectural
description language. In fact, UML has become the industry de facto standard for
describing architectures, although it was specifically designed to manifest detailed design
decisions (and this is still its most common use). This merely contributes to the confusion,
since a designer using UML has no way (within UML) of distinguishing architectural
information from other types of information.
Confusion also exists with respect to the artifacts of design and implementation. UML class
diagrams, for instance, are a prototypical artifact of the design phase. Nonetheless, class
diagrams may accumulate enough detail to allow code generation of very detailed
programs, an approach that is promoted by CASE tools such as Rational Rose and System
Architect. Using the same specification language further blurs the distinction between
artifacts of the design (class diagrams) and artifacts of the implementation (source
code). Having a unified specification language is, in many ways, a good thing. But a user
of this unified language is given little help in knowing if a proposed change is
“architectural” or not.
Seeking to separate architectural design from other design activities, definers of software
architecture in the past have stressed the following:
17
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
In suggesting typical “architectures” and “architectural styles,” existing definitions consist
of examples and offer anecdotes rather than providing clear and unambiguous notions. In
practice, the terms “architecture,” “design,” and “implementation” appear to connote
varying degrees of abstraction in the continuum between complete details
(“implementation”), few details (“design”), and the highest form of abstraction
(“architecture”). But the amount of detail alone is insufficient to characterize the
differences, because architecture and design documents often contain detail that is not
explicit in the implementation (e.g., design constraints, standards, performance goals).
Thus, we would expect a distinction between these terms to be qualitative and not merely
quantitative.
The ontology that we provide below can serve as a reference point for these discussions.
1. Intensional (vs. extensional) design specifications are “abstract” in the sense that
they can be formally characterized by the use of logic variables that range over an
unbounded domain. For example, a layered architectural pattern does not restrict
the architect to a specific number of layers; it applies equally well to 2 layers or 12
layers.
2. Non-local (vs. local) specifications are “abstract” in the sense that they apply to all
parts of the system (as opposed to being limited to some part thereof).
Both of these interpretations contribute to the distinction among architecture, design, and
implementation, summarized as the “intension/locality thesis”:
18
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Implications
What are the implications of such definitions? They give us a firm basis for determining
what is architectural (and hence crucial for the achievement of a system’s quality attribute
requirements) and what is not.
Consider the concept of a strictly layered architecture (an architecture in which each layer
is allowed to use only the layer immediately below it). How do we know that the
architectural style “layered” is really architectural? To answer that we need to answer
whether this style is intentional and whether it is local or non-local. First of all, are there
an unbounded number of implementations that qualify as layered? Clearly there are.
Secondly, is the layered style local or non-local? To answer that, we need only consider a
violation of the style, where a layer depends on a layer above it, or several layers below
it. Since this would be a violation wherever it occurred, the notion of a layered
architecture must be non-local.
What about a design pattern, such as the factory pattern? This is intensional, because
there may be an unbounded number of realizations of a factory design pattern within a
system. But is it local or non-local? One may use a design pattern in some corner of the
system and not use it (or even violate it) in a different portion of the same system. So
design patterns are local.
Similarly, it is simple to show that the term “implementation” refers only to artifacts that
are extensional and local.
Conclusions
Since the inception of architecture as a distinct field of study, there has been much
confusion about what the term “architecture” means. Similarly, the distinction between
architecture and other forms of design artifacts has never been clear. The
intension/locality thesis provides a foundation for determining the meaning of the terms
architecture, design, and implementation that accords not only with intuition but also with
best industrial practices. A more formal and complete treatment of this topic can be found
in our paper, “Architecture, Design, Implementation.” But what are the consequences of
precisely knowing the differences among these terms? Is this an exercise in definition for
definition’s sake? We think not. Among others, these distinctions facilitate
2. determining what information goes into architecture documents and what goes into
design documents
19
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
3. determining what to examine and what not to examine in an architectural
evaluation or a design walkthrough
4. understanding the distinction between local and non-local rules (i.e., between the
design rules that are enforced throughout a project versus those that are of a more
limited domain, because the architectural rules define the fabric of the system and
how it will meet its quality attribute requirements, and the violation of architectural
rules typically has more far-reaching consequences than the violation of a local
rule)
Furthermore, in the industrial practice of software architecture, many statements that are
said to be “architectural” are in fact local (e.g., both tasks A and B execute on the same
node, or task A controls B). Instead, a truly architectural statement would be, for
instance, for each pair of tasks A,B that satisfy some property X, A and B will execute on
the same node and the property Control(A,B) holds.
· I'd say that architecture is a view of software that's at a higher level than design,
i.e. more abstract and less connected with the actual implementation. The
architecture gives structure to the design elements, while the design elements give
structure to the implemented code.
· I would add that architecture is design, but not all design is architecture.
20
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
LESSON II Combination Logic
Introduction
Digital electronics is classified into combinational logic and sequential logic. Combinational
logic output depends on the inputs levels, whereas sequential logic output depends on
stored levels and also the input levels.
The memory elements are devices capable of storing binary info. The binary info stored in
the memory elements at any given time defines the state of the sequential circuit. The
input and the present state of the memory element determine the output. Memory
elements next state is also a function of external inputs and present state. A sequential
circuit is specified by a time sequence of inputs, outputs, and internal states.
21
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
There are two types of sequential circuits. Their classification depends on the timing of
their signals:
22
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
A clock signal is a periodic square wave that indefinitely switches from 0 to 1 and from
1 to 0 at fixed intervals. Clock cycle time or clock period: the time interval between
two consecutive rising or falling edges of the clock.
Clock Frequency = 1 / clock cycle time (measured in cycles per second or Hz)
The basic idea of having the feedback is to store the value or hold the value, but in the
above circuit, output keeps toggling. We can overcome this problem with the circuit
below, which is basically cascading two inverters, so that the feedback is in-phase, thus
avoids toggling. The equivalent circuit is the same as having a buffer with its output
connected to its input.
But there is a problem here too: each gate output value is stable, but what will it be?
Or in other words buffer output can not be known. There is no way to tell. If we could
know or set the value we would have a simple 1-bit storage/memory element.
23
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The circuit below is the same as the inverters connected back to back with provision to
set the state of each gate (NOR gate with both inputs shorted is like an inverter). I am
not going to explain the operation, as it is clear from the truth table. S is called set
and R is called Reset.
S R Q Q+
0 0 0 0
0 0 1 1
0 1 X 0
1 0 X 1
1 1 X 0
There still seems to be some problem with the above configuration, we can not control
when the input should be sampled, in other words there is no enable signal to control
when the input is sampled. Normally input enable signals can be of two types.
Level Sensitive: The circuit below is a modification of the above one to have level
sensitive enable input. Enable, when LOW, masks the input S and R. When HIGH,
presents S and R to the sequential logic input (the above circuit two NOR Gates). Thus
Enable, when HIGH, transfers input S and R to the sequential cell transparently, so this
kind of sequential circuits are called transparent Latch. The memory element we get is
an RS Latch with active high Enable.
24
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Edge Sensitive: The circuit below is a cascade of two level sensitive memory
elements, with a phase shift in the enable input between first memory element and
second memory element. The first RS latch (i.e. the first memory element) will be
enabled when CLK input is HIGH and the second RS latch will be enabled when CLK is
LOW. The net effect is input RS is moved to Q and Q' when CLK changes state from
HIGH to LOW, this HIGH to LOW transition is called falling edge. So the Edge Sensitive
element we get is called negative edge RS flip-flop.
Now that we know the sequential circuit’s basics, let's look at each of them in detail in
accordance to what is taught in colleges. You are always welcome to suggest if this can
be written better in any way.
· Asynchronous Circuits.
· Synchronous Circuits.
As seen in last section, Latches and Flip-flops are one and the same with a slight
variation: Latches have level sensitive control signal input and Flip-flops have edge
sensitive control signal input. Flip-flops and latches which use this control signals are
called synchronous circuits. So if they don't use clock inputs, then they are called
asynchronous circuits.
25
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
RS Latch
RS latch have two inputs, S and R. S is called set and R is called reset. The S input is
used to produce HIGH on Q (i.e. store binary 1 in flip-flop). The R input is used to
produce LOW on Q (i.e. store binary 0 in flip-flop). Q' is Q complementary output, so
it always holds the opposite value of Q. The output of the S-R latch depends on
current as well as previous inputs or state, and its state (value stored) can change as
soon as its inputs change. The circuit and the truth table of RS latch are shown below.
(This circuit is as we saw in the last page, but arranged to look beautiful :-)).
S R Q Q+
0 0 0 0
0 0 1 1
0 1 X 0
1 0 X 1
1 1 X 0
The operation has to be analyzed with the 4 inputs combinations together with the 2
possible previous states.
· Q and Q' are in, application of 1 at input of NOR gate always results in 0 at output
of NOR gate, which results in both Q and Q' set to LOW (i.e. Q = Q'). LOW in both
the outputs basically is wrong, so this case is invalid.
The waveform below shows the operation of NOR gates based RS Latch.
It is possible to construct the RS latch using NAND gates (of course as seen in Logic
gates section). The only difference is that NAND is NOR gate dual form (Did I say that
in Logic gates section?). So in this case the R = 0 and S = 0 case becomes the invalid
case. The circuit and Truth table of RS latch using NAND is shown below.
27
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
S R Q Q+
1 1 0 0
1 1 1 1
0 1 X 0
1 0 X 1
0 0 X 1
If you look closely, there is no control signal (i.e. no clock and no enable), so these
kinds of latches or flip-flops are called asynchronous logic elements. Since all the
sequential circuits are built around the RS latch, we will concentrate on synchronous
circuits and not on asynchronous circuits.
RS Latch with Clock
We have seen this circuit earlier with two possible input configurations: one with level
sensitive input and one with edge sensitive input. The circuit below shows the level
sensitive RS latch. Control signal "Enable" E is used to gate the input S and R to the RS
Latch. When Enable E is HIGH, both the AND gates act as buffers and thus R and S
appears at the RS latch input and it functions like a normal RS latch. When Enable E is
LOW, it drives LOW to both inputs of RS latch. As we saw in previous page, when both
inputs of a NOR latch are low, values are retained (i.e. the output does not change).
28
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Setup and Hold Time
For synchronous flip-flops, we have special requirements for the inputs with respect to
clock signal input. They are
· Setup Time: Minimum time period during which data must be stable before the
clock makes a valid transition. For example, for a posedge triggered flip-flop, with a
setup time of 2 ns, Input Data (i.e. R and S in the case of RS flip-flop) should be
stable for at least 2 ns before clock makes transition from 0 to 1.
· Hold Time: Minimum time period during which data must be stable after the clock
has made a valid transition. For example, for a posedge triggered flip-flop, with a
hold time of 1 ns. Input Data (i.e. R and S in the case of RS flip-flop) should be
stable for at least 1 ns after clock has made transition from 0 to 1.
If data makes transition within this setup window and before the hold window, then the
flip-flop output is not predictable, and flip-flop enters what is known as meta stable
state. In this state flip-flop output oscillates between 0 and 1. It takes some time for the
flip-flop to settle down. The whole process is called metastability. You could refer to
tidbits section to know more information on this topic.
The waveform below shows input S (R is not shown), and CLK and output Q (Q' is not
shown) for a SR posedge flip-flop.
29
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
D Latch
The RS latch seen earlier contains ambiguous state; to eliminate this condition we can
ensure that S and R are never equal. This is done by connecting S and R together with an
inverter. Thus we have D Latch: the same as the RS latch, with the only difference that
there is only one input, instead of two (R and S). This input is called D or Data input. D
latch is called D transparent latch for the reasons explained earlier. Delay flip-flop or delay
latch is another name used. Below is the truth table and circuit of D latch.
D Q Q+
1 X 1
0 X 0
30
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Below is the D latch waveform, which is similar to the RS latch one, but with R removed.
JK Latch
The ambiguous state output in the RS latch was eliminated in the D latch by joining the
inputs with an inverter. But the D latch has a single input. JK latch is similar to RS latch in
that it has 2 inputs J and K as shown figure below. The ambiguous state has been
eliminated here: when both inputs are high, output toggles. The only difference we see
here is output feedback to inputs, which is not there in the RS latch.
J K Q
1 1 0
1 1 1
1 0 1
31
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
0 1 0
T Latch
When the two inputs of JK latch are shorted, a T Latch is formed. It is called T latch as,
when input is held HIGH, output toggles.
T Q Q+
1 0 1
1 1 0
0 1 1
0 0 0
All sequential circuits that we have seen in the last few pages have a problem (All level
sensitive sequential circuits have this problem). Before the enable input changes state
from HIGH to LOW (assuming HIGH is ON and LOW is OFF state), if inputs changes, then
another state transition occurs for the same enable pulse. This sort of multiple transition
problem is called racing.
If we make the sequential element sensitive to edges, instead of levels, we can overcome
this problem, as input is evaluated only during enable/clock edges.
32
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
In the figure above there are two latches, the first latch on the left is called master latch
and the one on the right is called slave latch. Master latch is positively clocked and slave
latch is negatively clocked.
We saw in the combinational circuits section how to design a combinational circuit from
the given problem. We convert the problem into a truth table, then draw K-map for the
truth table, and then finally draw the gate level circuit for the problem. Similarly we have
a flow for the sequential circuit design. The steps are given below.
33
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Looks like sequential circuit design flow is very much the same as for combinational
circuit.
State Diagram
The state diagram is constructed using all the states of the sequential circuit in question.
It builds up the relationship between various states and also shows how inputs affect the
states.
To ease the following of the tutorial, let's consider designing the 2 bit up counter (Binary
counter is one which counts a binary sequence) using the T flip-flop.
State Table
The state table is the same as the excitation table of a flip-flop, i.e. what inputs need to
be applied to get the required output. In other words this table gives the inputs required
to produce the specific outputs.
Q1 Q0 Q1+ Q0+ T1 T0
0 0 0 1 0 1
0 1 1 0 1 1
1 0 1 1 0 1
1 1 0 0 1 1
K-map
The K-map is the same as the combinational circuits K-map. Only difference: we draw
K-map for the inputs i.e. T1 and T0 in the above table. From the table we deduct that we
34
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
don't need to draw K-map for T0, as it is high for all the state combinations. But for T1 we
need to draw the K-map as shown below, using SOP.
Circuit
There is nothing special in drawing the circuit; it is the same as any circuit drawing from
K-map output. Below is the circuit of 2-bit up counter using the T flip-flop.
35
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
OPERATING MANUAL FOR
ELECTRONIC INDICATORS
1. Batteries
A set of two Manganese Dioxide Lithium batteries will operate this electronic indicator for
approximately 250 hours of normal usage. Because milliampere hour ratings vary widely
with manufacturers, normal usage time is very hard to predict. The lithium battery used in
this indicator is an IEC standard, type CR2450. The indicators are shipped with the
batteries not installed, and should not be installed until battery operation is desired.
NOTE: This indicator has an .AUTO-OFF. feature to conserve battery life. After 10 minutes
of .no activity. (no key presses or spindle movement), the gage will turn itself off. This
feature may be disabled if continuous operation is desired; see .AUTO-OFF On/Off.
instructions in this book.
Installing Batteries
36
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Using a narrow screwdriver, gently pry under the tab on the left side of plastic bezel and
slide out the battery tray as you turn the indicator face side down. Insert two batteries,
.+. side up, into tray cavities, then slide the tray back into its bezel slot, taking care that
the batteries stay in proper position.
AC Adapter
AC adapters (providing 9VDC at 30ma.
maximum to the indicator from a 115 or 230
VAC, 50/60 Hz line source) may be
purchased from CDI. Although other 9V AC adapters with a 3/32. (2.5mm) mini-plug
(center +) may be used, CDI adapters are recommended because they include current
limiting to prevent damage from line fluctuations. For 115 V (USA) operation - Order CDI
Part #G11-0012 For 230 V (Europe) operation - Order CDI Part #G11-0014 First insert
the mini-plug into the socket on the lower left side of the bezel (see
drawing on page 2), then plug the adapter into a wall outlet. After turning the indicator
.ON., disable the .AUTO-OFF. feature; see .AUTO OFF On/Off.
37
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Button Functions
NOTE: Most functions are active on
release of button(s).
38
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Display-Operating Prompts & Conditions
Operating Instructions
39
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
To Turn
AUTO OFF On/Off - Press and hold "2ND" until
2ND appears at bottom of display then release. - Press
and release "OFF" within 3 seconds.
To Verify
DATA I/0 FORMAT
To view the current output format. - Press and release "2ND", until the 2ND appears in
display, then "ON/CLR" and "2ND" in sequence. Format information is displayed for
about 3 seconds, then indicator automatically returns to normal operation. Format
information is displayed as:
RS232 =rS232
MTI compatible =SEr
CDI mux BCD =Cdi
Bypass =bP
To Use
HOLD
To select type of HOLD - Freeze, Minimum or
Maximum: -Press and hold "HOLD" until cursor
moves under desired type of hold; FRZ, MIN or MAX,
then release.
To turn HOLD On/Off:
. Press and release "HOLD"
. MAX HOLD - Holds and displays highest reading.
. MIN HOLD - Holds and displays lowest reading.
. FREEZE HOLD - Freezes display when "HOLD" button is pressed.
To Change
INCH/MILLIMETER
To change from one to the other: - Press and hold "MOVE/2ND" until 2ND appears at
bottom of display then release. - Press and release "TOL" within 3 seconds.
NOTE: MM or IN will appear at bottom of display.
40
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
To Turn
INDICATOR ON
TO
Reset to DEFAULT
A total reset: clears all user settings and returns to factory-set defaults.
1. Press and hold "2ND" until 2ND appears at bottom of display, then release.
2. Press and release "ON/CLR" within 3 seconds.
3. Press and release "CHNG" within 3 seconds.
NOTE: Cannot be done if Lock
feature is on.
To Change
RESOLUTION
-Press and hold "2ND" until 2ND appears at bottom of display then release. - Press and
release "ON/CLR" within 3 seconds.
- Press and release "HOLD" within 3 seconds.
Use "CHNG" key to step through available resolution selections:
1 = .00005" (.001mm)
2 = .0001" (.002mm)
3 = .00025" (.005mm)
4 = .0005" (.O1mm)
5 = .001" (.02mm)
Press and release "CHNG" and "2ND" simultaneously to save.
Note: Only resolutions coarser than indicator resolution-as-purchased are
available.
To Enter
TEST MODE
Press and hold (for more than 5 seconds) "ON/CLR" to enter ’display and key’
test mode.
To Exit
41
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
TEST MODE
Press and hold (for more than 5 seconds) "ON/CLR" to exit ’display and key’ test
mode.
To Change
TRAVEL DIRECTION
- Press and hold "2ND" until 2ND appears at bottom of display then release. - Press and
release "HOLD" within 3 seconds.
Note: Arrow in upper right corner will show positive direction of spindle travel.
NOTE: Most functions are active on release of key(s).
Internal Memory
"LOGIC" Series indicators and remote displays include internal non-volatile memory to
store all factory default and user settings. When the indicator is turned on, user settings
and preset numbers will be the same as when the indicator was turned off.
NOTE: Many of the user settings are stored when the indicator is turned .Off. by using the
"OFF" key, or when the indicator turns itself off (AUTO OFF). However, if the indicator is
turned off by removing power (by disconnecting the AC adapter or cutting power through
the Data 1/0 connector), some or all of the user settings and/or changes may be lost!
Operating Precautions
1. Do not use the bottom of the spindle stroke as a base of measurement reference, as it
is protected with a rubber shock absorber to prevent shock to the internal mechanism.
The spindle should be offset .005.-.010" (.12 -.25 mm) from the bottom of travel.
2. Use of CDI type MS-10 or similar sturdy stands or fixtures for indicator mounting,
where the base plate and indicator are mounted to a common post, is highly
recommended for accurate and repeatable readings. The indicator must be mounted with
the spindle perpendicular to the reference or base plate. If the indicator is stem-mounted,
protect the indicator from attempted rotation, and from being stuck or bumped, to
prevent stem/case mechanical alignment damage. Do not over-tighten the mounting
mechanism, and use clamp mounting rather than set screws if at all possible, to prevent
damage to the stem.
3. The bezel face can be rotated from its normal horizontal position for convenient
viewing. Rotation is limited to 270 degrees and attempts to force it past its internal stop
may damage the indicator.
4. Frequently clean the spindle to prevent sluggish or sticky movement. Dry wiping with a
lint-free cloth usually will suffice, but isopropyl alcohol may be used to remove gummy
42
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
deposits. Do not apply any type of lubricant to the spindle. Spindle dust caps and spindle
boots are available for operation in dirty or abrasive environments. 1" Spindle dust cap -
Order CDI Part #A21.0131 l. Spindle boot - Order CDI Part #CD170-1 Use a soft cloth
dampened with a mild detergent to clean the bezel and front face of the indicator. Do not
use aromatic solvents as they may cause damage.
5. Extremely high electrical transients - from nearby arc welders, SCR motor/lighting
controls, radio transmitters, etc. - may cause malfunctions of the indicator’s internal
circuitry or ’ERROR 1’ indications, even through the electronic design was created to
minimize such problems. If at all possible, do not operate the indicator in plant areas
subject to these transients. Turning the indicator ’OFF’ for a few seconds, then back ’ON’
from time-to-time may eliminate any problems. Also, use of an isolated AC line (for AC
adapter operated indicators and AC powered remote displays), or an AC line filter - plus
solid grounding of stands and fixtures - is recommended in these conditions.
FLASHING DIGIT or +/- sign - Digit or sign affected by .CHNG. key when setting or
changing preset numbers.
FLASHING READING, with HIGH or LOW displayed Reading is out of tolerance, to
the high or low side.
ERROR 1 - Spindle speed too fast, high electrical noise, etc.
ERROR 2 - Counter overflow, i.e. counter number (spindle + preset number) out of
counter range.
ERROR 3 - Improper tolerance combination, i.e. both "HIGH" and ’LOW" set to ’O’ or
same number, or "LOW’ set to a higher number than ’HIGH’. Occurs only when ’TOL’ is
on.
ERROR 4 - Display overflow, i.e. number too large to be properly displayed. Moving
spindle to acceptable range eliminates error message.
Data Output
’LOGIC’ Series indicators and remote displays provide users with multiple data output
formats. The cable attached to the indicator when it is turned on determines the output
format in use. Cables for each format can be purchased from CDI. These cables also
provide remote control of ’ON/CLR’ and ’HOLD’ functions, plus +5v regulated power input.
For special applications, an ERROR FLAG output and/or custom cables also can be
provided; contact CDI for
information.
CAUTION: Use of cables other than those provided or approved by CDI can cause
irreparable damage to the indicator or data output port, and such damage is not covered
by the CDI Limited Warranty.
43
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Standard RS232 Format - Communications protocol is 1200 baud, no parity, 8 data
bits, and 1 stop bit. RS232 can be read by any IBM PC-compatible computer, RS232 serial
printer or other device, provided the device can be set to this protocol. A DB25 pin
adapter may be necessary for non-standard devices. "WINDOWS" terminal and other
communications software, "WEDGE" software, etc., may be used with this format. Cables
Required:
CDI #GO3-0018 - For IBM Compatible PC (CDI indicator to DB25F)
CDI #GO3-0021 - For CDI serial printer types G19-0001/Gl9- 0002 & G19-0003
(CDI indicator to DB25M)
BYPASS FORMAT - Permits indicator to be used as a probe for the CDI remote display:
bypasses ’raw’ unprocessed signals from the detector system directly to the data output
connector. In this operation mode, power for the indicator is supplied by the remote
display. Cable Required:
CDI #Gl3-0022 - CDI indicator to 6-pin DIN
IMPORTANT- Indicator and remote display must be of same base resolution. If the two
(2) are different base resolutions, you will experience compatibility problems.
Limited Warranty
"PLUS SERIES" INDICATORS ARE WARRANTED FOR A PERIOD OF ONE YEAR AGAINST
DEFECTIVE MATERIALS OR WORKMANSHIP. THIS WARRANTY DOES NOT APPLY TO
PRODUCTS THAT ARE MISHANDLED, MISUSED, ETCHED, STAMPED, OR OTHERWISE
MARKED OR DAMAGED, NOR DOES IT APPLY TO DAMAGE OR ERRONEOUS OPERATION
CAUSED
BY USER TAMPERING OR ATTEMPTS TO MODIFY THE INDICATOR. UNITS FOUND TO BE
DEFECTIVE WITHIN THE WARRANTY PERIOD WILL BE REPAIRED OR REPLACED FREE OF
CHARGE AT THE OPTION OF CDI. A NOMINAL CHARGE WILL BE MADE FOR
NON-WARRANTY REPAIRS, PROVIDED THE UNIT IS NOT DAMAGED BEYOND REPAIR.
Boolean algebra
44
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
For a basic introduction to sets, Boolean operations, Venn diagrams, truth tables, and
Boolean applications, see Boolean logic.
For an alternative perspective see Boolean algebras canonically defined.
In abstract algebra, a Boolean algebra is an algebraic structure (a collection of
elements and operations on them obeying defining axioms) that captures essential
properties of both set operations and logic operations. Specifically, it deals with the set
operations of intersection, union, complement; and the logic operations of AND, OR,
NOT.
For example, the logical assertion that a statement a and its negation ¬a cannot both
be true,
parallels the set-theory assertion that a subset A and its complement AC have empty
intersection,
Because truth values can be represented as binary numbers or as voltage levels in logic
circuits, the parallel extends to these as well. Thus the theory of Boolean algebras has
many practical applications in electrical engineering and computer science, as well as in
mathematical logic.
A Boolean algebra is also called a Boolean lattice. The connection to lattices (special
partially ordered sets) is suggested by the parallel between set inclusion, A ⊆ B, and
ordering, a ≤ b. Consider the lattice of all subsets of {x,y,z}, ordered by set inclusion.
This Boolean lattice is a partially ordered set in which, say, {x} ≤ {x,y}. Any two
lattice elements, say p = {x,y} and q = {y,z}, have a least upper bound, here {x,y,z},
and a greatest lower bound, here {y}. Suggestively, the least upper bound (or join or
supremum) is denoted by the same symbol as logical OR, p∨q; and the greatest lower
bound (or meet or infimum) is denoted by same symbol as logical AND, p∧q.
45
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The lattice interpretation helps in generalizing to Heyting algebras, which are Boolean
algebras freed from the restriction that either a statement or its negation must be true.
Heyting algebras correspond to intuitionist (constructivist) logic just as Boolean
algebras correspond to classical logic.
Formal definition
A Boolean algebra is a set A, supplied with two binary operations (called AND),
(called OR), a unary operation (called NOT) and two distinct elements 0 (called
zero) and 1 (called one), such that, for all elements a, b and c of set A, the following
axioms hold:
associativity
commutativity
absorption
distributivity
complements
The first three pairs of axioms above: associativity, commutativity and absorption,
mean that (A, , ) is a lattice. If A is a lattice and one of the above distributivity laws
holds, then the second distributivity law can be proven. Thus, a Boolean algebra can
also be equivalently defined as a distributive complemented lattice.
From these axioms, one can show that the smallest element 0, the largest element 1,
and the complement ¬a of any element a are uniquely determined. For all a and b in A,
the following identities also follow:
idempotency
bounded ness
De Morgan's laws
involution
Examples
46
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
· The simplest Boolean algebra has only two elements, 0 and 1, and is defined by the
rules:
∧ 0 1 ∨ 0 1
a 0 1
0 0 0 0 0 1
¬a 1 0
1 0 1 1 1 1
· The two-element Boolean algebra is also used for circuit design in electrical
engineering; here 0 and 1 represent the two different states of one bit in a
digital circuit, typically high and low voltage. Circuits are described by
expressions containing variables, and two such expressions are equal for all
values of the variables if and only if the corresponding circuits have the same
input-output behavior. Furthermore, every possible input-output behavior
can be modeled by a suitable Boolean expression.
· (a ∨ b) ∧ (¬a ∨ c) ∧ (b ∨ c) ≡ (a ∨ b) ∧ (¬a ∨ c)
· (a ∧ b) ∨ (¬a ∧ c) ∨ (b ∧ c) ≡ (a ∧ b) ∨ (¬a ∧ c)
· Starting with the propositional calculus with κ sentence symbols, form the
Lindenbaum algebra (that is, the set of sentences in the propositional calculus
modulo tautology). This construction yields a Boolean algebra. It is in fact the free
Boolean algebra on κ generators. A truth assignment in propositional calculus is
then a Boolean algebra homomorphism from this algebra to {0,1}.
· The power set (set of all subsets) of any given nonempty set S forms a Boolean
algebra with the two operations ∨ := ∪ (union) and ∧ := ∩ (intersection). The
smallest element 0 is the empty set and the largest element 1 is the set S itself.
· The set of all subsets of S that are either finite or cofinite is a Boolean algebra.
47
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
· For any natural number n, the set of all positive divisors of n forms a distributive
lattice if we write a ≤ b for a | b. This lattice is a Boolean algebra if and only if n is
square-free. The smallest element 0 of this Boolean algebra is the natural number
1; the largest element 1 of this Boolean algebra is the natural number n.
In fact one can also define a Boolean algebra to be a distributive lattice (A, ≤)
(considered as a partially ordered set) with least element 0 and greatest element 1,
within which every element x has a complement ¬x such that
x ¬x = 0 and x ¬x = 1
48
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Here and are used to denote the infimum (meet) and supremum (join) of two
elements. Again, if complements in the above sense exist, then they are uniquely
determined.
The algebraic and the order theoretic perspective can usually be used interchangeably
and both are of great use to import results and concepts from both universal algebra
and order theory. In many practical examples an ordering relation, conjunction,
disjunction, and negation are all naturally available, so that it is straightforward to
exploit this relationship.
Principle of duality
One can also apply general insights from duality in order theory to Boolean algebras.
Especially, the order dual of every Boolean algebra, or, equivalently, the algebra
obtained by exchanging and , is also a Boolean algebra. In general, any law valid for
Boolean algebras can be transformed into another valid, dual law by exchanging 0 with
1, with , and ≤ with ≥.
Other notation
The operators of Boolean algebra may be represented in various ways. Often they are
simply written as AND, OR and NOT. In describing circuits, NAND (NOT AND), NOR (NOT
OR) and XOR (exclusive OR) may also be used. Mathematicians, engineers, and
programmers often use + for OR and · for AND (since in some ways those operations
are analogous to addition and multiplication in other algebraic structures and this
notation makes it very easy to get sum of products form for people who are familiar
with normal algebra) and represent NOT by a line drawn above the expression being
negated. Sometimes, the symbol ~ or ! is used for NOT.
Here we use another common notation with "meet" for AND, "join" for OR, and ¬ for
NOT.
An ideal of the Boolean algebra A is a subset I such that for all x, y in I we have x y in
I and for all a in A we have a x in I. This notion of ideal coincides with the notion of
ring ideal in the Boolean ring A. An ideal I of A is called prime if I ≠ A and if a b in I
always implies a in I or b in I. An ideal I of A is called maximal if I ≠ A and if the only
ideal properly containing I is A itself. These notions coincide with ring theoretic ones of
prime ideal and maximal ideal in the Boolean ring A.
The dual of an ideal is a filter. A filter of the Boolean algebra A is a subset p such that
for all x, y in p we have x y in p and for all a in A if a x = a then a in p.
Stone's celebrated representation theorem for Boolean algebras states that every
Boolean algebra A is isomorphic to the Boolean algebra of all closed-open sets in some
(compact totally disconnected Hausdorff) topological space.
1. Commutativity: x + y = y + x.
50
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
2. Associativity: (x + y) + z = x + (y + z).
Herbert Robbins immediately asked: If the Huntington equation is replaced with its
dual, to wit:
4. Robbins Equation: n(n(x + y) + n(x + n(y))) = x,
do (1), (2), and (4) form a basis for Boolean algebra? Calling (1), (2), and (4) a
Robbins algebra, the question then becomes: Is every Robbins algebra a Boolean
algebra? This question remained open for decades, and became a favorite question of
Alfred Tarski and his students.
Boolean algebra
1. Commonly, and especially in computer science and digital electronics, this term is
used to mean two-valued logic.
2. This is in stark contrast with the definition used by pure mathematicians who in the
1960s introduced "Boolean-valued models" into logic precisely because a
"Boolean-valued model" is an interpretation of a theory that allows more than two
possible truth values!
Strangely, a Boolean algebra (in the mathematical sense) is not strictly an algebra, but
is in fact a lattice. A Boolean algebra is sometimes defined as a "complemented
distributive lattice".
Boole's work which inspired the mathematical definition concerned algebras of sets,
involving the operations of intersection, union and complement on sets. Such algebras
obey the following identities where the operators ^, V, - and constants 1 and 0 can be
thought of either as set intersection, union, complement, universal, empty; or as
two-valued logic AND, OR, NOT, TRUE, FALSE; or any other conforming system.
a^b=b^a aVb = bVa (commutative laws)
(a ^ b) ^ c = a ^ (b ^ c)
(a V b) V c = a V (b V c) (associative laws)
a ^ (b V c) = (a ^ b) V (a ^ c)
a V (b ^ c) = (a V b) ^ (a V c) (distributive laws)
a^a = a aVa = a (idempotence laws)
51
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
--a = a
-(a ^ b) = (-a) V (-b)
-(a V b) = (-a) ^ (-b) (de Morgan's laws)
a ^ -a = 0 a V -a = 1
a^1 = a aV0 = a
a^0 = 0 aV1 = 1
-1 = 0 -0 = 1
There are several common alternative notations for the "-" or logical complement
operator.
If a and b are elements of a Boolean algebra, we define a <= b to mean that a ^ b = a,
or equivalently a V b = b. Thus, for example, if ^, V and - denote set intersection, union
and complement then <= is the inclusive subset relation. The relation <= is a partial
ordering, though it is not necessarily a linear ordering since some Boolean algebras
contain incomparable values.
Note that these laws only refer explicitly to the two distinguished constants 1 and 0
(sometimes written as LaTeX \top and \bot), and in two-valued logic there are no
others, but according to the more general mathematical definition, in some systems
variables a, b and c may take on other values as well.
History
The term "Boolean algebra" honors George Boole (1815–1864), a self-educated English
mathematician. The algebraic system of logic he formulated in his 1854 monograph The
Laws of Thought differs from that described above in some important respects. For
example, conjunction and disjunction in Boole were not a dual pair of operations.
Boolean algebra emerged in the 1860s, in papers written by William Jevons and Charles
Peirce. To the 1890 Vorlesungen of Ernst Schröder we owe the first systematic
presentation of Boolean algebra and distributive lattices. The first extensive treatment
of Boolean algebra in English is A. N. Whitehead's 1898 Universal Algebra. Boolean
algebra as an axiomatic algebraic structure in the modern axiomatic sense begins with
a 1904 paper by Edward Vermilye Huntington. Boolean algebra came of age as serious
mathematics with the work of Marshall Stone in the 1930s, and with Garrett Birkhoff's
1940 Lattice Theory. In the 1960s, Paul Cohen, Dana Scott, and others found deep new
results in mathematical logic and axiomatic set theory using offshoots of Boolean
algebra, namely forcing and Boolean-valued models.
53
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
54
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The recommended way to do so on a
breadboard would be to arrange the resistors in
approximately the same pattern as seen in the
schematic, for ease of relation to the schematic.
If 24 volts is required and we only have 6-volt
batteries available, four may be connected in
series to achieve the same effect:
55
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Building a circuit with components secured to a terminal strip isn't as easy as plugging
components into a breadboard, principally because the components cannot be physically
arranged to resemble the schematic layout. Instead, the builder must understand how
to "bend" the schematic's representation into the real-world layout of the strip. Consider
one example of how the same four-resistor circuit could be built on a terminal strip:
Another terminal strip layout, simpler to understand and relate to the schematic,
involves anchoring parallel resistors (R1//R2 and R3//R4) to the same two terminal
points on the strip like this:
56
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Building more complex
circuits on a terminal strip involves the same spatial-reasoning skills, but of course
requires greater care and planning. Take for instance this complex circuit, represented in
schematic form:
Next, begin connecting components together wire by wire as shown in the schematic.
Over-draw connecting lines in the schematic to indicate completion in the real circuit.
Watch this sequence of illustrations as each individual wire is identified in the schematic,
then added to the real circuit:
57
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
58
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
59
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
60
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
61
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
62
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Although
there are minor variations possible with this terminal strip circuit, the choice of
connections shown in this example sequence is both electrically accurate (electrically
identical to the schematic diagram) and carries the additional benefit of not burdening
any one screw terminal on the strip with more than two wire ends, a good practice in any
terminal strip circuit.
An example of a "variant" wire connection might be the very last wire added (step 11),
which I placed between the left terminal of R2 and the left terminal of R3. This last wire
completed the parallel connection between R2 and R3 in the circuit. However, I could
have placed this wire instead between the left terminal of R2 and the right terminal of
R1, since the right terminal of R1 is already connected to the left terminal of R3 (having
been placed there in step 9) and so is electrically common with that one point. Doing
this, though, would have resulted in three wires secured to the right terminal of R1
instead of two, which is a faux pax in terminal strip etiquette. Would the circuit have
worked this way? Certainly! It's just that more than two wires secured at a single
terminal makes for a "messy" connection: one that is aesthetically unpleasing and may
place undue stress on the screw terminal.
Integrated Circuits are usually called ICs or chips. They are complex circuits
which have been etched onto tiny chips of semiconductor (silicon). The chip is packaged
in a plastic holder with pins spaced on a 0.1" (2.54mm) grid which will fit the holes on
63
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
strip board and breadboards. Very fine wires inside the package link the chip to the
pins.
Pin numbers
Chip holders are only needed when soldering so they are not used on breadboards.
Commercially produced circuit boards often have chips soldered directly to the board
without a chip holder, usually this is done by a machine which is able to work very
quickly. Please don't attempt to do this yourself because you are likely to destroy the
chip and it will be difficult to remove without damage by de-soldering.
Static precautions
64
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Many ICs are static sensitive and can be damaged when you touch them because your
body may have become charged with static electricity, from your clothes for example.
Static sensitive ICs will be supplied in antistatic packaging with a warning label and they
should be left in this packaging until you are ready to use them.
It is usually adequate to earth your hands by touching a metal water pipe or window
frame before handling the IC but for the more sensitive (and expensive!) ICs special
equipment is available, including earthed wrist straps and earthed work surfaces. You can
make an earthed work surface with a sheet of aluminum kitchen foil and using a crocodile
clip to connect the foil to a metal water pipe or window frame with a 10k resistor in
series.
Datasheets
Datasheets are available for most ICs giving detailed information about their ratings
and functions. In some cases example circuits are shown. The large amount of
information with symbols and abbreviations can make datasheets seem overwhelming to
a beginner, but they are worth reading as you become more confident because they
contain a great deal of useful information for more experienced users designing and
testing circuits.
The maximum sinking and sourcing currents for a chip output are usually the
same but there are some exceptions, for example 74LS TTL logic chips can sink up to
16mA but only source 2mA.
65
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Using diodes to combine outputs
The outputs of chips (ICs) must never be directly
connected together. However, diodes can be used to
combine two or more digital (high/low) outputs from a
chip such as a counter. This can be a useful way of
producing simple logic functions without using logic gates!
Low power versions of the 555 are made, such as the ICM7555, but these should
only be used when specified (to increase battery life) because their maximum output
current of about 20mA (with 9V supply) is too low for many standard 555 circuits. The
ICM7555 has the same pin arrangement as a standard 555.
66
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
For most new projects the 74HC family is the best choice. The older 4000 series is
the only family which works with a supply voltage of more than 6V. The 74LS and 74HCT
families require a 5V supply so they are not convenient for battery operation.
The table below summarizes the important properties of the most popular logic families:
74 Series 74 Series 74 Series
Property 4000 Series
74HC 74HCT 74LS
High-speed CMO High-speed CMOS TTL Low-power
Technology CMOS
S TTL compatible Schottky
Power Supply 3 to 15V 2 to 6V 5V ±0.5V 5V ±0.25V
Very high
Very high impedance. Unused
impedance. Unused 'Float' high to logic
inputs must be connected to +Vs
inputs must be 1 if unconnected.
or 0V. Inputs cannot be reliably
Inputs connected to +Vs 1mA must be
driven by 74LS outputs unless a
or 0V. Compatible drawn out to hold
'pull-up' resistor is used (see
with 74LS (TTL) them at logic 0.
below).
outputs.
Can sink and
source about Can sink and Can sink and Can sink up to
5mA (10mA with source about source about 16mA (enough to
9V supply), 20mA, enough 20mA, enough to light an LED), but
Outputs enough to light to light an LED. light an LED. To source only about
an LED. To To switch larger switch larger 2mA. To switch
switch larger currents use a currents use a larger currents
currents use a transistor. transistor. use a transistor.
transistor.
One output can
drive up to 50 One output can
One output can drive up to 50
CMOS, 74HC or drive up to 10
Fan-out CMOS, 74HC or 74HCT inputs, but
74HCT inputs, 74LS inputs or 50
only 10 74LS inputs.
but only one 74HCT inputs.
74LS input.
Maximum
about 1MHz about 25MHz about 25MHz about 35MHz
Frequency
Power
consumption A few µW. A few µW. A few µW. A few mW.
of the IC itself
67
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
It is best to build a circuit using just one logic
family, but if necessary the different families may be
mixed providing the power supply is suitable for all of
them. For example mixing 4000 and 74HC requires the
power supply to be in the range 3 to 6V. A circuit
which includes 74LS or 74HCT ICs must have a 5V
supply.
A 74LS output cannot reliably drive a 4000 or
74HC input unless a 'pull-up' resistor of 2.2k is
connected between the +5V supply and the input to correct the slightly different voltage
ranges used for logic 0.
Driving 4000 or 74HC inputs from a
The 74HC family has High-speed CMOS circuitry, combining the speed of TTL
with the very low power consumption of the 4000 series. They are CMOS ICs with the
68
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
same pin arrangements as the older 74LS family. Note that 74HC inputs cannot be
reliably driven by 74LS outputs because the voltage ranges used for logic 0 are not
quite compatible, use 74HCT instead.
The 74HCT family is a special version of 74HC with 74LS TTL-compatible inputs
so 74HCT can be safely mixed with 74LS in the same system. In fact 74HCT can be
used as low-power direct replacements for the older 74LS ICs in most circuits. The
minor disadvantage of 74HCT is a lower immunity to noise, but this is unlikely to be a
problem in most situations.
Beware that the 74 series is often still called the 'TTL series' even though the latest ICs
do not use TTL!
The CMOS circuitry used in the 74HC and 74HCT series ICs means that they are
static sensitive. Touching a pin while charged with static electricity (from your clothes
for example) may damage the IC. In fact most ICs in regular use are quite tolerant and
earthing your hands by touching a metal water pipe or window frame before handling
them will be adequate. ICs should be left in their protective packaging until you are
ready to use them.
PIC microcontrollers
A PIC is a Programmable Integrated Circuit microcontroller, a
'computer-on-a-chip'. They have a processor and memory to run a program responding
to inputs and controlling outputs, so they can easily achieve complex functions which
would require several conventional ICs.
If you think PICs are not for you because you have never written a computer
program, please look at the PICAXE system! It is very easy to get started using a few
simple BASIC commands and there are a number of projects available as kits which are
ideal for beginners.
69
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
High-performance integrated circuits have traditionally been characterized by the clock
frequency at which they operate. Gauging the ability of a circuit to operate at the
specified speed requires an ability to measure, during the design process, its delay at
numerous steps. Moreover, delay calculation must be incorporated into the inner loop of
timing optimizers at various phases of design, such as logic synthesis, layout
(placement and routing), and in in-place optimizations performed late in the design
cycle. While such timing measurements can theoretically be performed using a rigorous
circuit simulation, such an approach is liable to be too slow to be practical. Static timing
analysis plays a vital role in facilitating the fast and reasonably accurate measurement
of circuit timing. The speedup appears due to the use of simplified delay models, and on
account of the fact that its ability to consider the effects of logical interactions between
signals is limited. Nevertheless, it has become a mainstay of design over the last few
decades; one of the earliest descriptions of a static timing approach was published in
the 1970s.
Purpose
In a synchronous digital system, data is supposed to move in lockstep, advancing one
stage on each tick of the clock signal. This is enforced by synchronizing elements such
as flip-flops or latches, which copy their input to their output when instructed to do so
by the clock. To first order, only two kinds of timing errors are possible in such a
system:
· A hold time violation, when a signal arrives too early, and advances one clock
cycle before it should
· A setup time violation, when a signal arrives too late, and misses the time when
it should advance.
The time when a signal arrives can vary due to many reasons - the input data may
vary, the circuit may perform different operations, the temperature and voltage may
change, and there are manufacturing differences in the exact construction of each part.
The main goal of static timing analysis is to verify that despite these possible variations,
all signals will arrive neither too early nor too late, and hence proper circuit operation
can be assured.
Also, since STA is capable of verifying every path, apart from helping locate setup and
hold time violations, it can detect other serious problems like glitches, slow paths and
clock skew.
Definitions
· The critical path is defined as the path between an input and an output with the
maximum delay. Once the circuit timing has been computed by one of the
techniques below, the critical path can easily found by using a trace back method.
70
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
· The arrival time of a signal is the time elapsed for a signal to arrive at a certain
point. The reference, or time 0.0, is often taken as the arrival time of a clock signal.
To calculate the arrival time, delay calculation of all the component of the path will
be required. Arrival times, and indeed almost all times in timing analysis, are
normally kept as a pair of values - the earliest possible time at which a signal can
change, and the latest.
· Another useful concept is required time. This is the latest time at which a signal
can arrive without making the clock cycle longer than desired. The computation of
the required time proceeds as follows. At each primary output, the required times
for rise/fall are set according to the specifications provided to the circuit. Next, a
backward topological traversal is carried out, processing each gate when the
required times at all of its fan outs are known.
· The slack associated with each connection is the difference between the required
time and the arrival time. A positive slack s at a node implies that the arrival time
at that node may be increased by s without affecting the overall delay of the circuit.
Conversely, negative slack implies that a path is too slow, and the path must sped
up (or the reference signal delayed) if the whole circuit is to work at the desired
speed.
The use of corners in static timing analysis has several limitations. It may be overly
optimistic, since it assumes perfect tracking - if one gate is fast, all gates are assumed
fast, or if the voltage is low for one gate, it's also low for all others. Corners may also be
overly pessimistic, for the worst case corner may seldom occur. In an IC, for example, it
may not be rare to have one metal layer at the thin or thick end of its allowed range,
but it would be very rare for all 10 layers to be at the same limit, since they are
manufactured independently. Statistical STA, which replaces delays with distributions,
and tracking with correlation, is a more sophisticated approach to the same problem.
While the CPM-based methods are the dominant ones in use today, other methods for
traversing circuit graphs, such as depth-first search, have been used by various timing
analyzers.
LESSON III
Flip-flops are synchronous bistable devices. The term synchronous means the output
changes state only when the clock input is triggered. That is, changes in the output
occur in synchronization with the clock.
1. Monostable multivibrator (also called one-shot) has only one stable state. It
produces a single pulse in response to a triggering input.
2. Bistable multivibrator exhibits two stable states. It is able to retain the two SET
and RESET states indefinitely. It is commonly used as a basic building block for
counters, registers and memories.
72
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
3. Astable multivibrator has no stable state at all. It is used primarily as an oscillator
to generate periodic pulse waveforms for timing purposes.
Edge-Triggered Flip-flops
An edge-triggered flip-flop changes states either at the positive edge (rising edge) or at
the negative edge (falling edge) of the clock pulse on the control input. The three basic
types are introduced here: S-R, J-K and D.
The S-R, J-K and D inputs are called synchronous inputs because data on these
inputs are transferred to the flip-flop's output only on the triggering edge of the clock
pulse. On the other hand, the direct set (SET) and clear (CLR) inputs are called
asynchronous inputs, as they are inputs that affect the state of the flip-flop independent
of the clock. For the synchronous operations to work properly, these asynchronous inputs
must both be kept LOW.
Edge-triggered D flip-flop
The operations of a D flip-flop is much more simpler. It has only one input addition to the
clock. It is very useful when a single data bit (0 or 1) is to be stored. If there is a HIGH
on the D input when a clock pulse is applied, the flip-flop SETs and stores a 1. If there is
a LOW on the D input when a clock pulse is applied, the flip-flop RESETs and stores a 0.
The truth table below summarize the operations of the positive edge-triggered D flip-flop.
As before, the negative edge-triggered flip-flop works the same except that the falling
edge of the clock pulse is the triggering edge.
74
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Pulse-Triggered (Master-Slave) Flip-flops
The term pulse-triggered means that data are entered into the flip-flop on the rising
edge of the clock pulse, but the output does not reflect the input state until the falling
edge of the clock pulse. As this kind of flip-flops are sensitive to any change of the input
levels during the clock pulse is still HIGH, the inputs must be set up prior to the clock
pulse's rising edge and must not be changed before the falling edge. Otherwise,
ambiguous results will happen.
The three basic types of pulse-triggered flip-flops are S-R, J-K and D. Their logic symbols
are shown below. Notice that they do not have the dynamic input indicator at the clock
input but have postponed output symbols at the outputs.
The truth tables for the above pulse-triggered flip-flops are all the same as that for
the edge-triggered flip-flops, except for the way they are clocked. These flip-flops are
also called Master-Slave flip-flops simply because their internal construction are divided
into two sections. The slave section is basically the same as the master section except
that it is clocked on the inverted clock pulse and is controlled by the outputs of the master
section rather than by the external inputs. The logic diagram for a basic master-slave S-R
flip-flop is shown below.
75
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Data Lock-Out Flip-flops
The data lock-out flip-flop is similar to the pulse-triggered (master-slave) flip-flop
except it has a dynamic clock input. The dynamic clock disables (locks out) the data
inputs after the rising edge of the clock pulse. Therefore, the inputs do not need to be
held constant while the clock pulse is HIGH.
The master section of this flip-flop is like an edge-triggered device. The slave section
becomes a pulse-triggered device to produce a postponed output on the falling edge of
the clock pulse.
The logic symbols of S-R, J-K and D data lock-out flip-flops are shown below. Notice they
all have the dynamic input indicator as well as the postponed output symbol.
Again, the above data lock-out flip-flops have same the truth tables as that for the
edge-triggered flip-flops, except for the way they are clocked.
Operating Characteristics
The operating characteristics mentions here apply to all flip-flops regardless of the
particular form of the circuit. They are typically found in data sheets for integrated
76
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
circuits. They specify the performance, operating requirements, and operating limitations
of the circuit.
Propagation Delay Time - is the interval of time required after an input signal has been
applied for the resulting output change to occur.
Set-Up Time - is the minimum interval required for the logic levels to be maintained
constantly on the inputs (J and K, or S and R, or D) prior to the triggering edge of the
clock pulse in order for the levels to be reliably clocked into the flip-flop.
Hold Time - is the minimum interval required for the logic levels to remain on the
inputs after the triggering edge of the clock pulse in order for the levels to be reliably
clocked into the flip-flop.
Maximum Clock Frequency - is the highest rate that a flip-flop can be reliably
triggered.
Pulse Widths - are the minimum pulse widths specified by the manufacturer for the
Clock, SET and CLEAR inputs.
Frequency Division
When a pulse waveform is applied to the clock input of a J-K flip-flop that is connected to
toggle, the Q output is a square wave with half the frequency of the clock input. If more
flip-flops are connected together as shown in the figure below, further division of the clock
frequency can be achieved.
77
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The Q output of the second flip-flop is one-fourth the frequency of the original clock
input. This is because the frequency of the clock is divided by 2 by the first flip-flop, then
divided by 2 again by the second flip-flop. If more flip-flops are connected this way, the
frequency division would be 2 to the power n, where n is the number of flip-flops.
Counting
78
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Another very important application of flip-flops is
in digital counters, which are covered in detail in
the next chapter.
A counter that counts from 0 to 3 is illustrated
in the timing diagram on the right. The two-bit
binary sequence repeats every four clock
pulses. When it counts to 3, it recycles back to
0 to begin the sequence again. 0
Flip-flop (electronics)
Flip-flops can be either simple or clocked. Simple flip-flops consist of two cross-coupled
inverting elements – transistors, or NAND, or NOR-gates – perhaps augmented by some
enable/disable (gating) mechanism. Clocked devices are specially designed for
synchronous (time-discrete) systems and therefore ignores its inputs except at the
transition of a dedicated clock signal (known as clocking, pulsing, or strobing). This
causes the flip-flop to either change or retain its output signal based upon the values of
the input signals at the transition. Some flip-flops change output on the rising edge of
the clock, other on the falling edge.
Clocked flip-flops are typically implemented as master-slave devices* where two basic
flip-flops (plus some additional logic) collaborates to make it insensitive to spikes and
noise between the short clock transitions; they nevertheless also often include
asynchronous clear or set inputs which may be used to change the current output
independent of the clock.
Flip-flops can be further divided into types that have found common applicability in both
asynchronous and clocked sequential systems: the SR ("set-reset"), D ("data"), T
("toggle"), and JK types are the common ones; all of which may be synthetisized from
(most) other types by a few logic gates. The behavior of a particular type can be
79
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
described by what is termed the characteristic equation, which derives the "next" (i.e.,
after the next clock pulse) output, Qnext, in terms of the input signal(s) and/or the
current output, Q.
The first electronic flip-flop was invented in 1919 by William Eccles and F. W. Jordan
[1]. It was initially called the Eccles-Jordan trigger circuit and consisted of two
active elements (radio-tubes). The name flip-flop was later derived from the sound
produced on a speaker connected with one of the back coupled amplifiers output during
the trigger process within the circuit.
* Early master-slave devices actually remained (half) open between the first and
second edge of a clocking pulse; today most flip-flops are designed so they may be
clocked by a single edge as this gives large benefits regarding noise immunity, without
any significant downsides.
A circuit symbol for a T-type flip-flop, where > is the clock input, T is the toggle input and
Q is the stored data output.
If the T input is high, the T flip-flop changes state ("toggles") whenever the clock input
is strobe. If the T input is low, the flip-flop holds the previous value. This behavior is
described by the characteristic equation:
(or, without benefit of the XOR operator, the equivalent:
)
and can be described in a truth table:
Q n e Commen
T Q
xt t
0 0 0 hold state
80
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
0 1 1 hold state
1 0 1 toggle
1 1 0 toggle
JK flip-flop
A circuit symbol for a JK flip-flop, where > is the clock input, J and K are data inputs, Q is
the stored data output, and Q' is the inverse of Q.
The characteristic equation of the JK flip-flop is:
81
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
and the corresponding truth table is:
Q nex Commen
J K
t t
0 0 hold state
0 1 reset
1 0 set
1 1 toggle
The origin of the name for the JK flip-flop is detailed by P. L. Lindley, a JPL engineer, in
a letter to EDN, an electronics design magazine. The letter is dated June 13, 1968, and
was published in the August edition of the newsletter. In the letter, Mr. Lindley explains
that he heard the story of the JK flip-flop from Dr. Eldred Nelson, who is responsible for
coining the term while working at Hughes Aircraft.
Flip-flops in use at Hughes at the time were all of the type that came to be known as
J-K. In designing a logical system, Dr. Nelson assigned letters to flip-flop inputs as
follows: #1: A & B, #2: C & D, #3: E & F, #4: G & H, #5: J & K. Given the size of the
system that he was working on, Dr. Nelson realized that he was going to run out of
letters, so he decided to use J and K as the set and reset input of each flip-flop in his
system (using subscripts or some such to distinguish the flip-flops), since J and K were
"nice, innocuous letters."
Dr. Montgomery Phister, Jr., an engineer under Dr. Nelson at Hughes, in his book
"Logical Design of Digital Computers" (Wiley,1958) picked up the idea that J and K were
the set and reset input for a "Hughes type" of flip-flop, which he then termed "J-K
flip-flops." He also defined R-S, T, D, and R-S-T flip-flops, and showed how one could
use Boolean Algebra to specify their interconnections so as to carry out complex
functions.
82
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
D flip-flop
The D flip-flop can be interpreted as a primitive delay line or zero-order hold, since the
data is posted at the output one clock cycle after it arrives at the input. It is called delay
flip flop since the output takes the value in the Data-in.
Qne
D Q >
xt
0 X Rising 0
1 X Rising 1
These flip flops are very useful, as they form the basis for shift registers, which are an
essential part of many electronic devices.
The advantage of this circuit over the D-type latch is that it "captures" the signal at the
moment the clock goes high, and subsequent changes of the data line do not matter,
even if the signal line has not yet gone low again.
Master-slave D flip-flop
A master-slave D flip-flop is created by connecting two gated D latches in series, and
invert the enable input to one of them. It is called master-slave because the second
latch in the series only changes in response to a change in the first (master) latch.
A master slave D flip flop. It responds on the negative edge of the enable input (usually a
clock).
For a positive-edge triggered master-slave D flip-flop, when the clock signal is low
(logical 0) the “enable” seen by the first or “master” D latch (the inverted clock signal)
is high (logical 1). This allows the “master” latch to store the input value when the clock
83
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
signal transitions from low to high. As the clock signal goes high (0 to 1) the inverted
“enable” of the first latch goes low (1 to 0) and the value seen at the input to the
master latch is “locked”. Nearly simultaneously, the twice inverted “enable” of the
second or “slave” D latch transitions from low to high (0 to 1) with the clock signal. This
allows the signal captured at the rising edge of the clock by the now “locked” master
latch to pass through the “slave” latch. When the clock signal returns to low (1 to 0),
the output of the “slave” latch is "locked", and the value seen at the last rising edge of
the clock is held while the “master” latch begins to accept new values in preparation for
the next rising clock edge.
Qne
D Q >
xt
0 X Falling 0
1 X Falling 1
Most D-type flip-flops in ICs have the capability to be set and reset, much like an SR
flip-flop. Usually, the illegal S = R = 1 condition is resolved in D-type flip-flops.
Inputs Outputs
84
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
S R D > Q Q'
0 1 X X 0 1
1 0 X X 1 0
1 1 X X 1 1
Edge-triggered D flip-flop
A more efficient way to make a D flip-flop is not as easy to understand, but it works the
same way. While the master-slave D flip flop is also triggered on the edge of a clock, its
components are each triggered by clock levels. The "edge-triggered D flip flop" does not
have the master slave properties.
A positive-edge-triggered D flip-flop.
85
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Uses
· A single flip-flop can be used to store one bit, or binary digit, of data.
· Static RAM, which is the primary type of memory used in registers to store
numbers in computers and in many caches, is built out of flip-flops.
· Any one of the flip-flop types can be used to build any of the others. The data
contained in several such flip-flops may represent the state of a sequencer, the
value of a counter, an ASCII character in a computer's memory or any other piece
of information.
· One use is to build finite state machines from electronic logic. The flip-flops
remember the machine's previous state, and digital logic uses that state to
calculate the next state.
· The T flip-flop is useful for constructing various types of counters. Repeated signals
to the clock input will cause the flip-flop to change state once per high-to-low
transition of the clock input, if its T input is "1". The output from one flip-flop can be
fed to the clock input of a second and so on. The final output of the circuit,
considered as the array of outputs of all the individual flip-flops, is a count, in
binary, of the number of cycles of the first clock input, up to a maximum of 2n-1,
where n is the number of flip-flops used. See: Counters
· One of the problems with such a counter (called a ripple counter) is that the output
is briefly invalid as the changes ripple through the logic. There are two solutions to
this problem. The first is to sample the output only when it is known to be valid.
The second, more widely used, is to use a different type of circuit called a
synchronous counter. This uses more complex logic to ensure that the outputs of
the counter all change at the same, predictable time. See: Counters
Clocked flip-flops are prone to a problem called metastability, which happens when a
data or control input is changing at the instant of the clock pulse. The result is that the
output may behave unpredictably, taking many times longer than normal to settle to its
correct state, or even oscillating several times before settling. Theoretically it can take
86
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
infinite time to settle down. In a computer system this can cause corruption of data or a
program crash.
In many cases, metastability in flip-flops can be avoided by ensuring that the data and
control inputs are held constant for specified periods before and after the clock pulse,
called the setup time (tsu) and the hold time (th) respectively. These times are
specified in the data sheet for the device, and are typically between a few nanoseconds
and a few hundred picoseconds for modern devices.
Unfortunately, it is not always possible to meet the setup and hold criteria, because the
flip-flop may be connected to a real-time signal that could change at any time, outside
the control of the designer. In this case, the best the designer can do is to reduce the
probability of error to a certain level, depending on the required reliability of the circuit.
One technique for suppressing metastability is to connect two or more flip-flops in a
chain, so that the output of each one feeds the data input of the next, and all devices
share a common clock. With this method, the probability of a metastable event can be
reduced to a negligible value, but never to zero. The probability of metastability gets
closer and closer to zero as the number of flip-flops connected in series is increased.
Another important timing value for a flip-flop is the clock-to-output delay (common
symbol in data sheets: tCO) or propagation delay (tP), which is the time the flip-flop
takes to change its output after the clock edge. The time for a high-to-low transition
(tPHL) is sometimes different from the time for a low-to-high transition (tPLH).
When connecting flip-flops in a chain, it is important to ensure that the tCO of the first
flip-flop is longer than the hold time (tH) of the second flip-flop, otherwise the second
flip-flop will not receive the data reliably. The relationship between tCO and tH is
normally guaranteed if both flip-flops are of the same type.
The behavior of a sequential circuit is determined from the inputs, the outputs and the
states of its flip-flops. Both the output and the next state are a function of the inputs
and the present state.
87
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The suggested analysis procedure of a sequential circuit is set out in Figure 6 below.
Derive the state table and state diagram for the sequential circuit shown in Figure 7.
Figure 7.
L o g i c
schematic
of a
sequentia
l circuit.
SOLUTION:
STEP 1: First we derive the Boolean expressions for the inputs of each flip-flops in the
schematic, in terms of external input Cnt and the flip-flop outputs Q1 and Q0. Since there are
89
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
two D flip-flops in this example, we derive two expressions for D1 and D0:
D0 = Cnt Q0 = Cnt'*Q0 + Cnt*Q0'
D1 = Cnt'*Q1 + Cnt*Q1'*Q0 + Cnt*Q1*Q0'
These Boolean expressions are called excitation equations since they represent the inputs to
the flip-flops of the sequential circuit in the next clock cycle.
STEP 2: Derive the next-state equations by converting these excitation equations into
flip-flop characteristic equations. In the case of D flip-flops, Q(next) = D. Therefore the next
state equal the excitation equations.
Q0(next) = D0 = Cnt'*Q0 + Cnt*Q0'
Q1(next) = D1 = Cnt'*Q1 + Cnt*Q1'*Q0 + Cnt*Q1*Q0'
STEP 3: Now convert these next-state equations into tabular form called the next-state
table.
Present State Next State
Q1Q0 Cnt = 0 Cnt = 1
0 0 0 0 0 1
0 1 0 1 1 0
1 0 1 0 1 1
1 1 1 1 0 0
Each row is corresponding to a state of the sequential circuit and each column represents one
set of input values. Since we have two flip-flops, the number of possible states is four - that is,
Q1Q0 can be equal to 00, 01, 10, or 11. These are present states as shown in the table.
For the next state part of the table, each entry defines the value of the sequential circuit in the
next clock cycle after the rising edge of the Clk. Since this value depends on the present state
and the value of the input signals, the next state table will contain one column for each
assignment of binary values to the input signals. In this example, since there is only one input
signal, Cnt, the next-state table shown has only two columns, corresponding to Cnt = 0 and
Cnt = 1.
Note that each entry in the next-state table indicates the values of the flip-flops in the next
state if their value in the present state is in the row header and the input values in the column
header.
Each of these next-state values has been computed from the next-state equations in STEP 2.
STEP 4: The state diagram is generated directly from the next-state table, shown in Figure
8.
90
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Figur e
8. State
diagra
m
Each arc is labelled with the values of the input signals that cause the transition from the
present state (the source of the arc) to the next state (the destination of the arc).
In general, the number of states in a next-state table or a state diagram will equal 2m , where
m is the number of flip-flops. Similarly, the number of arcs will equal 2m x 2k , where k is the
number of binary input signals. Therefore, in the state diagram, there must be four states and
eight transitions. Following these transition arcs, we can see that as long as Cnt = 1, the
sequential circuit goes through the states in the following sequence: 0, 1, 2, 3, 0, 1, 2, ....
On the other hand, when Cnt = 0, the circuit stays in its present state until Cnt changes to 1,
at which the counting continues.
Since this sequence is characteristic of modulo-4 counting, we can conclude that the sequential
circuit in Figure 7 is a modulo-4 counter with one control signal, Cnt, which enables counting
when Cnt = 1 and disables it when Cnt = 0.
To see how the states changes corresponding to the input signals Cnt, click on this
image.
Below, we show a timing diagram, representing four clock cycles, which enables us to observe
the behavior of the counter in greater detail.
91
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Fig ur e
9 .
T im in g
Dia gr a
m
In this timing diagram we have assumed that Cnt is asserted in clock cycle 0 at t0 and is
disasserted in clock cycle 3 at time t4. We have also assumed that the counter is in state
Q1Q0 = 00 in the clock cycle 0. Note that on the clock's rising edge, at t1, the counter will go
to state Q1Q0 = 01 with a slight propagation delay; in cycle 2, after t2, to Q1Q0 = 10; and in
cycle 3, after t3 to Q1Q0 = 11. Since Cnt becomes 0 at t4, we know that the counter will stay
in state Q1Q0 = 11 in the next clock cycle. To see the timing behavior of the circuit click on
this image .
In Example 1.1 we demonstrated the analysis of a sequential circuit that has no outputs by
developing a next-state table and state diagram which describes only the states and the
transitions from one state to the next. In the next example we complicate our analysis by
adding output signals, which means that we have to upgrade the next-state table and the
state diagram to identify the value of output signals in each state.
Example 1.2
Derive the next state, the output table and the state diagram for the sequential circuit
shown in Figure 10.
92
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Figure
10. Logic
schemati
c of a
sequenti
al circuit.
SOLUTION:
The input combinational logic in Figure 10 is the same as in Example 1.1, so the
excitation and the next-state equations will be the same as in Example 1.1.
Excitation equations:
As this equation shows, the output Y will equal to 1 when the counter is in state Q1Q0 =
11, and it will stay 1 as long as the counter stays in that state.
00 00 01 0
01 01 10 0
10 10 11 0
93
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
11 11 00 1
State diagram:
Timing diagram:
94
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Figure
1 2 .
Timing
diagram
o f
sequentia
l circuit in
Figure
10.
Click on
t h e
image to
see its
timing
behavior.
Note that the counter will reach the state Q1Q0 = 11 only in the third clock cycle, so the
output Y will equal 1 after Q0 changes to 1. Since counting is disabled in the third clock
cycle, the counter will stay in the state Q1Q0 = 11 and Y will stay asserted in all
succeeding clock cycles until counting is enabled again.
The design of a synchronous sequential circuit starts from a set of specifications and
culminates in a logic diagram or a list of Boolean functions from which a logic diagram
can be obtained. In contrast to a combinational logic, which is fully specified by a truth
table, a sequential circuit requires a state table for its specification. The first step in the
design of sequential circuits is to obtain a state table or an equivalence representation,
such as a state diagram.
The recommended steps for the design of sequential circuits are set out below.
95
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Design of Sequential Circuits
Figure 13.
State diagram
From the state diagram, we can generate the state table shown in Table 9. Note that
there is no output section for this circuit. Two flip-flops are needed to represent the four
states and are designated Q0Q1. The input variable is labeled x.
Present State Next State
x=0 x=1
96
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Q0 Q1
0 0 0 0 0 1
0 1 1 0 0 1
1 0 1 0 1 1
1 1 1 1 0 0
Table 9. State table.
We shall now derive the excitation table and the combinational structure. The table is
now arranged in a different form shown in Table 11, where the present state and input
variables are arranged in the form of a truth table. Remember, the excitable for the JK
flip-flop was derive in Table 1.
Q àQ(next) JK
0 à 0 0 X
0 à 1 1 X
1 à 0 X 1
1 à 1 X 0
0 0 0 0 0 0 X 0 X
0 0 0 1 1 0 X 1 X
0 1 1 0 0 1 X X 1
0 1 0 1 1 0 X X 0
1 0 1 0 0 X 0 0 X
1 0 1 1 1 X 0 1 X
1 1 1 1 0 X 0 X 0
1 1 0 0 1 X 1 X 1
In the first row of Table 11, we have a transition for flip-flop Q0 from 0 in the present
state to 0 in the next state. In Table 10 we find that a transition of states from 0 to 0
requires that input J = 0 and input K = X. So 0 and X are copied in the first row under
J0 and K0 respectively. Since the first row also shows a transition for the flip-flop Q1
from 0 in the present state to 0 in the next state, 0 and X are copied in the first row
97
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
under J1 and K1. This process is continued for each row of the table and for each
flip-flop, with the input conditions as specified in Table 10.
The simplified Boolean functions for the combinational circuit can now be derived. The
input variables are Q0, Q1, and x; the outputs are the variables J0, K0, J1 and K1. The
information from the truth table is plotted on the Karnaugh maps shown in Figure 14.
Figure
15. Logic
diagram
of the
sequentia
l circuit.
Example 1.4 Design a sequential circuit whose state tables are specified in Table 12,
using D flip-flops.
98
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Output
Present State Next State
x = x =
Q0 Q1 x=0 x=1
0 1
0 0 0 0 0 1 0 0
0 1 0 0 1 0 0 0
1 0 1 1 1 0 0 0
1 1 0 0 0 1 0 1
Table 13. Excitation table for a D flip-flop.
QàQ(next) D
0 à 0 0
0 à 1 1
1 à 0 0
1 à 1
1
Next step is to derive the excitation table for the design circuit, which is shown in Table
14. The output of the circuit is labeled Z.
F l i p - f l o p Output
Present State Next State Input
Inputs
Q0 Q1 Q0 Q1 x
D0 D1 Z
0 0 0 0 0 0 0 0
0 0 0 1 1 0 1 0
0 1 0 0 0 0 0 0
0 1 1 0 1 1 0 0
1 0 1 1 0 1 1 0
1 0 1 0 1 1 0 0
1 1 0 0 0 0 0 0
1 1 0 1 1 0 1 1
Table 14. Excitation table
Now plot the flip-flop inputs and output functions on the Karnaugh map to derive the
Boolean expressions, which is shown in Figure 16.
99
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Figure 16. Karnaugh maps
RTL is used to represent the code being generated, in a form closer to assembly
language than to the high level languages which GCC compiles. RTL is generated from
the GCC Abstract Syntax Tree representation, transformed by various passes in the
100
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
GCC 'middle-end', and then converted to assembly language. GCC currently uses the
RTL form to do a part of its optimization work.
The RTL generated for a program is different when GCC generates code for different
processors. However, the meaning of the RTL is more-or-less independent of the
target: it would usually be possible to read and understand a piece of RTL without
knowing what processor it was generated for. Similarly, the meaning of the RTL doesn't
usually depend on the original high-level language of the program.
LESSON IV
Types of memory
Many types of memory devices are available for use in modern computer
systems. As an embedded software engineer, you must be aware of the differences
between them and understand how to use each type effectively. In our discussion, we
will approach these devices from the software developer's perspective. Keep in mind
that the development of these devices took several decades and that their underlying
hardware differs significantly. The names of the memory types frequently reflect the
historical nature of the development process and are often more confusing than
insightful. Figure 1 classifies the memory devices we'll discuss as RAM, ROM, or a hybrid
of the two.
Types of RAM
The RAM family includes two important memory devices: static RAM (SRAM) and
dynamic RAM (DRAM). The primary difference between them is the lifetime of the data
101
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
they store. SRAM retains its contents as long as electrical power is applied to the chip. If
the power is turned off or lost temporarily, its contents will be lost forever. DRAM, on
the other hand, has an extremely short data lifetime-typically about four milliseconds.
This is true even when power is applied constantly.
In short, SRAM has all the properties of the memory you think of when you hear the
word RAM. Compared to that, DRAM seems kind of useless. By itself, it is. However, a
simple piece of hardware called a DRAM controller can be used to make DRAM behave
more like SRAM. The job of the DRAM controller is to periodically refresh the data stored
in the DRAM. By refreshing the data before it expires, the contents of memory can be
kept alive for as long as they are needed. So DRAM is as useful as SRAM after all.
When deciding which type of RAM to use, a system designer must consider access time
and cost. SRAM devices offer extremely fast access times (approximately four times
faster than DRAM) but are much more expensive to produce. Generally, SRAM is used
only where access speed is extremely important. A lower cost-per-byte makes DRAM
attractive whenever large amounts of RAM are required. Many embedded systems
include both types: a small block of SRAM (a few kilobytes) along a critical data path
and a much larger block of DRAM (perhaps even Megabytes) for everything else.
Types of ROM
Memories in the ROM family are distinguished by the methods used to write new data to
them (usually called programming), and the number of times they can be rewritten.
This classification reflects the evolution of ROM devices from hardwired to
programmable to erasable-and-programmable. A common feature of all these devices is
their ability to retain data and programs forever, even during a power failure.
The very first ROMs were hardwired devices that contained a preprogrammed set of
data or instructions. The contents of the ROM had to be specified before chip
production, so the actual data could be used to arrange the transistors inside the chip.
Hardwired memories are still used, though they are now called masked ROMs to
distinguish them from other types of ROM. The primary advantage of a masked ROM is
its low production cost. Unfortunately, the cost is low only when large quantities of the
same ROM are required.
One step up from the masked ROM is the PROM (programmable ROM), which is
purchased in an unprogrammed state. If you were to look at the contents of an
unprogrammed PROM, you would see that the data is made up entirely of 1's. The
process of writing your data to the PROM involves a special piece of equipment called a
device programmer. The device programmer writes data to the device one word at a
time by applying an electrical charge to the input pins of the chip. Once a PROM has
been programmed in this way, its contents can never be changed. If the code or data
stored in the PROM must be changed, the current device must be discarded. As a result,
PROMs are also known as one-time programmable (OTP) devices.
102
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
An EPROM (erasable-and-programmable ROM) is programmed in exactly the same
manner as a PROM. However, EPROMs can be erased and reprogrammed repeatedly.
To erase an EPROM, you simply expose the device to a strong source of ultraviolet light.
(A window in the top of the device allows the light to reach the silicon.) By doing this,
you essentially reset the entire chip to its initial--unprogrammed--state. Though more
expensive than PROMs, their ability to be reprogrammed makes EPROMs an essential
part of the software development and testing process.
Hybrids
As memory technology has matured in recent years, the line between RAM and ROM
has blurred. Now, several types of memory combine features of both. These devices do
not belong to either group and can be collectively referred to as hybrid memory devices.
Hybrid memories can be read and written as desired, like RAM, but maintain their
contents without electrical power, just like ROM. Two of the hybrid devices, EEPROM
and flash, are descendants of ROM devices. These are typically used to store code. The
third hybrid, NVRAM, is a modified version of SRAM. NVRAM usually holds persistent
data.
Flash memory combines the best features of the memory devices described thus far.
Flash memory devices are high density, low cost, nonvolatile, fast (to read, but not to
write), and electrically reprogrammable. These advantages are overwhelming and, as a
direct result, the use of flash memory has increased dramatically in embedded systems.
From a software viewpoint, flash and EEPROM technologies are very similar. The major
difference is that flash devices can only be erased one sector at a time, not
byte-by-byte. Typical sector sizes are in the range 256 bytes to 16KB. Despite this
disadvantage, flash is much more popular than EEPROM and is rapidly displacing many
of the ROM devices as well.
The third member of the hybrid memory class is NVRAM (non-volatile RAM). No
volatility is also a characteristic of the ROM and hybrid memories discussed previously.
However, an NVRAM is physically very different from those devices. An NVRAM is
usually just an SRAM with a battery backup. When the power is turned on, the NVRAM
operates just like any other SRAM. When the power is turned off, the NVRAM draws just
enough power from the battery to retain its data. NVRAM is fairly common in embedded
systems. However, it is expensive--even more expensive than SRAM, because of the
103
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
battery--so its applications are typically limited to the storage of a few hundred bytes of
system-critical information that can't be stored in any better way.
Table 1 summarizes the features of each type of memory discussed here, but keep in
mind that different memory types serve different purposes. Each memory type has its
strengths and weaknesses. Side-by-side comparisons are not always effective.
E r a s e Max Erase Cost (per
Type Volatile? Writeable? Speed
Size Cycles Byte)
SRAM Yes Yes Byte Unlimited Expensive Fast
DRAM Yes Yes Byte Unlimited Moderate Moderate
Masked
No No n/a n/a Inexpensive Fast
ROM
Once, with a
PROM No d e v i c e n/a n/a Moderate Fast
programmer
Yes, with a L i m i t e d
Entire
EPROM No d e v i c e ( c o n s u l t Moderate Fast
Chip
programmer datasheet)
L i m i t e d Fast to read,
EEPROM No Yes Byte ( c o n s u l t Expensive slow to
datasheet) erase/write
L i m i t e d Fast to read,
Flash No Yes Sector ( c o n s u l t Moderate slow to
datasheet) erase/write
Expensive
NVRAM No Yes Byte Unlimited (SRAM + Fast
battery)
Table 1. Characteristics of the various memory types
104
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
a n d
coupled with a central processing unit (CPU), implements the basic Von Neumann
computer model used since the 1940s.
In contemporary usage, memory usually refers to a form of solid state storage known
as random access memory (RAM) and sometimes other forms of fast but temporary
storage. Similarly, storage more commonly refers to mass storage - optical discs,
forms of magnetic storage like hard disks, and other types of storage which are slower
than RAM, but of a more permanent nature. These contemporary distinctions are
helpful, because they are also fundamental to the architecture of computers in general.
As well, they reflect an important and significant technical difference between memory
and mass storage devices, which has been blurred by the historical usage of the terms
"main storage" (and sometimes "primary storage") for random access memory, and
"secondary storage" for mass storage devices. This is explained in the following
sections, in which the traditional "storage" terms are used as sub-headings for
convenience.
Purposes of storage
The fundamental components of a general-purpose computer are arithmetic and logic
unit, control circuitry, storage space, and input/output devices. If storage was removed,
the device we had would be a simple digital signal processing device (e.g. calculator,
media player) instead of a computer. The ability to store instructions that form a
105
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
computer program, and the information that the instructions manipulate is what makes
stored program architecture computers versatile.
A Digital computer represents information using the binary numeral system. Text,
numbers, pictures, audio, and nearly any other form of information can be converted
into a string of bits, or binary digits, each of which has a value of 1 or 0. The most
common unit of storage is the byte, equal to 8 bits. A piece of information can be
manipulated by any computer whose storage space is large enough to accommodate
the corresponding data, or the binary representation of the piece of information. For
example, a computer with a storage space of eight million bits, or one megabyte, could
be used to edit a small novel.
Various forms of storage, based on various natural phenomena, have been invented. So
far, no practical universal storage medium exists, and all forms of storage have some
drawbacks. Therefore a computer system usually contains several kinds of storage,
each with an individual purpose, as shown in
Various forms of storage, divided according
the diagram. to their distance from the central processing
unit. Additionally, common technology and
capacity found in home computers of 2005 is
Primary storage
Primary storage is directly
connected to the central processing unit of
the computer. It must be present for the CPU
to function correctly, just as in a biological
analogy the lungs must be present ( f o r
oxygen storage) for the heart to function
(to pump and oxygenate the blood). As shown
in the diagram, primary storage typic ally
consists of three kinds of storage:
106
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
processing units to increase their performance or "throughput". Some of the
information in the main memory is duplicated in the cache memory, which is
slightly slower but of much greater capacity than the processor registers, and faster
but much smaller than main memory. Multi-level cache memory is also commonly
used - "primary cache" being smallest, fastest and closest to the processing device;
"secondary cache" being larger and slower, but still faster and much smaller than
main memory.
· Main memory contains the programs that are currently being run and the data the
programs are operating on. In modern computers, the main memory is the
electronic solid-state random access memory. It is directly connected to the CPU
via a "memory bus" (shown in the diagram) and a "data bus". The arithmetic and
logic unit can very quickly transfer information between a processor register and
locations in main storage, also known as a "memory addresses". The memory bus
is also called an address bus or front side bus and both busses are high-speed
digital "superhighways". Access methods and speed are two of the fundamental
technical differences between memory and mass storage devices. (Note that all
memory sizes and storage capacities shown in the diagram will inevitably be
exceeded with advances in technology over time.)
107
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Off-line storage is a system where the storage medium can be easily removed from
the storage device. Off-line storage is used for data transfer and archival purposes. In
modern computers, compact discs, DVDs, memory cards, flash memory devices
including "USB drives", floppy disks, Zip disks and magnetic tapes are commonly used
for off-line mass storage purposes. "Hot-pluggable" USB hard disks are also available.
Off-line storage devices used in the past include punched cards, microforms, and
removable Winchester disk drums.
Network storage
Network storage is any type of computer storage that involves accessing information
over a computer network. Network storage arguably allows to centralize the information
management in an organization, and to reduce the duplication of information. Network
storage includes:
Confusingly, these terms are sometimes used differently. Primary storage can be
used to refer to local random-access disk storage, which should properly be called
secondary storage. If this type of storage is called primary storage, then the term
secondary storage would refer to offline, sequential-access storage like tape media.
Characteristics of storage
108
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The division to primary, secondary, tertiary and off-line storage is based on memory
hierarchy, or distance from the central processing unit. There are also other ways to
characterize various types of storage.
Volatility of information
Volatile memory requires constant power to maintain the stored information. Volatile
memory is typically used only for primary storage. (Primary storage is not necessarily
volatile, even though today's most cost-effective primary storage technologies are.
Non-volatile technologies have been widely used for primary storage in the past and may
again be in the future.)
Non-volatile memory will retain the stored information even if it is not constantly
supplied with electric power. It is suitable for long-term storage of information, and
therefore used for secondary, tertiary, and off-line storage.
Dynamic memory is volatile memory which also requires that stored information is
periodically refreshed, or read and rewritten without modifications.
· Read only storage retains the information stored at the time of manufacture, and
write once storage (WORM) allows the information to be written only once at
some point after manufacture. These are called immutable storage. Immutable
storage is used for tertiary and off-line storage. Examples include CD-R.
· Slow write, fast read storage is read/write storage which allows information to
be overwritten multiple times, but with the write operation being much slower than
the read operation. Examples include CD-RW.
109
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Addressability of information
· In location-addressable storage, each individually accessible unit of information
in storage is selected with its numerical memory address. In modern computers,
location-addressable storage usually limits to primary storage, accessed internally
by computer programs, since location-addressability is very efficient, but
burdensome for humans.
· In file system storage, information is divided into files of variable length, and a
particular file is selected with human-readable directory and file names. The
underlying device is still location-addressable, but the operating system of a
computer provides the file system abstraction to make the operation more
understandable. In modern computers, secondary, tertiary and off-line storage use
file systems.
· Latency is the time it takes to access a particular location in storage. The relevant
unit of measurement is typically nanosecond for primary storage, millisecond for
secondary storage, and second for tertiary storage. It may make sense to separate
read latency and write latency, and in case of sequential access storage, minimum,
maximum and average latency.
· Throughput is the rate at which information can read from or written to the
storage. In computer storage, throughput is usually expressed in terms of
megabytes per second or MB/s, though bit rate may also be used. As with latency,
read rate and write rate may need to be differentiated.
110
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Technologies, devices and media
Magnetic storage
Magnetic storage uses different patterns of magnetization on a magnetically coated
surface to store information. Magnetic storage is non-volatile. The information is
accessed using one or more read/write heads. Since the read/write head only covers a
part of the surface, magnetic storage is sequential access and must seek, cycle or both.
In modern computers, the magnetic surface will take these forms:
· Magnetic disk
In early computers, magnetic storage was also used for primary storage in a form of
magnetic drum, or core memory, core rope memory, thin film memory, twistor memory
or bubble memory. Also unlike today, magnetic tape was often used for secondary
storage.
Semiconductor storage
Semiconductor memory uses semiconductor-based integrated circuits to store
information. A semiconductor memory chip may contain millions of tiny transistors or
capacitors. Both volatile and non-volatile forms of semiconductor memory exist. In
modern computers, primary storage almost exclusively consists of dynamic volatile
semiconductor memory or dynamic random access memory. Since the turn of the
century, a type of non-volatile semiconductor memory known as flash memory has
steadily gained share as off-line storage for home computers. Non-volatile
semiconductor memory is also used for secondary storage in various advanced
electronic devices and specialized computers.
· CD, CD-ROM, DVD: Read only storage, used for mass distribution of digital
information (music, video, computer programs)
111
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
· CD-R, DVD-R, DVD+R: Write once storage, used for tertiary and off-line storage
· CD-RW, DVD-RW, DVD+RW, DVD-RAM: Slow write, fast read storage, used for
tertiary and off-line storage
· Blu-ray
· HD DVD
· HVD
· Phase-change Dual
Current generations of UDO store up to 30 GB, but 60 GB and 120 GB versions of UDO
are in development and are expected to arrive sometime in 2007 and beyond, though
up to 500 GB has been speculated as a possibility for UDO. [1]
Williams tube used a cathode ray tube, and Selectron tube used a large vacuum
tube to store information. These primary storage devices were short-lived in the
market, since Williams tube was unreliable and Selectron tube was expensive.
Delay line memory used sound waves in a substance such as mercury to store
information. Delay line memory was dynamic volatile, cycle sequential read/write
storage, and was used for primary storage.
Molecular memory stores information in polymers that can store electric charge.
Molecular memory might be especially suited for primary storage.
LESSON V
113
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
A computer program is a collection of instructions that describe a task, or set of
tasks, to be carried out by a computer.
The term computer program may refer to source code, written in a programming
language, or to the executable form of this code. Computer programs are also known as
software, applications programs, system software or simply programs.
The source code of most computer programs consists of a list of instructions that
explicitly implement an algorithm (known as an imperative programming style); in
another form (known as declarative programming) the characteristics of the required
information are specified and the method used to obtain the results, if any, is left to the
platform.
Computer programs are often written by people known as computer programmers, but
may also be generated by other programs.
Terminology
Commercial computer programs aimed at end-users are commonly referred to as
application software by the computer industry, as these programs are focused on the
functionality of what the computer is being used for (its application), as opposed to
being focused on system-level functionality (for example, as the Windows operating
system software is). In practice, colloquially, both application software and system
software may correctly be referred to as programs, as may be the more esoteric
firmware—software firmly built into an embedded system. Programs that execute on
the hardware are a set of instructions in a format understandable by the instruction set
of the computer's main processor, which cause specific other instructions to execute or
perform a simple computation like addition. But computers process millions of such per
second and that is the program, the sequence of instructions strung together such that
when executed, they do something useful, and usually repeatable and reliable.
For differences in the usage of the spellings program and programmed, see American
and British English spelling differences.
Program execution
A computer program is loaded into memory (usually by the operating system) and then
executed ("run"), instruction by instruction, until termination, either with success or
through software or hardware error.
Before a computer can execute any sort of program (including the operating system,
itself a program) the computer hardware must be initialized. This initialization is done in
modern PCs by a piece of software stored on programmable memory chips installed by
114
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
the manufacturer, called the BIOS. The BIOS will attempt to initialize the boot
sequence, making the computer ready for higher-level program execution.
Programming
Main article: Computer programming
A program is likely to contain a variety of data structures and a variety of different
algorithms to operate on them.
Creating a computer program is the iterative process of writing new source code or
modifying existing source code, followed by testing, analyzing and refining this code. A
person who practices this skill is referred to as a computer programmer or software
developer. The sometimes lengthy process of computer programming is now referred to
as "software development" or software engineering. The latter becoming more popular
due to the increasing maturity of the discipline. (see Debate over who is a software
engineer)
Two other forms of modern day approaches are team programming where each
member of the group has equal say in the development process except for one person
who guides the group through discrepancies. These groups tend to be around 10 people
to keep the group manageable. The second form is referred to as "peer programming"
or pair programming.
See Process and methodology for the different aspects of modern day computer
programming.
Trivia
The world's shortest useful program is usually agreed upon to be the utility cont/rerun
used on the old operating system CP/M. It was 2 bytes long (JMP 100), jumping to the
start position of the program that had previously been run and so restarting the
program, in memory, without loading it from the much slower disks of the 1980's.
115
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
"program" was qualified as such only due to a flaw in the language of the contest rules,
which were soon after modified to require the program to be greater than zero bytes.
Ada Lovelace wrote a set of notes specifying in complete detail a method for calculating
Bernoulli numbers with the Analytical Engine described by Charles Babbage. This is
recognized as the world's first computer program and she is recognized as the world's
first computer programmer by historians.
Different kinds of data structures are suited to different kinds of applications, and some
are highly specialized to certain tasks. For example, B-trees are particularly well-suited
for implementation of databases, while routing tables rely on networks of machines to
function.
In the design of many types of programs, the choice of data structures is a primary
design consideration, as experience in building large systems has shown that the
difficulty of implementation and the quality and performance of the final result depends
heavily on choosing the best data structure. After the data structures are chosen, the
algorithms to be used often become relatively obvious. Sometimes things work in the
opposite direction - data structures are chosen because certain key tasks have
algorithms that work best with particular data structures. In either case, the choice of
appropriate data structures is crucial.
116
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
This insight has given rise to many formalized design methods and programming
languages in which data structures, rather than algorithms, are the key organizing
factor. Most languages feature some sort of module system, allowing data structures to
be safely reused in different applications by hiding their verified implementation details
behind controlled interfaces. Object-oriented programming languages such as C++ and
Java in particular use classes for this purpose.
Since data structures are so crucial to professional programs, many of them enjoy
extensive support in standard libraries of modern programming languages and
environments, such as C++'s Standard Template Library, the Java API, and the
Microsoft .NET Framework.
The fundamental building blocks of most data structures are arrays, records,
discriminated unions, and references. For example, the nullable reference, a reference
which can be null, is a combination of references and discriminated unions, and the
simplest linked data structure, the linked list, is built from records and nullable
references.
· queues
· linked lists
· trees
· graphs
The arithmetic logic unit (ALU) is a digital circuit that calculates an arithmetic
operation (like an addition, subtraction, etc.) and logic operations (like an Exclusive Or)
between two numbers. The ALU is a fundamental building block of the central
processing unit of a computer.
117
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Many types of electronic circuits need to perform some type of arithmetic operation, so
even the circuit inside a digital watch will have a tiny ALU that keeps adding 1 to the
current time, and keeps checking if it should beep the timer, etc...
By far, the most complex electronic circuits are those that are built inside the chip of
modern microprocessors like the Pentium. Therefore, these processors have inside them
a powerful and very complex ALU. In fact, a modern microprocessor (or mainframe)
may have multiple cores, each core with multiple execution units, each with multiple
ALUs.
Many other circuits may contain ALUs inside: GPUs like the ones in NVidia and ATI
graphic cards, FPUs like the old 80387 co-processor, and digital signal processor like the
ones found in Sound Blaster sound cards, CD players and High-Definition TVs. All of
these have several powerful and complex ALUs inside.
A typical schematic symbol for an ALU: A & B are operands; R is the output; F is the input
from the Control Unit; D is an output status
Von Neumann stated that an ALU is a necessity for a computer because it is guaranteed
that a computer will have to compute basic mathematical operations, including addition,
118
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
subtraction, multiplication, and division.[1] He therefore believed it was "reasonable
that [the computer] should contain specialized organs for these operations."[2]
Numerical Systems
An ALU must process numbers using the same format as the rest of the digital circuit.
For modern processors, that almost always is the two's complement binary number
representation. Early computers used a wide variety of number systems, including one's
complement, sign-magnitude format, and even true decimal systems, with ten tubes
per digit.
ALUs for each one of these numeric systems had different designs, and that influenced
the current preference for two's complement, as this is the representation that makes it
easier for the ALUs to calculate additions and subtractions.
Practical overview
Simple Operations
Most ALUs can perform the following operations:
119
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Complex Operations
An engineer can design an ALU to calculate any operation, however complicated it is;
the problem is that the more complex the operation, the more expensive the ALU is, the
more space it uses in the processor, and more power it dissipates, etc...
Therefore, engineers always calculate a compromise, to provide for the processor (or
other circuits) an ALU powerful enough to make the processor fast, but yet not so
complex as to become prohibitive. Imagine that you need to calculate, say the square
root of a number; the digital engineer will examine the following options to implement
this operation:
1. Design a very complex ALU that calculates the square root of any number in a
single step. This is called calculation in a single clock.
2. Design a complex ALU that calculates the square root through several steps. This is
called interactive calculation, and usually relies on control from a complex
control unit with built-in microcode.
3. Design a simple ALU in the processor, and sell a separate specialized and costly
processor that the customer can install just besides this one, and implements one
of the options above. This is called the co-processor.
4. Emulate the existence of the co-processor, that is, whenever a program attempts
to perform the square root calculation, make the processor check if there is a
co-processor present and use it if there is one; if there isn't one, interrupt the
processing of the program and invoke the operating system to perform the square
root calculation through some software algorithm. This is called software
emulation.
The options above go from the fastest and most expensive one to the slowest and least
expensive one. Therefore, while even the simplest computer can calculate the most
complicated formula, the simplest computers will usually take a long time doing that
because several of the steps for calculating the formula will involve the options #3, #4
and #5 above.
Powerful processors like the Pentium IV and AMD64 implement option #1 above for the
most of the complex operations and the slower #2 for the extremely complex
operations. That is possible by the ability of building very complex ALUs in these
processors.
120
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Inputs and outputs
The inputs to the ALU are the data to be operated on (called operands) and a code from
the control unit indicating which operation to perform. Its output is the result of the
computation.
In many designs the ALU also takes or generates as inputs or outputs a set of condition
codes from or to a status register. These codes are used to indicate cases such as
carry-in or carry-out, overflow, divide-by-zero, etc.[4]
Usually engineers call an ALU the circuit that performs arithmetic operations in integer
formats (like two's complement and BCD), while the circuits that calculate on more
complex formats like floating point, complex numbers, etc... usually receive a more
illustrious name.
In computing, input/output
In computer architecture, the combination of the CPU and main memory (i.e. memory
that the CPU can read and write to directly, with individual instructions) is considered
121
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
the heart of a computer, and any movement of information from or to that complex,
for example to or from a disk drive, is considered I/O. The CPU and its supporting
circuitry provide I/O methods that are used in low-level computer programming in the
implementation of device drivers.
An alternative to special primitive functions is the I/O monad that permits programs to
just describe I/O, and the actions are carried out outside the program. This is notable
because the I/O functions would introduce side-effects to any programming language,
but now purely functional programming is practical.
Control unit is the part of a CPU or other device that directs its operation. The outputs
of the unit control the activity of the rest of the device. A control unit can be thought of
as a finite state machine.
At one time control units for CPUs were ad-hoc logic, and they were difficult to design.
Now they are often implemented as a microprogram that is stored in a control store.
Words of the microprogram are selected by a microsequencer and the bits from those
words directly control the different parts of the device, including the registers,
arithmetic and logic units, instruction registers, buses, and off-chip input/output. In
modern computers, each of these subsystems may have its own subsidiary controller,
with the control unit acting as a supervisor. (See also CPU design and computer
architecture.)
2. Hardware control units. In a hardware control unit, a digital circuit generates the
control signals directly.
The system console, root console or simply console is the text entry and display
device for system administration messages, particularly those from the BIOS or boot
loader, the kernel, from the init system and from the system logger.
122
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
On traditional minicomputers, the console was a serial console, an RS-232 serial link
to a terminal such as a DEC VT100. This terminal was usually kept in a secured room
since it could be used for certain privileged functions such as halting the system or
selecting which media to boot from. Large midrange systems, e.g. those from Sun
Microsystems, Hewlett-Packard and IBM, still use serial consoles. In larger installations,
the console ports are attached to multiplexers or network-connected multiport serial
servers that let an operator connect his terminal to any of the attached servers.
On PCs, the computer's attached keyboard and monitor have the equivalent function.
Since the monitor cable carries video signals, it cannot be extended very far. Often,
installations with many servers therefore use keyboard/video multiplexers (KVM
switches) and possibly video amplifiers to centralize console access. In recent years,
KVM/IP devices have become available that allow a remote computer to view the video
output and send keyboard input via any TCP/IP network and therefore the Internet.
Some PC BIOSes, especially in servers, also support serial consoles, giving access to the
BIOS through a serial port so that the simpler and cheaper serial console infrastructure
can be used. Even where BIOS support is lacking, some operating systems, e.g.
FreeBSD and Linux, can be configured for serial console operation either during boot up,
or after startup.
123
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Knoppix system console showing the boot process
A microprogram implements a CPU instruction set. Just as a single high level language
statement is compiled to a series of machine instructions (load, store, shift, etc), in a
CPU using microcode, each machine instruction is in turn implemented by a series of
microinstructions, sometimes called a microprogram. Microprograms are often referred
to as microcode.
The elements composing a microprogram exist on a lower conceptual level than the
more familiar assembler instructions. Each element is differentiated by the "micro"
prefix to avoid confusion: microprogram, microcode, microinstruction, microassembler,
etc.
Microprograms are carefully designed and optimized for the fastest possible execution,
since a slow microprogram would yield a slow machine instruction which would in turn
cause all programs using that instruction to be slow. The microprogrammer must have
extensive low-level hardware knowledge of the computer circuitry, as the microcode
controls this. The microcode is written by the CPU engineer during the design phase.
124
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
On most computers using microcode, the microcode doesn't reside in the main system
memory, but exists in a special high speed memory, called the control store. This
memory might be read-only memory, or it might be read-write memory, in which case
the microcode would be loaded into the control store from some other storage medium
as part of the initialization of the CPU. If the microcode is read-write memory, it can be
altered to correct bugs in the instruction set, or to implement new machine instructions.
Microcode can also allow one computer microarchitecture to emulate another, usually
more-complex architecture.
· Update the "condition codes" with the ALU status flags ("Negative", "Zero",
"Overflow", and "Carry")
To simultaneously control all of these features, the microinstruction is often very wide,
for example, 56 bits or more.
Microprogramming also helped alleviate the memory bandwidth problem. During the
1970s, CPU speeds grew more quickly than memory speeds. Numerous acceleration
techniques such as memory block transfer, memory pre-fetch and multi-level caches
helped reduce this. However high level machine instructions (made possible by
microcode) helped further. Fewer more complex machine instructions require less
memory bandwidth. For example complete operations on character strings could be
done as a single machine instruction, thus avoiding multiple instruction fetches.
Architectures using this approach included the IBM System/360 and Digital Equipment
Corporation VAX, the instruction sets of which were implemented by complex
microprograms. The approach of using increasingly complex microcode-implemented
instruction sets was later called CISC.
Other benefits
A processor's microprograms operate on a more primitive, totally different and much
more hardware-oriented architecture than the assembly instructions visible to normal
programmers. In coordination with the hardware, the microcode implements the
programmer-visible architecture. The underlying hardware need not have a fixed
relationship to the visible architecture. This makes it possible to implement a given
instruction set architecture on a wide variety of underlying hardware
micro-architectures.
The IBM System/360 has a 32-bit architecture with 16 general-purpose registers, but
most of the System/360 implementations actually used hardware implementing a much
simpler underlying microarchitecture.
In this way, microprogramming enabled IBM to design many System/360 models with
substantially different hardware and spanning a wide range of cost and performance,
while making them all architecturally compatible. This dramatically reduced the amount
of unique system software that had to be written for each model.
126
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
A similar approach was used by Digital Equipment Corporation in their VAX family of
computers. Initially a 32-bit TTL processor in conjunction with supporting microcode
implemented the programmer-visible architecture. Later VAX versions used different
micro architectures, yet the programmer-visible architecture didn't change.
Microprogramming also reduced the cost of field changes to correct defects (bugs) in
the processor; a bug could often be fixed by replacing a portion of the microprogram
rather than by changes being made to hardware logic and wiring.
History
In 1947, the design of the MIT Whirlwind introduced the concept of a control store as a
way to simplify computer design and move beyond ad hoc methods. The control store
was a two-dimensional lattice: one dimension accepted "control time pulses" from the
CPU's internal clock, and the other connected to control signals on gates and other
circuits. A "pulse distributor" would take the pulses generated by the CPU clock and
break them up into eight separate time pulses, each of which would activate a different
row of the lattice. When the row was activated, it would activate the control signals
connected to it.
Described another way, the signals transmitted by the control store are being played
much like a player piano roll. That is, they are controlled by a sequence of very wide
words constructed of bits, and they are "played" sequentially. In a control store,
however, the "song" is short and repeated continuously.
· The Digital Equipment Corporation PDP-11 processors, with the exception of the
PDP-11/20, were microprogrammed (Sieworek,Bell,Newell 1982)
127
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
· Most models of the IBM System/360 series were microprogrammed:
· The Model 25 was unique among System/360 models in using the top 16k
bytes of core storage to hold the control storage for the microprogram. The
2025 used a 16-bit microarchitecture with seven control words (or
microinstructions).
· The Model 30, the slowest model in the line, used an 8-bit microarchitecture
with only a few hardware registers; everything that the programmer saw
was emulated by the microprogram.
· The Model 40 used 56-bit control words. The 2040 box implements both the
System/360 main processor and the multiplex channel (the I/O processor).
· The Model 50 had two internal data paths which operated in parallel: a 32-bit
data path used for arithmetic operations, and an 8-bit data path used in
some logical operations. The control store used 90-bit microinstructions.
· The Model 85 had separate instruction fetch (I-unit) and execution (E-unit) to
provide high performance. The I-unit is hardware controlled. The E-unit is
microprogrammed with 108-bit control words.
Implementation
Each microinstruction in a microprogram provides the bits which control the functional
elements that internally comprise a CPU. The advantage over a hard-wired CPU is that
internal CPU control becomes a specialized form of a computer program. Microcode thus
transforms a complex electronic design challenge (the control of a CPU) into a
less-complex programming challenge.
A microsequencer picked the next word of the control store. A sequencer is mostly a
counter, but usually also has some way to jump to a different part of the control store
depending on some data, usually data from the instruction register and always some
part of the control store. The simplest sequencer is just a register loaded from a few
bits of the control store.
A register set is a fast memory containing the data of the central processing unit. It
may include the program counter, stack pointer, and other numbers that are not easily
accessible to the application programmer. Often the register set is triple-ported, that is,
two registers can be read, and a third written at the same time.
An arithmetic and logic unit performs calculations, usually addition, logical negation, a
right shift, and logical AND. It often performs other functions, as well.
128
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
There may also be a memory address register and a memory data register, used to
access the main computer storage.
Together, these elements form an "execution unit." Most modern CPUs have several
execution units. Even simple computers usually have one unit to read and write
memory, and another to execute user code.
These elements could often be bought together as a single chip. This chip came in a
fixed width which would form a 'slice' through the execution unit. These were known a
'bit slice' chips. The AMD Am2900 is the best known example of a bit slice processor.
The parts of the execution units, and the execution units themselves are interconnected
by a bundle of wires called a bus.
After the microprogram is finalized, and extensively tested, it is sometimes used as the
input to a computer program that constructs logic to produce the same data. This
program is similar to those used to optimize a programmable logic array. No known
computer program can produce optimal logic, but even pretty good logic can vastly
reduce the number of transistors from the number required for a ROM control store.
This reduces the cost and power used by a CPU.
Horizontal microcode
A typical horizontal microprogram control word has a field, a range of bits, to control
each piece of electronics in the CPU. For example, one simple arrangement might be:
| register source A | register source B | destination register | arithmetic and logic unit
operation | type of jump | jump address |
For this type of micro machine to implement a jump instruction with the address
following the jump op-code, the micro assembly would look something like:
# Any line starting with a number-sign is a comment
# This is just a label, the ordinary way assemblers symbolically represent a
# memory address.
Instruction JUMP:
129
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
# To prepare for the next instruction, the instruction-decode microcode has
already
# moved the program counter to the memory address register. This instruction
fetches
# the target address of the jump instruction from the memory word following
the
# jump opcode, by copying from the memory data register to the memory
address register.
# This gives the memory system two clock ticks to fetch the next
# instruction to the memory data register for use by the instruction decode.
# The sequencer instruction "next" means just add 1 to the control word
address.
MDR, NONE, MAR, COPY, NEXT, NONE
# This places the address of the next instruction into the PC.
# This gives the memory system a clock tick to finish the fetch started on the
# previous microinstruction.
# The sequencer instruction is to jump to the start of the instruction decode.
MAR, 1, PC, ADD, JMP, Instruction Decode
# The instruction decode is not shown, because it's usually a mess, very
particular
# to the exact processor being emulated. Even this example is simplified.
# Many CPUs have several ways to calculate the address, rather than just
fetching
# it from the word following the op-code. Therefore, rather than just one
# jump instruction, those CPUs have a family of related jump instructions.
Horizontal microcode is microcode that sets all the bits of the CPU's controls on each
tick of the clock that drives the sequencer.
Note how many of the bits in horizontal microcode contain fields to do nothing.
Vertical microcode
In vertical microcode, each microinstruction is encoded -- that is, the bit fields may pass
through intermediate combinatory logic which in turn generates the actual control
signals for internal CPU elements (ALU, registers, etc.) By contrast, with horizontal
microcode the bit fields themselves directly produce the control signals. Consequently
vertical microcode requires smaller instruction lengths and less storage, but requires
more time to decode, resulting in a slower CPU clock.
Some vertical microcodes are just the assembly language of a simple conventional
computer that is emulating a more complex computer. This technique was popular in
the time of the PDP-8. Another form of vertical microcode has two fields:
130
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The "field select" selects which part of the CPU will be controlled by this word of the
control store. The "field value" actually controls that part of the CPU. With this type of
microcode, a designer explicitly chooses to make a slower CPU to save money by
reducing the unused bits in the control store; however, the reduced complexity may
increase the CPU's clock frequency, which lessens the effect of an increased number of
cycles per instruction.
A CPU that uses microcode generally takes several clock cycles to execute a single
instruction, one clock cycle for each step in the microprogram for that instruction. Some
CISC processors include instructions that can take a very long time to execute. Such
variations in instruction length interfere with pipelining and interrupt latency.
· Analysis shows complex instructions are rarely used, hence the machine resources
devoted to them are largely wasted.
· Programming has largely moved away from assembly level, so it's no longer
worthwhile to provide complex instructions for productivity reasons.
131
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
· Complex microcoded instructions requiring many, varying clock cycles are difficult
to pipeline for increased performance.
Many RISC and VLIW processors are designed to execute every instruction (as long as it
is in the cache) in a single cycle. This is very similar to the way CPUs with microcode
execute one microinstruction per cycle. VLIW processors have instructions that behave
like very wide horizontal microcode, although typically VLIW instructions do not have as
fine-grained control over hardware as microcode. RISC processors can have instructions
that look like narrow vertical microcode.
Modern implementations of CISC instruction sets such as the x86 instruction set
implement the simpler instructions in hardware rather than microcode, using microcode
only to implement the more complex instructions.
LESSON VI
Input/Output
Each row of the input-output matrix reports the monetary value of an industry's inputs
and each column represents the value of an industry's outputs. Suppose there are three
industries. Row 1 reports the value of inputs to Industry 1 from Industries 1, 2, and 3.
Rows 2 and 3 do the same for those industries. Column 1 reports the value of outputs
from Industry 1 to Industries 1, 2, and 3. Columns 2 and 3 do the same for the other
industries.
While the input-output matrix reports only the intermediate goods and services that are
exchanged among industries, row vectors on the bottom record the disposition of
132
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
finished goods and services to consumers, government, and foreign buyers. Similarly,
column vectors on the right record non-industrial inputs like labor and purchases from
foreign suppliers.
Usefulness
An input-output model is widely used in economic forecasting to predict flows
between sectors. They are also used in local urban economics.
Irving Hock at the Chicago Area Transportation Study did detailed forecasting by
industry sectors using input-output techniques. At the time, Hock’s work was quite an
undertaking, the only other work that has been done at the urban level was for
Stockholm and it was not widely known. Input-output was one of the few techniques
developed at the CATS not adopted in later studies. Later studies used economic base
analysis techniques.
133
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Key Ideas
The inimitable book by Leontief himself remains the best exposition of input-output
analysis. See bibliography.
Input-output concepts are simple. Consider the production of the ith sector. We may
isolate (1) the quantity of that production that goes to final demand, c, (2) to total
output, xi, and (3) flows xi from that industry to other industries. We may write a
transactions tableau.
Agriculture 5 15 2 68 90
Manufacturing 10 20 10 40 80
Transportation 10 15 5 0 30
Labor 25 30 5 0 60
Or
Note that in the example given we have no input flows from the industries to 'Labor’.
We know very little about production functions because all we have are numbers
representing transactions in a particular instance (single points on the production
functions):
134
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Where Q = Quantity, K = Capital, L = Labor,
Now
Gives
Introducing matrix notation, we can see how a solution may be obtained. Let
135
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Denote the total output vector, the final demand vector, the unit matrix and the
input-output matrix, respectively. Then:
There are many interesting aspects of the Leontief system, and there is an extensive
literature. There is the Hawkins-Simon Condition on producibility. There has been
interest in disaggregation to clustered inter-industry flows, and the study of
constellations of industries. A great deal of empirical work has been done to identify
coefficients, and data have been published for the national economy as well as for
regions. This has been a healthy, exciting area for work by economists because the
Leontief system can be extended to a model of general equilibrium; it offers a method
of decomposing work done at a macro level.
Walter Isard and his student, Leon Moses, were quick to see the spatial economy and
transportation implications of input-output, and began work in this area in the 1950s
developing a concept of interregional input-output. Take a one region versus the world
case. We wish to know something about interregional commodity flows, so introduce a
column into the table headed “exports” and we introduce an “input” row.
Imports
136
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
2
…
…
A more satisfactory way to proceed would be to tie regions together at the industry
level. That is, we identify both within region inter-industry transactions and among
region inter-industry transactions. A not-so-small problem here is that the table gets
very large very quickly.
North Mfg
...
...
Ag
East Mfg
...
...
137
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Ag
West Mfg
...
...
As we see from the use of the economic base study, Urban transportation planning
studies are demand-driven. The question we want to answer is, “What transportation
need results from some economic development: what’s the feedback from development
to transportation?” For that question, input-output is helpful. That’s the question Hock
posed. There is an increase in the final demand vector, changed inter-industry relations
result, and there is an impact on transportation requirements.
Rappoport et al. (1979) started with consumption projections. These drove solutions of
a national I-O model for projections of GNP and transportation requirements as per the
transportation vector in the I-O matrix. Submodels were then used to investigate modal
split and energy consumption in the transportation sector.
Another question asked is: What is the impact of the transportation construction activity
on an area? One of the first studies made of the impact of the interstate highway
system used the national I/O model to forecast impacts measured in increased steel
production, cement, employment, etc.
Selling Industry
The Maritime Administration (MARAD) has produced the Port Impact Kit for a number of
years. This software illustrates the use of I/O models. Simply written, it makes the
technique widely available. It shows how to calculate direct effects from the initial round
of spending that’s worked out by the vessel/cargo combinations. The direct
expenditures are entered into the I/O table, and indirect effects are calculated. These
are the inter-industry-relations derived activities from the purchases of supplies,
purchases, labor, etc. An I/O table is supplied to aid that calculation. Then, using the
138
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
I/O table, induced effects are calculated. These are effects from household purchases of
goods and services made possible from the wages generated from direct and indirect
effects. The Corps of Engineers has a similar capability that has been used to examine
the impacts of construction or base closing. The US Department of Commerce Bureau of
Economic Analysis (BEA) (1997) model discusses how to use their state level I/O
models (RIMS II). The ready availability of BEA and MARAD-like tables and calculation
tools says that we will see more and more feedback impact analysis. The information is
meaningful for many purposes.
Feed forward calculations seem to be much more interesting for planning. The question
is, “If an investment is made in transportation, what will be its development effects?”
An investment in transportation might lower transport costs, increase quality of service,
or a mixture of these. What would be the effect on trade flows, output, earnings, etc.?
The first problem we know of worked on from this point of view was in Japan in the
1950’s. The situation was the building of a bridge to connect two islands, and the core
question was of the mixing of the two island economies.
A first consideration is the impact of changed transportation attributes, say, lower cost,
on industry location, and/or agricultural or other resource based extra active activity,
and/or on markets. A spatial price equilibrium model (linear programming) is the tool of
choice for that. Input-output then permits tracing changed inter-industry relations,
impacts on wages, etc.
Britton Harris (1974) uses that analysis strategy. He begins with industry location
forecasting equations: treats equilibrium of locations, markets, and prices; and pays
much attention to transport costs. An interesting thing about this and other models is
that input-output considerations are no more than an accounting add-on; they hardly
enter Harris’ study. The interesting problems are the location and flow problems.
I/O devices
This topic discusses the different types of I/O devices used on your managed system,
and how the I/O devices are added to logical partitions.
I/O devices allow your managed system to gather, store, and transmit data. I/O devices
are found in the server unit itself and in expansion units and towers that are attached to
the server. I/O devices can be embedded into the unit, or they can be installed into
physical slots.
Not all types of I/O devices are supported for all operating systems or on all server
models. For example, I/O processors (IOPs) are supported only on i5/OS® logical
partitions. Also, Switch Network Interface (SNI) adapters are supported only on certain
server models, and are not supported for i5/OS logical partitions.
I/O pools for i5/OS logical partitions
139
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
This information discusses how I/O pools must be used to switch I/O adapters (IOAs)
between i5/OS® logical partitions that support switchable independent auxiliary storage
pools (IASPs).
An I/O pool is a group of I/O adapters that form an IASP. Other names for IASPs
include I/O failover pool and switchable independent disk pool. The IASP can be
switched from a failed server to a backup server within the same cluster without the
active intervention of the HMC. The I/O adapters within the IASP can be used by only
one logical partition at a time, but any of the other logical partitions in the group can
take over and use the I/O adapters within the IASP. The current owning partition must
power off the adapters before another partition can take ownership.
IASPs are not suitable for sharing I/O devices between different logical partitions. If you
want to share an I/O device between different logical partitions, use the HMC to move
the I/O device dynamically between the logical partitions.
IOPs for i5/OS logical partitions
This information discusses the purpose of IOPs and how you can switch IOPs and IOAs
dynamically between i5/OS® logical partitions.
i5/OS logical partitions require that the I/O processor (IOP) be attached to the system
I/O bus and one or more I/O adapters (IOA). The IOP processes instructions from the
server and works with the IOAs to control the I/O devices. The combined-function IOP
(CFIOP) can connect to a variety of different IOAs. For instance, a CFIOP could support
disk units, a console, and communications hardware.
Note: A server with i5/OS logical partitions must have the correct IOP feature codes for
the load source disk unit and alternate restart devices. Without the correct hardware, the
logical partitions will not function correctly.
A logical partition controls all devices connected to an IOP. You cannot switch one I/O
device to another logical partition without moving the ownership of the IOP. Any
resources (IOAs and devices) that are attached to the IOP cannot be in use when you
move an IOP from one logical partition to another.
IOAs for i5/OS logical partitions
This information discusses some of the types of IOAs that are used to control devices in
i5/OS® logical partitions and the placement rules that you must follow when installing
these devices in your servers and expansion units.
This topic discusses the purpose of a load source for i5/OS® logical partitions and the
placement rules that you must follow when installing the load source.
140
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Each i5/OS logical partition must have one disk unit designated as the load source. The
server uses the load source to start the logical partition. The server always identifies
this disk unit as unit number 1.
You must follow placement rules when placing a load source disk unit in your managed
system. Before adding a load source to your managed system or moving a load source
within your managed system, validate the revised system hardware configuration with
the System Planning Tool (SPT), back up the data on the disks attached to the IOA, and
move the hardware according to the SPT output.
Alternate restart device and removable media devices for i5/OS logical
partitions
This topic discusses the purpose of tape and optical devices in i5/OS® logical partitions
and the placement rules that you must follow when installing these devices.
A removable media device reads and writes to media (tape, CD-ROM, or DVD). Every
i5/OS logical partition must have either a tape or an optical device (CD-ROM or DVD)
available to use. The server uses the tape or optical devices as the alternate restart
device and alternate installation device. The media in the device is what the system
uses to start from when you perform a D-mode initial program load (IPL). The alternate
restart device loads the Licensed Internal Code contained on the removable media
instead of the code on the load source disk unit. It can also be used to install the
system.
Depending on your hardware setup, you might decide that your logical partitions will
share these devices. If you decide to share these devices, remember that only one
logical partition can use the device at any time. To switch devices between logical
partitions, you must move the IOP controlling the shared device to the desired logical
partition.
Disk units for i5/OS logical partitions
Disk units store data for i5/OS™ logical partitions. You can configure disk units into
auxiliary storage pools (ASPs).
Disk units store data for i5/OS logical partitions. The server can use and reuse this data
at any time. This method of storing data is more permanent than memory (RAM);
however, you can still erase any data on a disk unit.
Disk units can be configured into auxiliary storage pools (ASPs) on any logical partition.
All of the disk units you assign to an ASP must be from the same logical partition. You
cannot create a cross-partition ASP.
Overview
Hardware interrupts were introduced as a way to avoid wasting the processor's valuable
time in polling loops, waiting for external events.
Interrupts can be categorized into the following types: software interrupt, maskable
interrupt, non-maskable interrupt (NMI), interprocessor interrupt (IPI), and spurious
interrupt.
- All instructions before the one pointed to by the PC have fully executed.
- No instruction beyond the one pointed to by the PC has been executed (That is no
prohibition on instruction beyond that in PC, it is just that any changes they make to
registers or memory must be undone before the interrupt happens).
Processors typically have an internal interrupt mask which allows software to ignore all
external hardware interrupts while it is set. This mask may offer faster access than
accessing an IMR in a PIC, or disabling interrupts in the device itself. In some cases,
such as the x86 architecture, disabling and enabling interrupts on the processor itself
acts as a memory barrier, in which case it may actually be slower.
Level-triggered
A level-triggered interrupt is a class of interrupts where the presence of an
unserviced interrupt is indicated by a high level (1), or low level (0), of the interrupt
request line. A device wishing to signal an interrupt drives the line to its active level,
and then holds it at that level until serviced. It ceases asserting the line when the CPU
commands it to or otherwise handles the condition that caused it to signal the interrupt.
Typically, the processor samples the interrupt input at predefined times during each bus
cycle such as state T2 for the Z80 microprocessor. If the interrupt isn't active when the
processor samples it, the CPU doesn't see it. One possible use for this type of interrupt
is to minimize spurious signals from a noisy interrupt line: a spurious pulse will often be
so short that it is not noticed.
Multiple devices may share a level-triggered interrupt line if they are designed to. The
interrupt line must have a pull-down or pull-up resistor so that when not actively driven
it settles to its inactive state. Devices actively assert the line to indicate an outstanding
interrupt, but let the line float (do not actively drive it) when not signaling an interrupt.
The line is then in its asserted state when any (one or more than one) of the sharing
devices is signaling an outstanding interrupt.
This class of interrupts is favored by some because of a convenient behavior when the
line is shared. Upon detecting assertion of the interrupt line, the CPU must search
through the devices sharing it until one requiring service is detected. After servicing this
one, the CPU may recheck the interrupt line status to determine whether any other
devices also need service. If the line is now disserted then the CPU avoids the need to
check all the remaining devices on the line. Where some devices interrupt much more
than others, or where some devices are particularly expensive to check for interrupt
status, a careful ordering of device checks brings some efficiency gain.
143
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
There are also serious problems with sharing level-triggered interrupts. As long as any
device on the line has an outstanding request for service the line remains asserted, so it
is not possible to detect a change in the status of any other device. Deferring servicing
a low-priority device is not an option, because this would prevent detection of service
requests from higher-priority devices. If there is a device on the line that the CPU does
not know how to service, then any interrupt from that device permanently blocks all
interrupts from the other devices.
The original PCI standard mandated shareable level-triggered interrupts. The rationale
for this was the efficiency gain discussed above. (Newer versions of PCI allow, and PCI
Express requires, the use of message-signaled interrupts.)
Edge-triggered
An edge-triggered interrupt is a class of interrupts that are signaled by a level
transition on the interrupt line, either a falling edge (1 to 0) or (usually) a rising edge (0
to 1). A device wishing to signal an interrupt drives a pulse onto the line and then
returns the line to its quiescent state. If the pulse is too short to detect by polled I/O
then special hardware may be required to detect the edge.
Multiple devices may share an edge-triggered interrupt line if they are designed to. The
interrupt line must have a pull-down or pull-up resistor so that when not actively driven
it settles to one particular state. Devices signal an interrupt by briefly driving the line to
its non-default state, and let the line float (do not actively drive it) when not signaling
an interrupt. The line then carries all the pulses generated by all the devices. However,
interrupt pulses from different devices may merge if they occur close in time. To avoid
losing interrupts the CPU must trigger on the trailing edge of the pulse (e.g., the rising
edge if the line is pulled up and driven low). After detecting an interrupt the CPU must
check all the devices for service requirements.
The elderly ISA bus uses edge-triggered interrupts, but does not mandate that devices
be able to share them. The parallel port also uses edge-triggered interrupts. Many older
devices assume that they have exclusive use of their interrupt line, making it electrically
unsafe to share them. However, ISA motherboards include pull-up resistors on the IRQ
lines, so well-behaved devices share ISA interrupts just fine.
Hybrid
144
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Some systems use a hybrid of level-triggered and edge-triggered signaling. The
hardware not only looks for an edge, but it also verifies that the interrupt signal stays
active for a certain period of time. A common hybrid interrupt is the NMI (non-maskable
interrupt) input. Because NMIs generally signal major-or even catastrophic-system
events, a good implementation of this signal tries to ensure that the interrupt is valid by
verifying that it remains active for a period of time. This 2-step approach helps to
eliminate false interrupts from affecting the system.
Message-signalled
A message-signalled interrupt does not use a physical interrupt line. Instead, a device
signals its request for service by sending a short message over some communications
medium, typically a computer bus. The message might be of a type reserved for
interrupts, or it might be of some pre-existing type such as a memory write.
Message-signalled interrupt vectors can be shared, to the extent that the underlying
communication medium can be shared. No additional effort is required.
Because the identity of the interrupt is indicated by a pattern of data bits, not requiring
a separate physical conductor, many more distinct interrupts can be efficiently handled.
This reduces the need for sharing. Interrupt messages can also be passed over a serial
bus, not requiring any additional lines.
145
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
misbehave if serviced when they do not want it. Such devices cannot tolerate spurious
interrupts, and so also cannot tolerate sharing an interrupt line. ISA cards, due to often
cheap design and construction, are notorious for this problem. Such devices are
becoming much rarer, as hardware logic becomes cheaper and new system
architectures mandate shareable interrupts.
Typical uses
Typical interrupt uses include the following: system timers, disks I/O, power-off signals,
and traps. Other interrupts exist to transfer data bytes using UARTs or Ethernet; sense
key-presses; control motors; or anything else the equipment must do.
A classic system timer interrupt interrupts periodically from a counter or the power-line.
The interrupt handler counts the interrupts to keep time. The timer interrupt may also
be used by the OS's task scheduler to reschedule the priorities of running processes.
Counters are popular, but some older computers used the power line frequency instead,
because power companies in most Western countries control the power-line frequency
with an atomic clock.
A disk interrupt signals the completion of a data transfer from or to the disk peripheral.
A process waiting to read or write a file starts up again.
Interrupts are also used in type ahead features for buffering events like keystrokes.
Without DMA, using programmed input/output (PIO) mode, the CPU typically has to be
occupied for the entire time it's performing a transfer. With DMA, the CPU would initiate
the transfer, do other operations while the transfer is in progress, and receive an
interrupt from the DMA controller once the operation has been done. This is especially
useful in real-time computing applications where not stalling behind concurrent
operations is critical.
Principle
DMA is an essential feature of all modern computers, as it allows devices to transfer
data without subjecting the CPU to a heavy overhead. Otherwise, the CPU would have
to copy each piece of data from the source to the destination. This is typically slower
146
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
than copying normal blocks of memory since access to I/O devices over a peripheral
bus is generally slower than normal system RAM. During this time the CPU would be
unavailable for any other tasks involving CPU bus access, although it could continue
doing any work which did not require bus access.
A DMA transfer essentially copies a block of memory from one device to another. While
the CPU initiates the transfer, it does not execute it. For so-called "third party" DMA, as
is normally used with the ISA bus, the transfer is performed by a DMA controller which
is typically part of the motherboard chipset. More advanced bus designs such as PCI
typically use bus mastering DMA, where the device takes control of the bus and
performs the transfer itself.
A typical usage of DMA is copying a block of memory from system RAM to or from a
buffer on the device. Such an operation does not stall the processor, which as a result
can be scheduled to perform other tasks. DMA transfers are essential to high
performance embedded systems. It is also essential in providing so-called zero-copy
implementations of peripheral device drivers as well as functionalities such as network
packet routing, audio playback and streaming video.
DMA engines
In addition to hardware interaction, DMA can also be used to offload expensive memory
operations, such as large copies or scatter-gather operations, from the CPU to a
dedicated DMA engine. While normal memory copies are typically too small to be
worthwhile to offload on today's desktop computers, they are frequently offloaded on
embedded devices due to more limited resources.[1]
Newer Intel Xeon processors also include a DMA engine technology called I/OAT, meant
to improve network performance on high-throughput network interfaces, such as
gigabit Ethernet, in particular.[2] However, benchmarks with this approach on Linux
indicate no more than 10% improvement in CPU utilization.[3]
Examples
ISA
For example, a PC's ISA DMA controller has 16 DMA channels of which 7 are available
for use by the PC's CPU. Each DMA channel has associated with it a 16-bit address
register and a 16-bit count register. To initiate a data transfer the device driver sets up
the DMA channel's address and count registers together with the direction of the data
transfer, read or write. It then instructs the DMA hardware to begin the transfer. When
the transfer is complete, the device interrupts the CPU.
147
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
"Scatter-gather" DMA allows the transfer of data to and from multiple memory areas in
a single DMA transaction. It is equivalent to the chaining together of multiple simple
DMA requests. Again, the motivation is to off-load multiple input/output interrupt and
data copy tasks from the CPU.
DRQ stands for DMA request; DACK for DMA acknowledge. These symbols are generally
seen on hardware schematics of computer systems with DMA functionality. They
represent electronic signaling lines between the CPU and DMA controller.
Recent advances in silicon densities now allow for the integration of numerous functions
onto a single silicon chip. With this increased density, peripherals formerly attached to the
processor at the card level are integrated onto the same die as the processor. As a result,
chip designers must now address issues traditionally handled by the system designer. In
particular, the on-chip buses used in such system-on-a chip designs must be sufficiently
flexible and robust in order to support a wide variety of embedded system needs. The IBM
Blue Logic™ cores program provides the framework to efficiently realize complex
system-on-a chip (SOC) designs. Typically, a SOC contains numerous functional blocks
representing a very large number of logic gates. Designs such as these are best realized
through a macro-based approach. Macro based design provides numerous benefits during
logic entry and verification, but the ability to reuse intellectual property is often the most
significant. From generic serial ports to complex memory controllers and processor cores,
each SOC generally requires the use of common macros. Many single chip solutions used
in applications today are designed as custom chips, each with its own internal
architecture. Logical units within such a chip are often difficult to extract and re-use in
different applications. As a result, many times the same function is redesigned from one
application to another. Promoting reuse by ensuring macro interconnectivity is
accomplished by using common buses for intermacro communications. To that end, the
IBM CoreConnect architecture provides three buses for interconnecting cores, library
macros, and custom logic:
· Processor Local Bus (PLB)
· On-Chip Peripheral Bus (OPB)
· Device Control Register (DCR) Bus
Figure 1 illustrates how the CoreConnect architecture can be used to interconnect macros
in a PowerPC 440 based SOC. High performance, high bandwidth blocks such as the
PowerPC 440 CPU core, PCI-X Bridge and PC133/DDR133 SDRAM Controller reside on the
PLB, while the OPB hosts lower data rate peripherals. The daisy-chained DCR bus provides
a relatively low-speed data path for passing configuration and status information between
the PowerPC 440 CPU core and other on-chip macros.
148
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The CoreConnect architecture shares many similarities with the Advanced Microcontroller
Bus
Architecture (AMBA™) from ARM Ltd. As shown in Table 1, the recently announced AMBA
2.01 includes the specification of many high performance features that have been
available in the Core Connect architecture for over three years. Both architectures support
data bus widths of 32-bits and higher, utilize separate read and write data paths and
149
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
allow multiple masters. CoreConnect and AMBA 2.0 now both provide high performance
features including pipelining, split transactions and burst transfers. Many custom designs
utilizing the high performance features of the CoreConnect architecture are available in
the marketplace today. Open specifications for the CoreConnect architecture are available
on the IBM Microelectronics web site. In addition, IBM offers a no-fee, royalty-free
CoreConnect architectural license. Licensees receive the PLB arbiter, OPB arbiter and
PLB/OPB Bridge designs along with bus model toolkits and bus functional compilers for the
PLB, OPB and DCR buses. In the future, IBM intends to include compliance test suites for
each of the three buses.
Processor Local Bus
The PLB and OPB buses provide the primary means of data flow among macro elements.
Because these two buses have different structures and control signals, individual macros
are designed to interface to either the PLB or the OPB. Usually the PLB interconnects
high-bandwidth devices such as processor cores, external memory interfaces and DMA
controllers. The PLB addresses the high performance, low latency and design flexibility
issues needed in a highly integrated SOC through:
· Decoupled address, read data, and write data buses with split transaction capability
· Concurrent read and writes transfers yielding a maximum bus utilization of two data
transfers per clock
· Address pipelining that reduces bus latency by overlapping a new write request with an
ongoing write transfer and up to three read requests with an ongoing read transfer.
· Ability to overlap the bus request/grant protocol with an ongoing transfer
150
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
In addition to providing a high bandwidth data path, the PLB offers designers flexibility
through the following features:
· Support for both multiple masters and slaves
· Four priority levels for master requests allowing PLB implementations with various
arbitration
schemes
· Deadlock avoidance through slave forced PLB rearbitration
· Master driven atomic operations through a bus arbitration locking mechanism
· Byte-enable capability, supporting unaligned transfers
· A sequential burst protocol allowing byte, half-word, word and double-word burst
transfers
· Support for 16-, 32- and 64-byte line data transfers
· Read word address capability, allowing slaves to return line data either sequentially or
target
word first
· DMA support for buffered, fly-by, peripheral-to-memory, memory-to-peripheral, and
memory-to memory
transfers
· Guarded or unguarded memory transfers allow slaves to individually enable or disable
prefetching of instructions or data
· Slave error reporting
· Architecture extendable to 256-bit data buses
· Fully synchronous
The PLB specification describes system architecture along with a detailed description of
the signals and transactions. PLB-based custom logic systems require the use of a PLB
macro to interconnect the various master and slave macros.
Figure 2 illustrates the connection of multiple masters and slaves through the PLB macro.
Each PLB master is attached to the PLB macro via separate address, read data and write
data buses and a plurality of transfer qualifier signals. PLB slaves are attached to the PLB
macro via shared, but decoupled, address, read data and write data buses along with
transfer control and status signals for each data bus.
151
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
larger systems tend to have increased bus wire load and a longer delay in arbitrating
among multiple masters and slaves.
The PLB macro consists of a bus arbitration control unit and the control logic required
managing the address and data flow through the PLB. The separate address and data
buses from the masters allow simultaneous transfer requests. The PLB macro arbitrates
among these requests and directs the address, data and control signals from the granted
master to the slave bus. The slave response is then routed from the slave bus back to the
appropriate master.
As shown in Figure 3, the PLB specification supports implementations where these three
phases can require only a single PLB clock cycle. This occurs when the requesting master
is immediately granted access to the slave bus and the slave acknowledges the address in
the same cycle. If a master issues a request that cannot be immediately forwarded to the
slave bus, the request phase lasts one or more cycles.
152
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Each data beat in the data tenure has two phases: transfer and acknowledge. During the
transfer phase the master drives the write data bus for a write transfer or samples the
read data bus for a read transfer.
As shown in Figure 3, the first (or only) data beat of a write transfer coincides with the
address transfer phase. Data acknowledge cycles are required during the data
acknowledge phase for each data beat in a data cycle. In the case of a single-beat
transfer, the data acknowledge signals also indicate the end of the data transfer. For line
or burst transfers, the data acknowledge signals apply to each individual beat and indicate
the end of the data cycle only after the final beat. The highest data throughput occurs
when data is transferred between master and slave in a single PLB clock cycle. In this
case the data transfer and data acknowledge phases are coincident. During multi-cycle
accesses there is a wait-state either before or between the data transfer and data
acknowledge phases.
The PLB address, read data, and write data buses are decoupled from one another,
allowing for address cycles to be overlapped with read or write data cycles, and for read
data cycles to be overlapped with write data cycles. The PLB split bus transaction
capability allows the address and data buses to have different masters at the same time.
Additionally, a second master may request ownership of the PLB, via address pipelining, in
parallel with the data cycle of another master's bus transfer. This is shown in Figure 3.
Overlapped read and write data transfers and split-bus transactions allow the PLB to
operate at a very high bandwidth by fully utilizing the read and write data buses. Allowing
PLB devices to move data using long burst transfers can further enhance bus throughput.
However, to control the maximum latency in a particular application, master latency
timers are required. All masters able to issue burst operations must contain a latency
timer that increments at the PLB clock rate and a latency count register. The latency
count register is an example of a configuration register that is accessed via the DCR bus.
During a burst operation, the latency timer begins counting after an address acknowledge
153
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
is received from a slave. When the latency timer exceeds the value programmed into the
latency count register, the master can either immediately terminate its burst, continue
until another master requests the bus or continue until another master requests the bus
with a higher priority.
OPB Bridge
PLB masters gain access to the peripherals on the OPB bus through the OPB bridge
macro. The OPB bridge acts as a slave device on the PLB and a master on the OPB. It
supports word (32-bit), half-word (16-bit) and byte read and write transfers on the 32-bit
OPB data bus, bursts and has the capability to perform target word first line read
accesses. The OPB bridge performs dynamic bus sizing, allowing devices with different
data widths to efficiently communicate. When the OPB bridge master performs an
operation wider than the selected OPB slave the bridge splits the operation into two or
more smaller transfers.
OPB Implementation
The OPB supports multiple masters and slaves by implementing the address and data
buses as a distributed multiplexer. This type of structure is suitable for the less data
intensive OPB bus and allows adding peripherals to a custom core logic design without
changing the I/O on either the OPB arbiter or existing peripherals. Figure 5 shows one
method of structuring the OPB address and data buses. Observe that both masters and
slaves provide enable control signals for their outbound buses. By requiring that each
macro provide this signal, the associated bus combining logic can be strategically
Channels
(1) A high-speed metal or optical fiber subsystem that provides a path between the
computer and the control units of the peripheral devices. Used in mainframes and
high-end servers, each channel is an independent unit that transfers data concurrently
with other channels and the CPU. For example, in a 32-channel computer, 32 streams of
data are transferred simultaneously. In contrast, the PCI bus in a desktop computer is a
shared channel between all devices plugged into it.
(2) The physical connecting medium in a network, which could be twisted wire pairs,
coaxial cable or optical fiber between clients, servers and other devices.
(3) A subchannel within a communications channel. Multiple channels are transmitted via
different carrier frequencies or by interleaving bits and bytes. This usage of the term can
refer to both wired and wireless transmission. See FDM and TDM.
155
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
See Webcast, push client and push technology.
(5) The distributor/dealer sales channel. Vendors that sell in the channel rely on the sales
ability of their dealers and the customer relationships they have built up over the years.
Such vendors may also compete with the channel by selling direct to the customer via
catalogs and the Web.
Channel controller
A channel controller is a simple CPU used to handle the task of moving data to
and from the memory of a computer. Depending on the sophistication of the design, they
can also be referred to as peripheral processors, I/O processors, I/O controllers or
DMA controllers.
Most input/output tasks can be fairly complex and require logic to be applied to the
data to convert formats and other similar duties. In these situations the computer's CPU
would normally be asked to handle the logic, but due to the fact that the I/O devices are
very slow, the CPU would end up spending a huge amount of time (in computer terms)
sitting idle waiting
for the data from the device.
A channel controller avoids this problem by using a low-cost CPU with enough logic
and memory onboard to handle these sorts of tasks. They are typically not powerful or
flexible enough to be used on their own, and are actually a form of co-processor. The CPU
sends small programs to the controller to handle an I/O job, which the channel controller
can then complete without any help from the CPU. When it is complete, or there is an
error, the channel
controller communicates with the CPU using a selection of interrupts.
Since the channel controller has direct access to the main memory of the computer,
they are also often referred to as DMA Controllers (where DMA means direct memory
access), but that term is somewhat more loose in definition and is often applied to
non-programmable devices as well.
The first use of channel controllers was in the famed CDC 6600 supercomputer,
which used 12 dedicated computers they referred to as peripheral processors, or PP's for
this role. The PP's were quite powerful, basically a cut down version of CDC's first
computer, the CDC 1604. Since the 1960s channel controllers have been a standard part
of almost all mainframe designs, and the primary reason why anyone buys one. CDC's
PP's are at one end of the spectrum of power, most mainframe systems tasked the CPU
with more and the channel controllers with less of the overall I/O task.
156
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Channel controllers have also been made as small as single-chip designs with
multiple channels on them, used in the NeXT computers for instance. However with the
rapid speed increases in computers today, combined with operating systems that don't
"block" when waiting for data, the channel controller has become somewhat redundant
and are not commonly found on smaller machines.
Channel controllers can be said to be making a comeback in the form of "bus
mastering" peripheral devices, such as SCSI adaptors and network cards. The rationale
for these devices is the same as for the original channel controllers, namely off-loading
interrupts and context switching from the main CPU.
A serial number is a unique number that is one of a series assigned for identification
which varies from its successor or predecessor by a fixed discrete integer value.
Common usage has expanded the term to refer to any unique alphanumeric identifier
for one of a large set of objects, however in data processing and allied fields in
computer science. Not every numerical identifier is a serial number; identifying numbers
which are not serial numbers are sometimes called nominal numbers.
Sequence numbers are almost always non-negative, and typically start at zero or one.
Many computer programs come with serial numbers, often called "CD keys," and the
installers often require the user to enter a valid serial number to continue. These
numbers are verified using a certain algorithm to avoid usage of counterfeit keys.
Serial numbers also help track down counterfeit currency, because in some countries
each banknote has a unique serial number.
The ISSN or International Standard Serial Number seen on magazines and other
periodicals, an equivalent to the ISBN applied to books, is serially assigned but takes its
name from the library science use of serial to mean a periodical.
Certificates and Certificate Authorities (CA) are necessary for widespread use of
cryptography. These depend on applying mathematically rigorous serial numbers and
serial number arithmetic
The term "serial number" is also used in military formations as an alternative to the
expression "service number".
157
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
If there are items whose serial numbers is part of a sequence of consecutive numbers
and you take n number of random samples of items' serial numbers, you can then
estimate the population of items "in the wild" using a maximum likelihood method
derived using Bayesian reasoning.
Serial numbers are often used in network protocols. However, most sequence numbers
in computer protocols are limited to a fixed number of bits, and will wrap around after a
sufficiently many numbers have been allocated. Thus, recently-allocated serial numbers
may duplicate very old serial numbers, but not other recently-allocated serial numbers.
To avoid ambiguity with these non-unique numbers, RFC 1982, " Serial Number
Arithmetic" defines special rules for calculations involving these kinds of serial numbers.
Lollipop sequence number spaces are a more recent and sophisticated scheme for
dealing with finite-sized sequence numbers in protocols.
· input
· processor
· storage
· output
First, information in the form of gravitational force from the earth serves as input to the
system we call a rock. At a particular instant the rock is a specific distance from the
surface of the earth traveling at a specific speed. Both the current distance and speed
properties are also forms of information which for that instant only may be considered
"stored" in the rock.
In the next instant, the distance of the rock from the earth has changed due to its
motion under the influence of the earth's gravity. Any time the properties of an object
change a process has occurred meaning that a processor of some kind is at work. In
addition, the rock's new position and increased speed is observed by us as it falls. These
changing properties of the rock are its "output."
It could be argued that in this example both the rock and the earth are the information
processing system being observed since both objects are changing the properties of
each other over time. If information is not being processed no change would occur at
all.
Lesson VII
Arithmetic logic units vary in terms of number of bits, supply voltage, operating current,
propagation delay, power dissipation, and operating temperature. The number of bits
equals the width of the two input words on which the ALU performance arithmetic and
logical operations. Common configurations include 2-bit, 4-bit, 8-bit, 16-bit, 32-bit and
159
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
64-bit ALUs. Supply voltages range from - 5 V to 5 V and include intermediate voltages
such as - 4.5 V, - 3.3 V, - 3 V, 1.2 V, 1.5 V, 1.8 V, 2.5 V, 3 V, 3.3 V, and 3.6 V. The
operating current is the minimum current needed for active operation. The propagation
delay is the time interval between the application of an input signal and the occurrence of
the corresponding output. Power dissipation, the total power consumption of the device, is
generally expressed in watts (W) or milliwatts (mW). Operating temperature is a
full-required range.
Arithmetic logic units are available in a variety of integrated circuit (IC) package types and
with different numbers of pins. Basic IC package types for ALUs include ball grid array
(BGA), quad flat package (QFP), single in-line package (SIP), and dual in-line package
(DIP). Many packaging variants are available. For example, BGA variants include
plastic-ball grid array (PBGA) and tape-ball grid array (TBGA). QFP variants include
low-profile quad flat package (LQFP) and thin quad flat package (TQFP). DIPs are
available in either ceramic (CDIP) or plastic (PDIP). Other IC package types include small
outline package (SOP), thin small outline package (TSOP), and shrink small outline
package (SSOP).
Decimal Arithmetic
The 80x86 CPUs use the binary numbering system for their native internal representation.
The binary numbering system is, by far, the most common numbering system in use in
computer systems today. In days long since past, however, there were computer systems
that were based on the decimal (base 10) numbering system rather than the binary
numbering system. Consequently, their arithmetic system was decimal based rather than
binary. Such computer systems were very popular in systems targeted for
business/commercial systems1. Although systems designers have discovered that binary
arithmetic is almost always better than decimal arithmetic for general calculations, the
myth still persists that decimal arithmetic is better for money calculations than binary
arithmetic. Therefore, many software systems still specify the use of decimal arithmetic in
their calculations (not to mention that there is lots of legacy code out there whose
160
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
algorithms are only stable if they use decimal arithmetic). Therefore, despite the fact that
decimal arithmetic is generally inferior to binary arithmetic, the need for decimal
arithmetic still persists.
Of course, the 80x86 is not a decimal computer; therefore we have to play tricks in order
to represent decimal numbers using the native binary format. The most common
technique, even employed by most so-called decimal computers, is to use the binary
coded decimal, or BCD representation. The BCD representation (see "Nibbles" on page 56)
uses four bits to represent the 10 possible decimal digits. The binary value of those four
bits is equal to the corresponding decimal value in the range 0..9. Of course, with four bits
we can actually represent 16 different values. The BCD format ignores the remaining six
bit combinations.
0000 0
0001 1
0010 2
0011 3
0100 4
0101 5
0110 6
0111 7
1000 8
1001 9
1010 Illegal
1011 Illegal
1100 Illegal
1101 Illegal
1110 Illegal
1111 Illegal
161
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Since each BCD digit requires four bits, we can represent a two-digit BCD value with a single byte. This means
that we can represent the decimal values in the range 0..99 using a single byte (versus 0..255 if we treat the value as
an unsigned binary number). Clearly it takes a bit more memory to represent the same value in BCD as it does to
represent the same value in binary. For example, with a 32-bit value you can represent BCD values in the range
0..99,999,999 (eight significant digits) but you can represent values in the range 0..4,294,967,295 (better than nine
significant digits) using the binary representation.
Not only does the BCD format waste memory on a binary computer (since it uses more
bits to represent a given integer value), but decimal arithmetic is slower. For these
reasons, you should avoid the use of decimal arithmetic unless it is absolutely mandated
for a given application.
Binary coded decimal representation does offer one big advantage over binary
representation: it is fairly trivial to convert between the string representation of a decimal
number and the BCD representation. This feature is particularly beneficial when working
with fractional values since fixed and floating point binary representations cannot exactly
represent many commonly used values between zero and one (e.g., 1/10). Therefore,
BCD operations can be efficient when reading from a BCD device, doing a simple
arithmetic operation (e.g., a single addition) and then writing the BCD value to some
other device.
he important thing to keep in mind is that you must not use HLA literal decimal constants
for BCD values. That is, "mov( 95, al );" does not load the BCD representation for
ninety-five into the AL register. Instead, it loads $5F into AL and that's an illegal BCD
value. Any computations you attempt with illegal BCD values will produce garbage results.
Always remember that, even though it seems counter-intuitive, you use hexadecimal
literal constants to represent literal BCD values.
How Pipelining Works
PIpelining, a standard feature in RISC processors, is much like an assembly line. Because
162
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
the processor works on different steps of the instruction at the same time, more
instructions can be executed in a shorter period of time.
A useful method of demonstrating this is the laundry analogy. Let's say that there are
four loads of dirty laundry that need to be washed, dried, and folded. We could put the
first load in the washer for 30 minutes, dry it for 40 minutes, and then take 20 minutes
to fold the clothes. Then pick up the second load and wash, dry, and fold, and repeat
for the third and fourth loads. Supposing we started at 6 PM and worked as efficiently
as possible, we would still be doing laundry until midnight.
However, a smarter approach to the problem would be to put the second load of dirty
laundry into the washer after the first was already clean and whirling happily in the
dryer. Then, while the first load was being folded, the second load would dry, and a
third load could be added to the pipeline of laundry. Using this method, the laundry
would be finished by 9:30.
163
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
RISC Pipelines
A RISC processor pipeline operates in much the same way, although the stages in the
pipeline are different. While different processors have different numbers of steps, they
are basically variations of these five, used in the MIPS R3000 processor:
If you glance back at the diagram of the laundry pipeline, you'll notice that although the
washer finishes in half an hour, the dryer takes an extra ten minutes, and thus the wet
clothes must wait ten minutes for the dryer to free up. Thus, the length of the pipeline
is dependent on the length of the longest step. Because RISC instructions are simpler
than those used in pre-RISC processors (now called CISC, or Complex Instruction Set
Computer), they are more conducive to pipelining. While CISC instructions varied in
length, RISC instructions are all the same length and can be fetched in a single
operation. Ideally, each of the stages in a RISC processor pipeline should take 1 clock
cycle so that the processor finishes an instruction each clock cycle and averages one
cycle per instruction (CPI).
164
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Pipeline Problems
In practice, however, RISC processors operate at more than one cycle per instruction.
The processor might occasionally stall a result of data dependencies and branch
instructions.
For example:
add $r3, $r2, $r1
add $r5, $r4, $r3
more instructions that are independent of the first two
In this example, the first instruction tells the processor to add the contents of registers
r1 and r2 and store the result in register r3. The second instructs it to add r3 and r4
and store the sum in r5. We place this set of instructions in a pipeline. When the second
instruction is in the second stage, the processor will be attempting to read r3 and r4
from the registers. Remember, though, that the first instruction is just one step ahead
of the second, so the contents of r1 and r2 are being added, but the result has not yet
been written into register r3. The second instruction therefore cannot read from the
register r3 because it hasn't been written yet and must wait until the data it needs is
stored. Consequently, the pipeline is stalled and a number of empty instructions (known
as bubbles go into the pipeline. Data dependency affects long pipelines more than
shorter ones since it takes a longer period of time for an instruction to reach the final
register-writing stage of a long pipeline.
MIPS' solution to this problem is code reordering. If, as in the example above, the
following instructions have nothing to do with the first two, the code could be
rearranged so that those instructions are executed in between the two dependent
instructions and the pipeline could flow efficiently. The task of code reordering is
generally left to the compiler, which recognizes data dependencies and attempts to
minimize performance stalls.
Branch instructions are those that tell the processor to make a decision about what the
next instruction to be executed should be based on the results of another instruction.
Branch instructions can be troublesome in a pipeline if a branch is conditional on the
results of an instruction which has not yet finished its path through the pipeline.
For example:
Loop : add $r3, $r2, $r1
165
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
sub $r6, $r5, $r4
beq $r3, $r6, Loop
The example above instructs the processor to add r1 and r2 and put the result in r3,
then subtract r4 from r5, storing the difference in r6. In the third instruction, beq
stands for branch if equal. If the contents of r3 and r6 are equal, the processor should
execute the instruction labeled "Loop." Otherwise, it should continue to the next
instruction. In this example, the processor cannot make a decision about which branch
to take because neither the value of r3 or r6 have been written into the registers yet.
The processor could stall, but a more sophisticated method of dealing with branch
instructions is branch prediction. The processor makes a guess about which path to take
- if the guess is wrong, anything written into the registers must be cleared, and the
pipeline must be started again with the correct instruction. Some methods of branch
prediction depend on stereotypical behavior. Branches pointing backward are taken
about 90% of the time since backward-pointing branches are often found at the bottom
of loops. On the other hand, branches pointing forward, are only taken approximately
50% of the time. Thus, it would be logical for processors to always follow the branch
when it points backward, but not when it points forward. Other methods of branch
prediction are less static: processors that use dynamic prediction keep a history for
each branch and uses it to predict future branches. These processors are correct in their
predictions 90% of the time.
Still other processors forgo the entire branch prediction ordeal. The RISC System/6000
fetches and starts decoding instructions from both sides of the branch. When it
determines which branch should be followed, it then sends the correct instructions down
the pipeline to be executed.
Pipelining Developments
In order to make processors even faster, various methods of optimizing pipelines have
been devised.
Super pipelining refers to dividing the pipeline into more steps. The more pipe stages
there are, the faster the pipeline is because each stage is then shorter. Ideally, a
pipeline with five stages should be five times faster than a non-pipelined processor (or
rather, a pipeline with one stage). The instructions are executed at the speed at which
each stage is completed, and each stage takes one fifth of the amount of time that the
non-pipelined instruction takes. Thus, a processor with an 8-step pipeline (the MIPS
R4000) will be even faster than its 5-step counterpart. The MIPS R4000 chops its
pipeline into more pieces by dividing some steps into two. Instruction fetching, for
example, is now done in two stages rather than one. The stages are as shown:
166
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
2. Instruction Fetch (Second Half)
3. Register Fetch
4. Instruction Execute
7. Tag Check
8. Write Back
Dynamic pipelines have the capability to schedule around stalls. A dynamic pipeline is
divided into three units: the instruction fetch and decode unit, five to ten execute or
functional units, and a commit unit. Each execute unit has reservation stations, which
act as buffers and hold the operands and operations.
167
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
While the functional units have the freedom to execute out of order, the instruction
fetch/decode and commit units must operate in-order to maintain simple pipeline
behavior. When the instruction is executed and the result is calculated, the commit unit
decides when it is safe to store the result. If a stall occurs, the processor can schedule
other instructions to be executed until the stall is resolved. This, coupled with the
efficiency of multiple units executing instructions simultaneously, makes a dynamic
pipeline an attractive alternative.
Lesson VIII
The line between computer systems can be extremely vague. A powerful entry level
system can double as an low end business system, or a gamming system can be identical
to a low end workstation. In fact, some equipment manufactures may refer to their
business systems as workstations. Some components on a computer in any class can be
installed on all systems. For example, a manufacture may use the same RAM on the entry
level system and the gaming system. You will want to pay particular attention to system's
CPU and Video. Sometimes it the amount of hard disk space or the addition of a better
168
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
video adapter that moves a system from one class to another. Please keep in mind that
the tables below show the minimum configurations.
170
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The number of operands of an operator is called its arity. Based on arity, operators are
classified as unary, binary, ternary etc.
Introduction
• At the ISA level, a variety of different data types are used to represent data.
• A key issue is whether or not there is hardware support for a particular data type.
• Hardware support means that one or more instructions expect data in a particular
format, and the user is not free to pick a different format.
• Another issue is precision – what if we wanted to total the transactions on Bill Gates’
deposit
account?
• Using 32-bit arithmetic would not work here because the numbers involved are larger
than 232 (about 4 billion).
• We could to use two 32-bit integers to represent each number, giving 64 bits in all.
• However, if the machine does not support this kind of double precision number, all
arithmetic on them will have to be done in software, thus, without a required hardware
representation.
• Today, we will look at data types are supported by the hardware, and thus for which
specific
formats are required.
• Data types can be divided into two categories: numeric and nonnumeric.
• Chief among the numeric data types are the integers, which come in many lengths,
typically
8, 16, 32, and 64 bits.
171
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
• Most modern computers store integers in two’s complement binary notation.
• Some computers support unsigned integers as well as signed integers.
• For an unsigned integer, there is no sign bit and all the bits contain data – thus the
range of a 32-bit word is 0 to 232 − 1, inclusive.
• In contrast, a two’s complement signed 32-bit integer can only handle numbers up to
231
− 1, but it can also handle negative numbers.
• For numbers that cannot be expressed as an integer, floating-point numbers are used.
• They have lengths of 32, 64, or sometimes 128 bits.
• Most computers have instructions for doing floating-point arithmetic.
• Many computers have separate registers for holding integer operands and for holding
floating-point operands.
• Modern computers are often used for nonnumerical applications, such as word
processing or
database management.
• Thus, characters are clearly important here although not every computer provides
hardware
support for them.
• The most common character codes are ASCII and UNICODE.
• These support 7-bit characters and 16-bit characters, respectively.
• It is not uncommon for the ISA level to have special instructions that are intended for
handling character strings.
• The instructions can perform copy, search, edit and other functions on the strings.
• Boolean values are also important.
• Two values: TRUE or FALSE.
• In theory, a single bit can represent a Boolean, with 0 as false and 1 as true (or vice
versa).
• In practice, a byte or word is used per Boolean value because individual bits in a byte do
not
have their own addresses and thus are hard to access.
• A common system uses the convention that 0 means false and everything else means
true.
• Our last data type is the pointer, which is just a machine address.
• We have already seen pointers.
• When we discussed stacks we came across pointers SP and LV.
• Accessing a variable at a fixed distance from a pointer, which is the way ILOAD works,
is
extremely common on all machines.
Instruction Formats
172
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
• An instruction consists of an opcode, usually along with some additional information
such as
where operands come from, and where results go to.
• The general subject of specifying where the operands are (i.e., their addresses) is called
addressing.
• Instructions always have an opcode to tell what the instruction does.
• There can be zero, one, two, or three addresses present.
Instruction Formats
• On some machines, all instructions have the same length; on others there may be many
different lengths.
• Instructions may be shorter than, the same length as, or longer than the word length.
• Having all the instructions be the same length is simpler and makes decoding easier but
often wastes space, since all instructions then have to be as long as the longest one.
173
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Addressing
• Instructions generally have one, two or three operands.
• The operands are addressed using one of the following modes:
– Immediate
– Direct
– Register
– Indexed
– Other mode
174
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
• Some machines have a large number of
complex addressing modes.
• We will consider a few addressing modes here.
• The simplest way for an instruction to specify an operand is for the address part of the
instruction actually to contain the operand itself rather than an address or other
information
describing where the operand is.
• Such an operand is called an immediate operand because it is automatically fetched
from memory at the same time the instruction itself is fetched.
• Example:
MOV R1,4
• Advantage – no extra memory reference to fetch the operand.
• Disadvantage – only a constant can be supplied this way.
Direct Addressing
• A method for specifying an operand in memory is just to give its full address.
• This mode is called direct addressing.
• Like immediate addressing, direct addressing is restricted in its use: the instruction will
always access exactly the same memory location.
• So while the value can change, the location cannot.
• Thus direct addressing can only be used to access global variables whose address is
known at compile time.
Register Addressing
175
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
integer array to compute the sum of the elements in register R1.
• We will indirectly register through R2 to access the elements of the array
• Here is the assembly program:
Indexed Addressing
Based-Inde
xed Addressing
• Some machines have an addressing mode in which the memory address is computed
by adding up two registers plus an (optional) offset.
• Sometimes this mode is called based-indexed addressing.
176
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
• One of the registers is the base and the other is the index.
• Such a mode would have been useful in our example here.
• Outside the loop we could have put the address of A in R5 and the address of B in R6.
• Then we could have replaced the instruction at LOOP and its successor with LOOP: MOV
R4,(R2+R5) AND R4,(R2+R6)
An instruction set is (a list of) all instructions, and all their variations, that a processor
can execute.
Instructions include:
Instruction set architecture is distinguished from the microarchitecture, which is the set
of processor design techniques used to implement the instruction set. Computers with
different microarchitectures can share a common instruction set. For example, the Intel
Pentium and the AMD Athlon implement nearly identical versions of the x86 instruction
set, but have radically different internal designs.
177
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
· all early computer designers, and some of the simpler later RISC computer
designers, hard-wired the instruction set.
· Many CPU designers compiled the instruction set to a microcode ROM inside the
CPU. (such as the Western Digital MCP-1600)
· Some CPU designers compiled the instruction set to a writable RAM or FLASH inside
the CPU (such as the Rekursiv processor and the Imsys Cjip)[1], or a FPGA
(reconfigurable computing).
Some instruction set designers reserve one or more opcodes for some kind of software
interrupt. For example, MOS Technology 6502 uses 0x00 (all zeroes), Zilog Z80 uses
0xFF (all ones),[1] and Motorola 68000 has instructions 0xA000 through 0xAFFF.
Fast virtual machines are much easier to implement if an instruction set meets the
Popek and Goldberg virtualization requirements.
Code density
In early computers, program memory was expensive and limited, and minimizing the
size of a program in memory was important. Thus the code density -- the combined size
of the instructions needed for a particular task -- was an important characteristic of an
instruction set. Instruction sets with high code density employ powerful instructions that
can implicity perform several functions at once. Typical complex instruction-set
computers (CISC) have instructions that combine one or two basic operations (such as
"add", "multiply", or "call subroutine") with implicit instructions for accessing memory,
incrementing registers upon use, or dereferencing locations stored in memory or
registers. Some software-implemented instruction sets have even more complex and
powerful instructions.
178
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
operation, such as an "add" of two registers or the "load" of a memory location into a
register.
Minimal instruction set computers (MISC) are a form of stack machine, where there are
few separate instructions (16-64), so that multiple instructions can be fit into a single
machine word. These type of cores often take little silicon to implement, so they can be
easily realized in an FPGA or in a multi-core form. Code density is similar to RISC; the
increased instruction density is offset by requiring more of the primitive instructions to
do a task.
Instruction sets may be categorized by the number of operands in their most complex
instructions. (In the examples that follow, a, b, and c refer to memory addresses, and
reg1 and so on refer to machine registers.)
· 0-operand ("zero address machines") -- these are also called stack machines, and
all operations take place using the top one or two positions on the stack. Adding
two numbers here can be done with four instructions: push a, push b, add, pop
c;
· 1-operand -- this model was common in early computers, and each instruction
performs its operation using a single operand and places its result in a single
accumulator register: load a, add b, store c;
· 2-operand -- most RISC machines fall into this category, though many CISC
machines also fall here as well. For a RISC machine (requiring explicit memory
loads), the instructions would be: load a,reg1, load b,reg2, add reg1,reg2, store
reg2;
· 3-operand -- some CISC machines, and a few RISC machines fall into this category.
The above example here might be performed in a single instruction in a machine
with memory operands: add a,b,c, or more typically (most machines permit a
maximum of two memory operations even in three-operand instructions): move
a,reg1, add reg1,b,c. In three-operand RISC machines, all three operands are
typically registers, so explicit load/store instructions are needed. An instruction set
with 32 registers requires 15 bits to encode three register operands, so this scheme
is typically limited to instructions sets with 32-bit instructions or longer;
· more operands -- some CISC machines permit a variety of addressing modes that
allow more than 3 register-based operands for memory accesses.
There has been research into executable compression as a mechanism for improving
code density. The mathematics of Kolmogorov complexity describes the challenges and
limits of this.
Machine language
179
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Machine language is built up from discrete statements or instructions. Depending on the
processing architecture, a given instruction may specify:
More complex operations are built up by combining these simple instructions, which (in
a von Neumann machine) are executed sequentially, or as otherwise directed by control
flow instructions.
· moving
· move data from a memory location to a register, or vice versa. This is done
to obtain the data to perform a computation on it later, or to store the result
of a computation.
· computing
· add, subtract, multiply, or divide the values of two registers, placing the
result in a register
· compare two values in registers (for example, to see if one is less, or if they
are equal)
· jump to another location, but save the location of the next instruction as a
point to return to (a call)
180
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Some computers include "complex" instructions in their instruction set. A single
"complex" instruction does something that may take many instructions on other
computers. Such instructions are typified by instructions that take multiple steps,
control multiple functional units, or otherwise appear on a larger scale than the bulk of
simple instructions implemented by the given processor. Some examples of "complex"
instructions include:
· instructions that combine ALU with an operand from memory rather than a register
A complex instruction type that has become particularly popular recently is the SIMD or
Single-Instruction Stream Multiple-Data Stream operation or vector instruction, an
operation that performs the same arithmetic operation on multiple pieces of data at the
same time. SIMD have the ability of manipulating large vectors and matrices in minimal
time. SIMD instructions allow easy parallelization of algorithms commonly involved in
sound, image, and video processing. Various SIMD implementations have been brought
to market under trade names such as MMX, 3DNow! and AltiVec.
The design of instruction sets is a complex issue. There were two stages in history
for the microprocessor. One using CISC or complex instruction set computer where
many instructions were implemented. In the 1970s places like IBM did research and
found that many instructions were used that could be eliminated. The result was the
RISC, reduced instruction set computer, architecture which uses a smaller set of
instructions. The result was a simpler instruction set may offer the potential for higher
speeds, reduced processor size, and reduced power consumption; a more complex one
may optimize common operations, improve memory/cache efficiency, or simplify
programming.
List of ISAs
This list is far from comprehensive as old architectures are abandoned and new ones
invented on a continual basis. There are many commercially available microprocessors
and microcontrollers implementing ISAs in all shapes and sizes. Customised ISAs are
also quite common in some applications, e.g. ARC International, application-specific
integrated circuit, FPGA, and reconfigurable computing. Also see history of computing
hardware.
181
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
ISAs commonly implemented in hardware
· Alpha AXP (DEC Alpha)
· ARM (Acorn RISC Machine) (Advanced RISC Machine now ARM Ltd)
· IA-64 (Itanium)
· MIPS
· Motorola 68k
· IBM POWER
· PowerPC
· SPARC
· SuperH
· System/360
· Tricore (Infineon)
· Transputer (STMicroelectronics)
· EISC (AE32K)
· FORTH
182
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
ISAs never implemented in hardware
· ALGOL object code
Categories of ISA
· application-specific integrated circuit (ASIC) fully custom ISA
· CISC
· MISC
· reconfigurable computing
· RISC
· vector processor
· VLIW
· microcontroller
· microprocessor
Processor Register
Processor registers are the top of the memory hierarchy, and provide the fastest way
for the system to access data. The term is often used to refer only to the group of
registers that can be directly indexed for input or output of an instruction, as defined by
the instruction set. More properly, these are called the "architectural registers". For
instance, the x86 instruction set defines a set of eight 32-bit registers, but a CPU that
implements the x86 instruction set will contain many more registers than just these
eight.
Putting frequently used variables into registers is critical to the program's performance.
This action, namely register allocation is usually done by a compiler in the code
generation phase.
Categories of registers
Registers are normally measured by the number of bits they can hold, for example, an
"8-bit register" or a "32-bit register". Registers are now usually implemented as a
register file, but they have also been implemented using individual flip-flops, high speed
core memory, thin film memory, and other ways in various machines.
· Data registers are used to store integer numbers (see also Floating Point
Registers, below). In some older and simple current CPUs, a special data register is
the accumulator, used implicitly for many operations.
· Address registers hold memory addresses and are used to access memory. In
some CPUs, a special address register is an index register, although often these
hold numbers used to modify addresses rather than holding addresses.
· Conditional registers hold truth values often used to determine whether some
instruction should or should not be executed.
· General purpose registers (GPRs) can store both data and addresses, i.e., they
are combined Data/Address registers.
· Floating point registers (FPRs) are used to store floating point numbers in many
architectures.
· Constant registers hold read-only values (e.g., zero, one, pi, ...).
· Vector registers hold data for vector processing done by SIMD instructions
(Single Instruction, Multiple Data).
184
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
· Special purpose registers hold program state; they usually include the program
counter (aka instruction pointer), stack pointer, and status register (aka processor
status word).
· Index registers are used for modifying operand addresses during the run of
a program.
Some examples
The table below shows the number of registers of several mainstream processors:
I n t e g e r Double FP
Processors
registers registers
Pentium 4 8 8
Athlon MP 8 8
Opteron 240 16 16
185
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Itanium 2 128 128
UltraSPARC IIIi 32 32
Power 3 32 32
Addressing modes, a concept from computer science, are an aspect of the instruction
set architecture in most central processing unit (CPU) designs. The various addressing
modes that are defined in a given instruction set architecture define how machine
language instructions in that architecture identify the operand (or operands) of each
instruction. An addressing mode specifies how to calculate the effective memory
address of an operand by using information held in registers and/or constants contained
within a machine instruction or elsewhere.
Caveats
Note that there is no generally accepted way of naming the various addressing modes.
In particular, different authors and/or computer manufacturers may give different
names to the same addressing mode, or the same names to different addressing
modes. Furthermore, an addressing mode which, in one given architecture, is treated
as a single addressing mode may represent functionality that, in another architecture, is
covered by two or more addressing modes. For example, some complex instruction set
computer (CISC) computer architectures, such as the Digital Equipment Corporation
(DEC) VAX, treat registers and literal/immediate constants as just another addressing
mode. Others, such as the IBM System/390 and most reduced instruction set computer
(RISC) designs, encode this information within the instruction code. Thus, the latter
machines have three distinct instruction codes for copying one register to another,
copying a literal constant into a register, and copying the contents of a memory location
into a register, while the VAX has only a single "MOV" instruction.
The addressing modes listed below are divided into code addressing and data
addressing. Most computer architectures maintain this distinction, but there are, or
have been, some architectures which allow (almost) all addressing modes to be used in
any context.
The instructions shown below are purely representative in order to illustrate the
addressing modes, and do not necessarily apply to any particular computer.
186
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Useful side effect
Some computers have a Load effective address instruction. This performs a
calculation of the effective operand address, but instead of acting on that memory
location, it loads the address that would have been accessed into a register. This can be
useful when passing the address of an array element to a subroutine. It may also be a
slightly sneaky way of doing more calculation than normal in one instruction; for
example, use with the addressing mode 'base+index+offset' allows one to add two
registers and a constant together in one instruction.
Most RISC machines have only about five simple addressing modes, while CISC
machines such as the DEC VAX supermini have over a dozen addressing modes, some
of which are quite complicated. The IBM System/360 mainframe had only three
addressing modes; a few more have been added for the System/390.
When there are only a few addressing modes, the particular addressing mode required
is usually encoded within the instruction code (e.g. IBM System/390, most RISC). But
when there are lots of addressing modes, a specific field is often set aside in the
instruction to specify the addressing mode. The DEC VAX allowed multiple memory
operands for almost all instructions and so reserved the first few bits of each operand
specifier to indicate the addressing mode for that particular operand.
Absolute
+----+------------------------------+
|jump| address |
+----+------------------------------+
Effective address = address as given in instruction
187
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Program relative
+------+-----+-----+----------------+
|jumpEQ| reg1| reg2| offset | jump relative if reg1=reg2
+------+-----+-----+----------------+
Effective address = offset plus address of next instruction.
This is particularly useful in connection with conditional jumps, because you usually only
want to jump to some nearby instruction (in a high-level language most if or while
statements are reasonably short). Measurements of actual programs suggest that an 8
or 10 bit offset is large enough for some 90% of conditional jumps.
Register indirect
+-------+-----+
|jumpVia| reg |
+-------+-----+
Effective address = contents of specified register.
The effect is to transfer control to the instruction whose address is in the specified
register. Such an instruction is often used for returning from a subroutine call, since the
actual call would usually have placed the return address in a register.
Register
+------+-----+-----+-----+
| mul | reg1| reg2| reg3| reg1 := reg2 * reg3;
+------+-----+-----+-----+
This 'addressing mode' does not have an effective address, and is not considered to be
an addressing mode on some computers.
In this example, all the operands are in registers, and the result is placed in a register.
188
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The offset is usually a signed 16-bit value (though the 80386 is famous for expanding it
to 32-bit, though x64 didn't).
If the offset is zero, this becomes an example of register indirect addressing; the
effective address is just that in the base register.
On many RISC machines, register 0 is fixed with value 0. If register 0 is used as the
base register, this becomes an example of absolute addressing. However, only a small
portion of memory can be accessed (the first 32 Kbytes and possibly the last 32 Kbytes)
The 16-bit offset may seem very small in relation to the size of current computer
memories (which is why the 80386 expanded it to 32-bit. x64 didn't expand it,
however.) (it could be worse: IBM System/360 mainframes only have a positive 12-bit
offset 0 to 4095). However, the principle of locality of reference applies: over a short
time span most of the data items you wish to access are fairly close to each other.
Example 1: Within a subroutine you will mainly be interested in the parameters and the
local variables, which will rarely exceed 64 Kbytes, for which one base register suffices.
If this routine is a class method in an object-oriented language, you will need a second
base register pointing at the attributes for the current object (this or self in some high
level languages).
Example 2: If the base register contains the address of a record or structure, the offset
can be used to select a field from that record (most records/structures are less than 32
Kbytes in size).
Immediate/literal
+------+-----+-----+----------------+
| add | reg1| reg2| constant | reg1 := reg2 + constant;
+------+-----+-----+----------------+
This 'addressing mode' does not have an effective address, and is not considered to be
an addressing mode on some computers.
Instead of using an operand from memory, the value of the operand is held within the
instruction itself. On the DEC VAX machine, the literal operand sizes could be 6, 8, 16,
or 32 bits long.
189
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Other addressing modes for code and/or data
Absolute/Direct
+------+-----+--------------------------------------+
| load | reg | address |
+------+-----+--------------------------------------+
Effective address = address as given in instruction.
This requires space in an instruction for quite a large address. It is often available on
CISC machines which have variable length instructions.
Some RISC machines have a special Load Upper Literal instruction which places a 16-bit
constant in the top half of a register. An OR literal instruction can be used to insert a
16-bit constant in the lower half of that register, so that a full 32-bit address can then
be used via the register-indirect addressing mode, which itself is provided as
'base-plus-offset' with an offset of 0.
Indexed absolute
+------+-----+-----+--------------------------------+
| load | reg |index| 32-bit address |
+------+-----+-----+--------------------------------+
Effective address = address plus contents of specified index register.
This also requires space in an instruction for quite a large address. The address could be
the start of an array or vector, and the index could select the particular array element
required. The index register may need to have been scaled to allow for the size of each
array element.
Note that this is more or less the same as base-plus-offset addressing mode, except
that the offset in this case is large enough to address any memory location.
The base register could contain the start address of an array or vector, and the index
could select the particular array element required. The index register may need to have
been scaled to allow for the size of each array element. This could be used for accessing
elements of an array passed as a parameter.
190
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Base plus index plus offset
+------+-----+-----+-----+----------------+
| load | reg | base|index| 16-bit offset |
+------+-----+-----+-----+----------------+
Effective address = offset plus contents of specified base register plus contents of
specified index register.
The base register could contain the start address of an array or vector of records, the
index could select the particular record required, and the offset could select a field
within that record. The index register may need to have been scaled to allow for the
size of each record.
Scaled
+------+-----+-----+-----+
| load | reg | base|index|
+------+-----+-----+-----+
Effective address = contents of specified base register plus scaled contents of specified
index register.
The base register could contain the start address of an array or vector, and the index
could contain the number of the particular array element required.
This addressing mode dynamically scales the value in the index register to allow for the
size of each array element, e.g. if the array elements are double precision floating-point
numbers occupying 8 bytes each then the value in the index register is multiplied by 8
before being used in the effective address calculation. The scale factor is normally
restricted to being a power of two so that shifting rather than multiplication can be used
(shifting is usually faster than multiplication).
Register indirect
+------+-----+-----+
| load | reg | base|
+------+-----+-----+
Effective address = contents of base register.
A few computers have this as a distinct addressing mode. Many computers just use
base plus offset with an offset value of 0.
191
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Register autoincrement indirect
+------+-----+-----+
| load | reg | base|
+------+-----+-----+
Effective address = contents of base register.
After determining the effective address, the value in the base register is incremented by
the size of the data item that is to be accessed.
Within a loop, this addressing mode can be used to step through all the elements of an
array or vector. A stack can be implemented by using this in conjunction with the next
addressing mode (autodecrement).
In high-level languages it is often thought to be a good idea that functions which return a
result should not have side effects (lack of side effects makes program understanding and
validation much easier). This instruction has a side effect in that the base register is
altered. If the subsequent memory access causes a page fault then restarting the
instruction becomes much more problematical.
Within a loop, this addressing mode can be used to step backwards through all the
elements of an array or vector. A stack can be implemented by using this in conjunction
with the previous addressing mode (autoincrement).
See also the discussion on side-effects under the autoincrement addressing mode.
Memory indirect
Any of the addressing modes mentioned in this article could have an extra bit to
indicate indirect addressing, i.e. the address calculated by using some addressing mode
192
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
is the address of a location (often 32 bits or a complete word) which contains the actual
effective address.
Indirect addressing may be used for code and/or data. It can make implementation of
pointers or references very much easier, and can also make it easier to call subroutines
which are not otherwise addessable. There is a performance penalty due to the extra
memory access involved.
Some early minicomputers (e.g. DEC PDP8, Data General Nova) had only a few
registers and only a limited addressing range (8 bits). Hence the use of memory indirect
addressing was almost the only way of referring to any significant amount of memory.
PC-based addressing
The x86-64 architecture supports RIP-based addressing, which uses the 64-bit program
counter (instruction pointer) RIP as a base register. This allows for position-independent
code.
The DEC PDP-10 computer with 18-bit addresses and 36-bit words allowed multi-level
indirect addressing with the possibility of using an index register at each stage as well.
Memory-mapped registers
On some computers the registers were regarded as occupying the first 8 or 16 words of
memory (e.g. ICL 1900, DEC PDP-10). This meant that there was no need for a
separate 'Add register to register' instruction - you could just use the 'Add memory to
register' instruction.
193
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
In the case of early models of the PDP-10, which did not have any cache memory, you
could actually load a tight inner loop into the first few words of memory (the fast
registers in fact), and have it run much faster than if it was in magnetic core memory.
Later models of the DEC PDP-11 series mapped the registers onto addresses in the
input/output area, but this was primarily intended to allow remote diagnostics.
Confusingly, the 16-bit registers were mapped onto consecutive 8-bit byte addresses.
Zero page
In the MOS Technology 6502 the first 256 bytes of memory could be accessed very
rapidly. The reason was that the 6502 was lacking in registers which were not special
function registers. To use zero page access an 8-bit address would be used, saving one
clock cycle as compared with using a 16-bit address. An Operating System would use
much of zero page, so it was not as useful as it might have seemed.
Another variation uses vector descriptors to hold the bounds; this makes it easy to
implement dynamically allocated arrays and still have full bounds checking.
Instructions existed to load and store bytes via this descriptor, and to increment the
descriptor to point at the next byte (bytes were not split across word boundaries). Much
DEC software used five 7-bit bytes per word (plain ASCII characters), with 1 bit unused
per word. Implementations of C had to use four 9-bit bytes per word, since C assumes
that you can access every bit of memory by accessing consecutive bytes.
194
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Lesson IX
Main memory is as vital as the processor chip to a computer system. Fast systems have
both a fast processor and a large memory. Here is a list of some characteristics of
computer memory. Some characteristics are true for both kinds of memory; others are
true for just one.
Bit
Here is a table that summarizes the characteristics of the two types of computer
memory.
Fast access. X
Slow access. X
195
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
In both main and secondary memory, information is stored as patterns of bits. Recall
from chapter two what a bit is:
A bit is a single "on"/"off" value. Only these two values are possible.
The two values may go by different names, such as "true"/"false", or "1"/"0". There are
many ways in which a bit can be implemented. For example a bit could be implemented
as:
· Voltage on a wire.
Copied Information
Information stored in binary form does not change when it is copied from one medium
(storage method) to another. And an unlimited number of such copies can be made
(remember the advantages of binary.) This is a very powerful combination. You may be
so accustomed to this that it seems commonplace. But when you (say) download an
image from the Internet, the data has been copied many dozens of times, using a
variety of storage and transmission methods.
It is likely, for example, that the data starts out on magnetic disk and is then copied to
main storage of the web site's computer (involving a voltage signal in between.) From
main storage it is copied (again with a voltage signal in between) to a network interface
card, which temporarily holds it in many transistors. From there it is sent as an
electrical signal down a cable. Along the route to your computer, there may be dozens
of computers that transform data from an electrical signal, into main memory transistor
form, and then back to an electrical signal on another cable. Your data may even be
transformed into a radio signal, sent to a satellite (with its own computers), and sent
196
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
back to earth as another radio signal. Eventually the data ends up as data in your video
card (transistors), which transforms it
Byte
power of 2
Name Number of Bytes
byte 1 20
1,099,511,627,776
terabyte 240
One bit of information is so little that usually computer memory is organized into groups
of eight bits. Each eight bit group is called a byte. When more than eight bits are
required for some data, a whole number of bytes are used. One byte is about enough
memory to hold a single character.
Often very much more than eight bits are required for data, and thousands, millions, or
even billions of bytes are needed. These amounts have names, as seen in the table. If
you expect computers to be your career, it would be a good idea to become very
familiar with this table. (Except that the only number you should remember from the
middle column is that a kilobyte is 1024 bytes.) Often a kilobyte is called a "K" and a
megabyte is called a "Meg."
The previous table listed the number of bytes, not bits. So one K of memory is 1024
bytes, or 1024*8 == 8,192 bits. Usually one is not particularly interested in the exact
number of bits. It will be very useful in your future career to be sure you know how to
multiply powers of two.
2M * 2N = 2(M+N)
197
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Locations in a digital image are specified by a row
number and a column number (both of t h e m
integers). A particular digital image is 1024 rows
by 1024 columns, and each location holds one byte.
How many megabytes are in that image?
Each byte contains a pattern of eight bits. When the computer's power is on, every byte
contains some pattern or other, even those bytes not being used for anything.
(Remember the nature of binary: when a binary device is working it is either "on" or
"off", never in between.)
The address of a byte is not part of its contents. When the processor needs to access
the byte at a particular address, the electronics of the computer "knows how" to find
that byte in memory.
198
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Main memory (as all computer m e m o r y )
stores bit patterns. That is, each m e m o r y
location consists of eight bits, and each bit is
either "0" or "1". For example, the picture shows
the first few bytes of memory.
The processor has written a byte of data at location 7. The old contents of that location
are lost. Main memory now looks like the picture.
When a program is running, it has a section of memory for the data it is using.
Locations in that section can be changed as many times as the program needs. For
example, if a program is adding up a list of numbers, the sum will be kept in main
memory (probably using several bytes.) As new numbers are added to the sum, it will
change and main memory will have to be changed, too.
Other sections of main memory might not change at all while a program is running. For
example, the instructions that make up a program do not (usually) change as a
program is running. The instructions of a running program are located in main memory,
so those locations will not change.
When you write a program in Java (or most other languages) you do not need to keep
track of memory locations and their contents. Part of the purpose of a programming
language is to do these things automatically.
199
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
In secondary storage (usually the computer system's hard disk.)
Hard Disks
Usually the component called the "hard disk" of a computer system contains many
individual disks and read/write heads like the above. The disks are coated with
magnetic material on both sides (so each disk gets two read/write heads) and the disks
are all attached to one spindel. All the disks and heads are sealed into a dust-free metal
can. Since the operation of a hard disk involves mechanical motion (which is much
slower than electronic processes), reading and writing data is much slower than with
main memory.
Files
Hard disks (and other secondary memory devices) are used for long-term storage of
large blocks of information, such as programs and data sets. Usually disk memory is
organized into files.
A file is a collection of information that has been given a name and is stored in secondary
memory. The information can be a program or can be data.
The form of the information in a file is the same as with any digital information---it
consists of bits, usually grouped into eight bit bytes. Files are frequently quite large;
their size is measured in kilobytes or megabytes.
200
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
If you have never worked with files on a computer before you should study the
documentation that came with your operating system, or look at a book such as
Windows NT for Dummies (or whatever is appropriate for your computer.)
One of the jobs of a computer's operating system is to keep track of file names and
where they are on its hard disk. For example, in DOS the user can ask to run the
program DOOM like this:
C:\> DOOM.EXE
The "C:\>" is a prompt; the user typed in "DOOM.EXE". The operating system now has
to find the file called DOOM.EXE somewhere on its hard disk. The program will be
copied into main storage and will start running. As the program runs it asks for
information stored as additional files on the hard disk, which the operating system has
to find and copy into main memory. Usually in a file in secondary storage. If the file
does not already exist, the program will ask the operating system to create it.
Usually all collections of data outside of main storage are organized into files. The job of
keeping all this information straight is the job of the operating system. If the computer
system is part of a network, keeping straight all the files on all the computers can be
quite a task, and is the collective job of all the operating systems involved.
Application programs (including programs that you might write) do not directly read,
write, create, or delete files. Since the operating system has to keep track of
everything, all other programs ask it to do file manipulation tasks. For example, say
that a program has just calculated a set of numbers and needs to save them. The
following might be how it does this:
1. Program: asks the operating system to create a file with a name RESULTS.DAT
2. Operating System: gets the request; finds an unused section of the disk and
creates an empty file. The program is told when this has been completed.
3. Program: asks the operating system to save the numbers in the file.
4. Operating System: gets the numbers from the program's main memory, writes
them to the file. The program is told when this has been completed.
201
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
so that they have alternatives when a requests is refused. Older programs were not
written this way, and do not run well on modern computers.
In modern computer systems, only the operating system can directly do anything with
disk files. How does this:
· Program creation is easier because the work of dealing with files is done by
the operating system.
Types of Files
As far as the hard disk is concerned, all files are the same. At the electronic level, there
is no difference between a file containing a program and a file containing data. All files
are named collections of bytes. Of course, what the files are used for is different. The
operating system can take a program file, copy it into main memory, and start it
running. The operating system can take a data file, and supply its information to a
running program when it asks.
Often then last part of a file's name (the extension) shows what the file is expected to
be used for. For example, in "mydata.txt" the ".txt" means that the file is expected to
be used as a collection of text, that is, characters. With "Netscape.exe" the ".exe"
means that the file is an "executable," that is, a program that is ready to run. With
"program1.java" the ".java" means that the file is a source program in the language
java (there will be more about source programs later on in these notes.) To the hard
disk, each of these files is the same sort of thing: a collection of bytes.
Address EXtension
The processor hardware is augmented with additional address lines used to select the
additional memory, and 36-bit page tables, but regular application software continues
to use instructions with 32-bit addresses and a flat memory model limited to 4
gigabytes. The operating system uses PAE to map this 32-bit address space onto the 64
gigabytes of total memory, and the map can be and usually is different for each
process. In this way the extra memory is useful even though regular applications cannot
access it all simultaneously.
For application software which needs access to more than 4 gigabytes of memory some
special mechanism may be provided by the operating system in addition to the regular
PAE support. On Microsoft Windows this mechanism is called Address Windowing
Extensions (AWE), while on Unix systems a variety of tricks are used, such as using
mmap() to map regions of a file into and out of the address space as needed, none
having been blessed as a standard.
Enabling PAE (by setting bit 5, PAE, of the system control register CR4) causes major
changes to this scheme. By default, the size of each page remains as 4K. Each entry in
the page table and page directory is extended to 64 bits (8 bytes) rather than 32 to
allow for additional address bits; the table size does not change, however, so each table
now has only 512 entries. Because this allows only a quarter as many entries as the
original scheme, an extra level of hierarchy must be added, so CR3 now points to the
Page Directory Pointer Table, a short table which contains pointers to 4 page
directories.
Additionally, the entries in the page directory have an additional flag, named 'PS' (for
Page Size). If this bit (bit 7) is set to 1, the page directory entry does not point to a
page table, but a single large page (2MB in length).
203
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Lesson X
C l a i m s
I claim:
1. A circuit comprising: a plurality of cascaded stages, each stage being capable of being
placed in one of a set state or a reset state, wherein the set state of a particular stage is
established by using a majority of charge supplied from an immediately preceding stage,
and the reset state of the particular stage is established by using a minority of charge
supplied directly from a subsequent stage.
2. A circuit as in claim 1 wherein each stage comprises at least one PMOS transistor and
at least one NMOS transistor.
5. A circuit as in claim 2 wherein the PMOS and NMOS transistors are interconnected to
form an inverter.
6. A circuit as in claim 5 wherein the PMOS and NMOS transistors include gates commonly
connected.
204
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
7. A circuit as in claim 1 wherein the circuit is coupled between a first and a second
potential source, and a selected number of the stages each comprises: an input node; an
output node; a first transistor connected to the input node, the output node, and to one of
the first and second potential sources of connecting the output node to one of the first and
second potential sources in response to a first type signal on the input node; and a second
transistor connected to the output node, to the other of the first and second potential
sources, and to the following stage, for connecting the output node to the other of the
first and second potential sources in delayed response to the earlier first type signal on
the input node.
8. A circuit as in claim 7 wherein for each of the selected number of stages: the first
transistor has a gate connected to the input node, a source connected to the second
potential source, and a drain connected to the output node; and the second transistor has
a gate connected to an output node of the following stage, a drain connected to the
output node, and a source connected to the first potential source.
9. A circuit as in claim 8 wherein for each of the selected number of stages: the first
transistor comprises an NMOS transistor; and the second transistor comprises a PMOS
transistor.
10. A circuit as in claim 9 wherein for each of the selected number of stages, each stage
further comprises: a third transistor connected to the output node, to the other of the first
and second potential sources and to the input node, for connecting the output node to the
other of the first and second potential in response to a second type signal on the input
node.
11. A circuit as in claim 10 wherein for each of the selected number of stages, each stage
further comprises: the third transistor has a source connected to the other of the first and
second potential sources, a drain connected to the output node, and a gate connected to
the input node.
12. A circuit as in claim 1 wherein the input node of the particular stage is connected to
the output node of the stage immediately preceding the particular stage.
13. A circuit as in claim 12 wherein the subsequent stage is an even number of stages
after the particular stage.
205
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
15. A CMOS circuit as in claim 14 wherein: each stage includes an input node and an
output node, the input node of a stage being connected to the output node of a preceding
stage; wherein the input node of an odd-numbered stage is connected to control the
PMOS transistor, the input node of an even-numbered stage is connected to control the
NMOS transistor; and wherein the NMOS transistor in an odd-numbered stage is
connected to be controlled by the output node of a subsequent stage, and the PMOS
transistor in an even-numbered stage is controlled by the output node of a subsequent
stage.
16. A circuit as in claim 15 wherein: the stages are capable of being placed in a first logic
stage or a second logic state; the NMOS transistor in the odd-numbered stages is coupled
to a subsequent stage and controls the first logic state of the odd-numbered stages; and
the PMOS transistor in the even-numbered stages is coupled to a subsequent stage and
controls the first logic state of the even-numbered stages.
17. A circuit as in claim 16 wherein: the NMOS transistor in the even-numbered stages is
coupled to a immediately preceding stage and controls the second logic state of the
even-numbered stages; the PMOS transistor in the odd-numbered stages is coupled to an
immediately preceding stage and controls the second logic stage of the odd-numbered
stages.
18. A circuit as in claim 14 coupled between a lower potential and an upper potential
wherein each stage comprises: an input node; an output node; a PMOS transistor having
a gate connected to the input node, a source connected to the upper potential, and a
drain connected to the output node; and an NMOS transistor having a gate connected to
the input node, a source connected to the lower potential, and a drain connected to the
output node.
19. A circuit comprising: a first stage; a plurality of cascaded stages; a last stage; each
cascaded stage including set means and reset means; the set means for each particular
one of the cascaded stages being coupled to and driven by a previous stage, and the
reset means for each cascaded stage being coupled to and driven by a subsequent stage,
wherein virtually all of the power available during the switching of the previous stage is
available for driving the set means for the particular cascaded stage, thereby increasing
the switching speed of the set means of the particular cascaded stage; and wherein a
minor portion of the power available during the switching of the subsequent stage is used
for driving the reset means of the particular cascaded stage, thereby accomplishing the
reset of the particular cascaded stage without significantly altering the switching speed of
the subsequent stage.
20. A logic circuit comprising: a first node for receiving an input signal having energy; a
second node for supplying an output signal; a plurality of cascaded stages each having a
control input node, a reset input node, and an output node, the control input node of a
first stage of the plurality being connected to the first node, the output node of a last
206
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
stage of the plurality being connected to the second node; each said stage being capable
of assuming a set state and a reset state, the set state for each stage being controlled by
a signal on its control input node which for all stages except the first stage is coupled to
the output node of an earlier stage, whereby most of the energy available from the input
signal for the first cascaded stage is used for setting that cascaded stage and most of the
energy available from each subsequent stage is used for setting the next stage thereafter;
and the reset state for each stage being controlled by a signal on the reset input node
supplied from an output node of a subsequent stage, whereby energy to reset each
particular cascaded stage comes from a subsequent stage.
21. A logic circuit as in claim 20 wherein the set state and the reset state for each
cascaded stage are controlled by logic switches whose conduction depends upon the state
of the control
input node for that stage.
22. A circuit as in claim 21 wherein the logic switches for each cascaded stage connected
between the output node of that stage and a most positive potential source comprises
PMOS transistors.
23. A circuit as in claim 21 wherein the logic switches for each cascaded stage connected
between the output node for that stage and a most negative potential source comprise
NMOS transistors.
24. A logic circuit as in claim 20 wherein the subsequent stage is an even number of
stages following the particular stage.
26. A circuit for providing control signals to other circuits comprising: a plurality of
serially-connected stages, each capable of being placed in a set state or a reset state;
wherein a majority of charge to switch a stage to a set state comes from a prior stage
and a majority of charge to switch a stage to a reset state comes directly from a later
stage.
27. A method of increasing the speed of operation of a CMOS circuit having multiple
serially-connected stages comprising: providing a pulse having charge at an input node to
a selected stage; using a majority of the charge of the pulse to place the selected stage in
an active state; propagating the active state of the selected stage to later stages to
thereby also place them in an active state; and using an output signal from one of the
later stages connected directly to the selected stage to place the selected stage in a reset
state to await arrival of another pulse.
207
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The Hard-Wired Control Unit
The ring counter provides a sequence of six consecutive active signals that cycle
continuously. Synchronized by the system clock, the ring counter first activates its T0
line, then its T1 line, and so forth. After T5 is active, the sequence begins again with T0.
Figure 3 shows how the ring counter might be organized internally.
208
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The instruction decoder takes its four-bit input from
the op-code field of the instruction register and
activates one and only one of its 8 output lines.
Each line corresponds to one of the instructions in
the computer's instruction set. Figure 4 shows the
internal organization of this decoder.
209
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
IP = T2
W = T5*STA
LP = T3*JMP + T3*NF*JN
LD = T4*STA
LA = T5*LDA + T4*ADD + T4*SUB
EA = T4*STA + T3*MBA
EP = T0
S = T3*SUB
A = T3*ADD
LI = T2
LM = T0 + T3*LDA + T3*STA
ED = T2 + T5*LDA
R = T1 + T4*LDA
EU = T3*ADD+T3*SUB
EI = T3*LDA + T3*STA + T3*JMP + T3*NF*JN
LB = T3*MBA
To understand how this diagram was obtained, we must look carefully at the machine's
instruction set (Table 1).
210
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
3. RAM(MAR) <-- MDR 5
W
ADD 3 ACC <-- ACC + B 1. ALU <-- ACC + B 3
A
(Add B to ACC) 2. ACC <-- ALU 4
EU, LA
SUB 4 ACC <-- ACC - B 1. ALU <-- ACC - B 3
S
(Sub. B from ACC) 2. ACC <-- ALU 4
EU, LA
MBA 5 B <-- ACC 1. B <-- A 3
EA, LB
(Move ACC to B)
JMP 6 PC <-- RAM 1. PC <-- IR 3
EI, LP
(Jump to
Address)
JN 7 PC <-- RAM 1. PC <-- IR 3
NF: EI, LP
(Jump if if negative if NF set
Negative) flag is set
Table 2 shows which control signals must be active at each ring counter pulse for each of
the instructions in the computer's instruction set (and for the instruction fetch operation).
The table was prepared by simply writing down the instructions in the left-hand column.
(In the circuit these will be the output lines from the decoder). The various control
signals are placed horizontally along the top of the table. Entries into the table consist of
the moments (ring counter pulses T0, T1, T2, T3, T4, or T5) at which each control signal
must be active in order to have the instruction executed. This table is prepared very
easily by reading off the information for each instruction given in Table 1. For example,
the Fetch operation has the EP and LM control signals active at ring count 1, and ED, LI,
211
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
and IPC active at ring count 2. Therefore the first row (Fetch) of Table 2 has T0 entered
below EP and LM, T1 below R, and T2 below IP, ED, and LI.
Control Signal: IP LP EP LM R W LD ED LI EI LA EA A S EU LB
Instruction:
-----------------------------------------------------------------------------
"Fetch" T2 T0 T0 T1 T2 T2
LDA T3 T4 T5 T3 T5
STA T3 T5 T4 T3 T4
MBA T3 T3
ADD T4 T3 T4
SUB T4 T3 T4
JMP T3 T3
JN T3*NF T3*NF
Once Table 2 has been prepared, the logic required for each control signal is easily
obtained. For each an AND operation is performed between any active ring counter (Ti)
signals that were entered into the signal's column and the corresponding instruction
contained in the far left-hand column. If a column has more than one entry, the output
of the ANDs are ORed together to produce the final control signal. For example, the LM
column has the following entries: T0 (Fetch), T3 associated with the LDA instruction,
and T3 associated with the STA instruction. Therefore, the logic for this signal is:
LM = T0 + T3*LDA + T3*STA
This means that control signal LM will be activated whenever any of the following
conditions is satisfied: (1) ring pulse T0 (first step of an instruction fetch) is active, or
(2) an LDA instruction is in the IR and the ring counter is issuing pulse 3, or (3) and
STA instruction is in the IR and the ring counter is issuing pulse 3.
The entries in the JN (Jump Negative) row of this table require some further
explanation. The LP and EI signals are active during T3 for this instruction if and only if
the accumulator's negative flag has been set. Therefore the entries that appear above
these signals for the JN instruction are T3*NF, meaning that the state of the negative
flag must be ANDed in for the LP and EI control signals.
Figure 6 gives the logical equations required for each of the control signals used on our
machine. These equations have been read from Table 2, as explained above. The circuit
diagram of the control matrix (Figure 5) is constructed directly from these equations.
212
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
It should be noticed that the HLT line from the instruction decoder does not enter the
control matrix, Instead this signal goes directly to circuitry (not shown) that will stop
the clock and thus terminate execution.
Figure 6. The logical equations required for each of the hardwired control
signals on the basic computer. The machine's control matrix is designed from
these equations.
213
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
R I S C ?
RISC, or Reduced Instruction Set Computer. is a type of microprocessor architecture that
utilizes a small, highly-optimized set of instructions, rather than a more specialized set of
instructions often found in other types of architectures.
H i s t o r y
The first RISC projects came from IBM, Stanford, and UC-Berkeley in the late 70s and
early 80s. The IBM 801, Stanford MIPS, and Berkeley RISC 1 and 2 were all designed
with a similar philosophy which has become known as RISC. Certain design features
have been characteristic of most RISC processors:
214
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
· one cycle execution time: RISC processors have a CPI (clock per instruction) of one
cycle. This is due to the optimization of each instruction on the CPU and a technique
called <I.PIPELINING< I>;
This Site
Mips Architecture
The Stanford research group had a strong background in compilers, which led them to
develop a processor whose architecture would represent the lowering of the compiler to
the hardware level, as opposed to the raising of hardware to the software level, which
had been a long running design philosophy in the hardware industry.
Thus, the MIPS processor implemented a smaller, simpler instruction set. Each of the
instructions included in the chip design ran in a single clock cycle. The processor used a
technique called pipelining to more efficiently process instructions.
MIPS used 32 registers, each 32 bits wide (a bit pattern of this size is referred to as a
word).
215
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Instruction Set
The MIPS instruction set consists of about 111 total instructions, each represented in 32
bits. An example of a MIPS instruction is below:
Above is the assembly (left) and binary (right) representation of a MIPS addition
instruction. The instruction tells the processor to compute the sum of the values in
registers 7 and 8 and store the result in register 12. The dollar signs are used to
indicate an operation on a register. The colored binary representation on the right
illustrates the 6 fields of a MIPS instruction. The processor identifies the type of
instruction by the binary digits in the first and last fields. In this case, the processor
recognizes that this instruction is an addition from the zero in its first field and the 20 in
its last field.
The operands are represented in the blue and yellow fields, and the desired result
location is presented in the fourth (purple) field. The orange field represents the shift
amount, something that is not used in an addition operation.
· 25 branch/jump instructions
· 15 load instructions
· 10 store instructions
· 8 move instructions
· 4 miscellaneous instructions
MIPS Today
216
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
MIPS Computer Systems, Inc. was founded in 1984 upon the Stanford research from
which the first MIPS chip resulted. The company was purchased buy Silicon Graphics,
Inc. in 1992, and was spun off as MIPS Technologies, Inc. in 1998. Today, MIPS powers
many consumer electronics and other devices.
How Pipelining Works
Pipelining, a standard feature in RISC processors, is much like an assembly line. Because
the processor works on different steps of the instruction at the same time, more
instructions can be executed in a shorter period of time.
A useful method of demonstrating this is the laundry analogy. Let's say that there are
four loads of dirty laundry that need to be washed, dried, and folded. We could put the
the first load in the washer for 30 minutes, dry it for 40 minutes, and then take 20
minutes to fold the clothes. Then pick up the second load and wash, dry, and fold, and
repeat for the third and fourth loads. Supposing we started at 6 PM and worked as
efficiently as possible, we would still be doing laundry until midnight.
However, a smarter approach to the problem would be to put the second load of dirty
laundry into the washer after the first was already clean and whirling happily in the dryer.
Then, while the first load was being folded, the second load would dry, and a third load
could be added to the pipeline of laundry. Using this method, the laundry would be
finished by 9:30.
217
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
RISC Pipelines
A RISC processor pipeline operates in much the same way, although the stages in the
pipeline are different. While different processors have different numbers of steps, they
are basically variations of these five, used in the MIPS R3000 processor:
If you glance back at the diagram of the laundry pipeline, you'll notice that although the
washer finishes in half an hour, the dryer takes an extra ten minutes, and thus the wet
clothes must wait ten minutes for the dryer to free up. Thus, the length of the pipeline
is dependent on the length of the longest step. Because RISC instructions are simpler
than those used in pre-RISC processors (now called CISC, or Complex Instruction Set
Computer), they are more conducive to pipelining. While CISC instructions varied in
length, RISC instructions are all the same length and can be fetched in a single
operation. Ideally, each of the stages in a RISC processor pipeline should take 1 clock
cycle so that the processor finishes an instruction each clock cycle and averages one
cycle per instruction (CPI).
218
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Pipeline Problems
In practice, however, RISC processors operate at more than one cycle per instruction.
The processor might occasionally stall a a result of data dependencies and branch
instructions.
For example:
add $r3, $r2, $r1
MIPS' solution to this problem is code reordering. If, as in the example above, the
following instructions have nothing to do with the first two, the code could be
rearranged so that those instructions are executed in between the two dependent
instructions and the pipeline could flow efficiently. The task of code reordering is
generally left to the compiler, which recognizes data dependencies and attempts to
minimize performance stalls.
Branch instructions are those that tell the processor to make a decision about what the
next instruction to be executed should be based on the results of another instruction.
219
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Branch instructions can be troublesome in a pipeline if a branch is conditional on the
results of an instruction which has not yet finished its path through the pipeline.
For example:
Loop : add $r3, $r2, $r1
sub $r6, $r5, $r4
beq $r3, $r6, Loop
The example above instructs the processor to add r1 and r2 and put the result in r3,
then subtract r4 from r5, storing the difference in r6. In the third instruction, beq
stands for branch if equal. If the contents of r3 and r6 are equal, the processor should
execute the instruction labeled "Loop." Otherwise, it should continue to the next
instruction. In this example, the processor cannot make a decision about which branch
to take because neither the value of r3 or r6 have been written into the registers yet.
The processor could stall, but a more sophisticated method of dealing with branch
instructions is branch prediction. The processor makes a guess about which path to take
- if the guess is wrong, anything written into the registers must be cleared, and the
pipeline must be started again with the correct instruction. Some methods of branch
prediction depend on stereotypical behavior. Branches pointing backward are taken
about 90% of the time since backward-pointing branches are often found at the bottom
of loops. On the other hand, branches pointing forward, are only taken approximately
50% of the time. Thus, it would be logical for processors to always follow the branch
when it points backward, but not when it points forward. Other methods of branch
prediction are less static: processors that use dynamic prediction keep a history for
each branch and uses it to predict future branches. These processors are correct in their
predictions 90% of the time.
Still other processors forgo the entire branch prediction ordeal. The RISC System/6000
fetches and starts decoding instructions from both sides of the branch. When it
determines which branch should be followed, it then sends the correct instructions down
the pipeline to be executed.
Pipelining Developments
In order to make processors even faster, various methods of optimizing pipelines have
been devised.
Super pipelining refers to dividing the pipeline into more steps. The more pipe stages
there are, the faster the pipeline is because each stage is then shorter. Ideally, a
pipeline with five stages should be five times faster than a non-pipelined processor (or
rather, a pipeline with one stage). The instructions are executed at the speed at which
each stage is completed, and each stage takes one fifth of the amount of time that the
non-pipelined instruction takes. Thus, a processor with an 8-step pipeline (the MIPS
R4000) will be even faster than its 5-step counterpart. The MIPS R4000 chops its
220
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
pipeline into more pieces by dividing some steps into two. Instruction fetching, for
example, is now done in two stages rather than one. The stages are as shown:
3. Register Fetch
4. Instruction Execute
7. Tag Check
8. Write Back
Dynamic pipelines have the capability to schedule around stalls. A dynamic pipeline is
divided into three units: the instruction fetch and decode unit, five to ten execute or
functional units, and a commit unit. Each execute unit has reservation stations, which
act as buffers and hold the operands and operations.
221
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
While the functional units have the freedom to execute out of order, the instruction
fetch/decode and commit units must operate in-order to maintain simple pipeline
behavior. When the instruction is executed and the result is calculated, the commit unit
decides when it is safe to store the result. If a stall occurs, the processor can schedule
other instructions to be executed until the stall is resolved. This, coupled with the
efficiency of multiple units executing instructions simultaneously, makes a dynamic
pipeline an attractive alternative.
The simplest way to examine the advantages and disadvantages of RISC architecture is
by contrasting it with it's predecessor: CISC (Complex Instruction Set Computers)
architecture.
Multiplying Two Numbers in Memory
On the right is a diagram representing the storage scheme for a generic computer. The
main memory is divided into locations numbered from (row) 1: (column) 1 to (row) 6:
(column) 4. The execution unit is responsible for carrying out all computations.
However, the execution unit can only operate on data that has been loaded into one of
the six registers (A, B, C, D, E, or F). Let's say we want to find the product of two
numbers - one stored in location 2:3 and another stored in location 5:2 - and then store
the product back in the location 2:3.
222
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
A complex instruction set computer (CISC) is a
microprocessor instruction set architecture (ISA) in
which each instruction can execute several low-level
operations, such as a load from memory, an
arithmetic operation, and a memory store, all in a
single instruction. The term was coined in contrast
to reduced instruction set computer (RISC).
One of the primary advantages of this system is that the compiler has to do very little
work to translate a high-level language statement into assembly. Because the length of
the code is relatively short, very little RAM is required to store instructions. The
emphasis is put on building complex instructions directly into the hardware.
RISC processors only use simple instructions that can be executed within one clock
cycle. Thus, the "MULT" command described above could be divided into three separate
commands: "LOAD," which moves data from the memory bank to a register, "PROD,"
which finds the product of two operands located within the registers, and "STORE,"
which moves data from a register to the memory banks. In order to perform the exact
series of steps described in the CISC approach, a programmer would need to code four
lines of assembly:
223
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
LOAD A, 2:3
LOAD B, 5:2
PROD A, B
STORE 2:3, A
At first, this may seem like a much less efficient way of completing the operation.
Because there are more lines of code, more RAM is needed to store the assembly level
instructions. The compiler must also perform more work to convert a high-level
language statement into code of this form.
CISC RISC
Emphasis on hardware Emphasis on software
Includes multi-clock S i n g l e - c l o c k ,
complex instructions reduced instruction only
M e m o r y - t o - m e m o r y : Register to register:
"LOAD" and "STORE" "LOAD" and "STORE"
incorporated in instructions are independent instructions
Separating the "LOAD" and "STORE" instructions actually reduces the amount of work
that the computer must perform. After a CISC-style "MULT" command is executed, the
processor automatically erases the registers. If one of the operands needs to be used
for another computation, the processor must re-load the data from the memory bank
into a register. In RISC, the operand will remain in the register until another value is
loaded in its place.
224
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The Performance Equation
The CISC approach attempts to minimize the number of instructions per program,
sacrificing the number of cycles per instruction. RISC does the opposite, reducing the
cycles per instruction at the cost of the number of instructions per program.
RISC Roadblocks
Another major setback was the presence of Intel. Although their CISC chips were
becoming increasingly unwieldy and difficult to develop, Intel had the resources to plow
through development and produce powerful processors. Although RISC chips might
surpass Intel's efforts in specific areas, the differences were not great enough to
persuade buyers to change technologies.
Today, the Intel x86 is arguable the only chip which retains CISC architecture. This is
primarily due to advancements in other areas of computer technology. The price of RAM
has decreased dramatically. In 1977, 1MB of DRAM cost about $5,000. By 1994, the
same amount of memory cost only $6 (when adjusted for inflation). Compiler
technology has also become more sophisticated, so that the RISC use of RAM and
emphasis on software has become ideal.
225
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
CISC and RISC Convergence
State of the art processor technology has changed significantly since RISC chips were first
introduced in the early '80s. Because a number of advancements (including the ones
described on this page) are used by both RISC and CISC processors, the lines between
the two architectures have begun to blur. In fact, the two architectures almost seem to
have adopted the strategies of the other. Because processor speeds have increased, CISC
chips are now able to execute more than one instruction within a single clock. This also
allows CISC chips to make use of pipelining. With other technological improvements, it is
now possible to fit many more transistors on a single chip. This gives RISC processors
enough space to incorporate more complicated, CISC-like commands. RISC chips also
make use of more complicated hardware, making use of extra function units for
superscalar execution. All of these factors have led some groups to argue that we are now
in a "post-RISC" era, in which the two styles have become so similar that distinguishing
between them is no longer relevant. However, it should be noted that RISC chips still
retain some important traits. RISC chips strictly utilize uniform, single-cycle instructions.
They also retain the register-to-register, load/store architecture. And despite their
extended instruction sets, RISC chips still have a large number of general purpose
registers.
Simultaneous Multi-Threading
Normal thread execution requires threads to be switched on and off the processor as a
single processor dominates the processor for a moment of time. This allows some tasks
that involve waiting (for disk accesses, or network usage) to execute more efficiently.
SMT allows threads to execute at the same time by pulling instructions into the pipeline
from different threads. This way, multiple threads advance in their processes and no
one thread dominates the processor at any given time.
Value Prediction
Value prediction is the prediction of the value that a particular load instruction will
produce. Load values are generally not random, and approximately half of the load
instructions in a program will fetch the same value as they did in a previous execution.
Thus, predicting that the load value will be the same as it was last time speeds up the
processor since it allows the computer to continue without having to wait for the load
memory access. As loads tend to be one of the slowest and most frequently executed
instructions, this improvement makes a significant difference in processor speed.
226
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Example system configurations
Connection Point Services (CPS) lends itself to various configurations, according to your
needs. A few examples follow.
You can maintain both Phone Book Service (PBS) and Phone Book Administrator (PBA)
on a single computer running an operating system in the Windows Server 2003 family.
Even though PBA posts to the same server on which it resides, you must use the same
procedures for setting permissions and posting phone books as you would with any
other configuration.
Dedicated Phone Book Service server with a Phone Book Administrator client
Example uses:
• Medium to large corporations
• When ownership and responsibilities for phone book administration and server
maintenance are split between groups
In this configuration, PBS and PBA are installed on separate computers. PBA could be
installed on a server or on a workstation running Windows XP Professional. The
following illustration shows this configuration.
227
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Dedicated Phone Book Service server with remote administration of Phone Book
Administrator client
Example use: When the primary computer running Phone Book Administrator is not
physically accessible to the administrator, you can use this dual-mode system.
You can configure PBA to run on a primary (dedicated) computer and on a remote
workstation at the same time.The following illustration shows this configuration.
All data files reside on the primary computer, never on the remote workstation. The
remote workstation accesses the data files on the primary computer.
You can install PBA on a primary computer and on multiple remote workstations. PBS is
installed on a staging server and on multiple host servers residing in a less secure
environment outside a firewall. The
following illustration shows this
configuration.
228
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
The remote workstations access phone book data on the primary PBA computer. Phone
book updates are posted to the staging server. Using a content replication method,
phone book updates are then copied from the staging server through the firewall to the
host servers.
Lesson XI
Advanced Architectures
Classes of Architecture:
I originally used the term "class type" because I first started with this approach using
object-oriented (OO) technology, although since then have used it for component-based
architectures, service oriented architectures (SOAs), and combinations thereof.
Throughout this article I still refer to classes within the layers, although there is
absolutely nothing stopping you from using non-OO technology to implement the layers.
The five layers are summarized in Table 1, as are the skills required to successfully work
on them (coding is applicable to
all layers so it's not listed).
Skillset
Layer Description
229
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
For user interfaces:
This layer wraps access to the logic of · User interface design
your system. There are two categories of skills
interface class – user interface (UI)
classes that provide people access to your · Usability skills
system and system interface (SI) classes
that provide access to external systems to · Ability to work closely
Interface your system. Java Server Pages (JSPs) with stakeholders
and graphical user interface (GUI) screens
For system interfaces:
implemented via the Swing class library
are commonly used to implement UI · API design skills
classes within Java. Web services and
CORBA wrapper classes are good options · Legacy analysis skills
for implementing SI classes.
Collaboration within a layer is allowed. For example, UI objects can send messages to
other UI objects and business/domain objects can send messages to other
business/domain objects. Collaboration can also occur between layers connected by
arrows. As you see in Figure 1, interface classes may send messages to domain classes
but not to persistence classes. Domain classes may send messages to persistence classes,
but not to interface classes. By restricting the flow of messages to only one direction, you
dramatically increase the portability of your system by reducing the coupling between
classes. For example, the domain classes don’t rely on the user interface of the system,
implying that you can change the interface without affecting the underlying business logic.
All types of classes may interact with system classes. This is because your system layer
implements fundamental software features such as inter-process communication (IPC), a
service classes use to collaborate with classes on other computers, and audit logging,
which classes use to record critical actions taken by the software. For example, if your
user interface classes are running on a personal computer (PC) and your domain classes
are running on an EJB application server on another machine, and then your interface
classes will send messages to the domain classes via the IPC service in the system layer.
This service is often implemented via the use of middleware.
It’s critical to understand that this isn’t the only way to layer an application, but instead
that it is a very common one. The important thing is that you identify the layers that are
pertinent to your environment and then act accordingly.
Software architecture
Dataflow is a software architecture based on the idea that changing the value of a
variable should automatically force recalculation of the values of other variables.
A data flow diagram (DFD) is a graphical representation of the "flow" of data through
an information system. A data flow diagram can also be used for the visualization of
data processing (structured design). It is common practice for a designer to draw a
context-level DFD first which shows the interaction between the system and outside
entities. This context-level DFD is then "exploded" to show more detail of the system
being modeled.
Azna, the original developer of structured design, based on Martin and Estrin's "data
flow graph" model of computation. Data flow diagrams (DFDs) are one of the three
essential perspectives of SSADM. The sponsor of a project and the end users will need
to be briefed and consulted throughout all stages of a systems evolution. With a
dataflow diagram, users are able to visualize how the system will operate, what the
system will accomplish and how the system will be implemented. Old system dataflow
diagrams can be drawn up and compared with the new systems dataflow diagrams to
draw comparisons to implement a more efficient system. Dataflow diagrams can be
used to provide the end user with a physical idea of where the data they input,
ultimately has an effect upon the structure of the whole system from order to dispatch
to restock how any system is developed can be determined through a dataflow
diagram.
Components
A data flow diagram illustrates the processes, data stores, and external entities in a
business or other system and the connecting data flows.
232
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Data flow diagram notation
External Entities/Terminators
are outside of the system being modeled. Terminators represent where information
comes from and where it goes. In designing a system, we have no idea about what
these terminators do or how they do it.
Processes
modify the inputs in the process of generating the outputs
Data Stores
represent a place in the process where data comes to rest. A DFD does not say
anything about the relative timing of the processes, so a data store might be a
place to accumulate data over a year for the annual accounting process.
Data Flows
are how data moves between terminators, processes, and data stores (those that
cross the system boundary are known as IO or Input Output Descriptions).
Every page in a DFD should contain fewer than 10 components. If a process has more
than 10 components, then one or more components (typically a process) should be
combined into one and another DFD be generated that describes that component in
more detail. Each component should be numbered, as should each subcomponent, and
so on. So for example, a top level DFD would have components 1 2 3 4 5, the
subcomponent DFD of component 3 would have components 3.1, 3.2, 3.3, and 3.4; and
the sub subcomponent DFD of component 3.2 would have components 3.2.1, 3.2.2, and
3.2.3
Data store
A''''data store is a repository for data. Data stores can be manual, digital, or
temporary.''''
233
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Duplication
External entities and data stores can be duplicated in the system for more clarity, while
processes cannot. External entities that have been replicated are marked by an asterisk
(\) in the lower left part of the oval that represents that entity. Data stores have a
double line on the left side of their box .
Developing a DFD
Top-Down Approach
1. The system designer makes a context level DFD, which shows the interaction (data
flows) between the system (represented by one process) and the system
environment (represented by terminators).
2. The system is decomposed in lower level DFD (Zero) into a set of processes, data
stores, and the data flows between these processes and data stores.
3. Each process is then decomposed into an even lower level diagram containing its
sub processes.
4. This approach then continues on the subsequent sub processes, until a necessary
and sufficient level of detail is reached which is called the primitive process (aka
chewable in one bite).
3. Each process is linked (with incoming data flows) directly with other
processes or via data stores, so that it has enough information to respond to
given event.
DFD tools
· Concept Draw - Windows and MacOS X data flow diagramming tool
234
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
· Microsoft Visio - Windows diagramming tool which includes very basic DFD support
(Images only, does not record data flows
One benefit of dataflow is that it can reduce the amount of coupling-related code in a
program. For example, without dataflow, if a variable X depends on a variable Y, then
whenever Y is changed X must be explicitly recalculated. This means that Y is coupled to
X. Since X is also coupled to Y (because X's value depends on the Y's value), the
program ends up with a cyclic dependency between the two variables. Most good
programmers will get rid of this cycle by using an observer pattern, but only at the cost
of introducing a non-trivial amount of code. Dataflow improves this situation by making
the recalculation of X automatic, thereby eliminating the coupling between from Y to X.
Dataflow makes implicit a significant amount of code that otherwise would have had to
be tediously explicit.
There have been a few programming languages created specifically to support dataflow.
In particular, many (if not most) visual programming languages have been based on
the idea of dataflow. A good example of a Java-based framework is Pervasive
DataRush.
Diagrams
The term dataflow may also be used to refer to the flow of data within a system, and
is the name normally given to the arrows in a data flow diagram that represent the flow
of data between external entities, processes, and data stores.
Concurrency
A dataflow network is a network of concurrently executing processes or automata
that can communicate by sending data over channels (see message passing.)
235
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Kahn process networks, named after one of the pioneers of dataflow networks, are a
particularly important class of such networks. In a Kahn process network the processes
are determinate. This implies that they satisfy the so-called Kahn's principle, which
roughly speaking states that, each determinate process computes a continuous function
from input streams to output streams, and that a network of determinate processes is
itself determinate, thus computing a continuous function. This implies that the
behaviour of such networks can be described by a set of recursive equations, which can
be solved using fix point theory.
Hardware architecture
Hardware architectures for dataflow was a major topic in Computer architecture
research in the 1970s and early 1980s. Jack Dennis of MIT pioneered the field of static
dataflow architectures while the Manchester Dataflow Machine and MIT Tagged Token
architecture were major projects in dynamic dataflow.
A compiled program for a dataflow machine would keep this dependency information. A
dataflow compiler would record these dependencies by creating unique tags for each
dependency instead of using variable names. By giving each dependency a unique tag,
it exposes any possibility of parallel execution of non-dependent instructions. Each
instruction, along with its tagged operands would be stored in the compiled binary code.
Once the instruction was completed by the execution unit, its output data would be
broadcast (with its tag) to the CAM memory. Any other instructions that were
dependent on this particular datum (identified by its tag value) would be updated. In
this way, subsequent instructions would be activated.
Instructions would be activated in data order, that is when all of the required data
operands were available. This order can be different from the sequential order
envisioned by the human programmer, the programmed order.
The instructions along with their required data would be transported as packets to the
execution units. These packets are often known as instruction tokens. Similarly, data
236
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
results are transported back to the CAM as data tokens. The packetization of
instructions and results allowed for parallel execution of activated instructions on a large
scale. Connection networks would deliver the activated instruction tokens to the
execution units and return data tokens to the instruction CAM memory. In contrast to
the conventional von Neumann architecture, data tokens are not permanently stored in
memory, rather they are transient messages that only exist when in transit to the
instruction storage.
Earlier designs that only used instruction addresses as data dependency tags were
called static dataflow machines. These machines could not allow instructions from
multiple loop iterations (or multiple calls to the same routine) to be issued
simultaneously as the simple tags could not differentiate between the different loop
iterations (or each invocation of the routine). Later designs called dynamic dataflow
machines used more complex tags to allow greater parallelism from these cases.
· building CAMs large enough to hold all of the dependencies of a real programs
237
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
A computer network is two or m o r e
computers connected together using a
telecommunication system for the purpose of
communicating and sharing resources
This article uses the definition which requires two or more computers to be connected
together to form a network. The same basic functions are generally present in this case
as with larger numbers of connected computers.
Computers
Many of the components of an average network are individual computers, which are
generally either workstations (including personal computers) or servers.
Types of Workstations
There are many types of workstations that may be incorporated into a particular
network, some of which have high-end displays, multiple CPUs, large amounts of
RAM, large amounts of hard drive storage space, or other enhancements required
for special data processing tasks, graphics, or other resource intensive applications.
(See also network computer).
Types of Servers
The following lists some common types of servers and their purpose.
File Server
238
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Stores various types of files and distributes them to other clients on the network.
Print Server
Controls and manages one or more printers and accepts print jobs from other
network clients, spooling the print jobs, and performing most or all of the other
functions that a workstation would perform to accomplish a printing task if the
printer were connected directly to the workstation's printer port.
Mail Server
Stores, sends, receives, routes, and performs other email related operations for
other clients on the network.
Fax Server
Stores, sends, receives, routes, and performs other functions necessary for the
proper transmission, reception, and distribution of faxes.
Telephony Server
Performs telephony related functions such as answering calls automatically,
performing the functions of an interactive voice response system, storing and
serving voice mail, routing calls between the Public Switched Telephone Network
(PSTN) and the network or the Internet (e.g., voice over IP (VoIP) gateway), etc.
Proxy Server
Performs some type of function on behalf of other clients on the network to
increase the performance of certain operations (e.g., prefetching and caching
documents or other data that is requested very frequently) or as a security
precaution to isolate network clients from external threats.
Remote Access Server (RAS)
Monitors modem lines or other network communications channels for requests to
connect to the network from a remote location, answers the incoming telephone
call or acknowledges the network request, and performs the necessary security
checks and other procedures necessary to log a user onto the network.
Application Server
Performs the data processing or business logic portion of a client application,
accepting instructions for operations to perform from a workstation and serving the
results back to the workstation, while the workstation performs the user interface
or GUI portion of the processing (i.e., the presentation logic) that is required for the
application to work properly.
Web Server
Stores HTML documents, images, text files, scripts, and other Web related data
(collectively known as content), and distributes this content to other clients on the
network on request.
Backup Server
Has network backup software installed and has large amounts of hard drive storage
or other forms of storage (tape, etc.) available to it to be used for the purpose of
ensuring that data loss does not occur in the network.
Printers
239
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Many printers are capable of acting as part of a computer network without any
other device, such as a print server, to act as an intermediary between the printer
and the device that is requesting a print job to be completed.
Dumb Terminals
Many networks use dumb terminals instead of workstations either for data entry
and display purposes or in some cases where the application runs entirely on the
server.
Other Devices
There are many other types of devices that may be used to build a network, many
of which require an understanding of more advanced computer networking
concepts before they are able to be easily understood (e.g., hubs, routers, bridges,
switches, hardware firewalls, etc.). On home and mobile networks, connecting
consumer electronics devices such as video game consoles is becoming increasingly
common.
A Simple Network
A simple computer network may be constructed from two computers by adding a
network adapter (Network Interface Controller (NIC)) to each computer and then
connecting them together with a special cable called a crossover cable. This type of
network is useful for transferring information between two computers that are not
normally connected to each other by a permanent network connection or for basic
home networking applications. Alternatively, a network between two computers can
be established without dedicated extra hardware by using a standard connection
such as the RS-232 serial port on both computers, connecting them to each other
via a special cross linked null modem cable.
Practical Networks
Practical networks generally consist of more than two interconnected computers
and generally require special devices in addition to the Network Interface Controller
that each computer needs to be equipped with. Examples of some of these special
devices are listed above under Basic Computer Network Building Blocks / Other
devices.
Types of Networks:
Below is a list of the most common types of computer networks.
240
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Local Area Network (LAN):
A network that is limited to a relatively small spatial area such as a room, a single
building, a ship, or an aircraft. Local area networks are sometimes called a single
location network.
Note: For administrative purposes, large LANs are generally divided into smaller
logical segments called workgroups. A workgroup is a group of computers that
share a common set of resources within a LAN.
Internetwork:
Two or more networks or network segments connected using devices that operate
at layer 3 (the 'network' layer) of the OSI Basic Reference Model, such as a router.
Note: Any interconnection among or between public, private, commercial,
industrial, or governmental networks may also be defined as an internetwork.
Internet, The:
A specific internetwork, consisting of a worldwide interconnection of governmental,
academic, public, and private networks based upon the Advanced Research
Projects Agency Network (ARPANET) developed by ARPA of the U.S. Department of
Defense – also home to the World Wide Web (WWW) and referred to as the
'Internet' with a capital 'I' to distinguish it from other generic internetworks.
241
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Synonyms for the 'Internet' also include the 'Web' or, in a more comical sense, the
'Interweb'.
Intranet:
A network or internetwork that is limited in scope to a single organization or entity
or, also, a network or internetwork that is limited in scope to a single organization
or entity and which uses the TCP/IP protocol suite, HTTP, FTP, and other network
protocols and software commonly used on the Internet.
Note: Intranets may also be categorized as a LAN, CAN, MAN, WAN, or other type
of network.
Extranet:
A network or internetwork that is limited in scope to a single organization or entity
but which also has limited connections to the networks of one or more other
usually, but not necessarily, trusted organizations or entities (e.g., a company's
customers may be provided access to some part of its intranet thusly creating an
extranet while at the same time the customers may not be considered 'trusted'
from a security standpoint).
Note: Technically, an extranet may also be categorized as a CAN, MAN, WAN, or
other type of network, although, by definition, an extranet cannot consist of a
single LAN, because an extranet must have at least one connection with an outside
network.
Intranets and extranets may or may not have connections to the Internet. If
connected to the Internet, the intranet or extranet is normally protected from being
accessed from the Internet without proper authorization. The Internet itself is not
considered to be a part of the intranet or extranet, although the Internet may serve
as a portal for access to portions of an extranet.
By network layer
Computer networks may be classified according to the network layer at which they
operate according to some basic reference models that are considered to be
standards in the industry such as the seven layer OSI reference model and the five
layer TCP/IP model.
By scale
Computer networks may be classified according to the scale or extent of reach of
the network, for example as a Personal area network (PAN), Local area network
(LAN), Wireless local area network (WLAN), Campus area network (CAN),
Metropolitan area network (MAN), or Wide area network (WAN).
242
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
By connection method
Computer networks may be classified according to the technology that is used to
connect the individual devices in the network such as HomePNA, Power line
communication, Ethernet, or WiFi.
By functional relationship
Computer networks may be classified according to the functional relationships which
exist between the elements of the network, for example Active Networking,
Client-server and Peer-to-peer (workgroup) architectures.
By network topology
Computer networks may be classified according to the network topology upon which the
network is based, such as Bus network, Star network, Ring network, Mesh network,
Star-bus network, Tree or Hierarchical topology network, etc.
By services provided
Computer networks may be classified according to the services which they provide,
such as Storage area networks, Server farms, Process control networks, Value-added
network, SOHO network, Wireless community network, XML appliance, Jungle
Networks, khadar network, etc.
By protocol
243
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
Computer networks may be classified according to the communications protocol that is
being used on the network. See the articles on List of network protocol stacks and List
of network protocols for more information.
S ampl e
Networks
244
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327
245
f.c. ledesma avenue, san carlos city, negros occidental
Tel. #: (034) 312-6189 / (034) 729-4327