0% found this document useful (0 votes)

224 views15 pages

Unit 2 - Advanced Computer Architecture - WWW - Rgpvnotes.in

The document discusses Instruction Set Architectures (ISA) in computer architecture, focusing on CISC and RISC processors, their differences, and advantages. It also covers VLIW architecture and the memory hierarchy, detailing the types of memory and principles that enhance performance. Key concepts include instruction sets, pipelining, and locality of reference in memory management.

Uploaded by

ayushdeepanshu7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

224 views15 pages

Unit 2 - Advanced Computer Architecture - WWW - Rgpvnotes.in

Uploaded by

ayushdeepanshu7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Subject Name: Advanced Computer Architecture

Subject Code: CS-6001

Semester: 6th
Downloaded from be.rgpvnotes.in

Department of Computer Science and Engineering

Subject Notes
Subject Code: CS-6001 Subject Name: Advanced Computer Architecture
UNIT-2
Instruction Set Architectures
The inst ruct ion set , also called inst ruct ion set archit ect ure (ISA), is part of a comput er t hat pert ains t o
programming, w hich is basically machine language. The inst ruct ion set provides commands t o t he processor,
t o t ell it w hat it needs t o do. The inst ruct ion set consist s of addressing modes, inst ruct ions, nat ive dat a t ypes,
regist ers, memory archit ect ure, int errupt , and except ion handling, and ext ernal I/ O.

 ADD - Add t w o numbers t oget her.

Examples of inst ruct ion set



COM PARE - Compare numbers.


IN - Input informat ion from a device, e.g., keyboard.


JUM P - Jump t o designat ed RAM address.


LOAD - Load informat ion from RAM t o t he CPU.


OUT - Out put informat ion t o device, e.g., monit or.
STORE - St ore informat ion t o RAM .
Comput ers are classified on t he basis on inst ruct ion set t hey have as:
CISC Scalar Processors
CISC (Complex Inst ruct ion Set Comput er): CISC based comput er w ill have short er programs w hich are made up
of symbolic machine language. A Complex Inst ruct ion Set Comput er (CISC) supplies a large number of complex
inst ruct ions at t he assembly language level. During t he early years, mem ory w as slow and expensive and t he
programming w as done in assembly language. Since memory w as slow and inst ruct ions could be ret rieved up
t o 10 t imes fast er from a local ROM t han from main memory, programmers t ried t o put as many inst ruct ions
as possible in a m icrocode.

Figure 2.1: (a) CSIC Architecture (b) RSIC Architecture

RISC Scalar Processors
RISC (Reduced Inst ruct ion Set Comput er): RISC is a t ype of microprocessor t hat has a relat ively lim it ed number
of inst ruct ions. It is designed t o perform a smaller number of t ypes of comput er inst ruct ions so t hat it can
operat e at a higher speed (perform m ore million inst ruct ions per second, or m illions of inst ruct ions per
second). Earlier, comput ers used only 20% of t he inst ruct ions, making t he ot her 80% unnecessary. One
advant age of reduced inst ruct ion set comput ers is t hat t hey can execut e t heir inst ruct ions very fast because
t he inst ruct ions are so simple.
RISC chips require few er t ransist ors, w hich makes t hem cheaper t o design and produce. In a RISC machine, t he
inst ruct ion set cont ains simple, basic inst ruct ions, from w hich more complex inst ruct ions can be composed.
Each inst ruct ion is of t he same lengt h, so t hat it may be fet ched in a single operat ion. M ost inst ruct ions
complet e in one machine cycle, w hich allow s t he processor t o handle several inst ruct ions at t he same t ime.
This pipelining is a key t echnique used t o speed up RISC machines.

Page no: 1 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

 Speed: Since a simplified inst ruct ion set allow s for a pipelined, superscalar design RISC processors
Advant ages:

oft en achieve 2 t o 4 t imes t he performance of CISC processor using comparable semiconduct or

 Simpler Hardw are : Because t he inst ruct ion set of a RISC processor is so simple, it uses up much less
t echnology and t he same clock rat es.

chip space; ext ra funct ions, such as memory management unit s or float ing point arit hmet ic unit s, can
also be placed on t he same chip. Smaller chips allow a semiconduct or manufact urer t o place m ore

 Short er Design Cycle : Since RISC processors are simpler t han corresponding CISC processors, t hey can
part s on a single silicon w afer, w hich can low er t he per-chip cost dramat ically.

be designed more quickly, and can t ake advant age of ot her t echnological development s sooner t han
corresponding CISC designs, leading t o great er leaps in performance bet w een generat ions.
Difference between CISC and RISC

VLIW Architecture
Very long inst ruct ion w ord (VLIW) describes a comput er processing archit ect ure in w hich a language compiler
or pre-processor breaks program inst ruct ion dow n int o basic operat ions t hat can be performed by t he
processor in parallel (t hat is, at t he same t ime). These operat ions are put int o a very long inst ruct ion w ord
w hich t he processor can t hen t ake apart w it hout furt her analysis, handing each operat ion t o an appro priat e
funct ional unit .

VLIW is somet imes view ed as t he next st ep beyond t he reduced inst ruct ion set comput ing (RISC) archit ect ure,
w hich also w orks w it h a limit ed set of relat ively basic inst ruct ions and can usually execut e more t han one
inst ruct ion at a t ime (a charact erist ic referred t o as superscalar ). The main advant age of VLIW processors is
t hat complexit y is moved from t he hardw are t o t he soft w are, w hich means t hat t he hardw are can be smaller,
cheaper, and require less pow er t o operat e. The challenge is t o design a compiler or pre-processor t hat is
int elligent enough t o decide how t o build t he very long inst ruct ion w ords. If dynamic pre-processing is done as
t he program is run, performance may be a concern.

Figure 2.2: A VLIW processor architecture and instruction format Figure 2.3: Pipeline execution

Page no: 2 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

Pipelining in VLIW Processors

Decoding of inst ruct ions is easier in VLIW t han in superscalars, because each “ region” of an inst ruct ion w ord is
usually limit ed as t o t he t ype of inst ruct ion it can cont ain.
Code densit y in VLIW is less t han in superscalars, because if a “ region” of a VLIW w ord isn’t needed in a
part icular inst ruct ion, it must st ill exist (t o be filled w it h a “ no op” ). Superscalars can be compat ible w it h scalar
processors; t his is difficult w it h VLIW parallel and non-parallel archit ect ures.
VLIW Opportunities
“ Random” parallelism among scalar operat ions is exploit ed in VLIW, inst ead of regular parallelism in a vect or
or SIM D machine.
The efficiency of t he machine is ent irely dict at ed by t he success, or “ goodness,” of t he compiler in planning
t he operat ions t o be placed in t he same inst ruct ion w ords.
Different implement at ions of t he same VLIW archit ect ure may not be binary-compat ible w it h each ot her,
result ing in different lat encies.

M emory Hierarchy
The t ot al memory capacit y of a comput er can be visualized by hierarchy of component s. The memory
hierarchy syst em consist s of all st orage devices cont ained in a comput er syst em from t he slow Auxiliary
M emory t o fast M ain M emory and t o smaller Cache memory. Auxiliary memory access t ime is generally 1000
t imes t hat of t he main memory, hence it is at t he bot t om of t he hierarchy.
The main memory occupies t he cent ral posit ion because it is equipped t o comm unicat e direct ly w it h t he CPU
and w it h auxiliary memory devices t hrough Input / out put processor (I/ O).
When t he program not residing in main memory is needed by t he CPU, t hey are brought in from auxiliary
memory. Programs not current ly needed in main memory are t ransferred int o auxiliary memory t o provide
space in main memory for ot her programs t hat are current ly in use.
The cache memory is used t o st ore program dat a w hich is current ly being execut ed in t he CPU. Approximat e
access t ime rat io bet w een cache memory and main memory is about 1 t o 7~10

Figure 2.4: M emory Hierarchy

1. Int ernal regist er: Int ernal regist er in a CPU is used for holding variables and t emporary result s. Int ernal
regist ers have a very small st orage; how ever t hey can be accessed inst ant ly. Accessing dat a from t he
int ernal regist er is t he fast est w ay t o access memory.
2. Cache: Cache is used by t he CPU for memory w hich is being accessed over and over again. Inst ead of
pulling it every t ime from t he main memory, it is put in cache for fast access. It is also a smaller
memory, how ever, larger t han int ernal regist er.
Cache is furt her classified t o L1, L2 and L3:
a) L1 cache: It is accessed w it hout any delay.
b) L2 cache: It t akes more clock cycles t o access t han L1 cache.
c) L3 cache: It t akes more clock cycles t o access t han L2 cache.

Page no: 3 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

3. M ain memory or RAM (Random Access M emory): It is a t ype of t he comput er memory and is a
hardw are component . It can be increased provided t he operat ing syst em can handle it . Typical PCs
t hese days use 8 GB of RAM . It is accessed slow ly as compared t o cache.
4. Hard disk: A hard disk is a hardw are component in a comput er. Dat a is kept permanent ly in t his
memory. M emory from hard disk is not direct ly accessed by t he CPU, hence it is slow er. As compared
w it h RAM , hard disk is cheaper per bit .
5. M agnet ic t ape: M agnet ic t ape memory is usually used for backing up large dat a. When t he syst em
needs t o access a t ape, it is first mount ed t o access t he dat a. When t he dat a is accessed, it is t hen un-
mount ed. The memory access t ime is slow er in magnet ic t ape and it usually t akes few minut es t o
access a t ape.

M emory Hierarchy Properties:

Informat ion st ored in a memory hierarchy (M 1, M 2,..M n) sat isfies t hree import ant propert ies:
Inclusion Property: it implies t hat all informat ion it ems are originally st ored in level M n. During t he
processing, subset s
of M n are copied int o M n-1 similarly, subset s of M n-1 are copied int o M n-2, and so on.
Coherence Property: it requires t hat copies of t he same informat ion it em at successive memory levels be
consist ent . If a w ord is modified in t he cache, copies of t hat w ord must be updat ed immediat ely or event ually
at all higher levels..
Locality of References: t he memory hierarchy w as developed based on a program behavior know n as localit y
of references. M emory references are generat ed by t he CPU for eit her inst ruct ion or dat a access. Frequent ly
used informat ion is found in t he low er levels in order t o m inimize t he effect ive access t ime of t he memory
hierarchy.

Figure 2.5: The inclusion property and data transfer between adjacent levels
The follow ing t hree principles w hich led t o an effect ive implement at ion memory hierarchy for a
syst em are:
1 M ake the Common Case Fast: This principle says t he dat a w hich is more frequent ly used should be kept in
fast er device. It is based on a fundament al law , called Am dahl’s Law , w hich st at es t hat t he performance
improvement t o be gained from using some fast er mode of execut ion is limit ed by t he fract ion of t he t ime
t he fast er mode can be used. Thus if fast er mode use relat ively less frequent dat a t hen most of t he t ime fast er
mode device w ill not be used hence t he speed up achieved w ill be less t han if fast er mode device is
more frequent ly used.
2. Principle of Locality: It is very common t rend of Programs t o reuse dat a and inst ruct ions t hat are used

Page no: 4 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

recent ly. Based on t his observat ion comes import ant program propert y called localit y of references: t he
inst ruct ions and dat a in a program t hat w ill be used in t he near fut ure is based on it s accesses in t he recent
past . There is a famous 40/ 10 rule that comes from empirical observat ion is:
" A program spends 40% of it s t ime in 10% of it s code"
These localities can be categorized of three types:
a. Temporal locality: st at es t hat dat a it ems and code t hat are recent ly accessed are likely t o be accessed in t he
near fut ure. Thus if locat ion M is referenced at t ime t , t hen it (locat ion M ) w ill be referenced again at some
t ime t +Dt .
b. Spatial locality: st at es t hat it ems t ry t o reside in proximit y in t he memory i.e., t he it ems w hose
addresses are near t o each ot her are likely t o be referred t oget her in t ime. Thus w e can say memory accesses
are clust ered w it h respect t o t he address space. Thus if locat ion M is referenced at t ime t , t hen anot her
locat ion M ±Dm w ill be referenced at t ime t +Dt .
c. Sequential locality: Programs are st ored sequent ially in memory and normally t hese programs has
sequent ial t rend of execut ion. Thus w e say inst ruct ions are st ored in memory in cert ain array pat t erns and
are accessed sequent ially one memory locat ions aft er anot her. Thus if locat ion M is referenced at t ime t ,
t hen locat ions M +1, M +2, … w ill be referenced at t ime t +Dt , t +Dt ’, et c. In each of t hese patt erns, bot h Dm
and Dt are “ small.” H& P suggest t hat 90 percent of t he execut ion t ime in most programs is spent execut ing
only 10 percent of t he code. One of t he implicat ions of t he localit y is dat a and inst ruct ions should have
separat e dat a and inst ruct ion caches. The main advant age of separat e caches is t hat one can fet ch inst ruct ions
and operands simult aneously. This concept is basis of t he design know n as Harvard archit ect ure, aft er t he
Harvard M ark series of elect romechanical machines, in w hich t he inst ruct ions w ere supplied by a separat e
unit .
3. Smaller is Faster: Smaller pieces of hardw are w ill generally be fast er t han larger pieces.
This according t o above principles suggest ed t hat one should t ry t o keep recent ly accessed it ems in
t he fast est memory.

While designing t he memory hierarchy follow ing point s are alw ays considered
Inclusion property: If a value is found at one level, it should be present at all of t he levels below it .

The implicat ion of t he inclusion propert y is t hat all it ems of informat ion in t he “ innermost ” memory level
(cache) also appear in t he out er memory levels.The inverse, how ever, is not necessarily t rue. That is, t he
presence of a dat a it em in level M i+1 does not imply it s presence in level M i. We call a reference t o a missing
it em a “ miss.”
The Coherence Property
The value of any dat a should be consist ent at all level. The inclusion propert y is, of course, never complet ely
t rue, but it does represent a desired st at e. That is, as informat ion is modified by t he processor, copies of t hat
informat ion should be placed in t he appropriat e locat ions in out er memory levels. The requirement t hat
copies of dat a it ems at successive memory levels be consist ent is called t he “ coherence propert y.”
Coherence Strategies
W rite-through
As soon as a dat a it em in M i is modified, immediat e updat e of t he corresponding dat a it em(s) in M i+1, M i+2,
… M n is required. This is t he most aggressive (and expensive) st rat egy.
W rite-back
The updat e of t he dat a it em in M i+1 corresponding t o a modified it em in M i is not updat ed unit it (or t he
block/ page/ et c. in M i t hat cont ains it ) is replaced or removed. This is t he most efficient approach, but cannot
be used (w it hout modificat ion) w hen mult iple processors share M i+1, …, M n.
M emory Capacity Planning:
The performance of a memory hierarchy is det ermined by t he effect ive access t ime (Teff) t o any level in t he
hierarchy. It depends on t he hit rat io and access frequencies at successive levels.

Page no: 5 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

Hit Ratio (h): is a concept defined for any t w o adjacent levels of a memory hierarchy. When an informat ion
it em found in M i, it is a hit , ot herw ise, a miss. The hit rat io (hi) at M i is t he probabilit y t hat an inform at ion it em
w ill be found in M i. t he miss rat io at M i is defined as 1-hi.
The access frequency t o M i is defined as
fi= (1-h1)(1-h2)….(1-hi)
Effective Access Time (Teff):
In pract ice, w e w ish t o achieve as high a hit rat io as possible at M 1. Every t ime a miss occurs, a penalt y must
be paid t o access t he next higher level of memory. The Teff of a memory hierarchy is given by:

Hierarchy Optimization:
The t ot al cost of a memory hierarchy is est imat ed as:

Interleaved memory organization- memory interleaving

It is a t echnique for compensat ing t he relat ively slow speed of DRAM (Dynamic RAM ). In t his t echnique, t he
main memory is divided int o memory banks w hich can be accessed individually w it hout any dependency on
t he ot her.
High-Order Interleaving
Arguably t he most “ nat ural” arrangement w ould be t o use bus lines A26-A27 as t he module det erminer. In
ot her w ords, w e w ould feed t hese t w o lines int o a 2-t o-4 decoder, t he out put s of w hich w ould be connect ed
t o t he Chip Select pins of t he four memory modules. If w e w ere t o do t his, t he physical placement of our
syst em addresses w ould be as follow s:

Not e t hat t his means consecut ive addresses are st ored w it hin t he same module, except at t he boundary. The
above arrangement is called high-order int erleaving, because it uses t he high-order, i.e. most significant , bit s
of t he address t o det ermine w hich module t he w ord is st ored in.
Low-Order Interleaving
An alt ernat ive w ould be t o use t he low bit s for t hat purpose. In our example here, for inst ance, t his w ould
ent ail feeding bus lines A0-A1 int o t he decoder, w it h bus lines A2-A27 being t ied t o t he address pins of t he
memory modules. This w ould mean t he follow ing st orage pat t ern:

In ot her w ords, consecut ive addresses are st ored in consecut ive modules, w it h t he underst anding t hat t his is
mod 4, i.e. w e w rap back t o M 0 aft er M 3.
Bandwidth
The memory bandw idt h (B) of an m -w ay int erleaved memory is low er-bounded by 1 and upper-bounded by
m. The approximat ion of B by Hellerman is:
~√
0.56
B=m
In t his equat ion m denot es t he number of int erleaved memory modules. This equat ion indicat ed t hat t he

Page no: 6 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

efficient memory bandw idt h is approximat ely t w o t imes t hat of single module w hen four memory modules
are used.
This pessimist ic est imat e is because of t he fact t hat block access of different lengt hs and access of single
w ords are random ly mixed in users programs. Hellerman's calculat ion w as depend on a single processor
syst em. The effect ive memory bandw idt h decreased again, if memory-access conflict s from mult iple
processors are considered.
Fault Tolerance
To achieve various int erleaved memory organizat ions, low order and high order int erleaving are combined. In
each memory module, sequent ial addresses are allocat ed in high order int erleaved memory. This makes it
simple t o isolat e fault y memory modules in a memory bank of m memory modules. If one module failure is
det ect ed t he remaining modules can st ill be used by opening t he w indow in t he address space. This fault
isolat ion cannot be performed in low order int erleaved memory, w here a module failure may paralyze t he
complet e memory bank. Hence, low order int erleaved memory is not fault t olerant .
Backplane Buses
A backplane bus int erconnect s processors, dat a st orage and peripheral devices in a t ight ly coupled
hardw are. The syst em bus must be designed to allow communicat ion bet w een devices on t he devices on
t he bus w it hout dist urbing t he int ernal act ivit ies of all t he devices at t ached t o t he bus. These are t ypically
`int ermediat e' buses, used t o connect a variet y of ot her buses t o t he CPU-M emory bus. They are called
Backplane Buses because t hey are rest rict ed t o t he backplane of t he syst em.
Backplane bus specification
They are generally connect ed t o t he CPU-M emory bus by a bus adapt or, w hich handles t ranslat ion bet w een
t he buses. Commonly, t his is int egrat ed int o t he CPU-M emory bus cont roller logic. While t hese buses can be
used t o direct ly cont rol devices, t hey are used as 'bridges` t o ot her buses. For example, AGP bus devices –

 Allow processors, memory and I/ O devices t o coexist on single bus.

i.e. video cards – act as bridges bet w een t he CPU-M emory bus and t he act ual display device: t he m onit or.

 Balance demands of processor-memory communicat ion w it h demands of I/ O device-memory

 Int erconnect s t he circuit boards cont aining processor, memory and I/ O int erfaces an int erconnect ion
communicat ion.

 Dat a address and cont rol lines form t he dat a t ransfer bus (DTB) in VM E bus.
st ruct ure w it hin t he chassis.

 DTB Arbit rat ion bus t hat provide cont rol of DTB t o request er using t he arbit rat ion logic.
 Int errupt and Synchronizat ion bus used for handling int errupt .
 Ut ilit y bus includes signals t hat provide periodic t iming and coordinat e t he pow er up and pow er dow n
sequence of t he syst em.

Controller M aster Slave

Takes cont rol of Allow s M ast ers t o
Cont rols access t o t he dat a bus Read/ w rit e access
t he bus
Read or w rit es
dat a from / t o Generat es
Handles Int errupt s Int errupt s
slaves

Bus Grant / Request

R/ W
Int ernet Bus

Dat a Bus

Addr ess Bus

Figure 2.6: VM Ebus System

Page no: 7 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

The backplane bus is made of signal lines and connect ors. A special bus cont roller board is used t o house t he
backplane cont rol logic, such as t he syst em clock driver, arbit er, bus t imer and pow er driver.
Functional module: A funct ional module is collect ion of elect ronic circuit ry t hat resides on one funct ional

 An arbit rat or is a funct ional module t hat accept s bus request from t he request er module and grant
board and w orks t o achieve special bus cont rol funct ion. These funct ions are:

 A bus t imer measures t he t ime each dat a t ransfer t akes on t he DTB and t erminat es t he DTB cycle if
cont rol of t he DTB t o one request at a t ime.

 An int errupt er module generat es an int errupt request and provides st at us / ID informat ion w hen an
t ransfers t ake t oo long.

 A locat ion monit or is a funct ional module t hat m onit ors dat a t ransfer over t he DTB. A pow er monit or
int errupt handler module request s it .

 A syst em clock driver is a module t hat provides a clock t iming signal on t he ut ilit y bus. In addit ion, board
w at ches t he st at us of t he pow er source and signals w hen pow er unst able.

int erface logic is needed t o mat ch t he signal line impedance, t he propagat ion t ime and t erminat ion
values bet w een t he backplane and t he plug in board.
Asynchronous Data Transfer
All t he operat ions in a digit al syst em are synchronized by a clock t hat is generat ed by a pulse generat or. The
CPU and I/ O int erface can be designed independent ly or t hey can share common bus. If CPU and I/ O
int erface share a common bus, t he t ransfer of dat a bet w een t w o unit s is said t o synchronous. There are
some disadvant ages of synchronous dat a t ransfer, such as:
• It is not flexible, as all bus devices run on t he same clock rat e.
• Execut ion t imes are t he mult iples of clock cycles (if any operat ion needs 3.1 clock cycles, it w ill t ake 4
cycles).
• Bus frequency has t o be adapt ed t o slow er devices. Thus, one cannot t ake full advant age of t he
fast er ones.
• It is part icularly not suit able for an I/ O system in w hich t he devices are comparat ively m uch
slow er t han processor.
In order t o overcome all t hese problems, an asynchronous dat a t ransfer is used for input / out put syst em.
The w ord ‘asynchronous’ means not in st ep w it h t he elapse of t ime; In case of asynchronous dat a t ransfer,
t he CPU and I/ O int erface are independent of each ot her. Each uses it s ow n int ernal clock t o cont rol it s
regist ers. There are t w o popular t echniques used for such dat a t ransfer: st robe cont rol and handshaking.
Strobe Control
In st robe cont rol , a cont rol signal, called st robe pulse, w hich is supplied from one unit t o ot her, indicat es t hat
dat a t ransfer has t o t ake place. Thus, for each dat a t ransfer, a st robe is act ivat ed eit her by source or
dest inat ion unit . A st robe is a single cont rol line t hat informs t he dest inat ion unit t hat a valid dat a is
available on t he bus. The dat a bus carries t he binary informat ion from source unit t o dest inat ion unit .
Data transfer from source to destination
The st eps involved in dat a t ransfer from source t o dest inat ion are as follow s: (i) The source unit places dat a on
t he dat a bus.
(ii) A source act ivat es t he st robe aft er a brief delay in order t o ensure t hat dat a values are st eadily placed on
t he dat a bus.
(iii) The informat ion on dat a bus and st robe signal remain act ive for some t ime t hat is sufficient for t he
dest inat ion t o receive it .
(iv) Aft er t his t ime t he sources remove t he dat a and disable t he st robe pulse, indicat ing t hat dat a bus does
not cont ain t he valid dat a.
(v) Once new dat a is available, st robe is enabled again.

Page no: 8 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

Dat a Bus
Source unit Dest inat ion
St robe unit

Dat a bus

St robe

Figure 2.7: Source- Initiated Strobe for Data Transfer

Data transfer from destination to source
The st eps involved in dat a t ransfer from dest inat ion t o source are as follow s:
1. The dest inat ion unit act ivat es t he st robe pulse informing t he source t o provide t he dat a.
2. The source provides t he dat a by placing t he dat a on t he dat a bus.
3. Dat a remains valid for some t ime so t hat t he dest inat ion can receive it .
4. The falling edge of st robe t riggers t he dest inat ion regist er.
5. The dest inat ion regist er removes t he dat a from t he dat a bus and disables t he st robe.

Dat a Bus
Source unit Dest inat ion
St robe unit

Dat a bus

St robe

Figure 2.8: Destination- Initiated Strobe for Data Transfer

The disadvant age of t his scheme is t hat t here is no suret y t hat dest inat ion has received t he dat a before source
removes t he dat a. Also, dest inat ion unit init iat es t he t ransfer w it hout know ing w het her source has placed
dat a on t he dat a bus.
Thus, anot her t echnique, know n as handshaking, is designed t o overcome t hese draw backs.
Handshaking
The handshaking t echnique has one more cont rol signal for acknow ledgement t hat is used for int im at ion. As in
st robe cont rol, in t his t echnique also, one cont rol line is in t he same direct ion as dat a flow , t elling about t he
validit y of dat a. Ot her cont rol line is in reverse direct ion t elling w het her dest inat ion has accept ed t he dat a.
Data transfer from source to destination
In t his case, t here are t w o cont rol lines as show n in f igure request and reply. The sequence of act ions t aken
is as follow s
(i) Source init iat es t he dat a t ransfer by placing t he dat a on dat a bus and enable request signal.
(ii) Dest inat ion accept s t he dat a from t he bus and enables t he reply signal.
(iii) As soon as source receives t he reply, it disables t he request signal. This also invalidat es t he dat a on
t he bus.
(iv) Source cannot send new dat a unt il dest inat ion disables t he reply signal. (v) Once dest inat ion disables t he
reply signal, it is ready t o accept new signal.

Dat a Bus
Source unit Dest inat ion
Request
unit
Reply

Page no: 9 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

Dat a bus

Request

Figure 2.9: Source-Initiated Data Transfer Using Handshaking Technique

Data transfer from destination to source
The st eps t aken for dat a t ransfer from dest inat ion t o source are as follow s:
(i) Dest inat ion init iat es t he dat a t ransfer sending a request t o source t o send dat a t elling t he lat t er t hat it is
ready t o accept dat a.
(ii) Source on receiving request places dat a on dat a bus.
(iii) Also, source sends a reply t o dest inat ion t elling t hat it has placed t he requisit e dat a on t he dat a bus and
has disabled t he request signal so t hat dest inat ion does not have new request unt il it has accept ed t he dat a.
(iv) Aft er accept ing t he dat a, dest inat ion disables t he reply signal so t hat it can issue a fresh request for
dat a.

Dat a Bus
Dest inat ion Source unit
Request
unit
Reply

Dat a bus

Request

Figure 2.10: Destination-Initiated Data Transfer Using Handshaking Technique

Advantage of asynchronous bus transaction
• It is not clocked.
• It can accommodat e a w ide range of devices.

Arbitration transaction (Bus Arbitration)

Since at a unit t ime only one device can t ransmit over t he bus, hence one import ant issue is t o decide w ho
should access t he bus. Bus arbit rat ion is t he process of det ermining t he bus mast er w ho has t he bus cont rol at
a given t ime w hen t here is a request for bus from one or more devices.
Devices connect ed t o a bus can be of t w o kinds:
1. M ast er: is act ive and can init iat e a bus t ransfer.
2. Slave: is passive and w ait s for request s.

Control: M aster initiates requests

Bus M ast er Bus Slave
Data can go either w ay

Figure 2.11: Bus arbitration

In most comput ers, t he processor and DM A cont roller are bus mast ers w hereas memory and I/ O cont rollers
are slaves. In some syst ems, cert ain int elligent I/ O cont rollers are also bus mast ers. Some devices can act bot h
as mast er and as slave, depending on t he circumst ances:
• CPU is t ypically a mast er. A coprocessor, how ever, can init iat e a t ransfer of a paramet er from t he CPU

Page no: 10 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

here CPU act s like a slave.

• An I/ O device usually act s like a slave in int eract ion w it h t he CPU. Several devices can perform direct
access t o t he memory, in w hich case t hey access the bus like a mast er.
• The memory act s only like a slave.
• In some syst ems especially one w here mult iple processors share a bus. When more t han one bus
mast er simult aneously needs t he bus, only one of t hem gains cont rol of t he bus and become act ive bus
mast er. The ot hers should w ait for t heir t urn. The ‘bus arbit er’ decides w ho w ould become current bus
mast er. Bus arbit rat ion schemes usually t ry t o balance t w o fact ors:
• Bus priorit y: t he highest priorit y device should be serviced first
• Fairness: Even t he low est priorit y device should never be complet ely locked out from t he bus
Let ’s underst and t he sequence of event s t ake place, w here t he bus arbit rat ion consist s are follow ing:
1. Assert ing a bus mast ership request
2. Receiving a grant indicat ing t hat t he bus is available at t he end of t he current cycle. A bus mast er
cannot use t he bus unt il it s request is grant ed.
3. Acknow ledging t hat mast ership has been assumed.
4. A bus mast er must signal t o t he arbit er aft er finish using t he bus.
The 68000 has t hree bus arbit rat ion cont rol pins:
BR - The bus request signal assigned by t he device t o t he processor int ending t o use t he buses.
BG - The bus grant signal is assigned by t he processor in response t o a B R, indicat ing t hat t he bus w ill be
released at t he end of t he current bus cycle. When BG is assert ed BR can be de-assert ed. BG can be rout ed
t hrough a bus arbit rat or e.g. using daisy chain or t hrough a specific priorit y-encoded circuit .

BGACK - At t he end of t he current bus cycle t he pot ent ial bus mast er t akes cont rol of t he syst em buses and
assert s a bus grant acknow ledge signal t o inform t he old bus mast er t hat it is now cont rolling t he buses.
This signal should not be assert ed unt il t he follow ing condit ions are met :
1. A bus grant has been received.
2. Address st robe is inact ive, w hich indicat es t hat t he microprocessor is not using t he bus.
3. Dat a t ransfer acknow ledge is inact ive, w hich indicat es t hat neit her memory nor peripherals are
using t he bus.
4. Bus grant acknow ledge is inact ive, w hich indicat es t hat no ot her device is st ill claiming bus
mast ership.
On a t ypical I/ O bus, how ever, t here may be mult iple pot ent ial mast ers and t here is a need t o arbit rat e
bet w een simult aneous request s t o use t he bus. The arbit rat ion can be eit her cent ral or dist ribut ed.
Cent ralized bus arbit rat ion in w hich a dedicat ed arbit er has t he role of bus arbit rat ion. In t he cent ral scheme,
it is assumed t hat t here is a single device (usually t he CPU) t hat has t he arbit rat ion hardw are. The cent ral
arbit er can det ermine priorit ies and can force t erminat ion of a t ransact ion if necessary. Cent ral arbit rat ion is
simpler and low er in cost for a uniprocessor system. It does not w ork as w ell for a symmet ric mult iprocessor
design unless t he arbit er is independent of t he CPUs.

Figure 2.12: Centralized bus arbitrators

Page no: 11 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

Device Device Device

1 2 N
Grant
Req
Bus
Arbit er

Figure 2.13: Independent requests with central arbitrator

Figure 2.14: Central arbitrator

Dist ribut ed bus arbit rat ion in w hich all bus mast ers cooperat e and joint ly perform t he arbit rat ion. In t his case,
every bus mast er has an arbit rat ion sect ion. For example: t here are as many request lines on t he bus as
devices; each device monit ors each request line aft er each bus cycle each device know s if he is t he
highest priorit y device w hich request ed t he busy if yes, it t akes it . In t he dist ribut ed scheme, every
pot ent ial mast er carries some hardw are for arbit rat ion and all pot ent ial mast ers compet e equally for t he
bus. The arbit rat ion scheme is oft en based on some pre-assigned priorit ies for t he devices, but t hese can be
changed. Dist ribut ed arbit rat ion can be done eit her by self-select ion – w here code indicat es ident it y on t he
bus for example NuBus 16 devices, or by collision det ect ion as an example Et hernet . NuBus has four
arbit rat ion lines. A candidat e t o bus mast er assert s it s arbit rat ion level on t he 4-bit open collect or
arbit rat ion bus. If a compet ing mast er sees a higher level on t he bus t han it s ow n level, it ceases t o
compet e for t he bus. Each pot ent ial bus mast er simult aneously drives and samples t he bus. When one bus
mast er has gained bus mast ership and t hen relinquished it , it w ould not at t empt t o re-est ablish bus mast ership
unt il all pending bus request s have been dealt w it h (fairness). Daisy-chaining t echnique is a hybr id of
cent r al and dist r ibut ed arbit rat ion. In t hese t echn iques all devices t hat can request are at t ached
serially. The cent ral arbit er issues grant signal t o t he closest device request ing it . Devices request t he bus
by passing a signal t o t heir neighbors w ho are closer t o t he cent ral arbit er. If a closer device also
request s t he bus, t hen t he request from t he more dist ant device is blocked i.e., t he priorit y scheme is fixed by
t he device's physical posit ion on t he bus, and cannot be changed in soft w are. Somet imes, mult iple request
and grant lines are used w it h daisy-chaining t o enable request s from devices t o bypass a closer device,
and t hereby implement a rest rict ed soft w are- cont rollable priorit y scheme. Daisy-chaining is low cost
t echnique and also suscept ible t o fault s. It is may lead t o st arvat ion for dist ant devices if a high priorit y
devices (one nearest t o arbit rat or) frequent ly request for bus.

Page no: 12 Follow us on facebook to get real-time updates from RGPV

Downloaded from be.rgpvnotes.in

Device1 Device 2 Device N

Highest Lowest
Priorit y Priorit y

Bus Grant Grant Grant

Arbit er
Release

Request

Figure 2.15: Daisy chaining arbitration scheme

Interrupt M echanisms
A request from I/ O or ot her peripherals t o a processor is know n as int errupt for service or at t ent ion. To pass
t he int errupt signals, a priorit y int errupt bus is employed. To serve as an int errupt handler, a funct ional
module is used. Priorit y int errupt s are handled at various levels. The int errupt er must give ident ificat ion
det ails and st at us. The VM E bus ut ilizes seven int errupt -request lines. To handle mult iple int errupt s, maximum
seven int errupt handlers are used.
Using t he dat a bus lines on a t ime sharing basis, int errupt s can also be cont rolled by message passing. The
saving of dedicat ed int errupt lines is obt ained at t he expense of requiring some bus cycles for handling
message based int errupt s. Virt ual int errupt s is t he use of t ime shared dat a bus lines t o implement int errupt s.

Page no: 13 Follow us on facebook to get real-time updates from RGPV

We hope you find these notes useful.
You can get previous year question papers at
https://qp.rgpvnotes.in .

If you have any queries or you want to submit your

study notes please write us at
[email protected]

Itskills C25 Theory
No ratings yet
Itskills C25 Theory
14 pages
Bootstrapping in Compiler Design
No ratings yet
Bootstrapping in Compiler Design
12 pages
MC4301 - ML Unit 4 (Parametric Machine Learning)
No ratings yet
MC4301 - ML Unit 4 (Parametric Machine Learning)
56 pages
Clustering PPT 1233
No ratings yet
Clustering PPT 1233
18 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
16 pages
Microprocessor Interfacing Techniques
100% (4)
Microprocessor Interfacing Techniques
351 pages
23ma2101 Advanced Mathematics For Scientific Computing
No ratings yet
23ma2101 Advanced Mathematics For Scientific Computing
10 pages
Machine Learning-4
100% (1)
Machine Learning-4
18 pages
Unit1 ML
No ratings yet
Unit1 ML
23 pages
ACA Notes UNIT-1
No ratings yet
ACA Notes UNIT-1
20 pages
ML Unit-3
No ratings yet
ML Unit-3
24 pages
ST10F280 JT3
No ratings yet
ST10F280 JT3
186 pages
CF 2 GSG
No ratings yet
CF 2 GSG
30 pages
15CS72 ACA Module2Final
No ratings yet
15CS72 ACA Module2Final
29 pages
px10 Pos 7746
No ratings yet
px10 Pos 7746
82 pages
Memory Controller For A 6502 CPU in VHDL: Michel Wilson, 1047981
No ratings yet
Memory Controller For A 6502 CPU in VHDL: Michel Wilson, 1047981
28 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
Report Cafe Management System
100% (2)
Report Cafe Management System
20 pages
737 Book NG 22 303
100% (2)
737 Book NG 22 303
76 pages
Network Administration and Hardware Maintenance
No ratings yet
Network Administration and Hardware Maintenance
22 pages
UNIT1
No ratings yet
UNIT1
38 pages
IT Acronyms - Google Drive
No ratings yet
IT Acronyms - Google Drive
5 pages
Unit 1 - Project Management - WWW - Rgpvnotes.in
No ratings yet
Unit 1 - Project Management - WWW - Rgpvnotes.in
13 pages
Computer Science LAB Descriptions
No ratings yet
Computer Science LAB Descriptions
3 pages
Computer Fundamentals and Logic Circuits Part 2 Updated
No ratings yet
Computer Fundamentals and Logic Circuits Part 2 Updated
58 pages
ARM7 Processor Architecture
No ratings yet
ARM7 Processor Architecture
33 pages
Grade 8 Lesson 1
No ratings yet
Grade 8 Lesson 1
49 pages
Chapter 2
No ratings yet
Chapter 2
36 pages
Embedded Systems - 7
No ratings yet
Embedded Systems - 7
17 pages
Super 7: SY-5EHM/5EH5
No ratings yet
Super 7: SY-5EHM/5EH5
71 pages
Computer MEMORY
No ratings yet
Computer MEMORY
4 pages
DNP3 8600
No ratings yet
DNP3 8600
19 pages
Comp Question 1
No ratings yet
Comp Question 1
15 pages
Bca Annual Syllabus
No ratings yet
Bca Annual Syllabus
34 pages
Jenis Komponen Dan Arsitektur Komputer
No ratings yet
Jenis Komponen Dan Arsitektur Komputer
30 pages
W4920 - SpecSheet 01.2013
No ratings yet
W4920 - SpecSheet 01.2013
2 pages
DOC-20231118-WA0008new Unit 3
No ratings yet
DOC-20231118-WA0008new Unit 3
15 pages
Implementation of RISC-V Processor
No ratings yet
Implementation of RISC-V Processor
7 pages
M48T35AY M48T35AV: 5.0 or 3.3V, 256 Kbit (32 KB x8) TIMEKEEPER Sram
No ratings yet
M48T35AY M48T35AV: 5.0 or 3.3V, 256 Kbit (32 KB x8) TIMEKEEPER Sram
26 pages
Objectiv Question Answer of Computer
No ratings yet
Objectiv Question Answer of Computer
10 pages
CH 3 - Information System and Its Components
No ratings yet
CH 3 - Information System and Its Components
25 pages
Multimedia Unit 1
No ratings yet
Multimedia Unit 1
17 pages
Unit-II BDA
No ratings yet
Unit-II BDA
19 pages
Unit 1 - Computer Networks - WWW - Rgpvnotes.in
No ratings yet
Unit 1 - Computer Networks - WWW - Rgpvnotes.in
14 pages
ARM Operating Modes
0% (1)
ARM Operating Modes
152 pages
ML UNIT 2 Sir
No ratings yet
ML UNIT 2 Sir
46 pages
Unit 3 - Computer Networks - WWW - Rgpvnotes.in
No ratings yet
Unit 3 - Computer Networks - WWW - Rgpvnotes.in
18 pages
PowerPoint Slides To Chapter 07
No ratings yet
PowerPoint Slides To Chapter 07
49 pages
Data Mining Models - GeeksforGeeks
No ratings yet
Data Mining Models - GeeksforGeeks
4 pages
Beyond Binary Classification
No ratings yet
Beyond Binary Classification
34 pages
ML Notes MAKAUT 7th Sem
No ratings yet
ML Notes MAKAUT 7th Sem
31 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
29 pages
At-Speed Test:: Improving Defect Detection For Nanometer Designs
No ratings yet
At-Speed Test:: Improving Defect Detection For Nanometer Designs
34 pages
Machine Learning: Presentation
100% (2)
Machine Learning: Presentation
23 pages
Big Data NOTES and QB
No ratings yet
Big Data NOTES and QB
92 pages
Parallel Computing
No ratings yet
Parallel Computing
57 pages
Subject Name Parallel and Distributed Computing
100% (1)
Subject Name Parallel and Distributed Computing
3 pages
Unit I R Data Structures
No ratings yet
Unit I R Data Structures
30 pages
ML Lab
No ratings yet
ML Lab
62 pages
Communication Operations
No ratings yet
Communication Operations
70 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
ML - CSA 301 - ML Perspective and Issues
No ratings yet
ML - CSA 301 - ML Perspective and Issues
34 pages
Tree Traversals (Inorder, Preorder and Postorder)
No ratings yet
Tree Traversals (Inorder, Preorder and Postorder)
4 pages
Unit 5 I/O Organization: Computer Architecture
No ratings yet
Unit 5 I/O Organization: Computer Architecture
9 pages
Branch and Bound
No ratings yet
Branch and Bound
30 pages
Microprocessor 8085 Viva
No ratings yet
Microprocessor 8085 Viva
38 pages
DBMS - Unit-3
No ratings yet
DBMS - Unit-3
35 pages
Unit 5 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Data Mining - WWW - Rgpvnotes.in
15 pages
Unit V
No ratings yet
Unit V
67 pages
Introduction of Machine Learning
No ratings yet
Introduction of Machine Learning
58 pages
Parallelism
No ratings yet
Parallelism
22 pages
Lesson 1: Structure of A Compiler
No ratings yet
Lesson 1: Structure of A Compiler
20 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Unit 4
No ratings yet
Unit 4
4 pages
Enterprise Information Architecture Component Model - Chapter 5
100% (1)
Enterprise Information Architecture Component Model - Chapter 5
27 pages
18CS653 - NOTES Module 1
No ratings yet
18CS653 - NOTES Module 1
24 pages
DEADLOCK
No ratings yet
DEADLOCK
8 pages
AI CH3 Unit3
No ratings yet
AI CH3 Unit3
40 pages
Unit 2 - Data Structure - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Data Structure - WWW - Rgpvnotes.in
22 pages
Daa M-4
No ratings yet
Daa M-4
28 pages
Chapter 1 - Data Representation 1.1 - Data Types
No ratings yet
Chapter 1 - Data Representation 1.1 - Data Types
12 pages
Unit - 3
No ratings yet
Unit - 3
42 pages
Unit 3 - Data Mining - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 3 - Data Mining - WWW - Rgpvnotes.in PDF
10 pages
Concept Learning
No ratings yet
Concept Learning
85 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
2 pages
EcuExplorer User Guide
100% (1)
EcuExplorer User Guide
16 pages
Collate Se Unit 4 Notes
No ratings yet
Collate Se Unit 4 Notes
37 pages
CS6456-Object Oriented Programming
No ratings yet
CS6456-Object Oriented Programming
15 pages
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
From Everand
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
Sebastian Thelen
5/5 (1)

Unit 2 - Advanced Computer Architecture - WWW - Rgpvnotes.in

Uploaded by

Unit 2 - Advanced Computer Architecture - WWW - Rgpvnotes.in

Uploaded by

Subject Name: Advanced Computer Architecture

Subject Code: CS-6001

Department of Computer Science and Engineering

 ADD - Add t w o numbers t oget her.

Figure 2.1: (a) CSIC Architecture (b) RSIC Architecture

Page no: 1 Follow us on facebook to get real-time updates from RGPV

oft en achieve 2 t o 4 t imes t he performance of CISC processor using comparable semiconduct or

Page no: 2 Follow us on facebook to get real-time updates from RGPV

Pipelining in VLIW Processors

Figure 2.4: M emory Hierarchy

Page no: 3 Follow us on facebook to get real-time updates from RGPV

M emory Hierarchy Properties:

Page no: 4 Follow us on facebook to get real-time updates from RGPV

Page no: 5 Follow us on facebook to get real-time updates from RGPV

Interleaved memory organization- memory interleaving

Page no: 6 Follow us on facebook to get real-time updates from RGPV

 Allow processors, memory and I/ O devices t o coexist on single bus.

 Balance demands of processor-memory communicat ion w it h demands of I/ O device-memory

Controller M aster Slave

Bus Grant / Request

Addr ess Bus

Figure 2.6: VM Ebus System

Page no: 7 Follow us on facebook to get real-time updates from RGPV

Page no: 8 Follow us on facebook to get real-time updates from RGPV

Figure 2.7: Source- Initiated Strobe for Data Transfer

Figure 2.8: Destination- Initiated Strobe for Data Transfer

Page no: 9 Follow us on facebook to get real-time updates from RGPV

Figure 2.9: Source-Initiated Data Transfer Using Handshaking Technique

Figure 2.10: Destination-Initiated Data Transfer Using Handshaking Technique

Arbitration transaction (Bus Arbitration)

Control: M aster initiates requests

Figure 2.11: Bus arbitration

Page no: 10 Follow us on facebook to get real-time updates from RGPV

here CPU act s like a slave.

Figure 2.12: Centralized bus arbitrators

Page no: 11 Follow us on facebook to get real-time updates from RGPV

Device Device Device

Figure 2.13: Independent requests with central arbitrator

Figure 2.14: Central arbitrator

Page no: 12 Follow us on facebook to get real-time updates from RGPV

Device1 Device 2 Device N

Bus Grant Grant Grant

Figure 2.15: Daisy chaining arbitration scheme

Page no: 13 Follow us on facebook to get real-time updates from RGPV

If you have any queries or you want to submit your

You might also like