0% found this document useful (0 votes)

88 views15 pages

Home Work 3: Class: M.C.A SECTION: RE3004 Course Code: CAP211

The document discusses compiler optimizations in RISC systems. It explains that the optimization process involves initial hand tuning by the user, followed by preprocessing and compiler front-end optimizations of the intermediate language code. The compiler back-end then applies additional optimizations and generates optimized object code. As an example, the document describes how simply interchanging loops in an array multiplication program can improve performance by optimizing memory access patterns.

Uploaded by

Vikas Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views15 pages

Home Work 3: Class: M.C.A SECTION: RE3004 Course Code: CAP211

Uploaded by

Vikas Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 15

HOME WORK 3

COMPUTER ORGANIZATION AND ARCHITECTURE

Class: M.C.A

SECTION: RE3004

Course Code: CAP211

Submitted To: Submitted by:

Lect. Richa Malhotra Vikas Gupta
Part – A

Q1: Give a comparative study of RISC and CISC architectures.

RISC vs CISC is a topic quite popular on the Net. Everytime Intel (CISC)
or Apple (RISC) introduces a new CPU, the topic pops up again. But
what are CISC and RISC exactly, and is one of them reallybetter?

Explain in simple terms what RISC and CISC are and what the future
might bring for the both of them. This article is by no means intended as
an article pro-RISC or pro-CISC. You draw your own conclusions …

CISC:-

Pronounced sisk, and stands for Complex Instruction Set Computer. Most
PC's use CPU based on this architecture.
Typically CISC chips have a
large amount of different and complex instructions. The philosophy
behind it is that hardware is always faster than software, therefore one
should make a powerful instructionset, which provides programmers with
assembly instructions to do a lot with short programs.
In common
CISC chips are relatively slow (compared to RISC chips) per instruction,
but use little instructions.

RISC:-

Pronounced risk, and stands for Reduced Instruction Set Computer. RISC
chips evolved around the mid-1980 as a reaction at CISC chips. The
philosophy behind it is that almost no one uses complex assembly
language instructions as used by CISC, and people mostly use compilers
which never use complex instructions. Apple for instance uses RISC
chips.
Therefore fewer, simpler and faster instructions would
be better, than the large, complex and slower CISC instructions.
However, more instructions are needed to accomplish a task. An other
advantage of RISC is that - in theory - because of the more simple
instructions, RISC chips require fewer transistors.

Finally, it's easier to write

powerful optimised compilers, since fewer instructions exist.
Characteristics of RISC vs CISC
RISC CISC
Simple instructions taking 1 cycle
Complex instruction taking
multiple cycles
Only LOAD/STORE reference Any instruction can references
memory memory
High pipelined Not pipelined or less pipelined
Instructions are executed by HW Instructions are interpreted by
microprogram
Fixed format instructions Variable format instructions
Few instructions and modes Many instructions and modes
Complexity is in the compiler Complexity is in the microprogram

There is still considerable controversy among experts about which

architecture is better. Some say that RISC is cheaper and faster and
therefore the architecture of the future.
Others note that by making the
hardware simpler, RISC puts a greater burden on the software. Software
needs to become more complex. Software developers need to write more
lines for the same tasks.
Therefore they argue that RISC is not the
architecture of the future, since conventional CISC chips are becoming
faster and cheaper anyway.
RISC has now existed more than 10 years
and hasn't been able to kick CISC out of the market. If we forget about
the embedded market and mainly look at the market for PC's,workstations
and servers I guess a least 75% of the processors are based on the CISC
architecture. Most of them the x86 standard (Intel, AMD, etc.), but even
in the mainframe territory CISC is dominant via the IBM/390 chip. Looks
like CISC is here to stay …
Is RISC than really not better? The answer isn't quite that
simple. RISC and CISC architectures are becoming more and more alike.
Many of today's RISC chips support just as many instructions as
yesterday's CISC chips. The PowerPC 601,

For example, supports more instructions than the Pentium. Yet the 601 is
considered a RISC chip, while the Pentium is definitely CISC. Further
more today's CISC chips use many techniques formerly associated with
RISC chips.

Q2: Taking a suitable example, illustrate how compiler based

optimization is performed in RISC systems.

Ans. The numerical solution of Maxwell’s equations is a computationally

intensive task and use of high-performance parallel computing facilities
are necessary for the larger class of practical problems in scattering,
propagation and antenna modeling. It is therefore necessary to carefully
consider algorithm optimizations aimed at improving the code’s run time
performance on the computing platform employed. Although some
performance improvement can be derived from compiler-level
optimizations, further speed-up may involve manual effort in algorithm
restructuring, data layout , and parallelization.
Typically, the most
dramatic speed-up after code optimization is achieved by concentrating
on the matrix fill and solution step (LU decomposition or the inverse FFT
employed in the iterative solver) of the code. We therefore concentrate on
optimization techniques which emphasize performance improvements for
these steps and are aimed at reducing both CPU and wall-clock time.

Optimization Process Diagram

Source code

1. Hand-tuning 2. Preprocessor
3. Compiler front end
optimization Optimization
Preprocessor-
Hand-tuned Intermediate
tuned source
source code language code
code
4. Back end optimization

Object code

The optimization process

At the hand-tuning stage the user performs a number of optimizations at

the source-code level. Examples of such optimizations include reordering
programming statements or expressions and changing the memory access
patterns of loops [2, 3].

Next, an optimizing preprocessor, if available, takes the source code and

performs transformations enabled by user-selectable switches. Typical
examples are dead code elimination, inlining, interprocedural analysis,
library-call generation, etc.

The output source code is also optimized to take advantage of

architectural features of the host system. Then the output source code is
submitted to the compiler whose front end translates it into intermediate
language (IL).

Its back end optimizer then translates the IL code into machine language
and in this process it may apply a wide range of optimizations at the IL
level, depending on the user-selectable flag settings which invoke
specific sets of optimizations.

The sufficiency of the overall program performance is subsequently

assessed either intuitively, or more formally, e.g. by using different
performance characterization tools which provide bounds on the
achievable performance of the code [4, 5].

A performance bound hierarchy model may successively include the

effects of machine peak performance, high level application workloads,
compiler-inserted overhead, compiler generated instruction schedule,
cache effects, etc., and may be applied to loops, procedures, sections or
entire codes.
If later effects can be reduced or eliminated, performance may be made
to approach earlier bounds which represent the potential performance of
the application, if the effects of all later levels are eliminated.

Once a program has been sufficiently optimized for a single processor,

the next step is to assess whether the application code can take advantage
of multi-processor computing platforms. If so, the application is then
parallelized and optimized for parallel execution.

Optimization Examples

It is hard to overstress the importance of optimization. The examples

below demonstrate some simple techniques that can significantly improve
code performance and speed-up. Details on the employed techniques
themselves and further examples may be found in [2, 3].

An Array-Processing Example
Consider the two codes shown in Figure 3 which perform element-wise
array multiplication [6]. These codes are clearly seen to be functionally
equivalent.
do i=1,n do j=1,n
do j=1,n do i=1,n
c(i,j)=c(i,j)+a(i,j)*b(i,j) c(i,j)=c(i,j)+a(i,j)*b(i,j)
enddo enddo
enddo enddo
a) stride_n.f b) stride1.f

FIGURE:- The Array Multiplication. (a) Original. (b) Loop

interchanged

The only difference between the two examples in Figure is that the array
elements are referenced in a different order. All runs were made on an
IBM RISC System/6000, Model 530 with a 64KB cache. The arrays were
all declared REAL*8. A timing loop was inserted around the loops in the
examples so that the reported time is the average of 50 million inner loop
iterations.

Figure shows this time (in microseconds), as a function of n.

5
4.5 stride1
4 striden
Time( (Microsecond)

3.5
3
2.5
2
1.5
1
0.5
0
10 25 37 50 100 166 200 333 500
n

FIGURE: Performance on a RISC Sytem6000 Model 530 with 64 KB

data cache

As seen in Figure , the performance differs significantly between the two

codes. For small n, there is little difference in performance, but as n
grows, stride1 runs significantly faster then stride_n. In FORTRAN,
arrays are stored in “column major order”, implying that the leftmost
subscript changes more rapidly as memory-adjacent elements are
accessed.
In the stride1 routine, successive iterations of the inner loop access array
elements that are adjacent in memory. That is the array elements are
accessed in the same order as stored in memory.

However, in stride_n successive iterations of the inner loop access array

elements that are stored in memory n entries apart (one array column) in
memory. In this case the arrays are said to be accessed with stride n.

When a single element is read into the processor, adjacent elements

(comprising one “cache line”) are automatically brought into the high-
speed cache memory along with it. The user has no choice regarding this
automatic procedure of cache loading. Clearly, if all entries brought into
the cache are soon referenced(as in stride1), there is a memory access
delay only for the first element in each cache line that the processor reads
in.

However, if other entries in this line are referenced much later (as in
stride_n), the line with the referenced entries may get replaced in the
cache before they are referenced; referencing an element that is in the
cache is called a cache hit, otherwise the reference is a cache miss and
suffers a delay called the miss penalty. The advantage of the stride1 code
is that there is roughly one miss per cache line of elements accessed,

whereas almost every access to an element is a miss in stride_n code –

unless n is small enough so that entire array fits in cache and remains
there indefinitely. This scenario is easily seen in Figure 4, where both
stride1 and stride_n versions take the same time to run for n ≤50 : 3
arrays of 50 × 50 × 8 bytes = 60 KB < Size of the Cache (64KB).

Obviously, an understanding of the machine’s cache structure is

important in writing code routines that have the best potential for
optimum performance.1

Q3: Can instructions be executed in a pipeline? If yes, take a example

instruction and execute it using instruction pipeline.

Ans.An instruction pipeline is very similar to a manufacturing assembly

line. Imagine an assembly line partitioned into four stages:

1st stage receives some parts, performs its assembly task, and passes the
1
results to the second stage;

2nd stage takes the partially assembled product from the first stage,
performs its task, and passes its work to the third stage;

3rd stage does its work, passing the results to the last stage, which
completes the task and outputs its results.

As the first piece moves from the first stage to the second stage, a new set
of parts for a new piece enters the first stage. Ultimately, every stage
processes a piece simultaneously. This is how time is saved. Each product
requires the same amount of time to be processed (actually slightly more,
to account for the transfers between stages), but products are
manufactured more quickly because several are being created at the same
time.

Consider a non pipelined machine with 6 execution stages of lengths

50 ns, 50 ns, 60 ns, 60 ns, 50 ns, and 50 ns.
- Find the instruction latency on this machine.
- How much time does it take to execute 100 instructions?

Instruction latency = 50+50+60+60+50+50= 320 ns

Time to execute 100 instructions = 100*320 = 32000 ns

Suppose we introduce pipelining on this machine. Assume that when

introducing pipelining, the clock skew adds 5ns of overhead to each
execution stage.
- What is the instruction latency on the pipelined machine?
- How much time does it take to execute 100 instructions?

Solution:

Remember that in the pipelined implementation, the length of the pipe

stages must all be the same, i.e., the speed of the slowest stage plus
overhead. With 5ns overhead it comes to:
The length of pipelined stage = MAX(lengths of unpipelined stages) +
overhead = 60 + 5 = 65 ns
Instruction latency = 6x65 ns =390ns
Time to execute 100 instructions = 65*6*1 + 65*1*99 = 390 + 6435 =
6825 ns

Part - B

.
Q4: How RISC pipelines are implemented in RISC environment?

Ans. RISC AND PIPELINING

One of the major advantages of RISC instruction is the complexity of a

pipeline implementation.

RISC design features that make pipelining easy include.

(a)Single length instruction

(b)Relative few instruction format.
(c)Load and store instruction set
(d)Operand must be aligned in memory.
This is one possible configuration of an RISC pipeline, the pipeline
implemented in the SPARC MB86900 CPU.

The IBM 801, the first RISC computer, also uses a four-stage instruction
pipeline.
Other processors, such as the RISC II, use only three stages; they
combine the execute and store result operations in to a single stage.

There are three or four stages of RISC pipeline:-

Note that each stage has a register that latches its data at the end of the
stage to synchronize data flow between stages.

Q5: Give the super scalar architectures of Pentium processor.

Ans. Let us explain superscalar architecture of the Pentium processor

with the help of the example:-

Intel Pentium ("P5" / "P54C")

Intel's new fifth-generation chip was expected to be called the 586,following
their earlier naming conventions. However, with the rise of AMD and Cyrix,
Intel wanted to be able to register as a trademark the name of their new CPU
and numbers can't be trademarked. Thus, the Pentium was born. It is now
one of the most recognized trademarks in the computer world, one reason
why Intel doesn't seem to ever want to make another processor whose name
doesn't have "Pentium" in it somewhere.
The Pentium is the defining processor of the fifth generation. It has in fact
had several generations itself; the first Pentiums are different in many ways
from the latest ones. It has been the target for compatibility for AMD's K5
and Cyrix's 6x86 chips, as well as generations that have followed. The chip
itself is instruction set compatible with earlier x86 CPUs, although it does
include a few new (rarely used) instructions.

The Pentium provides greatly increased performance over the 486 chips that
precede it, due to several architectural changes. Roughly speaking,
a Pentium chip is double the speed of a 486 chip of the same clock speed. In
addition, the Pentium goes to much higher clock speeds than the 486 ever
did. The following are the key architectural enhancements made in the
Pentium over the 486-class chips (note that some of these are present in
Cyrix's 5x86 processor, but that chip was developed after the Pentium):

Superscalar Architecture: The Pentium is the first superscalar processor; it

uses two parallel execution units. Some people have likened the Pentium to
being a pair of 486s in the same chip for this reason, though this really isn't
totally accurate. It is really only partially superscalar because the second
execution unit doesn't have all the capabilities of the first; some instructions
won't run in the second pipeline. In order to take advantage of the dual
pipelines, code must be optimized to arrange the instructions in a way that
will let both pipelines run at the same time. This is why you sometimes see
reference to "Pentium optimization". Regardless, the performance is much
higher than the single pipeline of the 486.

Wider Data Bus: The Pentium's data bus is doubled to 64 bits, providing
double the bandwidth for transfers to and from memory.

Much Faster Memory Bus: Most Pentiums run on 60 or 66 MHz system

buses; most 486s run on 33 MHz system buses. This greatly improves
performance. Pentium motherboards also incorporate other performance-
enhancing features, such as pipelined burst cache. The Pentium processor
was also the first specifically designed to work with the (then new) PCI bus.

Branch Prediction: The Pentium uses branch prediction to prevent pipeline

stalls when branches are encountered.

Integrated Power Management: All Pentiums have built in SMM power

management (optional on most of the 486s).

Split Level 1 Cache: The Pentium uses a split level 1 cache, 8 KB each for
data and instructions. The cache was split so that the data and instruction
caches could be individually tuned for their specific use.
Improved Floating Point Unit: The floating point unit of the Pentium is
significantly faster than that of the 486.

The Pentium is available in a wide variety of speeds, and in regular and

Over Drive versions. It is also available in several packaging styles,
although the pin grid array (PGA) is still the most prevalent. The original
Pentiums, the 60 and 66 MHz versions, were very different than the later
versions that are used in most PCs; they used older, 5 volt technology and
significant problems with heat. Intel solved this with later (75-200 MHz)
versions by going to a smaller circuit size and 3.3 volt power.

Pentiums use three different sockets. The original Pentium 60 and 66 use
Socket 4. Pentiums from 75 to 133 will fit in either socket 5 or socket 7;
Pentium 150s, 166s and 200s require Socket 7. Intel makes Pentium Over
Drives that allow the use of faster Pentiums in older Pentium sockets (in
addition to Over Drives that go in 486 motherboards).

The Pentium processor achieved a certain level of "fame" as a result of the

bug that was discovered in its floating point unit not long after it was
released. This is commonly known as the "FDIV" bug after the instruction
(floating point divide) that it most commonly turns up in. While bugs in
processors are relatively common, they usually are minor and don't have a
direct impact on computation results. This one did, and achieved great
notoriety in part because Intel didn't own up to the problem and offer to
correct it immediately. Intel does offer a replacement on affected processors,
which were only found in early versions (60 to 100) sold in 1994 and earlier.

If you suspect your Pentium of having the FDIV bug, try this computation
test using a spreadsheet or calculator program: take the number 4,195,835
and divide it by 3,145,727. Then take the result and multiply it by the same
number again (3,145,727). You should of course get the same 4,195,835
back that you started with. On a PC with the FDIV bug you will get
4,195,579 (an error of 256), but beware that some operating systems and
applications have been patched to compensate for this bug, so a simple math
test isn't necessarily conclusive. Try looking at this page on Intel's web
site for replacement information, if you suspect that you have an FDIV bug
on your older Pentium chip.

For many years, the Pentium processor was the mainstream processor of
choice, but finally the Pentium with MMX has driven it to the economy
market. With the regular Pentium maxing out at 200 MHz and the Pentium
with MMX 166 dropping well below $200, the "Pentium Classic" doesn't
make nearly as much sense as it used to for new PCs. The 60 and 66 are
obsolete due to their slow speed and older technology, and the 75 to 150 are
obsolete because their performance is much lower than the 166 and 200, for
almost the same amount of money.

The entire classic Pentium line is now technically obsolete, due to the
availability of inexpensive, faster Pentium with MMX chips (as well as
comparable offerings from AMD and Cyrix). The non-MMX Pentium is no
longer generally used in new systems. However, since the Pentium with
MMX requires split rail voltage, the classic Pentium 200 remains a great
chip for those who have socket 7 motherboards and want to upgrade, but
who do not have split rail voltage support.

Q6: Is there any difference between the working of vector processors

and array processors? Differentiate between SIMD and MIMD array
processors.

Ans. Array processors are able to efficiently handle large amounts of

data, but since the function requires that the CPU be more complex,
simpler operations are more difficult to perform. Differences between
scalar and array processors became less pronounced with the
introduction of microprocessors in 1994.

Vector and array processing are essentially the same because, with slight
and rare differences, a vector processor and an array processor are the
same type of processor. A processor, or central processing unit (CPU),
is a computer chip that handles most of the information and functions
processed through a computer.

Vector processor and Array processor are just the same thing, its a CPU
design where instruction are set includes operations that can perform
mathematical operations on multiple data elements simultaneously.

SIMD processor organization

• This type of machine typically has an instruction dispatcher, a very high

bandwidth internal network, and a very large array of very small-capacity
instruction units.

• Thus single instruction is executed by different processing unit on

different set of data as shown in figure.
• Best suited for specialized problems characterized by a high degree of
regularity, such as image processing and vector computation.

• Synchronous (lockstep) and deterministic execution

• Two varieties: Processor Arrays e.g., Connection Machine CM-2,

Maspar MP-1, MP-2 and Vector Pipelines processor e.g., IBM 9000,
Cray C90, Fujitsu VP, NEC SX-2, Hitachi S82

Prev instruct Prev instruct

Load A(1) Load A(2)
Load B(1) Load B(2)
C(1)=A(1)* C(2)=A(2)*
B(1) B(2)
Store C(1) Store C(2)
Next instruction Next instruction
Prev instruct
Load A(n)
Load B(n)
C(n)=A(n)*
B(n)
Store C(n)
Next instruction

P1 P2 Pn

Execution of instructions in SIMD processor

Multiple instruction stream, multiple data stream (MIMD)

• Multiple Instruction: every processor may be executing a different

instruction stream

• Multiple Data: every processor may be working with a different data

stream as shown in figure multiple data stream is provided by shared
memory.

• Can be categorized as loosely coupled or tightly coupled depending on

sharing of data and control
• Execution can be synchronous or asynchronous, deterministic or
nondeterministic

Prev instruct Prev instruct Prev instruct

Load A(1) Call funcD Do 10 i=1,n
Load B(1) X=y*z Alpha=w**3
C(1)=A(1)* Sum=x*2 Zeta=c(i)
B(1)
Store C(1) Call sub1(i,j) 10 continue
Next instruction Next instruction Next instruction

Execution of instructions MIMD processor

Electronics and Communication Engineering
No ratings yet
Electronics and Communication Engineering
552 pages
William Stallings Computer Organization and Architecture 8 Edition
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition
38 pages
32bit RISC CPU Based On MIPS Using VHDL
86% (7)
32bit RISC CPU Based On MIPS Using VHDL
93 pages
Risc and Cisc: by Eugene Clewlow
No ratings yet
Risc and Cisc: by Eugene Clewlow
17 pages
Jan Marie Veatrice M. Pacia 4-ECE: What Are Cisc and Risc Architecture? How Do They Differ From Each Other?
No ratings yet
Jan Marie Veatrice M. Pacia 4-ECE: What Are Cisc and Risc Architecture? How Do They Differ From Each Other?
6 pages
Computer Organization UNIT5
No ratings yet
Computer Organization UNIT5
49 pages
Risc and Cisc: Computer Architecture
No ratings yet
Risc and Cisc: Computer Architecture
17 pages
Design of Asmart Glove ...
No ratings yet
Design of Asmart Glove ...
91 pages
RISC Vs CISC, Harvard V/s Van Neumann
No ratings yet
RISC Vs CISC, Harvard V/s Van Neumann
35 pages
1.2 Software and Software Development.280155520
No ratings yet
1.2 Software and Software Development.280155520
2 pages
RISC Machines: B. Ross COSC 3p92
No ratings yet
RISC Machines: B. Ross COSC 3p92
19 pages
Risc Arcitecture (Reduced Instructuion Set Computers) by Prateek
No ratings yet
Risc Arcitecture (Reduced Instructuion Set Computers) by Prateek
14 pages
Complex Instruction Set Computer
No ratings yet
Complex Instruction Set Computer
17 pages
Group 9 Risc
No ratings yet
Group 9 Risc
27 pages
Implementation of RISC-V Processor
No ratings yet
Implementation of RISC-V Processor
7 pages
RISC vs. CISC
No ratings yet
RISC vs. CISC
21 pages
2.ACA - Performance Measure
No ratings yet
2.ACA - Performance Measure
30 pages
Risc Reduced Instruction Set Computer
No ratings yet
Risc Reduced Instruction Set Computer
33 pages
A.A.I Com Arc
No ratings yet
A.A.I Com Arc
16 pages
Session - 26 CISC and RISC
No ratings yet
Session - 26 CISC and RISC
15 pages
WK1 Study Session 1.2
No ratings yet
WK1 Study Session 1.2
5 pages
Understanding Instruction Sets of Microcontrollers
No ratings yet
Understanding Instruction Sets of Microcontrollers
52 pages
Cisc
No ratings yet
Cisc
17 pages
Module 3 DDCO
No ratings yet
Module 3 DDCO
67 pages
Lec-1-To-10 19ECE349-RISC Processor Design Using HDL
No ratings yet
Lec-1-To-10 19ECE349-RISC Processor Design Using HDL
95 pages
Ldco Unit 5 Notes
No ratings yet
Ldco Unit 5 Notes
23 pages
03 Cisc Risc
No ratings yet
03 Cisc Risc
37 pages
CH 0
No ratings yet
CH 0
138 pages
Embedded Systems - 7
No ratings yet
Embedded Systems - 7
17 pages
CA Mid01 Fall'19 Solution
No ratings yet
CA Mid01 Fall'19 Solution
3 pages
The Simplest Way To Examine The Advantages and Disadvantages of RISC Architecture Is by Contrasting It With It
No ratings yet
The Simplest Way To Examine The Advantages and Disadvantages of RISC Architecture Is by Contrasting It With It
11 pages
Lecture 2: Performance/Power, MIPS Instructions
No ratings yet
Lecture 2: Performance/Power, MIPS Instructions
28 pages
Reduced-Instruction-Set-Computers-71722808 Unit2-5
No ratings yet
Reduced-Instruction-Set-Computers-71722808 Unit2-5
38 pages
CA Notes Chapter1
No ratings yet
CA Notes Chapter1
28 pages
Processor and Computer Achitecture
No ratings yet
Processor and Computer Achitecture
26 pages
Wk05 - CPU Architecture (Part 1)
No ratings yet
Wk05 - CPU Architecture (Part 1)
72 pages
CMSC 611: Advanced Computer Architecture
No ratings yet
CMSC 611: Advanced Computer Architecture
21 pages
Solutions COA7e 1
No ratings yet
Solutions COA7e 1
92 pages
Von Neumann Architecture vs. Harvard
No ratings yet
Von Neumann Architecture vs. Harvard
22 pages
Unit-5 PDF Material
No ratings yet
Unit-5 PDF Material
27 pages
Reduced Instruction Set Computers: William Stallings Computer Organization and Architecture 7 Edition
No ratings yet
Reduced Instruction Set Computers: William Stallings Computer Organization and Architecture 7 Edition
38 pages
13 Risc
No ratings yet
13 Risc
38 pages
Structure of Computer Systems
No ratings yet
Structure of Computer Systems
16 pages
Processor Organization & Pipelining
No ratings yet
Processor Organization & Pipelining
74 pages
William Stallings Computer Organization and Architecture: Reduced Instruction Set Computers
No ratings yet
William Stallings Computer Organization and Architecture: Reduced Instruction Set Computers
39 pages
FIT9134 Week11
No ratings yet
FIT9134 Week11
21 pages
Electronics
No ratings yet
Electronics
8 pages
Risc Cisc Study
No ratings yet
Risc Cisc Study
12 pages
Risc and Cisc Microprocessor
No ratings yet
Risc and Cisc Microprocessor
11 pages
RISC and CISC - Eugene Clewlow
No ratings yet
RISC and CISC - Eugene Clewlow
17 pages
Risc and Cisc: by Eugene Clewlow
No ratings yet
Risc and Cisc: by Eugene Clewlow
17 pages
Lecture 4
No ratings yet
Lecture 4
10 pages
Computer Organization: Basic Structure of Computer
No ratings yet
Computer Organization: Basic Structure of Computer
59 pages
Risc and Cisc
No ratings yet
Risc and Cisc
22 pages
Risc and Cisc
No ratings yet
Risc and Cisc
20 pages
Superscalar Architectures: COMP375 Computer Architecture and Organization
No ratings yet
Superscalar Architectures: COMP375 Computer Architecture and Organization
35 pages
RISC Vs CISC
No ratings yet
RISC Vs CISC
13 pages
Risc and Cisc
No ratings yet
Risc and Cisc
17 pages
Module - 5 - ARM
No ratings yet
Module - 5 - ARM
45 pages
Coa Viva
No ratings yet
Coa Viva
5 pages
Lesson Plan-COMPUTER Architecture
100% (1)
Lesson Plan-COMPUTER Architecture
5 pages
Assembler Programming Using Debug
100% (2)
Assembler Programming Using Debug
16 pages
Andrew Grove
100% (1)
Andrew Grove
13 pages
Presentation RISC Vs CISC
No ratings yet
Presentation RISC Vs CISC
13 pages
Microprocessor Based System Lab Manual
No ratings yet
Microprocessor Based System Lab Manual
105 pages
UM10120
No ratings yet
UM10120
297 pages
Unit-4 Coa
No ratings yet
Unit-4 Coa
17 pages
Answer ALL Questions
No ratings yet
Answer ALL Questions
2 pages
MPMC 2M Questions
No ratings yet
MPMC 2M Questions
20 pages
EE2007C Chap1 201516
No ratings yet
EE2007C Chap1 201516
57 pages
Eie 411 Course Compact
No ratings yet
Eie 411 Course Compact
13 pages
Blind Navigation Band
No ratings yet
Blind Navigation Band
50 pages
l1 Intro PDF
No ratings yet
l1 Intro PDF
36 pages
MPMC Unit - 4
No ratings yet
MPMC Unit - 4
15 pages
UNIT-4 - Pipelining & Parallel Processing
No ratings yet
UNIT-4 - Pipelining & Parallel Processing
34 pages
Co Part - I
No ratings yet
Co Part - I
38 pages
Get Multi Processor System On Chip 1 Architectures Liliana Andrade Frédéric Rousseau Free All Chapters
100% (4)
Get Multi Processor System On Chip 1 Architectures Liliana Andrade Frédéric Rousseau Free All Chapters
65 pages
Embedded Systems Lecture 2
No ratings yet
Embedded Systems Lecture 2
60 pages
Computer Organization and Design PPT02
No ratings yet
Computer Organization and Design PPT02
55 pages
ICT Short Questions-1
No ratings yet
ICT Short Questions-1
34 pages
MPMC Notes 18.05.2024
No ratings yet
MPMC Notes 18.05.2024
124 pages
Unit3 - AMIT YADAV COA
No ratings yet
Unit3 - AMIT YADAV COA
89 pages
Slide 1 (Introduction)
No ratings yet
Slide 1 (Introduction)
25 pages
DDCA Question Bank CO-3
No ratings yet
DDCA Question Bank CO-3
11 pages
Types of Processors
No ratings yet
Types of Processors
2 pages
PlayStation Architecture: Architecture of Consoles: A Practical Analysis, #6
From Everand
PlayStation Architecture: Architecture of Consoles: A Practical Analysis, #6
Rodrigo Copetti
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
What's New in .NET 8? A Complete Guide to the Latest Features
From Everand
What's New in .NET 8? A Complete Guide to the Latest Features
Nitika
No ratings yet
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
C Programming for the Pc the Mac and the Arduino Microcontroller System
From Everand
C Programming for the Pc the Mac and the Arduino Microcontroller System
Peter D Minns
No ratings yet

Home Work 3: Class: M.C.A SECTION: RE3004 Course Code: CAP211

Uploaded by

Home Work 3: Class: M.C.A SECTION: RE3004 Course Code: CAP211

Uploaded by

HOME WORK 3

COMPUTER ORGANIZATION AND ARCHITECTURE

Course Code: CAP211

Submitted To: Submitted by:

Q1: Give a comparative study of RISC and CISC architectures.

Finally, it's easier to write

There is still considerable controversy among experts about which

Q2: Taking a suitable example, illustrate how compiler based

Ans. The numerical solution of Maxwell’s equations is a computationally

Optimization Process Diagram

The optimization process

At the hand-tuning stage the user performs a number of optimizations at

Next, an optimizing preprocessor, if available, takes the source code and

The output source code is also optimized to take advantage of

The sufficiency of the overall program performance is subsequently

A performance bound hierarchy model may successively include the

Once a program has been sufficiently optimized for a single processor,

It is hard to overstress the importance of optimization. The examples

FIGURE:- The Array Multiplication. (a) Original. (b) Loop

Figure shows this time (in microseconds), as a function of n.

FIGURE: Performance on a RISC Sytem6000 Model 530 with 64 KB

As seen in Figure , the performance differs significantly between the two

However, in stride_n successive iterations of the inner loop access array

When a single element is read into the processor, adjacent elements

whereas almost every access to an element is a miss in stride_n code –

Obviously, an understanding of the machine’s cache structure is

Q3: Can instructions be executed in a pipeline? If yes, take a example

Ans.An instruction pipeline is very similar to a manufacturing assembly

Consider a non pipelined machine with 6 execution stages of lengths

Instruction latency = 50+50+60+60+50+50= 320 ns

Suppose we introduce pipelining on this machine. Assume that when

Remember that in the pipelined implementation, the length of the pipe

Ans. RISC AND PIPELINING

One of the major advantages of RISC instruction is the complexity of a

RISC design features that make pipelining easy include.

(a)Single length instruction

There are three or four stages of RISC pipeline:-

Q5: Give the super scalar architectures of Pentium processor.

Ans. Let us explain superscalar architecture of the Pentium processor

Intel Pentium ("P5" / "P54C")

Superscalar Architecture: The Pentium is the first superscalar processor; it

Much Faster Memory Bus: Most Pentiums run on 60 or 66 MHz system

Branch Prediction: The Pentium uses branch prediction to prevent pipeline

Integrated Power Management: All Pentiums have built in SMM power

The Pentium is available in a wide variety of speeds, and in regular and

The Pentium processor achieved a certain level of "fame" as a result of the

Q6: Is there any difference between the working of vector processors

Ans. Array processors are able to efficiently handle large amounts of

SIMD processor organization

• This type of machine typically has an instruction dispatcher, a very high

• Thus single instruction is executed by different processing unit on

• Synchronous (lockstep) and deterministic execution

• Two varieties: Processor Arrays e.g., Connection Machine CM-2,

Prev instruct Prev instruct

Execution of instructions in SIMD processor

Multiple instruction stream, multiple data stream (MIMD)

• Multiple Instruction: every processor may be executing a different

• Multiple Data: every processor may be working with a different data

• Can be categorized as loosely coupled or tightly coupled depending on

Prev instruct Prev instruct Prev instruct

Execution of instructions MIMD processor

You might also like