CRISC Research
CRISC Research
net/publication/201987790
CITATION READS
1 16,977
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Aws Fida El-Din on 20 April 2014.
ABSTRACT
The comparative study between CISC (Complex Instruction Set Computer) and RISC
(Reduce Instruction Set Computers) has been a well known research area for many
years. In our paper, we try to address the new trend of these two architectures, which
is CRISC (Complex-Reduce Instruction Set Computer). We chose the Intel Core Duo
processor, Intel's most recent processor, to be the focus of our study. The Core Duo
processor features will be highlighted, focusing on pipelining stages, clock speed,
number of transistors, instruction set architecture (ISA), and the improvement in
cache technology.
1. Introduction
From the architecture point of view, the microprocessor chips can be classified into
two categories: Complex Instruction Set Computers (CISC) and Reduce Instruction
Set Computers (RISC). In either case, the objective is to improve system
performance. The debates between these two architectures made this research area
very interesting, challenging, and some times confusing.
From the 60's CISC microprocessors became prevalent, each successive processor
having more and more complicated hardware and more and more complex instruction
sets. This trend started from Intel 80486, Pentium MMX to Pentium III.
RISC (Reduced Instruction Set Computer) chips evolved around the mid-1970 as a
reaction at CISC chips. In 70's, John Cocke at IBM's T.J Watson Research Center
provided the fundamental concepts of RISC, the idea came from the IBM 801
minicomputer built in 1971 which is used as a fast controller in a very large telephone
switching system. This chip contained many traits a later RISC chip should have: few
instructions, fix-sized instructions in a fixed format, execution on a single cycle of a
processor and a Load / Store architecture. These ideas were further refined and
articulated by a group at University Of California Berkeley led by David Patterson,
who coined the term "RISC". They realized that RISC promised higher performance,
less cost and faster design time. [1]. The simple load/store computers such as MIPS
1
are commonly called RISC architectures. David A. Patterson was the finder of the
term RISC, after that John L. Hennessy invented the MIPS architecture to represent
RISC [2].
The second section of this paper will provide a comparison between CISC and RISC.
In section 3, we present some related work showing the attempts that had been made
on these two architectures, based on different techniques. Section 4 presents the
CRISC technology. Our work will be focused on Intel's family processors to illustrate
the concept of CRISC. Finally, the last section will be the conclusion of this work.
In this section, our comparison will be based on the above key factors, to highlight the
major differences between the two architectures CISC and RISC.
As time passed, one of the non-RISC architecture with large market was the Intel x86
family, it has some specific characteristics that became associated with CISC:
[1]
• Segmented memory model
• Few registers
• Crappy floating point performance
Typically CISC chips have a large amount of different and complex instructions. It
was believed that hardware is always faster than software; therefore one should make
a powerful instruction set, which provides programmers with assembly instructions to
do a lot with short programs. Commonly speaking, CISC chips are relatively slow per
instruction compared to RISC chips, but use less instructions than RISC.
Most actual RISC machines such as the RISC I and RISC II from the University of
California at Berkeley and the MIPS from Stanford University have most of the
following common properties: [1]
2
• Simple primitive instructions and addressing modes
• Instructions execute in one clock cycle
• Uniformed length instructions and fixed instruction format
• Instructions interface with memory via fixed mechanisms (load/store)
• Pipelining
• Instruction set is orthogonal (little overlapping of instruction functionality)
• Hardwired control
• Complexity pushed to the compiler
Additional properties of CISC and RISC, regarding cost and performance together
[14] are:
• RISC design is approximately twice as cost-effective as CISC.
• RISc architectures are designed for a good cost/performance, whereas CISC
architectures are designed for a good performance on slow memories.
The essence of RISC architecture is that it allows the execution of more operations in
parallel and at a higher rate than possible with a CISC architecture employing similar
implementation complexity. It can not only improve parallelism using pipelining, but
also make superscalar and out-of -order execution possible [15].
RISC has also been called a "scalable architecture" because it is possible to go from
one technology to another with practically the same design [3].
Back in the middle to late 80's, the battle over RISC and CISC is mainly non-Intel
versus Intel x86, and RISC seemed to have a clearly upside, until the appearing of
i486, Pentium and now PII, PIII. Now Intel's machines still run the old instruction set,
but they adopt some RISC-like characteristics such as one clock execution, clean
memory models, deep pipelining, superscalar operations, lots of registers and even
out-of-order execution. They run faster and faster with a decent floating point
performance. On the other hand, some RISC machines added more instructions to
their architectures for new data types. So, it seems the RISC-CISC gap is narrowed.
So, nowadays, the difference between RISC and CISC is no longer one of instruction
sets, but of the whole chip architecture and system. The designations RISC and CISC
are no longer meaningful in the original sense. What counts in a real world is always
how fast a chip can execute the instructions it is given and how well it runs existing
software [1].
RISC's original goals was to limit the number of instructions on the chip so that each
could be allocated enough transistors to make it execute in one cycle. Rather than
provide a mul instruction, for example, the microprocessor's designers might make
3
sure that add executes in one clock. Then a compiler could multiply a and b by adding
a to itself b times or b to itself a times. A CISC could multiply 5 by 10 as follows:
[5]
Mov ax,10
Mov bx,5
Mul bx
Mov ax,0
Mov bx,10
Mov cx,5
Begin:
Add ax,bx
Loop Begin ; loop cx times
The two architectures, CISC and RISC, can be compared based on instruction set,
which is an important feature of computer architecture. The instruction set chosen for
a particular processor determines the way that machine language programs are
constructed. Another way for comparing these two processors is by studying the
available addressing modes. This point will give us an idea about the memory or
register referencing, which will be one of the important factors in performances
comparison. Other factors for comparing these two architectures can be the integer
and floating point units. Recently one of the most important factors is instruction
pipelining. The more pipelining stages that the processor has, the faster the processor
will execute the instructions. Several researchers have been working on instruction
pipelining, because of the impact of this feature on the overall performance. The
cache and the main memory were also primary factors affecting processor
performance. Microprocessor industries were always greedy for speed, while memory
industry were greedy for capacity, causing a big gab between CPU and memory
speeds [Car 2002]. The number of transistors in each processor can affect the speed of
the processor. If the processor contains more transistors, this means it will have more
gates. This makes the design of the processors at the gate level more compact, and as
a result it will be faster, and has a better performance.
4
Table-2 shows some system comparisons between selected CISC and RISC
processors: [4]
Figure-1 shows the system architecture for the combined machine at high level of
abstraction.
The researcher concluded that the resulting ability to increase computation power by
adding extra hardware when software was added provides an effective means of
accommodating new, more demanding, applications within the confines of existing
machine [8].
5
Table-3 shows the summary of MIPS R2000 and 80386 architectures [1].
Another experimental study has been done by Dileep and Douglas [9]. The study
compared an example implementation of the RISC and CISC architectures (a MIPS
M2000 and a VAX 8700) on nine of the ten SPEC benchmarks. The organizational
similarities of these machines provides an opportunity to examine the purely
architectural advantages of RISC. The RISC approach promises many advantages
over CISC architectures, including superior performance, design simplicity, rapid
development time [9]. The researcher's founding was that the RISC, MIPS M2000,
has significantly higher architecturally determined performance than the CISC, the
Digital VAX 8700.
The performance of a RISC depends greatly on the code quality that it is executing. If
the programmer (or compilers) does a poor job of instruction scheduling, the
processor can spend quit a bit of time stalling, waiting for the result of one instruction
before it can proceed with a subsequent instruction.
Since CISC machine perform complex action with single instructions, where RISC
machines may require multiple instructions for the same action, code expansion can
be a problem.
In the nineties people at Sun developed the java language and also the Java Virtual
Machine, which contains a complete, stack-based, instruction list. It can be seen as a
CISC instruction list. But compared to RISC it is opposite in every detail:[Bli 2006]
6
The JVM byte code architecture is:
There have been several research papers that covered the impact of system software
on the processor and memory performance. John Ousterhout [10] evaluated several
hardware platforms (MIPS M2000, VAX 8800, Sun-3/75, and Sun-4/280) and several
operating systems platforms such as (Ultrix, SunOS, RISC/os, and Spriite). A set of
benchmarks was used in his research. The study highlight issues for both hardware
designers and operating systems people to think about [10].
The term, like its antonym RISC, has become less meaningful with the continued
evolution of both CISC and RISC designs and implementations. The first pipelined
"CISC" CPUs, such as 486s [19] from Intel, AMD, Cyrix, and IBM, certainly
supported every instruction that their predecessors did, but achieved high efficiency
only on a fairly simple x86 subset (resembling a non load/store "RISC" instruction
set).
In our study, we focused on the Intel family microprocessors, to illustrate the future
trend of CISC and RISC, which is CRISC. An example of CRISC is the the Intel
Pentium-Pro, which is an interesting blend of the two architectures. It still executes
the CISC instruction set, but the internal implementation is a high performance "Post
RISC" CPU [12].
7
Table-3 shows some information about the Intel Family [7].
350000000
Number of Transistors
300000000
250000000
200000000
150000000
100000000
50000000
0
II
ue
6
6
6
I
04
08
80
86
88
PI
P4
PI
PI
28
38
48
D
40
80
80
80
80
2
e
or
Microprocessor
C
Number of Transistors
8
Table-4 shows additional information about the Intel Family [7].
7000000
6000000
5000000
Spe e d
4000000
3000000
2000000
1000000
0
II
PI
D
HT
P4
6
6
6
PI
04
08
80
86
88
PI
o
28
38
48
P
Du
40
80
80
80
80
P4
2
re
Co
Speed Processor
9
the clock frequency for high power efficiency. Although clocking at a slower rate than
most of its competitors, shorter stages and wider issuing pipeline compensates the
performance with higher Inter-Process Communications (IPC’s). In addition, the Core
2 Duo processor has more ALU units [18].
Intel Core 2 Duo processor uses the following instruction sets: x86, MMX, SSE,
SSE2, SSE3, SSSE3, x86-64. The most resent instruction set that the Core 2 Duo
processor is using is Supplemental Streaming SIMD Extension 3 (SSSE3), Intel's
name for the SSE instruction set's fourth iteration. The previous state of the art was
SSE3, and Intel have added an S rather than increment the version number, as they
appear to consider it merely a revision of SSE3. SSSE3 contains 16 new discrete
instructions over SSE3 [7].
The Core 2 Duo processor has 14 pipelining stages. The slightly deeper pipeline
enables increased clock speeds and techniques such as memory disambiguation and
improved prefetch logic also help offset any advantage an integrated memory
controller offers. It includes the shared 4 MB L2 cache and the L1 cache with size 32
KB + 32 KB. Intel has chosen not to follow suit in their Core 2 processors, preferring
to use any additional on-die transistors to increase the Level 2 cache to 4MB, it has
291 million transistors [7]. The improvements implemented in the Intel Core 2 chips
make them unquestionably the more efficient processors, able to decode and process
more instructions per clock cycle.
The five main features of Intel Core 2 Duo contributing towards its high performance
are [18]:
• Intel’s Wide Dynamic Execution
• Intel’s Advanced Digital Media Boost
• Intel’s Intelligent Power Capability
10
• Intel’s Advanced Smart Cache
• Intel’s Smart Memory Access
Table-5 shows additional information about two different Intel's Core processors
[7].
We believe that debate about the two architectures CISC and RISC is of no more
interest in today’s computer technology. Computers today, though they bear the brand
of RISC or CISC, are in fact not pure implementations of either one. They are truly
hybrid systems. RISC architects have adopted a larger set of instructions and CISC
architects have realized the benefits of implementing a core set of instructions that can
execute in a single CPU cycle. From our observation Intel's Core Duo processor has
improved the cache system. It has two high speed caches of size (L1 and L2) in each
core (processor, which will store most recently used main memory locations. It
typically requires only one to two processor cycles to access data as compared with 10
to 200 cycles for main memory access. We must confess that bigger caches will only
improve the hit rate, but decrease the access speed. The Intel's Core Duo processor
utilizes 14 stage pipelining, to speedup execution, although the data dependences will
remain one of the problems in processors using pipelining. Another fact in Intel's
Core Duo processor is the increasing number of transistors (291 Million), which
means closer circuits and as a result a better performance.
Most of the researchers, who have done some work in this field and focused on the
hardware or software aspects of the two architectures CISC and RISC, have, in fact,
chosen two old processors, one to represent a pure CISC and the other one to
represent a pure RISC. This type of research does not have much impact on the design
of the current chips. Secondly, the two architectures are not pure CISC and RISC
processors. The comparison between them will lead us to wrong conclusions, because
the recent processors are CRISC (Complex-Reduce Instruction Set Computer), a
hybrid processor of CISC and RISC features. These types of processors have been
focusing on the increasing number of pipelining stages and the number of instruction
sets.
11
5. Conclusion
A new trend of CISC and RISC architectures is addressed. Some of previous works
was highlighted, and a new technology is presented, Intel’s Core 2 Duo processor. For
the best performance and scalability of the Intel Core 2 Duo processor, the following
are important factors: (1) fast cache-to-cache communication, (2) large L2 or shared
capacity, (3) fast L2 access delay, and (4) fair resource (cache) sharing.
References:
[1] Yi Gao, Shilang Tang, Zhangli Ding, "Comparison between CISC and RISC",
2000.
[2] John L. Hennessy, David A. Patterson, "Computer Architecture A Quantitative
Approach", Third Edition, 2006.
[3] Margarita Esponda, Ra'ul Rojas,"The RISC Concept – A Survey of
Implementations", 1991.
[4] William Stallings, "Computer Architecture Designing for Performance",
"Seventh Edition", 2006.
[5] Jeff Prosise, "RISc vs. CISC: The Real Story", PC Magazine, 1995.
[6] Carlos Carvalho,"The Gap between Processor and Memory Speeds", 2002.
[7] Intel Group, "Intel Core 2 Duo Processor Specification",
http://www.intel.com/core2duo , 2007.
[8] Simon C.J. Garth, "Combining RISC and CISc in PC System", 1991.
[9] Dileep Bhandarkar, Douglas W. Clark, "Performance from Architecture
Comparing a RISC and a CISC with Similar Hardware Design", 1991.
[10] John Ousterhout, "Why aren't Operating Systems Getting Faster As Fast As
Hardware", 1989.
[11] Stefan Blixt, "Processors Designs for Embedded Systems", 2006.
[12] Mark Brehob, Trvis Doom, Richard Enbody, "Beyond RISC – The Post RISC
Architecture", 1996.
[13] Marc Tremblay, "Challenges and Trends in Processor Design",1998.
[14] Jim Thomas, "Evaluation of Performance Instruction Set Architectures: RISC
versus CISC", 2003.
[15] Richard S. Piopho, William S. Wu , "A Comparison of RISC Architectures",
1989.
[16] Dileep Bhandarkar, "RISC versus CISC: A Tale of Two Chips", 1997.
[17] Carlos Jorge Lopes,"Improve Computing Performance, without any Hardware
Update", 2002.
[18] Tribuvan Kumar Prakash, "Performance Analysis of Intel Core 2 Duo
Processor",Thesis, 2007.
[19] Paul DeMone , “RISc vs CISC Stilll Matters”, 2000.
12