0% found this document useful (0 votes)
119 views7 pages

Analysis of 16-Bit and 32-Bit RISC Processors

Uploaded by

Testau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views7 pages

Analysis of 16-Bit and 32-Bit RISC Processors

Uploaded by

Testau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2021 7th International Conference on Advanced Computing & Communication Systems (ICACCS)

Analysis of 16-Bit and 32-Bit RISC Processors


A nim esh Kulshreshtha Anm ol M oudgil
2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS) | 978-1-6654-0521-8/20/$31.00 ©2021 IEEE | DOI: 10.1109/ICACCS51430.2021.9441873

D epartm ent o f Electrical Engineering Departm ent o f Electrical Engineering


Delhi Technological University Delhi Technolgical University
N ew Delhi, India N ew Delhi, India
anim esh.kulshreshthal @gmail.com anmolmoudgil@ gmail.com

A bhishek Chaurasia Bharat Bhushan


D epartm ent o f Electrical Engineering Departm ent o f Electrical Engineering
Delhi Technolgical University Delhi Technolgical University
N ew Delhi, India N ew Delhi, India
abhi shekchaurasia_2k17 ee08@dtu .ac.in bharat@ dce .ac.in

Abstract— The reduced instruction set computer, or RISC, combination of binary numbers what can be called as
is a microprocessor that executes small and similar instructions operands or Opcode bits. There was an Opcode for every
that execute in about similar time. The objective is to reduce type of instruction that could be implemented in the CPU
the complexity of instructions which in turn reduces the cost, under certain classifications. This made the instruction set of
cycle time and the operating power. Though the 16-bit RISC the RISC processor much easier to understand [1]. The 16-
has been around since 1970s, it has not been up to the mark Bit RISC CPU underlines all the above points and expresses
and has posed a significant number of technical barriers. This the simplicity of the instruction set architecture.
was the very reason for the development of 32-bit and 64-bit
RISC processors and the concept of pipelining. In this paper, Unlike the 16-Bit RISC, the modern day 32-Bit MIPS
our objective is to study behavioral model of 16-bit and 32-bit processor which happens to be a commercial success,
RISC processor and their independent instruction sets. The 16- accommodates the concept of pipelining because of which
bit RISC processor is a non-pipelined Harvard architecture- the processor improves on a lot of fronts as compared to its
based CPU having separate data memory and instruction predecessor generations. Storing and execution takes place in
memory. The 32-bit RISC is a pipelined processor borrowing a sequential manner which is a clear indication of the fact
its implementation strategies from MIPS architecture. The that instruction overlapping takes place during execution.
processors include GPRs (General Purpose Register) and Flag Pipeline is divided into several stages and all these stages are
registers (Carry, Zero etc.). The model discussed will simulate
connected together via pipe like structures (latches). Each
optimized Multiplier algorithm and will try to optimize the
pipeline segment comprises of two parts i.e., input registers
data path since Arithmetic and Logical operations consume
more power along with high execution delay. The paper aims
followed by combinational circuits. The data through the
to draw a comparative study between the models based on pipeline is held in the registers and the combinational logic
their instruction sets and performance elements such as circuits perform several arithmetic and binary operations on
speedup, power dissipated etc. The individual models have it. The output from one stage is given as an input to the next
been designed and simulated and have been finally integrated one. Since the introduction of pipelining, the throughput
in a top-level module via XILINX ISE Design suit 14.7 along increased by the reduction of CPl. The instructions are
with power analysis. executable effectively in one clock cycle [2 ].

Keywords—RISC (Reduced Instruction Set Computing),16- Our objective is to draw a comparative analysis between
Bit, 32-Bit, pipelining, Verilog, Xilinx, Instruction Set, delay, the two processors on the basis of their performance,
power analysis, operatingfrequency, system architecture physical quantities and the complexity of instructions that
can be executed which will provide us with a detailed study
of making a choice between the two processors based on the
I. In t r o d u c t io n
application requirement. In the upcoming sections II and III
As the name suggests, the RISC microprocessor has a the architectures of the two processors have been discussed
limited number of instructions which can be executed much followed by experimental analysis in section IV followed by
faster since they are much smaller and simpler. They require a conclusion based on the results obtained in section V.
much lesser number of transistors which makes them cost
effective in the terms of both designing and production. The
idea behind the implementation is that RISC executes
simplified instructions as compared to complex ones which II. Sy s t em Ar c h it e c t u r e o f A 16-Bi t RISC Pr o cesso r
requires a complex design. The RISC instruction set The Harvard Architecture based design of 16-Bit RISC
comprises of simple and basic instructions while a complex processor presented here incorporates 8 general purpose
instruction can be executed via a combination of these basic registers, a basic Arithmetic Logic Unit (ALU) for basic
instructions. Since the instructions complete in a single operations such as addition and shift operations, data
cycle, it allows the processor to handle multiple instructions
memories and an instruction set of 14 instructions. It
at the same time. The instructions are register based hence,
data transfer takes place from one register to another. The supports a load store architecture where all operations are
instruction register was such programmed that it accepted a performed in the registers. Since it is a non-pipelined
processor it is much easier to understand and implement.
1318
978-1-6654-0521-8/21/$31.00 ©2021 IEEE

Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 01,2021 at 22:27:26 UTC from IEEE Xplore. Restrictions apply.
2021 7th International Conference on Advanced Computing & Communication Systems (ICACCS)

The ALU performs most of the instructions in the


instruction set. The ALU is controlled by the ALU_Op signal
from the ALU Control unit. It implements Arithmetic
operations such as Add and Subtract, Logical operations
such as AND, NOT (invert) of binary numbers and rotate
operations which come in handy during operations pertaining
to signal processing such as correlation [5]. Depending upon
the type of operation, the respective number of registers/data
elements are used. While basic addition operation will use 3
register (2 source and 1 destination), the add immediate uses
only 2 registers (1 source and 1 destination) along with a data
element in binary which is available for computation directly
in the opcode [6 ]. The impact of the operations is inflicted on
the output of the ALU and the flags present which indicate
any biproduct of the operation performed. Table I. gives a
detailed overview of the instructions executed in the ALU.
Fig 1. Architecture of 16-Bit Non-Pipelined RISC processor
C. Control Unit
Fig 1. gives a detailed view of the 16-Bit RISC The control unit acts as the brain of the processor and
processor. The architecture consists mainly of 5 logical receives input from opcodes stored in the memory. Based on
blocks [3]. These are Control Unit, Arithmetic Logic Unit them, it enables/disables signals which decide what operation
(ALU), Execution Unit, Program Counter (PC) and Memory is to be performed.
Unit. The register file contains all the operands which are
extracted from the memory unit and on appropriate signal Referring to Fig 2. The instruction type format belongs to
from the control unit, the operations are executed in the 1. Memory Access
ALU and respective read/write operation is executed and the
program counter passes the next instruction. 2. Data Processing
3. Branch
A. Program Counter 4. Jump
The objective of the 16-Bit latch is to fetch the address of
the next instruction present in the memory. Every time an
instruction is executed, the program counter is implemented
and the next instruction is loaded for executing. But until the
first one is executed the second instruction can’t be loaded. It
is connected to the control unit since most of the instructions
to control the PC are present in the instruction set. But at the
same time, it also complicated the hardware causing more
power dissipation in the circuit. Hence it is suitable for
load/store architecture [4].

B. Arithmetic and Logic Unit


Table I. ALU Control Design

A LU Control
ALU O pcode A L U cnt ALU I n s t r u c t io n
O £ _ (H e x ) O p e r a tio n
10 Xxxx 000 LW/SW Load/Store

00 0002 000 ADD Addition

00 0003 001 SUB Subtract

00 0004 010 INVERT Invert

00 0005 011 LSL Logical Shift


Left
00 0006 100 LSR Logical Shift
Right
00 0007 101 AND Logical And

00 0008 110 OR Logical Or Fig 2. Instruction Format of RISC processor

00 0009 111 SLT Set Less Than

1319

Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 01,2021 at 22:27:26 UTC from IEEE Xplore. Restrictions apply.
2021 7th International Conference on Advanced Computing & Communication Systems (ICACCS)

time time cycles instructions Fig 4. Describes how an operation is executed in the
----------- —-------- 1 __________ proposed model. This includes from fetching the instruction
program cycle instruction program from the memory to memory write back and getting the final
result of the operation [8 ].
Fig. 3 Performance Equation of a Processor
A. Pipelining
Though the 16-Bit RISC processor is non-pipelined yet it As the name suggests, it allows storing and executing
is able to execute simple instructions effectively and instructions in a sequential manner. The traditional CPU
overcomes the shortcomings of the traditional CISC dealt with the problem of excessive delay due to no other
processor where the number of instructions are minimized at task being executed when one of them is being processed.
the cost of CPI (cycles per instruction). RISC follows a This is a scenario similar to that of an operating system in
different strategy by reducing the CPI at the cost of the idle state when an I/O device is functional. This causes
number of instructions per program. This is depicted in Fig. excessive time delay. This problem is solved by the principle
3. The next section discusses what the 32-Bit RISC processor of pipelining which allows parallel execution i.e., while
brings in with increasing technological requirements. opcode 1 has been fetched and being executed, opcode 2 can
be fetched and decoded leading to parallel running of
multiple instructions at the same time [9]. The processor
receives the following advantages due to pipelining:
III. Sy s t e m a r c h it e c t u r e o f 32-Bi t RISC Pr o cesso r
• The cycles per instructions (CPI) of the processor are
The 32-Bit RISC processor is a 5-stage pipelined, reduced while the speed up (in theory) increases by a
Harvard architecture-based implementation MIPS processor factor of the number of stages of pipelining.
which has been a commercial success. A 32-bit machine
• Pipelining also reduces the delay between completed
works better as compared to a 16-bit machine since it cannot
instructions which is referred to as throughput. This
only address twice as many unique memory addresses, but
can also access data in memory in wider “chunks”. The is because multiple instructions are processed at
every instant due to which the average time taken per
advantage is that, all other things being equal, a 32-bit
computer is functionally going to be slightly less than twice instruction reduces.
as fast as its 16-bit counterpart. • Pipelining allows more complex instructions to be
With data width increasing from 16 to 32 bits, new incorporated in the ALU, since the design is much
application areas such as graphics or manipulation of large faster as compared to the previous ones. This also
data structures opened up along with improved scope of increases the extent of applications where such a
working in domains such as convolution. Complex processor can be used. E.g., DSP (Digital Signal
computations such as calculating n* power of a number take Processing) applications such as convolution
humongous amount of effort and time and hence a 16 bit is • Pipelining enables CPUs to operate at a higher
not fit for this purpose [7]. frequency than the RAM due to which the overall
performance of the computer increases substantially.
This is because the net combinational circuit is
simplified reducing the net time period (delay).

Fig. 5 provides a detailed architecture of 32-Bit MIPS


processor which has been implemented. It incorporates 5
stage pipelining which are as follows.
1. Instruction Fetch where the instructions are
retrieved from the memory and stored in registers.
The instruction which is supposed to be executed is
determined via the PC which contains its address.
2. Instruction Decode where the opcode and operands
are separated from each other and then are
compared to the instruction set available to the
processor in the register file.
3. Execute where the computation takes place and the
operands are made available to the ALU and the
output of the computation is available as ALU_out.
4. Memory Access where the output of the ALU is
stored at a location in Data Memory.

1320

Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 01,2021 at 22:27:26 UTC from IEEE Xplore. Restrictions apply.
2021 7th International Conference on Advanced Computing & Communication Systems (ICACCS)

Fig 5. Architecture of 32-Bit Pipelined RISC Processor

5. Write Back where the computed data is written 2. I type instruction encoding where the ALU
back in the register, the location of which is operations are performed on one source register
available in the final opcode. data and on immediate data (specified in the
opcode) and the result is stored in destination
register. The last 15 bits are designated for the
The objective of achieving maximum processing speed as immediate data on which operation is to be
compared to a similar non pipelined processor is performed as specified by the user.
accomplished via utilization of hardware to the maximum
3. J type instruction encoding where the instruction
extent which improves the speed even when more
complicated instructions are being executed [1 0 ]. encoding is used for jump instructions which
change the flow of execution of instructions in the
processor and takes the program sequence to the
B. Modified Instruction Set specified memory address [11]. Here the instruction
Due to availability of more general-purpose registers and is of 32 bits. The first 6 bits from the left side,
increase in the length of opcode, 6 bits are used for the contains opcode of the operation to be performed
purpose of defining a type of instruction which means that up while the next 26 bits contain immediate data
to 63 instructions can be realized. The proposed MIPS 32-Bit address where the program sequence is to be
architecture contains complex instructions such as transferred.
multiplication, comparison instructions such as contents of
Table II. Instruction Set
register equal to zero.
As presented in Fig. 6 , there are majorly 3 types of Instruction Set
instruction encoding present in this processor. O pcode A L U O p e r a t io n I n s t r u c t io n

000000 ADD R Type: Add


1. R type instruction encoding where the ALU works
000001 SUB R Type: Subtract
only upon source and destination register. The 5 bits
ranging from 10 to 6 contain shift amount (shamt) 000010 AND R Type: AND
for shifting and rotating while the last 6 bits that is 5 000011 OR R Type: OR
to 0 contain opcode extension (funct) for additional 000100 SLT R Type: Set Less than
functions to be performed. 000101 MUL R Type: Multiply
000110 XOR R Type: XOR
000111 INVERT R Type: Invert
001000 LW I Type: Load
001001 SW I Type: Store
001010 ADI I Type: Add Immediate
001011 SUBI I Type: Subtract Immediate
001100 SLTI I Type: SLT Immediate
001101 BNEQZ I Type: Content not equal to 0
001110 BEQZ I Type: Content equal to 0
001111 BNE I Type: Contents not equal
010000 BEQ I Type: Contents are equal
010001 J J Type: Jump to segment
011111 HLT R Type: Halt
Fig 6. Instruction Format

1321

Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 01,2021 at 22:27:26 UTC from IEEE Xplore. Restrictions apply.
2021 7th International Conference on Advanced Computing & Communication Systems (ICACCS)

The proposed model has incorporated a total of 19


instructions which can be further expanded depending 5. Carry Flag (CF)
upon the utility. It is evident from Table II that there is The 5th bit of the flag register is occupied by CF
scope of expansion in the proposed instruction set. Apart which stores the carry generated by MSB or the
from the basic ones available in the Instruction set, there borrow generated for MSB. If there is a
are shift commands which are used for carry/borrow, the CF flag is set else reset.
multiplication/division of the contents of a register by 2
[12]. This is a better method as compared to the traditional The next section provides details about the
multiplication operation available in the instruction set to experimental results, power analysis and various other
fetch results when the data present in a register needs to be factor which highlight the performance of 1 processor
multiplied by powers of 2 . over the other.

C. Flag Register IV. Ex p e r im e n t a l Re s u l t s a nd Po w e r a n a l y s is

Flag registers are available in the 16-Bit RISC The performance comparison is best achieved by
processors as well but their reach and usage are limited simulating a similar program in both the processors and
due to simpler computations and a limited instruction set. then judging them on various fronts. For this purpose, a
When an instruction is executed, the output of the ALU simple addition program has been first implemented in the
fetches a biproduct which can be evaluated by the changes 16-Bit RISC processor and then 32-Bit RISC processor.
in the flag register [13]. 5-Bits are being used out of 32- The implementation has been done on Xilinx ISE suit
Bits. These flags are evident in Fig. 7. 14.7.

1. Parity Flag (PF) For 16-Bit processor, since register 3 (R3) can
The 1st bit of the flag register is occupied by PF contain garbage values, the subtract command is used in
whose objective is to count the number of 1 s in order to clear the contents of R3 (‘0000110110100001’)
the result. If the number of 1s is even then PF is and transfer them to register 2 (R2). After this the number
set ( 1 ) else it remains reset (0 ). 7 is added (‘1110100110000111’) to the contents of R2
and then the result is stored in R3. Similar operations of
2. Zero Flag (ZF) adding 5 (0101) and 13 (1101) are conducted and the
The 2nd bit of the flag register is occupied by ZF result is finally stored in R2. At every positive edge of
whose objective is to determine whether the clock (clk), computation takes place.
result of the ALU or the instruction (BEQZ,
BNE etc.) is zero or not. If the result is 0; the ZF But the limitation of the simulation is that the
is set else it remains reset. computation can only use 8 -Bit numbers (max) even
though the result of the ALU can go up-to 16 bits. The
3. Sign Flag (SF) following is evident in Fig. 8 . This shortcoming is
The 3rd bit of the flag register is occupied by SF overcome by the usage of 32-Bit RISC as given in Fig. 9.
whose objective is to determine whether the In case of 32-Bit processor, both the ADI command and
result (signed number) obtained is within the ADD are being used. Since the registers are already in
range of signed numbers. The MSB of the result clear state, register (R1) is initialized with the value of 10
is used for this purpose. If it is 1 then the number using ADI command (‘001010 00000 00001
is negative while if it is 0 then the number is 0000000000001010’). Similarly, R2 and R3 are initialized
positive. with 20 and 25 respectively.

4. Auxiliary Carry Flag (ACF)


The 4th bit of the flag register is occupied by
ACF which denotes the carry generated every
time an arithmetic operation (addition,
subtraction) takes place (except for MSB). This
stores the extra generated carry or borrow and is
used for computation bit by bit and keeps
updating itself once used.

4 3 2 1 0
CF ACF SF ZF PF
Fig 8. Simulation Result for 16-Bit RISC

Fig 7. Flag Register

1322

Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 01,2021 at 22:27:26 UTC from IEEE Xplore. Restrictions apply.
2021 7th International Conference on Advanced Computing & Communication Systems (ICACCS)

O n -C h ip P o w e r
ctkl

li cm
► I f k[31:fl OûûûûûûOûûOûûûi
1__1 D yn am ic: 0.0 0 2 W (2 % )
► I f PC01:O]
► I f IF.IDJR[31:0j 14 %
' 1 C lo c k s : 0.001 W (1 4 % )
► I f IFJD_NRC[31:0]
y ^ ID_EX_A[31:fj
S ig n a ls : 0 .0 0 1 W (6 % )
► 1$ ID_EX.B[31;0|
►H ID.EXJR[i1:0] L o g ic : 0 .0 0 1 W (5 % )
y | f lD_EXJmm[31:fl
75%
► | f iD.EXtypell'O] JO : 0.001 W (7 5 % )
y I f EX.MEM_ALUout[31:ffl ¡XXXXXX. (aoo t
y | f EX.MEM.IR01A
y | f EX.MEMjyptßO] n D e v ice Static: 0 .1 0 4 W (9 8 % )
H ¿i; MEM.,VB.ALUout[31:0]
y I f MEM.WB.IRI31:0] m xm xm m ;
y | f MEM_WB_typ([2;0i XXX

srun 1000 ns
Simulator is doing circuit initialization process.
Finished orcut intakzätion process,
RO-O
Total On-Chip Pow er: 0.106 W
Rl-10
R2-20 Design P ow er Budget: Not Specified
R3-25 P ow er Budget Margin: N/A
R4-30
R5-5S Junction Tem perature: 26.2*C

Therm al Margin: 58.8‘ C (4.9 W)


Fig 9. Simulation Result for 32-Bit RISC
Effective 3JA 11.5-C/W
Once all the registers are assigned their individual Pow er supplied to off-chip devices: OW
values, the contents of R1 and R2 are added and their
value is stored R4 (‘000000 00001 00010 00100 00000 Confidence level: Medium
000000’). Similiarly, the contents of R3 and R4 are added Fig 11. Power Analysis of 16-Bit RISC Processor
and stored in R5.
On the basis of the following computations the power
The 32-Bit RISC processor is also capable of and timing analysis of the 2 RISC processors is done via
computing complex calculations like that of a factorial of Xilinx [15] [16]. Fig. 11 elaborates on the power analysis
a number which is only possible due to an inbuilt of 16-Bit RISC processor while Fig. 12 elaborates on the
multiplier. The multiplier used here is a basic Wallace power analysis of 32-Bit RISC processor.
Tree multiplier [14] which is more effective as compared
to the traditional method of bit by bit multiplication. For On-Chip Power
instance, the factorial of 7 has been calculated and
displayed by the ALU in Fig. 10. Functions such as load, Dynamic: 0.064 W 38%)
multiply, jump staements have been used in order to 38%
10 %
implement a recurssion based factorial determining 9% □ Clocks: 0.007 W (10%)
algorithm. □ Signals: 0 006 W (9%)
□ Logic: 0 004 W (6%)
75%
62% □ I/O: 0 048 W (75%)

Device Static: 0.105 W 62%:

Total On-Chip P o w er 0.169 W


Design Power Budget: Not Specified
Power Budget Margin: N/A
Junction Temperature: 27.0°C
Thermal Margin: 58.0“C (4.9 W)
Effective SJA: 11.5”C/W
Power supplied to off-chip devices: 0 W
Confidence level: Low

Fig 10. Factorial computation on 32-Bit processor Fig 12. Power Analysis of 32-Bit RISC Processor

1323

Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 01,2021 at 22:27:26 UTC from IEEE Xplore. Restrictions apply.
2021 7th International Conference on Advanced Computing & Communication Systems (ICACCS)
processor loses a significant amount of time dealing with
Table III. Timing and Frequency Summary Comparison
the stack element.
16-Bit 32-Bit Pipelined
Nonpipelined RISC All the aforementioned points suggest that the 16-Bit
RISC processor is now limited to a situation specific demand
because on one hand operating on 16-Bit increases code
Speedup 1 5 density as compared to a fixed 32-Bit format, the
instruction set performance and ability to improvise
Maximum makes 32-Bit processor a much better option in general.
Operating 78.654 MHz 139.438 MHz
Frequency VI. Re f e r en ces
Maximum
[1] Sivaram a P. D andam udi, "A G uide To R IS C P rocessor
Combinational 13.981 ns 7.028 ns F or P rogram m ers A n d Engineers" in , Springer.
Delay [2] G M am un B, Shabiul I. and Sulaim an S, “A Single
C lock C ycle M IPS R IS C P ro cesso r D esign using
While the power analysis highlights the power VHDL”
consumption of each processor, the timing analysis and [3] Sam iappa Sakthikum aran, S. S alivahanan, V. S.
maximum operating frequency which is available in Table K anchana B haaskaran. "16-B it R IS C pro cesso r design
for co nvolution application", 2011 International
III. highlight the extent to which minimum time period C onference on R ecen t T rends in Inform ation
and combinational delay is affected [17]. Even though the T echnology (IC R T IT ), 2011
word length has been increased, yet the affect of [4] C handran V enkatesan, M. T habsera Sulthana, M.
pipelining is more prominent, hence reducing the net G .Sum ithra, M. Suriya. "D esign o f a 16-Bit H arvard
combinational delay. Structure R ISC P ro cesso r in C adence 45nm
T echnology", 2019 5th In ternational C onference on
A d v an ced C om puting & C om m unication System s
The next section will draw a detailed comparison (IC A C C S), 2019
between the two processors based on the experimental [5] "P rocessor D esign", Springer Science and B usiness
results and observations made. M edia LLC , 2007
[6] G uang-M ing Tang; Pei-Y ao Q u; X iao-C hun Ye; D ong-
R u i FanR. N icole, “L ogic D esig n o f a 16-bit B it-Slice
V. Co n c l u s io n
A rithm etic L ogic U n it fo r 3 2-/64-bit R SFQ
M icroprocessors” IE E E T ransactions o n A pplied
In general, a 16-Bit processor costs lower than a 32- Superconductivity, vol. 28, issue no. 4 - 31,Jan. 2018.
Bit processor, owing to the fact that its internal data path [7] C handran V , A li K S, G nanaprakash V, “E nergy
E fficien t and H ig h sp e e d R ounding-B ased A pproxim ate
is narrower, so lesser number of transistors are required M ultiplier” .
for manufacturing it because of which a reasonable
[8] J.B. D ennis; G.R. G ao; “A n efficient p ipelined dataflow
amount of area is available. This space can be used for pro cesso r architecture” IE E E Supercom puting
accommodation of features (chip memory, peripheral '88:P roceedings o f the 1988 A C M /IE E E C onference on
interfaces etc.). S upercom puting, Vol. I, 6 A ug. 2002
[9] Iro P antazi-M ytarelli; “The h istory an d use o f pipelining
com puter architecture: M IPS pipelining
The aforementioned point is evident from the given im plem entation” 2013 IEEE L ong Island System s,
observations made in the previous section i.e., the total A pplications and T echnology C onference (L ISA T ), 3
power consumption for a 32-Bit processor is about 60% M ay 2013
more than that for the 16-Bit processor. Comparing the [10] B ai-Z hongY ing, C om puter O rganization, Science Press,
two figures we can see that the dynamic power 2000.11
consumption for the 32-Bit processor is much more as [11] J. L. H ennessy, "V L SI P rocessor A rchitecture", IEEE
T ransaction o n C om puters, vol. c-33, no. 12, Dec. 1984.
compared to the 16-Bit processor. The reason for this
[12] R ohit Sharm a, V ivek K um ar Sehgal, N itin N itin1,
phenomenon is the higher operational frequency. A higher P ran av Bhasker, Ishita V erm a; “D esig n and
frequency leads to higher number of operations that are Im plem entation o f a 64-bit R IS C P rocesso r using
being performed during a cycle which further leads to the V H D L ” U K S im 2009: 1 1th International C onference on
C om puter M odelling and Sim ulation, 978-0-7695-3593-
dynamic power consumption. Apparently, the 32-Bit 7/09, 2009 IEEE.
processor is approximately 70% faster than the 16-Bit [13] S. P. R itpurkar; M. N. T hakare; G. D. K orde; “D esign
processor. These observations are expected as the 32-Bit and sim ulation o f 32-B it R IS C architecture b ased on
processor is capable of storing more computational values M IPS using V H D L ” IE E E 2015 International
and the pipelined architecture of the processor reduces the C onference on A dvanced C om puting and
C o m m unication System s, 12 N ov. 2015
size of each instruction cycle thereby increasing the
[14] N. Sureka, R. Porselvi and K. K um uthapriya, "A n
operating frequency and decreasing the combinational E fficient H igh Speed W allace Tree M ultiplier", 2013
delay. In case of arithmetic calculations, although 16-Bit International C onference on Inform ation
processor can offer a decent processing speed in a small C om m unication and E m bedded System s (ICICES).
system at minimum cost but in case of applications that [15] V ivado D esig n Suite U ser G uide: Synthesis (U G901).
require high efficiency such as floating-point arithmetic, [16] V ivado D esig n Suite U ser G uide: Im plem entation
(U G 904)
32-Bit processors should be preferred. The 16-Bit
[17] Sivaram a P. D andam udi, "P rocessor D esig n Issues" in
G uide to R ISC P rocessors, Springer, pp. 13-36.

1324

Authorized licensed use limited to: BOURNEMOUTH UNIVERSITY. Downloaded on July 01,2021 at 22:27:26 UTC from IEEE Xplore. Restrictions apply.

You might also like