Design and Verification of 16 Bit RISC Processor
Design and Verification of 16 Bit RISC Processor
Abstract— Reduced Instruction Set Computer (RISC) is a Vedic Mathematics is derived from Ancient Indian
design which presents better performances, higher speed of Scriptures, which gives mathematical outcomes and basic
operation and favors the smaller and simpler set of understandable structures. The word Vedic is known from the
instructions. A 16 bit RISC processor designed in this paper word Veda which means the storage facility of all the
is capable of executing more number of instructions with information. Vedic Mathematics is mainly based on 16 Sutras
simple design, using the Verilog Hardware Description which provides the manipulations in arithmetic, logical math,
Language (HDL) and the design is simulated in the Xilinx ISE geometry, etc.
14.7 design suite. The main achievement in this work is that
the multiplier unit in Arithmetic and Logic Unit (ALU) and In this paper section 1 describes related work literature
Multiplier and Accumulator (MAC) is implemented using survey of the RISC Processor of the system, section 2 describe
Vedic Sutras. The main principle used in Vedic mathematics the implementation of processor design methodology, section
is to reduce the typical calculation of conventional 3 describe results observed in the system, section 4 shows the
mathematics to very simple one and hence reduce the overall verification results of the proposed system.
computational complexity. In addition to these blocks,
designed RISC Processor consists of other blocks like Control Vedic Mathematics has another method of estimations and
unit and data path, Register Bank, Program Counter and has arrangement of mathematics which depends on 16 Sutras.
Memory. The proposed RISC processor is very simple and Utilizing these procedures in the processing algorithms will
capable of executing 14 instructions. The achievement in this improve the execution time, area, power, and complexity and
work is that 44% savings in power in case of MAC and that so forth . Vedic System proposes a one of a kind and
of 12% in case of ALU is achieved compared to conventional incredibly compelling methodology covering a wide range -
ALU and MAC respectively. Also the delay is reduced by beginning with essential multiplication to closing a generally
45% in case of MAC and that of 35% in case of ALU in propelled theme, the arrangement of non-direct incomplete
comparison with conventional ALU and MAC differential conditions. In any case, the Vedic structure isn't
correspondingly. These Vedic MAC and ALU are then just an accumulation of quick techniques; it is a framework, a
integrated with other blocks in processor and 16-bit Vedic brought together methodology S. Lad and V. S. Bendre [1]
processor is developed. This reduces the delay by 34% and
explained the comparison of various types of multipliers and
saves around 88% power compared to conventional
processor. Hence the improvement in speed of operation,
proved that how Vedic multipliers are efficient than tree and
reduction in power utilization and less area utilization are the array multipliers. In [2],Vishwas V. Balpande, Abhishek B.
key features of designed RISC processor. Pande, Meeta J.Walke, Bhavna D. Choudhari, Kiran R.
Bagade, described a basic blocks of RISC Processor with
Keywords— Reduced Instruction Set Computer; Von- RTL. SeungPyo Jung, JingzheXu, Donghoon Lee, Ju Sung
Neumann architecture; Verilog HDl, Vedic Mathematics, Park, Kang-jookim, Koon-shik Cho [3], Presented a design
Urdhva-Tiryagbhyam Sutra. uses Harvard architecture and 5-stage pipeline structure. The
16-bit RISC Processor is designed and verification of the
I. INTRODUCTION processor is done in 3-stages.In [4], Abhyarthana Bisoyil,
The Reduced Instruction Set Computer (RISC) is a MituBaratl, Manoja Kumar presented a comparison of 32- bit
microprocessor CPU design in a computer with highly Vedic multiplier with a conventional binary multiplier.
optimized instructions, small and specialized instruction set The design of multipliers i.e. binary multiplier is done to
than that often found in other architecture like Complex find the product of two n-bit binary numbers and then
Instruction Set Computer (CISC). The main difference implement it on a Nexys 3, Spartan 6 FPGA board. After
between the features of CISC and RISC architecture is implementation the 32 bit multiplier is compared with the
RISC processor is optimized with large number of registers conventional multiplier based on their summary report. In [6],
and an instruction pipelining, also allows low number of Mr. Nishant G. Deshpande, Prof. Rashmi Mahajan presented
clock cycles per instruction. Also, the main feature in RISC an Ancient Indian Vedic Mathematics based Multiplier Design
is LOAD/STORE architecture. In CISC the controller for High Speed and Low Power Processor. The multipliers,
design is complex and also performance wise it was not with low power requirement and maximum speed, give
upto the expectation. That’s the reason why any typical information of “Urdhva-Tiryagbhyam” algorithm of ancient
RISC architecture has very few instructions, where Indian Vedic mathematics, which has utilized for
processor asked data from memory probably not other than multiplication to improve speed, area. In [8], Pravin S. Mane,
Load and Store. Indra Gupta, M.K. Vasatha presented Implementation of RISC
Processor on FPGA. In this paper the simulation of five-stage
Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 15,2021 at 06:26:50 UTC from IEEE Xplore. Restrictions apply.
pipelined is designed of 16-bit Microprocessor without operands in memory then assigning a command to a processor
Interlocked Pipeline Stages (MIPS) which is RISC. to execute the instruction.
II. PROPOSED METHODOLOGY This is done with the help of control unit which generates
the timing signals that controls the various processing
A. RISC Processor System: elements which involves in execution of the instruction.
This section gives the outline for the RISC Processor
4) Program Counter (PC):
architecture and working of the system which utilizes the
Program Counter points to the next instruction to be
synchronous execution of the instructions and hence allows
executed. In the complete instruction cycle, the instruction is
the processor to work effectively. As per the requirements
loaded into the instruction register after the processor fetches
of the organization for the power or speed, the architecture/
it from the memory location which is pointed by the program
design are chosen according to the various parameters for
counter.
the application. The function of the processor is to execute
each and every instruction set efficiently as per the machine 5) Control Unit:
language. Control Unit is an essential part of any kind of computer or
systems because this circuits generates the timing and control
signals for the operations which is performed by the CPU.
Here, the communication is between the ALU and the main
memory as it controls the transmission of signals between the
processor, memory and various buses.
6) Memory Address Register (MAR):
The MAR is also called as address buffer; the address in
the program counter is applied to memory so after the
increment in PC to the next address the current instruction is
stored in the Memory location. The MAR is completely
loaded with Binary words which point the location of the word
in RAM. This location stores the instruction in it.
7) Multiplexer (MUX):
The multiplexer block works as input selector. It can
Fig. 1. Block Diagram of Processor
control wires which act as select lines. It is a circuit which
takes multiple inputs and gives the single output.
Fig.1. shows the block diagram of the processor. The
blocks of the proposed system are as follows: 8) Instruction Set Architecture:
Instruction Set Architecture (ISA) provides the
1) Arithmetic Logic Unit (ALU):
information to write the program in machine language. It also
ALU is the combinational circuit which means
allows translating a high level program language to machine
Arithmetic and Logical Unit. This unit is designed to
language.
perform various numbers using various instruction sets. In
Processor, ALU inputs consist of instruction (machine
word) which is operation code (opcode) and some
operands. So the opcode tells the ALU which and what
operation is to be performed then these operands are used in
the operation.
Fig. 2. Instruction Set Format
2) Register Bank:
There is a small set of data holding place that is known In above figure it is a 16 bit instruction frame where
as Register bank. The ALU stores the result of operation in starting 12 bits (0-11 bits) indicate an address. Next 3 bits
accumulator which later on is placed in a storage register from (12-14 bits) specify operation code (opcode). Left one bit
and it checks the bits and indicates whether the operation i.e. (15th bit) is an addressing mode I bit. The following table
was performed successfully. If not successfully executed shows instructions and corresponding codes considered while
then some type of status will be shown i.e. even known as designing the instruction set in this work.
Z-Flag or status register. Its function is to execute programs
and operate efficiently for the data stored in memory. TABLE I. INSTRUCTION SET
Instruction
3) Instruction Register (IR): Instruction
Code
Operation
A processor has a set of instructions which is nothing Addition 0000 Out=a+b
but a command to perform a task in a computer. The control Subtraction 0001 Out=a-b
unit holds the instruction to be executed. In CPU, the Multiplication 0010 Out=a*b
registers such as address register, data register and an Division 0011 Out= a/b
instruction register is present. The performance of the CPU NOT 0100 NOT R0, R0 = !R0
is to fetch , decode and execute the operations on memory Read 0101 RD FECH_ADDR
R2 xxx R2 = M[xxx]
according to the registers. The task of IR includes the Write 0110 WR R3 xxx, -
decoding the op-code, determining the instruction, M[xxx] = R3
determining which operands are in memory, retrieving the Branch 0111 BR loop1
Branch-Zero 1000 BRZ Exit1
760
Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 15,2021 at 06:26:50 UTC from IEEE Xplore. Restrictions apply.
(BRZ) Following table shows the performance parameters
IOSTS 1011 IOSTS R1; R1 = obtained from the simulation of MAC
IO_STU
Shift Left 1100 R0, R0 = R0<<1 TABLE II. VALUES OBTAINED FOR MAC USING URDHVA-
Shift Right 1101 R0, R0 = R0>>1 TIRYAGBHYAM SUTRA
ADRI 1110 ADRI BYMEM;
Addr = *p(raw_addr) Parameters Urdhva Tiryagbhyam
*Vedic Mult 1111 rawnum(*); -> value This Work (MAC) Existing MAC[9]
Sutra may change by Number of Slices
main() 363 1243
LUTs
Number of IOB 64 66
B. Vedic Mathematics Sutra: Delays(nS) 21.55 38.82
Power (mW) 48.5 86.91
All the signal processing applications require
multiplication as a basic operation. As study reflects
multipliers require more silicon area, also they consume B. ALU using UrdhvaTriyagbhyam:
more power due to switching activity so they create longest The 16-Bit ALU schematic consist of circuitry blocks consisting
path via adders and carry. This limits the maximum speed of reversible Full Adder designed using gates.
in processor. So to overcome the limitation of critical path
The design of ALU requires four 8-Bit Multiplier and 2 adders.
and power in processors, Vedic Multipliers are designed Here the simulation is based on UrdhvaTriyagbhyam Sutra which
using various sutra for different multiply algorithm. Hence, simulates the basic operations of Arithmetic Logic Unit. This
Vedic Multiplier has proved to be more efficient using technique uses less number of computational steps for designing and
different Vedic sutra which reduces both critical path and obtaining the results of Multiplier.
dynamic power. The sutra used in design of multiplication
for the proposed system is “UrdhvaTriyakbhyam” sutra.
The UrdhvaTriyakbhyam sutra is applied to multiplication
which follows the vertically and crosswise pattern.
Therefore, the detailed study of designing the multiplier and
comparison using various sutra is performed using Vedic
mathematics algorithm.[1]
III. SIMULATION RESULTS
A. MAC using UrdhvaTriyagbhyam Sutra:
The following simulation result of 16-Bit Multiplier is
the output of Vedic multiplier using UrdhvaTriyagbhyam
sutra. It is the execution of MAC operations performed with Fig. 5. RTL Schematic of 16-Bit ALU
two inputs as a=252 and b=846 giving an output value as
c=213192.
As the number of bits increases, the delay and area of
gate increases very slowly as compared to other multipliers.
This makes the processor speed, power and time efficient.
761
Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 15,2021 at 06:26:50 UTC from IEEE Xplore. Restrictions apply.
the instruction execution which controls the operations. The
below simulation figure is the control signal for the instruction
of 0011 which performs the division operation and the data is
performing on left shift operation.
F. Processing Unit:
As the processor design consist of basic blocks for the
Fig. 8. Subtraction Operation implementation of hardware on processor. So the complete
processing unit consist of Storage Register File, Instruction
Register, Control Signal and various multiplexer select lines.
The processor has the limitation in speed, power, delay etc. so
to overcome the limitations the integration of vedic multiplier
with the processor design is implemented.
762
Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 15,2021 at 06:26:50 UTC from IEEE Xplore. Restrictions apply.
Fig. 17. Verification of 16 bit Vedic Multiplier.
763
Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 15,2021 at 06:26:50 UTC from IEEE Xplore. Restrictions apply.
International Conference on Advanced Communications, Control [8] P. S. Mane, I. Gupta and M. K. Vasantha, "Implementation of RISC
and Computing Technologies, Ramanathapuram, 2014, pp. 1757- Processor on FPGA," 2006 IEEE International Conference on Industrial
1760, doi: 10.1109/ICACCCT.2014.7019410. Technology, Mumbai, 2006, pp. 2096-2100, doi:
[6] Mr. Nishant G. Deshpande, Prof. Rashmi Mahajan, “Ancient Indian 10.1109/ICIT.2006.372448.
Vedic Mathematics based Multiplier Design for High Speed and [9] Ram, G. & Lakshmanna, Y. & Rani, D. & Kandula, Bala. (2016). Area
Low Power Processor”, IJAREEIE, Pune, 2014 efficient modified vedic multiplier. 1-5.
[7] Priyanka jain, Dr. G. S. Virdi, : Multiplier-Accumulator (MAC) 10.1109/ICCPCT.2016.7530294.
Unit: International Journal of Digital Application & Contemporary
Research ,Volume 5, Issue 3, October 2016
764
Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 15,2021 at 06:26:50 UTC from IEEE Xplore. Restrictions apply.