Design and Implementation of Synthesizable 32-Bit Four Stage Pipelined RISC Processor in FPGA Using Verilog/VHDL
Design and Implementation of Synthesizable 32-Bit Four Stage Pipelined RISC Processor in FPGA Using Verilog/VHDL
1 (2014) 81-88
Bikash Poduel1, Prasanna Kansakar1, Sujit R.Chhetri1 and Shashidhar Ram Joshi2
1
Department of Electronics and Communication, Thapathali Campus, Institute of Engineering,
Kathmandu Nepal
2
Department of Electronics and Communication, Institute of Engineering,
Pulchowk Campus, Lalitpur
e- mail: [email protected]
Abstract
This paper is delineating the design and implementation of high performance, synthesizable 32-bit pipelined Reduced
Instruction Set Computer (RISC) Core. The design of the Harvard Architecture based 32-bit RISC Core involves
design of 32-bit Data-path Unit, Control Unit, 32-bit Instruction Memory, 32-bit Data Memory, Register file with
each register of size 32 bit. The processor is divided into Fetch, Decode, Execute and Write Back block in order to
implement a four-stage pipeline. A 2*16 LCD is connected to the processor IO block to show the instruction
execution sequence for demonstration in FPGA. The RISC Core is designed using Verilog HDL and VHDL and is
tested in ISIM Simulator. The implementation of the processor is done in a Spartan 3E Starter Board using Xilinx ISE
14.7. All of the instructions incorporated with the processor have been tested successfully both in simulation and
hardware implementation in FPGA.
Key words: SoC, Harvard architecture, Xilinx ISE, IP Core, Register file.
81
Nepal Journal of Science and Technology Vol. 15, No.1 (2014) 81-88
jobs. This Hi-Tech business has a huge upside the instruction to be fetched is loaded in the Program
potential technically and economically thus creating Counter (PC). The address is also saved in the pipelined
myriad of job opportunities. This paper is inception register for internal purpose. The particular instruction
from our team to start writing and contributing IP Cores, pointed by PC is fetched from the Instruction Memory
which in our case, involves designing, and verifying into the Instruction Register (IR). The second stage is
processor core. This paper, we believe, shall be the the Instruction Decode stage, where the Instruction
reference for students who wanted to learn designing Decoder decodes the fetched instruction. Here, the
and verifying processor core, for teachers who want decoder extracts out the source operands, destination
to teach a simple real world implementation of operand, and the operation to be performed, from the
processor core in FPGA, and for all of those who available 32-bit instruction. The next, is the Execution
wanted to be a contributor of IP Cores. This project stage where the operation on the source operands is
shall emanates a good insight of a real deal involve in performed and the result is generated in the internal
Hi-Tech work which is designing and verifying Cores pipeline register. The corresponding flags (Negative,
as Processor, Bus, Peripherals, etc. thus bringing the Carry, Zero and Overflow) are set according to the
horizon closer and clearer which was hazy since result of the arithmetic and the logical operations
nobody has done it before here. performed. For the load and store instruction, the
memory location where the data is to be loaded or
Methodology fetched from is calculated. The final stage is the Write
Basic building block of RISC core Back or Memory Read/Write stage. For the arithmetic
The HDL design of the RISC core involves the design and logical operation, the result of the ALU operation
of the following components, which are the major is written to the destination register and for the load/
building blocks of most of the processor: memory unit, store, instruction reading from the data memory or
data path unit, and control unit. writing to the data memory is done.
82
Bikash Poduel et al./Design and Implementation of.......
Fig.1. A four stage pipeline showing the stages and the respective operations
Pipeline hazards: Conditions that may result in the resulting in a resource conflict. For example, a common
need to delay the execution of an instruction pipeline problem of this form occurs when there is only one
are referred to as pipeline hazard (Jiang 2006). There memory device for instruction and data and a load or
are three main types of pipeline hazards - structural store instruction is encountered. Since instruction
hazard, data hazard, and control hazard. must be fetched during each clock cycle, if an
instruction in the pipeline requires an access to
Structural hazard refers to the situation in which two memory then a memory access conflict will result.
instructions must use a common resource, thereby
83
Nepal Journal of Science and Technology Vol. 15, No.1 (2014) 81-88
Table 2. Four-instruction execution with four-stage pipelining takes 7-clock cycles (less than half of what it
takes without pipelining)
84
Bikash Poduel et al./Design and Implementation of.......
Control hazard occurs when an unconditional or instruction scheduling in the assembly to remove the
conditional branch instruction is encountered. In data hazard. For the control hazard, whenever a branch
absence of branch instruction, all instructions typically instruction is encountered in which the branch takes
can be fetched in strict sequential order. Then, if a place, the following instructions currently in the pipeline
fixed length instruction format is used, a sequence of are marked and subsequent instructions are fetched,
instructions can be pre-fetched before the first starting from the branch-target address. Whenever a
instruction of the sequence has completed the marked instruction is encountered during the execution
execution. However, if the first instruction happens stage, the result of that instruction is nullified by not
to be a branch instruction, then all subsequent storing the execution result for that instruction.
instructions fetched in this manner will be incorrectly
fetched instructions. Instruction set: The RISC core has 32-bit instructions,
which mostly are register based operations. Since
register are fast memory RISC core mostly uses register
Pipeline hazard solution adopted: In order to remove
based addressing mode. However, there is support for
the structural hazard, a separate memory for the data
immediate operand and loading/storing mechanism for
and instruction is used. Thus, there are two sets of
memory access.The instructions are divided into four
the address and data lines in the processor. There is
categories, which are Data Processing Instructions,
no any mechanism used to remove the data hazard. It
Load/Store Instruction, Interrupt Instruction, and
is left for the assembly programmer to perform
Branch Instruction.
85
Nepal Journal of Science and Technology Vol. 15, No.1 (2014) 81-88
Load/store instruction
Table 4.List of memory access instructions and their respective address fields
Branch instruction
Table 5. List of branch instructions and their respective address fields
Interrupt instruction
Table 6. List of interrupt instructions and their respective address fields
Programmer model of the RISC core Table 7. Programmers model for the RISC core
There RISC core executes 32-bit instruction. The
registers are all 32-bit wide. The ALU is 32-bit. The
programmer model for this RISC core is shown in Table
7. There are 16 visible registers (R0-R15) each 32-bit
wide and four status registers.
86
Bikash Poduel et al./Design and Implementation of.......
87
Nepal Journal of Science and Technology Vol. 15, No.1 (2014) 81-88
References
Hong J. 2006. Pipeline: Hazards. cse.unl.edu/~jiang/
cse430/Lecture%20Notes/Pipeline_Hazards.ppt.
The simulation and result of this processor verifies all Iannucci, R. A. 1988. Towards a Dataflow / Vons Neumann
of the instructions incorporated in the RISC core. The Hybrid Architecture.In: Proceedings of the 15th Annual
RISC core is useful to develop a 32-bit micro-controller International Symposium on Computer architecture
by simply adding the peripherals and a bus. Since this (May 1988), IEEE Computer Society PressLos
core is synthesizable and reconfigurable one can Alamitos, CA, USA, pp. 131-140.
Kilts, Steve.2007. Advanced FPGA Design: Architecture,
upgrade it by increasing the memory of the processor,
Implementation, Optimization.Wiley-IEEE Press, New
by moving up to 5-stage pipeline, by adding the data- Delhi, India. Pp. 170-265.
forwarding technique to remove data hazard, by adding Kim,James and Steve Leibson. 2005. Configurable
branch prediction block to remove the control hazard. processors: A new era in chip design.computer
38:51-59.
Acknowledgements Stallings, W. 2011. Computer organization and architecture,
We take this opportunity to express our profound designing for performance. Dorling Kindersley India
Pvt. Ltd., New Delhi, India.pp. 498-536.
gratitude and deep regards to Professor Dr. Shashidhar
Wikipedia.2011. Semiconductor intellectual property
Ram Joshi for his exemplary guidance, monitoring and core. http://en.wikipedia.org/wiki Semiconductor_
constant encouragement throughout the course of this intellectual_property_core.
research project. The support, help and guidance given Xilinx.2000. FPGA vs ASIC. http://www.xilinx.com/fpga/
by him time to time shall carry us a long way in the asic.htm
journey of life on which we are about to embark. We
88