0% found this document useful (0 votes)
300 views

Assignment Report On Cordic Algorithm Implementation Using Verilog

The document summarizes the implementation of a CORDIC algorithm using Verilog for VLSI design. It describes a parallel and serial architecture for the CORDIC algorithm. The parallel architecture supports both rotation and vector modes using addition/subtraction units across stages. The serial architecture similarly performs rotations in a single stage. The document also outlines the design parameters, verification strategy using simulation tools, and functional checklist for testing the CORDIC implementation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
300 views

Assignment Report On Cordic Algorithm Implementation Using Verilog

The document summarizes the implementation of a CORDIC algorithm using Verilog for VLSI design. It describes a parallel and serial architecture for the CORDIC algorithm. The parallel architecture supports both rotation and vector modes using addition/subtraction units across stages. The serial architecture similarly performs rotations in a single stage. The document also outlines the design parameters, verification strategy using simulation tools, and functional checklist for testing the CORDIC implementation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Assignment report on Cordic Algorithm

Implementation using Verilog

MOS VLSI Design


(ELL734)

Saahil Kr Nakami 2022EEY7581


Yash Juyal 2022EEY7582
Madhav 2023EEY7503
Ayanabho Banerjee 2023EEY7505

Under the guidance of


Prof. Kaushik Saha

Department of Electrical Engineering


Indian Institute of Technology Delhi
2023-24
MOS VLSI ASSIGNMENT

1 INTRODUCTION

1.1 Referenced Documents


[1] J. Duprat and J.M. Muller, "The CORDIC algorithm: new results for fast VLSI
implementation," in IEEE Transactions on Computers, vol. 42, no. 2, pp. 168-178, Feb. 1993,
doi: 10.1109/12.204786.

[2]‘‘Implementation of fast angle calculation and rotation using online CORDIC,” in Proc.
ISCAS’88, pp. 2703-2706.

[3] D. S. Phatak, "Double step branching CORDIC: a new algorithm for fast sine and cosine
generation," in IEEE Transactions on Computers, vol. 47, no. 5, pp. 587-602, May 1998, doi:
10.1109/12.677251.

[4] J.E. Volder, “The CORDIC Trigonometric Computing Technique,” IRE Trans. Electronic
Computers, vol. 8, pp. 330-334, Sept. 1959.

1.2 Design Library Name

/afs/iitd.ac.in/user/e/ee/eey237505/Synthesis_65LP/cordic_final_1/pd

1.3 People Involved in the Design

 Saahil Kr Nakami
 Yash Juyal
 Ayanabho Banerjee
 Madhav

2|Page
MOS VLSI ASSIGNMENT

2 Function

2.1 Brief Overview

 Specifications

Sl. No. Parameter Value Formatted: Indent: Left: 0.98 cm


1.) Clock frequency 100MHz Formatted: Indent: Left: 0.98 cm
2.) Input delay 0.1ns Formatted: Indent: Left: 0.98 cm
3.) Output delay 0.1ns Formatted: Indent: Left: 0.98 cm
4.) Clock latency 0.1ns Formatted: Indent: Left: 0.98 cm
5.) Clock uncertainty 0.1ns Formatted: Indent: Left: 0.98 cm
6.) Maximum capacitance 1000fF
Formatted: Indent: Left: 0.98 cm
7.) Maximum fan-out 150
Formatted: Indent: Left: 0.98 cm
Figure 2.1: Specifications

 Inputs and Outputs


An 8-bit input and 8-bit outputs for sine and cosine function.

 Cordic Algorithm

1. Introduction

The Coordinate Rotation Digital Computer (CORDIC) algorithm, introduced by Jack E.


Volder in 1956, stands as a powerful tool for efficiently computing arithmetic,
trigonometric, and hyperbolic functions. Its widespread use in areas like DSP, image
processing, communication, and industrial sectors stems from its ability to optimize design
performance. This tutorial delves into the foundational theory and practical implementation
of the CORDIC algorithm.

2. Theoretical Framework

The CORDIC algorithm performs vector rotations through a series of iterative and
incremental steps. Its primary application involves trigonometric calculations, particularly
rotation and transformation operations. CORDIC operates using fixed-point arithmetic,
making it efficient for hardware implementations.

Here's a detailed breakdown of the rotation method used in the CORDIC algorithm:

a) Initial Setup:

Input Vector: Begin with an initial 2D vector (x, y) that represents the magnitude and
direction of the vector to be rotated. The diagram in the figure 1 shows three

3|Page
MOS VLSI ASSIGNMENT

points(x,y)(x1,y1) and (x2,y2) on a circular path in the x-y coordinate system and distance
of all points are same and say ‘r’ here.

Target Rotation Angle: Define the desired angle of rotation (θ) that you want to achieve.

Figure 2.2: Rotation on circular path

By defining our objective, which is to rotate the point(x,y) anticlockwise towards the
point (x2,y2).

For the point (x,y) and (x2,y2) following equation can be written

x=rcos(θ) and y=rsin(θ)

x2=cosϕ1(x1-ytanϕ1) and y2=cosϕ1(y1+x1tanϕ1).

b) Rotation mode:

In rotation mode, the angle accumulator is initialized with the desired rotation angle. The
rotation decision at each iteration is made to diminish the magnitude of the residual angle
in the angle accumulator. The decision at each iteration is therefore based on the sign of
the residual angle after each step. Naturally, if the input angle is already expressed in the
binary arctangent base, the angle accumulator may be eliminated. For rotation mode, the
CORDIC equations are:

xi+1=xi-yi* di *2-i

yi+1=yi-xi* di *2-i

zi+1=zi-di*tan-1(2-i)

where

4|Page
MOS VLSI ASSIGNMENT

di=-1 if zi<0, +1 otherwise

which provides the following result

xn=An[x0cosz0-y0sinz0]

yn=An[y0cosz0+x0sinz0]

zn=0

An=∏𝑛 √1 + 2-2i

3. Vectoring Mode

Introducing the concept of vectoring mode, this mode involves nullifying one coordinate
to obtain the angle between them at the output. Primarily used for computing absolute
values or nullifying coordinates, vectoring mode employs a distinct set of equations
compared to the rotation mode.

4. Hyperbolic Mode

CORDIC's versatility extends to hyperbolic mode, where coordinates are rotated along a
hyperbola. This mode introduces additional functions and equations, broadening the
algorithm's applicability.

5. Implementation of CORDIC

CORDIC implementation can adopt either a parallel or serial architecture. The parallel
approach supports both rotation and vector modes, proving beneficial for hardware
implementations. On the other hand, the serial architecture is simpler and often employed
in software implementations.

6. Real-world Applications

CORDIC finds diverse applications in calculators, complex system implementations,


generating transform domains (e.g., FFT, Wavelet), QR decomposition, and DSP
applications for computing sine and cosine. CORDIC's elegance, efficiency, and
adaptability position it as a valuable computational tool across various domains, offering a
balanced compromise between accuracy and hardware efficiency.

3 Implementation of CORDIC
5|Page
MOS VLSI ASSIGNMENT

There are mainly two types of architecture namely parallel and series architecture.
Implementation of both architectures are described below.

3.1 Parallel architecture


This architecture support both rotation and vector mode. In this architecture each iteration
corresponds to a stage and each stage has 3 add/sub units which does addition or
subtraction depending upon value of signal σ. The value is computed by ph1 block as
shown in figure. When σ is 1 subtraction is performed. The RSH block for shifting data to
right. The scale block divides the output of the last stage by kn. The control signal for the
add/sub units at the inversion stage is given by ph2 block.

Figure 3.1: Parallel architecture of CORDIC

6|Page
MOS VLSI ASSIGNMENT

3.2 Serial architecture


Serial architecture is similar to one stage of parallel CORDIC architecture. The control signal
for add/sub units at inversion stage and scaling and inversion stage is similar to parallel
CORDIC architecture. The different rotational angle is pre-stored in LUT and 4-bit counter is
used to fetch those angle. The VRSH block is variable right shift block to shift the input
operands and FDC block is a controlled register block which stores a data when enable signal
is asserted. Total number of iterations required to evaluate a function depend on the data width
and data precision.

Figure 3.2: Serial architecture of CORDIC

7|Page
MOS VLSI ASSIGNMENT

4 Design Parameters

4.1 Performance Requirements


The operating frequency of the clock must be 100MHz with no DRC violations after RTL to
GDS synthesis.

4.2 Clock distribution


N/A

4.3 Reset

An active high synchronous reset was used in this design.

4.4 Timing Description


The operating frequency of the clock must be 100MHz. Input delay and output delay was set
up as 0.1ns in the constraint file. Clock latency and uncertainty of 0.1ns was set up.

5 Verification Strategy

5.1 Objectives
 Pertaining to Verilog code:
After writing a Verilog code of the algorithm, a testbench must be created in order to check
correct functionality of the code corresponding to a set of input combinations.

 Pertaining to RTL to GDS synthesis:


The main goal of verification for RTL to GDS synthesis is make sure there are no DRC
violations or any other errors in the subsequent steps except scan-chain errors.

5.2 Tools and version

 Xilinx Vivado 2022.2


 Cadence Innovus (TM) Implementation System 16.26
 Cadence Genus (TM) Synthesis Solution 2019

8|Page
MOS VLSI ASSIGNMENT

5.3 Checking mechanisms

 To make sure there are no DRC violations after the entire flow.
 To make sure throughout the flow there are no errors other than scan chain errors.

6 Functional Checklist

7 Testbench
Verilog code for testbench of the design:
module cordic_test();
localparam SZ = 8; // bits of accuracy
reg [2*SZ - 1:0] angle;
wire [SZ:0] Xout, Yout;
reg CLK_100MHZ;
reg reset;
// Waveform generator
localparam FALSE = 1'b0;
localparam TRUE = 1'b1;
reg signed [63:0] i;
reg start;
initial
begin
start = FALSE;
reset = FALSE;
$write("Starting sim");
CLK_100MHZ = 1'b0;
angle = 16'b0000000000000000;

9|Page
MOS VLSI ASSIGNMENT

i=130;
// Yout = 32000*sin(angle)
#10 reset = 1'b1;
#10 reset = 1'b0;
#1000;
@(posedge CLK_100MHZ);
start = TRUE;
// sin/cos output
//for (i = 0; i < 360; i = i + 1) // from 0 to 359 degrees in 1 degree increments
//for (i = 30; i < 60; i = i + 30) // increment by 30 degrees only
//begin
@(posedge CLK_100MHZ);
start = FALSE;
angle = ((1 << 16)*i)/360;
$display ("angle = %d, %h",i, angle);
//end
#500
$write("Simulation has finished");
$stop;
end
main sine_cosine (CLK_100MHZ,reset,angle, Xout, Yout);
parameter CLK100_SPEED = 10; // 100Mhz = 10nS
initial
begin
CLK_100MHZ = 1'b0;
$display ("CLK_100MHZ started");
#5;
forever
begin
#(CLK100_SPEED/2) CLK_100MHZ = 1'b1;
#(CLK100_SPEED/2) CLK_100MHZ = 1'b0;
end
10 | P a g e
MOS VLSI ASSIGNMENT

end
endmodule

8 Tests Specifications
All the test suits and functional tests done is mentioned in great depth in the previous section.

9 Design Microarchitecture

 Verilog Code For Design of the Algorithm:


module main(clk,reset,angle, cos_out, sin_out);
parameter PROCESSOR_DEPTH = 8; // bit width of input and output data
localparam NO_OF_STAGES = 13 ; // similar bit width of vectors X and Y
input clk,reset;
// The 2 MSB therefore tell us the quadrant
Input [2*PROCESSOR_DEPTH - 1:0] angle;
// 2 MSB = 2'b00 which represents 0 - PI/2 range
// 2 MSB = 2'b01 which represents PI/2 to PI range
// 2 MSB = 2'b10 which represents PI to 3*PI/2 range (i.e. -PI/2 to -PI)
// 2 MSB = 2'b11 which represents 3*PI/2 to 2*PI range (i.e. 0 to -PI/2)
// 1-bit is for the sign and output is in 2's complement notation
output signed [PROCESSOR_DEPTH :0] cos_out;
output signed [PROCESSOR_DEPTH :0] sin_out;
wire [PROCESSOR_DEPTH -1:0] Xinitial;
wire [PROCESSOR_DEPTH -1:0] Yinitial;

// For 8-bits the max value of cos(theta) or sin(theta) can be 255 corresponding to
value 1
// The Kn value for 0.607 is initial value of X which is 0.607*255 = 155;
assign Xinitial = 8'd155;

11 | P a g e
MOS VLSI ASSIGNMENT

assign Yinitial = 8'd0;

// Note: The tan_inverse was chosen to be NO_OF_STAGES bits wide giving


resolution up to atan(2^-12)
wire [15:0] tan_inverse [0:NO_OF_STAGES - 1];
assign tan_inverse[00] = 16'b0001111111111111;
assign tan_inverse[01] = 16'b0001000000001011;
assign tan_inverse[02] = 16'b0000010100010001;
assign tan_inverse[03] = 16'b0000001010001010;
assign tan_inverse[04] = 16'b0000000101000101;
assign tan_inverse[05] = 16'b0000000010100010;
assign tan_inverse[06] = 16'b0000000001010001;
assign tan_inverse[07] = 16'b0000000000101000;
assign tan_inverse[08] = 16'b0000000000010100;
assign tan_inverse[09] = 16'b0000000000001010;
assign tan_inverse[10] = 16'b0000000000000101;
assign tan_inverse[11] = 16'b0000000000000010;
assign tan_inverse[12] = 16'b0000000000000001; // atan(2^-12)
//------------------------------------------------------------------------------
// Registers
//------------------------------------------------------------------------------
//STAGE WISE OUTPUTS
reg signed [PROCESSOR_DEPTH :0] X [0:NO_OF_STAGES-1];
reg signed [PROCESSOR_DEPTH :0] Y [0:NO_OF_STAGES-1];
reg signed [2*PROCESSOR_DEPTH - 1:0] Z [0:NO_OF_STAGES-1];
//------------------------------------------------------------------------------
// Stage 0
//------------------------------------------------------------------------------
wire [1:0] quadrant;
assign quadrant = angle[15:14];

12 | P a g e
MOS VLSI ASSIGNMENT

always @(posedge clk)


begin
if(reset)begin
X[0] <= 0;
Y[0] <= 0;
Z[0] <= 0;
end
else
begin
case (quadrant)
// No Initial Rotation
2'b00,
2'b11:
begin
X[0] <= Xinitial;
Y[0] <= Yinitial;
Z[0] <= angle;
end
// Subtract 90 degrees if in second quadrant
2'b01:
begin
X[0] <= -Yinitial;
Y[0] <= Xinitial;
Z[0] <= {2'b00,angle[13:0]};
end
// Add 90 degrees if in third quadrant
2'b10:
begin
X[0] <= Yinitial;

13 | P a g e
MOS VLSI ASSIGNMENT

Y[0] <= -Xinitial;


Z[0] <= {2'b11,angle[13:0]};
end
endcase
end
end
//------------------------------------------------------------------------------
// Generate stages 1 to NO_OF_STAGES-1
//------------------------------------------------------------------------------
genvar i;
generate
for (i=0; i < (NO_OF_STAGES-1); i=i+1)
begin: XYZ
wire Z_sign;
wire signed [PROCESSOR_DEPTH :0] X_shr, Y_shr;
assign X_shr = X[i] >>> i; // signed shift right
assign Y_shr = Y[i] >>> I;
//the sign of the current rotation angle
assign Z_sign = Z[i][15]; // Z_sign = 1 if Z[i] < 0
always @(posedge clk)
begin
// add/subtract shifted data
X[i+1] <= Z_sign ? X[i] + Y_shr : X[i] - Y_shr;
Y[i+1] <= Z_sign ? Y[i] - X_shr : Y[i] + X_shr;
Z[i+1] <= Z_sign ? Z[i] + tan_inverse[i] : Z[i] - tan_inverse[i];
end
end
endgenerate

14 | P a g e
MOS VLSI ASSIGNMENT

//------------------------------------------------------------------------------
// Output
//------------------------------------------------------------------------------
assign cos_out = X[NO_OF_STAGES-1];
assign sin_out = Y[NO_OF_STAGES-1];
endmodule

9.1 Top Level Interface

Figure 8.1: Cordic Genus Synthesized Architecture

 The inputs are angle<0:7>


 The output is sin_out<0:7> cos_out<0:7>

15 | P a g e
MOS VLSI ASSIGNMENT

9.2 Sub-Block Description

 Initiation of the design phase involves drafting and validating the design using the
Vivado tool.
 After the design phase, synthesis is performed using Cadence Genus for schematic
generation of the RTL code.
 Cadence Genus utilizes pre-existing standard cells to create schematic representations
of the RTL code.
 The schematic diagrams portray various sub-blocks generated by Cadence Genus
during the synthesis process.
 Each sub-block assumes a critical role within the overall design architecture,
contributing to cohesive system functionality.
 The systematic synthesis approach adheres to established industry practices and design
principles.
 This methodical process ensures a standardized and coherent synthesis, fostering
reliable design outcomes.

9.3 Structural Mapping Process


N/A

10 Physical Hierarchy

10.1 Default Layout Generated

16 | P a g e
MOS VLSI ASSIGNMENT

Figure 10.1: Default Layout Generated using Innovus Tool

Figure 10.2: Console View

 Figure 10.1 shows the default Layout getting generated by the Cadence Innovus tool.
 Figure 10.2 shows the console view showing there are no errors.

10.2 Floor planning

Figure 10.3: Specified Floorplan

17 | P a g e
MOS VLSI ASSIGNMENT

 Figure 10.3 illustrates the floorplan, where the boundary has been meticulously
established through a systematic trial-and-error process.
 This method ensures that the floorplan is precisely configured to prevent any Design
Rule Check (DRC) violations.
 The deliberate consideration and adjustment of the boundary underscore the
commitment to achieving a floorplan that adheres to rigorous design standards and
minimizes potential violations.

10.3 Power planning


10.3.1 Adding Power Ring

Figure 10.4: Power Rings Added

 In Figure 10.4, the layout is presented subsequent to the addition of power rings.
 The width and spacing parameters have been meticulously maintained to mitigate any
potential Design Rule Check (DRC) violations.

18 | P a g e
MOS VLSI ASSIGNMENT

10.3.2 Adding Power Stripes

Figure 10.5: Power Stripes Added

 In Figure 10.5, the layout is presented subsequent to the addition of power stripes.
 The width and spacing parameters have been meticulously maintained to mitigate any
potential Design Rule Check (DRC) violations.

10.3.3 Power Routing

Figure 10.6: Power Routings

19 | P a g e
MOS VLSI ASSIGNMENT

 In Figure 10.6, the layout is presented subsequent to the addition of power routings.

10.4 Placement

Figure 10.7: After Placing Pre CTS

Figure 10.8: Console View After Placing

 Figure 10.7 provides a visual representation of the strategic placement of standard cells
within the top-level design, meticulously generated by Genus. This placement contributes
to the overall architectural framework, ensuring coherence and optimal functionality.
 In Figure 10.8, the console window is depicted, offering insights into scan chain errors. It
is noted that, for the current phase, these errors are deemed non-critical and can be
temporarily disregarded. This discerning observation underscores the ongoing commitment
to meticulous error management and the prioritization of design elements.

A. Placement Plan:
This process is automated by the Cadence Innovus tool.

20 | P a g e
MOS VLSI ASSIGNMENT

B. Routing Plan:
This process is automated by the Cadence Innovus tool.

10.5 Clocktree Insertion

Figure 10.9: Post CTS Layout After Nanorouting

Figure 10.10: Console View Post CTS

21 | P a g e
MOS VLSI ASSIGNMENT

 Figure 9.9 presents the layout post-clock tree synthesis, a pivotal step ensuring uniform
distribution of the clock signal across all sequential elements within the circuit.
 Clock tree synthesis, as depicted in Figure 9.9, plays a critical role in maintaining
synchronization and optimal functioning of the circuit's sequential elements.
 In Figure 9.10, the console view aptly indicates a smooth execution of clock tree synthesis
using Cadence Innovus, as evidenced by the absence of encountered errors. This
underscores the precision and efficacy of the synthesis process, aligning with established
design standards.

10.6 Layout Strategies

 Across the entirety of the Register-Transfer Level (RTL) to Graphic Data System (GDS)
flow, meticulous attention has been devoted to preserving the dimensional integrity of all
stripes and rings.
 This unwavering commitment extends to the conscientious maintenance of widths and
spacing’s, meticulously orchestrated to eliminate any Design Rule Check (DRC) violations.
 The deliberate adherence to stringent design standards throughout this process not only
attests to the dedication to quality but also ensures the seamless progression of the design
from RTL to GDS.
 By proactively addressing potential DRC concerns at each stage, the resulting
implementation stands as a testament to the precision and professionalism inherent in the
design methodology.
 This disciplined approach not only mitigates risk but also underscores the commitment to
delivering a final product of exceptional quality and compliance.

22 | P a g e
MOS VLSI ASSIGNMENT

11 Results
11.1 Simulation Waveforms

𝝅
 At radians
𝟔

𝜋
Figure 10.1: Simulation Waveforms at
6

𝝅
 At radians
𝟒

𝜋
Figure 10.2: Simulation Waveforms at
4

𝝅
 At radians
𝟑

𝜋
Figure 10.2: Simulation Waveforms at
3

23 | P a g e
MOS VLSI ASSIGNMENT

 Manual Calculation Table to check the accuracy of the outputs obtained

Table 10.1: Manual Calculations to check the accuracy

 The meticulous examination of manual calculations across various angles reveals a


commendable level of precision, with a maximum 5% deviation under worst-case
conditions.
 This robust performance has led to the strategic integration of an 8-bit CORDIC processor
for sine and cosine angle computations.
 This decision not only aligns with stringent accuracy requirements but also underscores a
commitment to efficiency and excellence in design.
 The implemented CORDIC processor ensures optimal computational performance while
maintaining a high degree of precision, enhancing the overall reliability and functionality
of the system.

11.2 Area

Figure 10.1: Area Report

11.3 Gate Count

Figure 10.2: Gate Count

24 | P a g e
MOS VLSI ASSIGNMENT

11.4 Timing

11.4.1 Pre-Clock Tree Synthesis (CTS)

 Worst Case (Setup Delay)

Figure 10.3: Pre-CTS Worst Case Delay (Setup Delay)

 Best Case (Hold Delay)

Figure 10.3: Pre-CTS Best Case Delay (Hold Delay)

25 | P a g e
MOS VLSI ASSIGNMENT

11.4.2 Post-Clock Tree Synthesis (CTS)

 Worst Case (Setup Delay)

Figure 10.3: Post-CTS Worst Case Delay (Setup Delay)

 Best Case (Hold Delay)

Figure 10.3: Post-CTS Best Case Delay (Hold Delay)

26 | P a g e
MOS VLSI ASSIGNMENT

 Post CTS Slack

Figure 10.3: Post-CTS Slack

11.5 Testability Analysis


This has been explained in Section 6 in detail.

11.6 DRC rule violations

Figure 10.4: DRC Window

 From Figure 10.4 it is seen that there are no DRC violations.

27 | P a g e

You might also like