0% found this document useful (0 votes)

104 views6 pages

FPGA Based Modified Karatsuba Multiplier

This document discusses a modified Karatsuba multiplier for finite field arithmetic. It begins by introducing finite field arithmetic and comparing different finite field multipliers. It then proposes a modified Karatsuba multiplier that splits the product terms into two alternative forms and expresses all terms repeatedly to optimize the Karatsuba algorithm. This modified design saves 14.9% computation time and uses 45.5% fewer slices than the existing Karatsuba multiplier. The design is also applied to compute circular convolution more efficiently. Simulation results show the modified Karatsuba algorithm provides a 26.5% faster computation time and 61.7% reduction in slices for an 8-bit circular convolution compared to the original Karatsuba algorithm.

Uploaded by

Mrudula Singam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

104 views6 pages

FPGA Based Modified Karatsuba Multiplier

Uploaded by

Mrudula Singam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

International Conference on VLSI and Signal Processing (ICVSP): 10 – 12 January, 2014

FPGA Based Modified Karatsuba Multiplier

Jagannath Samanta Razia Sultana Jaydeb Bhaumik
Dept. of ECE Dept. of ECE Dept. ECE
Haldia Institute of Technology Haldia Institute of Technology Haldia Institute of Technology
Haldia, India Haldia, India Haldia, India
[email protected] [email protected] [email protected]

fast but it is implemented with a higher complexity. Efficient

Abstract— Finite field arithmetic is becoming increasingly a very
architectures require low complexity and fast multipliers.
prominent solution for calculations in many applications. In this
paper, complexity and delay of six different multipliers
Assuming a basis representation of the field elements addition
(Mastrovito multiplier, Paar-Roelse multiplier, Massey- Omura is a relatively inexpensive operation, whereas the other field
multiplier, Hasan-Masoleh multiplier, Berlekamp multiplier and operation, is costly in terms of gate count and delay.
Karatsuba multiplier) are compared. Also this paper presents a In the polynomial multiplication, Karatsuba algorithm is used
modified multiplier based on Karatsuba multiplication
to make multiplication efficient which means algorithm saves
algorithm. To optimize the Karatsuba multiplication algorithm,
the product terms are splited into two alternative forms and all
multiplication at the cost of extra addition. Because
the terms are expressed in the repeated fashion. This Modified multiplication is more costly than addition. Addition of two m-
architecture saves the 14.9% computation time and it consumes bit numbers require m no. of XOR gates. Koc et al. [8] have
45.5% less slices than existing Karatsuba multiplier. The proposed a recursive algorithm for fast multiplication of large
proposed architecture has been simulated and synthesized by integers having a precision of 2k computer words, where k is
Xilinx ISE design suite for Spartan & Vertex device family. The an integer. Their algorithm has been derived from the
new architecture is simple & easy. The proposed Modified Karatsuba-Ofman algorithm and has the same asymptotic
Karatsuba Multiplier (MKM) is also applied to compute the complexity. They have claimed that the running time of their
circular convolution for DSP application. In Spartan3E FPGA algorithm is a little better that makes one third as many
device family, computation of 8-bit circular convolution using
Modified Karatsuba Algorithm (MKA) is 26.5% faster than
recursive calls. Murat Cenk et al. [9] gave improved formulas
Karatsuba Algorithm (KA). It also consumes 61.7% less slices to multiply polynomials of small degree over F2 using
than existing KA based Convolution. Chinese Remainder Theorem (CRT) that improve
multiplication complexity. Gang Zhou et al. have presented
Keywords- Karatsuba Algorithm; Finite fields; FPGA; VLSI; complexity analysis and efficient FPGA (Field Programmable
polynomial multiplication; Cicular Convolution; Gate Array) implementations of bit parallel mixed Karatsuba–
Ofman multipliers in [10]. By introducing the common
I. INTRODUCTION expression sharing and the complexity analysis on odd-term
polynomials, they have achieved a lower gate bound than
Galois fields have gained wide spread applications in error previous ASIC implementation. They have extended the
correcting codes and cryptographic algorithms. Further analysis by using 4-input/6-input lookup tables (LUT) on
applications may be found in signal processing and pseudo
FPGAs. They have evaluated the LUT complexity and area-
random number generation. Modern applications in many cases
call for VLSI implementations of the arithmetic modules in time product tradeoffs on FPGAs with different computer-
order to satisfy the high speed requirements. VLSI allows the aided design (CAD) tools. They claim that their bit parallel
designers to allocate complex systems consisting of several multipliers consume the least resources among known FPGA
thousand or even millions transistors on one or very few chips. implementations.
VLSI modules having Galois field multiplier can be classified In this paper, a modified multiplier based on Karatsuba
into three categories: bit- serial multipliers [6], bit- parallel multiplication algorithm is proposed. To optimize the
multipliers, and hybrid multiplier. Bit parallel architectures Karatsuba multiplication algorithm, the product terms are
tend to be faster and only use combinatorial logic [5]. On the splited into two alternative forms and computed all the terms
other hand, bit serial architectures require less area and uses in the repeated fashion. This modified architecture saves the
registers in addition to combinatorial logic, and the hybrid 14.9% computation time and it consumes 45.5% less slices
multipliers, which are partially bit-serial and partially bit- than existing Karatsuba multiplier. The proposed design has
parallel. Hybrid multipliers are faster than bit-serial ones, while been simulated and synthesized using Xilinx FPGA based
their area is smaller than that of bit- parallel. For efficient VLSI Spartan and Vertex device family. The new architecture is
implementation suitable hardware architecture is needed. It is simple and easy. It is also applied to compute circular
obtained by using addition, multiplication, field operations, convolution. In Spartan3E FPGA device family, computation
suitably in the architecture. Addition can be implemented with of 8-bit circular convolution using MKA is 26.5% faster than
a very low space complexity, multiplication is required to be

1
International Conference on VLSI and Signal Processing (ICVSP): 10 – 12 January, 2014

KA. It also consumes 61.7% less slices than existing KA different GF multipliers are compared with their device
based convolution. utilization and combinational path delay using Xilinx based
The rest of the paper is organized as follows. Basics of simulation tools on FPGA platforms. We have used the
Galois Field arithmetic and comparison of the different GF Verilog HDL language to code the all design.
multipliers are presented in section-II. A new method for
implementations of Karatsuba multipliers has been proposed Karatsuba Multiplier (KM)
in Section-III. Results & discussion are provided in Section- In this section, we introduce the fundamental Karatsuba
IV. Section-V describes application of proposed algorithm to algorithm which can successfully be applied to polynomial
compute the circular convolution and finally the paper is multiplication. The Karatsuba Algorithm was introduced by
concluded in Section-VI. Karatsuba in 1962. The fundamental Karatsuba multiplication
for polynomial in GF(2m) is a recursive divide-and-conquer
II. GALOIS FIELD ARITHMETICS
technique. It is considered as one of the fastest way to multiply
Galois field defines as GF(pm) which is a field with pm long numbers. For polynomial multiplication with original
numbers of elements (p is a prime number) [7]. Furthermore, Karatsuba method both operands have to be divided into two
order of Galois field is the number of elements in the Galois equal parts. Then each sub operands is divided again into two
field. Addition and multiplication are two basic operations parts. The process will continue until this become single.
mainly done in Galois field arithmetic. Addition and Figure1 shows the block diagram of Karatsuba multiplier
subtraction of elements of GF(2m) are simple XOR operations for degree-3 polynomials. Then we get the followings by
of the two operands. Each of the elements in the GF is first splitting the polynomials using KM:
represented as a corresponding polynomial. Multiplication If A(x) and B(x) are field polynomials with degrees 3
operation over the Galois field is a more complex operation over a field GF (24).
than the addition operation. For m=4, the product is With the auxiliary variables
represented as follows: D0 = a0b0 , D1 = a1b1
D2 = a2b2 , D3 = a3b3
A(x) = a3x3 +a2x2+a1x +a0 (1)
D0, 1 = (a0 + a1) (b0 + b1)
B(x) = b3x3 +b2x2+b1x +b0 (2) D0, 2 = (a0 + a2) (b0 + b2)
A(x)×B(x)= (a3x3 +a2x2+a1x +a0 ) × (b3x3 +b2x2+b1x +b0) D1, 3 = (a1 + a3) (b1 + b3)
 (a3b3)x6  (a3b2  a2b3 )x5  (a3b1  a2b2  a1b3)x4  (a3b0  a0b3  a2b1  a1b2 )x3 D3, 2 = (a3 + a2) (b3 + b2)
D0,1,2,3 = (a0 + a1+ a2 +a3) (b0 + b1+b2+b3)
 (a2b0  a1b1  a0b2 )x2  (a1b0  a0b1)x1  (a0b0 )x0
Field multiplication can be performed into two steps. Firstly,
The result has seven coefficients which must convert back into we perform an ordinary polynomial multiplication of two field
a 4-tuple to achieve closure. This can be done by substituting elements. Secondly, a reduction operation with an irreducible
the value of x6, x5 and x4 with their polynomial representations polynomial is need to be performed in order to obtain the (m -
and summing terms. 1) degree polynomial. It is noticed that once the irreducible
A(x) × B(x) = (a3b3 + a3b0 + a2b1 + a1b2 + a0b3) x3+ (a3b3 + polynomial p(x) = x4+ x+1 has been selected, the reduction
a3b2 + a2b3 + a2b0 + a1b1 +a0b2) x2 + (a3b2 + a2b3 + a3 b1 + step can be accomplished by using XOR gates only [9]. From
a2b2 + a1b3 + a1b0 + a0b1) x+ (a3b1 + a2 b2 +a1b3 +a0b0). (3) the irreducible polynomial p(x) we can replace x 4= x+1, x5=
Eqn. (3) is often expressed in matrix form. x2+ x and x6 = x 3+ x2 to obtain C’ (x) as follows:
a0 a3 a2 a1  b0  c 0  C’(x) = A(x) B(x) mod p(x)
 a1 a 3  a 0 a 3  a 2 a1  a 2  b1  c 1  C’(x)=(D0,1,2,3–D1,3–D2,0–D3,2 –D0,1+D0+D1+D2)x3+ (D0,2+D3,2
    =  +D1 –D0) x2+(D0,1+D1,3+D3,2 –D0)x+(D1,3–D1–D3+D2+D0) (5)
a2 a1 a 3  a 0 a 3  a 2  b2  c 2 
     
a3 a2 a1 a 3  a 0  b3  c 3  (4) III. Modified Karatsuba Multiplier (MKM)
The multiplication results in eqn.(3) can be implemented as
In this section our Modified Karatsuba Algorithm (MKA)
logical ANDs and the additions as logical XORs. Thus, the
has been discussed. In MKA all techniques are same as
expression requires only 16 AND and 15 XOR to implement.
fundamental basic Karatsuba multiplier except the splitting
GF multipliers are dependent on addition and multiplication. techniques. To optimize the Karatsuba Multiplication
Addition is easy and it equates to a bit-wise XOR of the m- Algorithm, the product terms are splited into two alternative
forms. This reduction technique requires small area and less
tuple and is realized by an array of mXOR gates. The GF
delay than others existing multiplication algorithms. The
multiplier much more complicated and is the key to
developing efficient of GF field computational circuits. In this results are compared by using Xilinx based synthesis tools on
different FPGA device family like Spartan & Vertex. Our
section, we have conducted an extensive survey on six
synthesis results are better than existing basic Karatsuba
different GF multipliers i.e. Mastrovito multiplier, Paar-Roelse
algorithm which is shown in the following section. Assume
multiplier, Massey-Omura multiplier, Hasan-Masoleh
multiplier, Berlekamp multiplier and Karatsuba multiplier. Six

2
International Conference on VLSI and Signal Processing (ICVSP): 10 – 12 January, 2014

A(x) and B(x) are two field polynomials with degree 3 in

GF(24).
A(x) = a3 x 3+a2x 2+a1x + a0
B(x) = b3x 3+b2x 2+b1x +b0

Fig.1: Block diagram of Karatsuba multiplier Fig.2: Block diagram of Modified Karatsuba multiplier
for degree-3 polynomials for degree-3 polynomial
IV. RESULTS & DISCUSSION:
Then we get the following expression by splitting the
coefficients of C(x)= A(x)B(x) polynomial using MKA. We have studied the performance of each multiplier
D0 = a0b0 , D1 = a1b1 over GF(24) employing the Xilinx ISE simulation tool.
D2 = a2b2 , D3 =a3b3 Multipliers are implemented on Spartan3E xc3s100e-4 device.
D3,2=(a3+a2)(b3+b2) These multipliers are compared based on number of slices,
D3,1=(a3+a1)(b3+b1) number of 4-input LUTs, bonded I/O blocks and delay.
D3,0=(a3+a0)(b3+b0)
TABLE 1: Comparison of different multipliers in GF(24) field
D1,2=(a1+a2)(b1+b2)
Different GF No. of No. of 4 No. of Max.
D0,2=(a0+a2)(b0+b2) Multipliers slices i/p bonded combinational
D0,1=(a0+a1)(b0+b1) (out of LUTs IOBs path delay (ns)
Here operands are splited into two alternative terms. 960) (out of (out of 66)
Employing auxiliary variables, we can obtain the following 1920)
Mastrovito[2] 7 12 12 13.195
expression.
Paar – Roelse [3] 7 12 12 13.083
C( x)  D3 x 6  ( D3,2  D2  D3 ) x 5  ( D1,3  D1  D3  D2 ) x 4  ( D0,3 D 0 D3  Massy Omura [4] 7 13 12 14.932
D1,2  D1  D2 ) x 3  D0,2  D2  D0  D1 ) x 2  ( D0,1  D1  D0 ) x  D0 .........(6) Hasan Masoleh [5] 7 12 12 13.271
Then C’(x) is computed by using the relationship C’(x)=C(x)
mod p(x). Using the irreducible polynomial p(x)=x4+x+1, Berlekamp [6] 8 15 12 12.985
terms x 4 , x5 and x6 in C(x) are replaced by (x+1),( x2+x) , Karatsuba
(x3+x2) respectively. The simplified expression of C’(x) is as Multiplier (KM) 9 15 12 14.790
follows: [7]
3 2 Modified
C’(x)=(D0,3–D0+D1,2–D1–D2)x +(D0,2+D3,2+D1–D0)x +(D0,1+D1,3+
D3,2 –D0)x +(D1,3–D1–D3+D2+D0) (7) Karatsuba 6 11 12 13.057
Multiplier (MKM)
Figure2 shows the block diagram of Modified Karatsuba
multiplier for degree-3 polynomials.
Table-1 shows the result of device utilization and
combinational path delay of various types of GF(24)
multipliers. Proposed multiplier has less hardware complexity
than other GF multiplier. It is also faster than other multipliers
except Berlekamp.

3
International Conference on VLSI and Signal Processing (ICVSP): 10 – 12 January, 2014

for m=8, KM requires 139 additions and 36 multiplications

to compute C(x) whereas modified KM, MKA needs 36
multiplications and 109 additions. Thus MKA can save 30
additions. Table 3, compares between Karatsuba multiplier
(KM) and Modified Karatsuba Multiplier (MKM) in GF(28)
field based on different Spartan & Vertex FPGA device
family.

Fig. 3: Time delay graph of various multipliers in GF(24)

Figure 3 shows the delay graph of various type of finite field
multiplier. From the table1, it is observed that the Berlekamp
Multiplier has the lowest combinational path delay than other
finite field multipliers. Highest path delay is found in Massy- Fig.4: Delay graph of 8×8 KM and MKM on different FPGA devices
Omura multiplier.
TABLE 2: Complexity comparison between KM and MKM for different
GF field
KM MKM
m # MUL # ADD # MUL # ADD
2 3 4 3 4
3 6 13 6 12
4 9 24 10 23
8 36 139 36 109

TABLE 3: Comparison of resource utilization between KM and MKM in

GF(28) for different Xilinx FPGA Devices.
Devices Algo # Slices # 4-i/p # Bonded Delay Fig. 5: Area occupied (% slices) of 8×8 KM and MKM on different FPGA
m out of LUTs m IOB (ns) devices.
n (m/n) out of n m out of n Figure 4 shows the multiplication time delay of the MKM in
(m/n) (m/n) comparison with KM for different FPGA device. The
Spartan2 KM 66 /192 115/ 384 24/90 19.835 proposed architecture has very small multiplication time delay
(xc2s15) MKM 36/192 62/ 384 24/ 90 15.857 and device utilization in comparison with the other
Spartan 2E KM 66 / 768 115/1536 24/182 19.095 architectures. Figure 5 shows resource utilization in terms of
(xc2s50e) MKM 36 /768 62/1536 24/182 15.279 (% of slices) necessary for the implementation. In Spartan3E,
Spartan 3 KM 66/768 115/1536 24/63 16.206 our modified Karatsuba multiplier is 14.9% faster than
(xc3s50) MKM 36/768 62/1536 24/63 13.948 Karatsuba multiplier. It also consumes 45.5% less slices than
Spartan 3E KM 66/ 960 115/1920 24/66 20.028 KM.
(xc3s100e) MKM 36/ 960 62/1920 24/ 66 17.035
Virtex KM 66/768 115/1536 24/184 24.699
(xcv50) MKM 36/768 62/1536 24/184 19.703
Virtex2 KM 66/256 115/512 24/ 88 14.759
(xc2v50) MKM 36/ 256 62/512 24/88 12.601
Virtex2P KM 66/1408 115/2816 24/140 9.14
(xc2vp2) MKM 36/1408 62/2816 24/140 7.754
Virtex4 KM 66/5472 115/10944 24/240 8.311
(xc4vFx12) MKM 36/5472 62/10944 24/240 7.199
Fig. 6: Simulation results of Modified Karatsuba Multiplier
Virtex E KM 66/768 115/1536 24/98 16.659
(xcv50e) MKM 36/768 62/1536 24/98 13.041 The simulation results of 8×8 MKM have been shown in Fig.
6. Figure shows the decimal equivalent of multiplication of
two 8-bit numbers to give the result. Ports ‘a’ and ‘b’ are the
Table-2 shows the complexity of KM and MKM for m= 2, 3, 4 two input ports that accept the numbers to be multiplied while
and 8. For m=4, KM requires 24 additions and 9 the port ‘c’ is the output port where the product of the two
multiplications to compute C(x) whereas MKA requires 10 aforesaid numbers is obtained.
multiplications and 23 additions, thus we save 1 addition. And

4
International Conference on VLSI and Signal Processing (ICVSP): 10 – 12 January, 2014
TABLE 4: Comparison of device utilization and combinational path delay A={a0,a1,a2,a3,a4,a5,a6,a7} and B={b0,b1,b2,b3,b4,b5,b6,b7}. All
of 8×8 KM and MKM using different primitive polynomial. the points of A are placed on the outer circle in the counter
p1(x)=x8+x4+x3+x2+1
Algo # Slices # 4-i/p # Bonded Delay
clockwise direction. Starting at the same point as A, all points
(out of LUT IOB (ns) of B are placed on the inner circle in clockwise direction.
768) (out of (out 63)
1536)
KM 66 115 24 16.206
MKM 36 62 24 13.948
P2(x) = x8+x5+x3+x2+1
KM 69 121 24 15.206
MKM 36 62 24 13.539
8 5 3
P3(x) = x +x +x +x+1
KM 67 116 24 16.553 Expression of d0 is obtained by multiplying the corresponding
samples points and then adding the product terms.
MKM 34 59 24 13.798
d0=a0b0+a7b1+a6b2+a5b3+a4b4+a3b5+a2b6+a1b7 (8)

Applying Modified Karatsuba Algorithm (MKA) in equation

(8) we can obtain,
d0=a0b0+(a7+a1)(b7+b1)+a7b7+a1b1+(a5+a3)(b5+b3)+
a5b5+a3b3+(a2+a6)(b2+b6)+a2b2+a6b6+a4b4 (9)

Similarly the expressions of d1,d2,d3, d4 d5,d6 and d7 are

obtained and they are as follows:
d1=a0b1+a1b0+a2b7+a3b6+a4b5+a5b4+a6b3+a7b2
=(a0+a1)(b0+b1)+a0b0+a1b1+(a2+a7)(b2+b7)+a2b2+a7b7+
(a) (a3+a6)(b3+b6)+a3b3+a6b6+(a5+a4)(b5+b4)+a5b5+a4b4 (10)

d2=a0b2+a1b1+a2b0+a3b7+a4b6+a5b5+a6b4+a7b3
=a1b1+(a0+a2)(b0+b2)+a0b0+a2b2+(a7+a3)(b7+b3)+a7b7+a3b3
+(a4+a6)(b4+b6)+a4b4+a6b6+a5b5 (11)

d3=a0b3+a1b2+a2b1+a3b0+a4b7+a5b6+a6b5+a7b4
= (a0+a3)(b0+b3)+a0b0+a3b3+(a1+a2)(b1+b2)+a1b1+a2b2
+(a4+a7)(b4+b7)+a4b4+a7b7+(a6+a5)(b6+b5)+a5b5+a6b6 (12)

d4= a0b4 +a1b3+a2b2+a3b1+a4b0+a5b7+a6 b6+a7b5

(b) =(a0+a4)(b0+b4)+a0b0+a4b4+(a1+a3)(b1+b3)+a1b1+a3b3+
Fig7. (a) Delay (ns); (b) Area occupied (%Slices) using different (a5+a7)(b5+b7)+a5b5+a7b7+ a2b2+a6b6 (13)
primitive polynomials
Table 4 shows the simulation results for device utilization d5=a0b5+a1b4+a2b3+a3b2+a4b1+a5b0+a6b7+a7b6
and combinational path delay of 8×8 KM and MKM using =(a0+a5)(b0+b5)+a0b0+a5b5+(a1+a4)(b1+b4)+a1b1+a4b4
three different primitive polynomials. The multipliers are +(a6+a7)(b6+b7)+a6b6+a7b7+(a2+a3)(b2+b3)+ a2b2+a3b3 (14)
implemented on the Xilinx Spartan3 xc3s50e-4 FPGA device.
Figure 7(a) shows the delay graph of KM and MKM for three d6= a0b6 +a1b5+a2 b4+a3b3+a4b2+a5b1+a6 b0+a7b7
types of primitive polynomial. Figure 7(b) shows the area =(a0+a6)(b0+b6)+a0b0+a6b6+(a1+a5)(b1+b5)+a1b1+a5b5+
performances of KM and MKM for three different primitive (a2+a4)(b2+b4)+a2b2+a4b4+a7b7+a3b3 (15)
polynomials, which are given in terms of total numbers of
slices necessary for the implementation. From Table 4, it is d7=a0b7+a1b6+a2b5+a3b4+a4b3+a5b2+a6b1+a7b0
observed that in the three cases the MKM requires lesser =(a0+a7)(b0+b7)+a0b0+a7b7+(a1+a6)(b1+b6)+a1b1+a6b6
number of slices and at the same time minimum critical path +(a2+a5)(b2+b5)+a2b2+a5b5+(a3+a4)(b3+b4)+a4b4+a3b3 (16)
delay.

V. APPLICATION
In this Section, computation of circular convolution by
employing proposed Modified Karatsuba Algorithm is
presented. Assume A and B are the two sequences, where

5
International Conference on VLSI and Signal Processing (ICVSP): 10 – 12 January, 2014

area and path delay. Figure 9 shows the delay in computing

convolution using two different algorithms and Figure 10
shows the resource utilization in terms of % of slices
necessary for the implementation. In Spartan3E FPGA device
family, computation of 8-bit circular convolution based on
MKA is 26.5% faster than KA. It also consumes 61.7% less
slices than existing KA based convolution.

VI. CONCLUSION
Fig. 8: Simulation result of circular convolution using MKA In this paper, modified Karatsuba multipliers for degree 3
TABLE 5: Comparison of device utilization and combinational path delay
and 7 polynomials has been implemented on FPGA platform.
to compute circular convolution using KA and MKA. The device utilization and combinational path delay of
Length Algorithm # Slices # 4-i/p # Bonded Delay MKM have been compared with standard 8×8 KM. It
(out of LUT IOB (ns) has been observed that the proposed multiplier has better
960) (out of (out of
1920) 66)
timing performance than standard KM. In Spartan3E FPGA
circular device, proposed multiplier needs 14.9% lesser delay than
convolution 10 17 12 16.949 KM, and it also consumes 45.5% lesser slices compared to
using KA KM. The new architecture is very simple and easy. This
circular feature is advantageous to have a suitable trade-offs between
4-bit convolution 7 12 12 11.324
using MKA area and speed for implementing circular convolution
circular algorithm in VLSI. In FPGA device family, computation of 8-
convolution 68 118 24 18.469 bit circular convolution using MKA is 26.5% faster than KA.
using KA It also consumes 61.7% less slices than existing KA based
circular
8-bit convolution 26 45 25 13.567
convolution. MKM may also be used to design cryptosystems.
using MKA Proposed multiplier is faster and hardware efficient compared
to existing Karatsuba multiplier.
REFERENCES
[1] Z. J. Shi and H. Yun, “ Software implementations of elliptic curve
cryptography,” International Journal of Network Security, vol. 7, no.
1, pp. 141-150, 2008.
[2] T. Zhang and K.K. Parhi, “Systematic Design of Original and Modified
Mastrovito Multipliers for General Irreducible Polynomials,” IEEE
Trans. Computers, vol. 50, no. 7, pp. 734-749, July 2001.
[3] C. Paar, P. Fleischmann, and P. Roeise, “Efficient Multiplier
Architectures for Galois Fields GF(24n)” , IEEE Trans. Computers, vol.
47, no. 2, pp. 162-170, Feb. 1998.
[4] C. A. Wang, T. K. Truong, H. M. Shao, L. J. Deutsch, J. K. Omura, and
Fig. 9: Delay for comparing circular convolution using KA and MKA I. S. Reed, “VLSI architectures for computing multiplications and
inverses in GF(2m)”, IEEE Transactions on Computers,34(8):709- 717,
Aug 1985.
[5] A. Reyhani-Masoleh and M.A. Hasan, “A New Construction of
Massey- Omura Parallel Multiplier over GF(2m)”, IEEE Trans.
Computers, vol. 51, no. 5, pp. 511-520, May 2002.
[6] Berlekamp, E. R., “Bit-Serial Reed-Solomon Encoder”, IEEE Trans.
Inform. Theory, Vol. IT-28, pp. 869-874 (1982).
[7] A. Karatsuba and Y. Ofman, “Multiplication of many-digital numbers by
automatic computers”, in Doklady Akad. Nauk SSSR, vol. 145, no. 293-
294, pp. 85, 1962.
[8] Koc, Cetin K; Erdem, Serdar S,“A Less Recursive Variant of Karatsuba-
Ofman Algorithm for Multiplying Operands of Size a Power of Two”,
Fig. 10: Area occupied (% slices) between circular Convolution using KA
Proceedings of the 16th IEEE Symposium on Computer Arithmetic,
and MKA
1063-1069,2003.
[9] Murat Cenk and Ferruh O¨ zbudak,“Improved Polynomial Multiplication
The circular convolution algorithm is coded using Verilog Formulas over F2 Using Chinese Remainder Theorem”, IEEE
HDL language. It is simulated and synthesized using Xilinx Transactions on Computers, vol. 58, no. 4, pp. 572- 576, April 2009.
ISE 7.1i software tool. Table 5 shows the comparison of [10] Zhou, Gang; Michalik, Harald; Hinsenkamp, Laszlo, “Complexity
device utilization and combinational path delay to compute Analysis and Efficient Implementations of Bit Parallel Finite Field
Multipliers Based on Karatsuba-Ofman Algorithm on FPGAs”, IEEE
circular convolution using KA and MKA. It is observed that Transactions on Very Large Scale Integration Systems,18 (7), pp.1057-
circular convolution based on MKA requires least amount of 1066,2010.

Project Base Paper
No ratings yet
Project Base Paper
6 pages
Haskell High Performance Programming
From Everand
Haskell High Performance Programming
Samuli Thomasson
No ratings yet
Fast Multiplication Algorithms
No ratings yet
Fast Multiplication Algorithms
171 pages
An Efficient and High Speed Overlap Free Karatsuba Based Finite Field Multiplier For Fpga Implementation
No ratings yet
An Efficient and High Speed Overlap Free Karatsuba Based Finite Field Multiplier For Fpga Implementation
15 pages
Temenos T24 Template Programming V 4.0
67% (3)
Temenos T24 Template Programming V 4.0
35 pages
An Efficient and High Speed Overlap Free Karatsuba Based Finite Field Multiplier For Fpga Implementation
No ratings yet
An Efficient and High Speed Overlap Free Karatsuba Based Finite Field Multiplier For Fpga Implementation
167 pages
Grade 5 Mathematics Textbook PDF
50% (4)
Grade 5 Mathematics Textbook PDF
2 pages
Digital Modulations using Matlab
From Everand
Digital Modulations using Matlab
Mathuranathan Viswanathan
4/5 (6)
Vlsi Mtech Document
No ratings yet
Vlsi Mtech Document
72 pages
Khatibzadeh Amir Ali
No ratings yet
Khatibzadeh Amir Ali
114 pages
Sastry1999 Book NonlinearSystems
100% (2)
Sastry1999 Book NonlinearSystems
690 pages
Hardware Implementation of Bit-Parallel Finite Field Multipliers
No ratings yet
Hardware Implementation of Bit-Parallel Finite Field Multipliers
68 pages
(Graduate Texts in Mathematics #67) Jean-Pierre Serre - Local Fields-Springer (1979)
No ratings yet
(Graduate Texts in Mathematics #67) Jean-Pierre Serre - Local Fields-Springer (1979)
248 pages
Efficient Hardware Architectures For Modular Multi
No ratings yet
Efficient Hardware Architectures For Modular Multi
59 pages
DICD Fall 2024 Lecture 09 Arithmetic Circuits
No ratings yet
DICD Fall 2024 Lecture 09 Arithmetic Circuits
52 pages
Fall Semester 2024-25 - STS3007 - TH - AP2024252001241 - 2024-09-10 - Reference-Material-I
No ratings yet
Fall Semester 2024-25 - STS3007 - TH - AP2024252001241 - 2024-09-10 - Reference-Material-I
26 pages
Kris Gaj: Research and Teaching Interests
No ratings yet
Kris Gaj: Research and Teaching Interests
47 pages
Karatsuba Algorithm
No ratings yet
Karatsuba Algorithm
22 pages
29-Karatsuba Algorithm-23-05-2023
No ratings yet
29-Karatsuba Algorithm-23-05-2023
21 pages
Electronics 12 00605 v2
No ratings yet
Electronics 12 00605 v2
19 pages
Applsci 14 03323 v2
No ratings yet
Applsci 14 03323 v2
15 pages
Applsci 14 04085
No ratings yet
Applsci 14 04085
15 pages
Higher Nationals - Summative Assignment Feedback Form: Unit 11: Maths For Computing
100% (3)
Higher Nationals - Summative Assignment Feedback Form: Unit 11: Maths For Computing
15 pages
L16 - Karatsuba Algorithm
No ratings yet
L16 - Karatsuba Algorithm
17 pages
Resize-Pdf - Base Paper 6 - Copy-Numbered
No ratings yet
Resize-Pdf - Base Paper 6 - Copy-Numbered
13 pages
Shimura G. - Automorphic Functions and Number Theory (1968) PDF
No ratings yet
Shimura G. - Automorphic Functions and Number Theory (1968) PDF
38 pages
2017 Mastrovito Form of Non-Recursive Karatsuba Multiplier For All Trinomials
No ratings yet
2017 Mastrovito Form of Non-Recursive Karatsuba Multiplier For All Trinomials
12 pages
1 s2.0 S0045790624001459 Main
No ratings yet
1 s2.0 S0045790624001459 Main
11 pages
2022 Optimized Interpolation of Four-Term Karatsuba Multiplication and A Method of Avoiding Negative Multiplicands
No ratings yet
2022 Optimized Interpolation of Four-Term Karatsuba Multiplication and A Method of Avoiding Negative Multiplicands
11 pages
Hardware Complexity of Modular Multiplication and Exponentiation
No ratings yet
Hardware Complexity of Modular Multiplication and Exponentiation
12 pages
An Efficient and High-Speed Overlap-Free Karatsuba-Based Finite-Field Multiplier For FGPA Implementation
No ratings yet
An Efficient and High-Speed Overlap-Free Karatsuba-Based Finite-Field Multiplier For FGPA Implementation
10 pages
1MV21EC079 Advance VLSI
No ratings yet
1MV21EC079 Advance VLSI
10 pages
2018 Efficient Implementation of Karatsuba Algorithm Based Three-Operand Multiplication Over Binary Extension Field
No ratings yet
2018 Efficient Implementation of Karatsuba Algorithm Based Three-Operand Multiplication Over Binary Extension Field
9 pages
Exploring The Design Space For FPGA Base
No ratings yet
Exploring The Design Space For FPGA Base
9 pages
2) Karatsuba Algorithm
No ratings yet
2) Karatsuba Algorithm
8 pages
DSP48E Efficient Floating Point Multiplier Architectures On FPGA
No ratings yet
DSP48E Efficient Floating Point Multiplier Architectures On FPGA
6 pages
Efficient Design of Single Precision Floating Point Multiplier Paper
No ratings yet
Efficient Design of Single Precision Floating Point Multiplier Paper
6 pages
Improves Multiplier Effcieny in Hardware
No ratings yet
Improves Multiplier Effcieny in Hardware
9 pages
An Efficient Multiplication Algorithm Using Nikhilam Method: Shri Prakash Dwivedi
No ratings yet
An Efficient Multiplication Algorithm Using Nikhilam Method: Shri Prakash Dwivedi
6 pages
Multiplexer-Based Array Multipliers: Kiamal Z. Pekmestzi
No ratings yet
Multiplexer-Based Array Multipliers: Kiamal Z. Pekmestzi
9 pages
Karatsuba Algorithm and Urdhva-Tiryagbhyam Algorithm
No ratings yet
Karatsuba Algorithm and Urdhva-Tiryagbhyam Algorithm
6 pages
Design and Evaluation of Finite Field Multipliers Using Fast XNOR Cells
No ratings yet
Design and Evaluation of Finite Field Multipliers Using Fast XNOR Cells
6 pages
FPGA-Based Multiplier With A New Approximate Full Adder For Error-Resilient Applications
No ratings yet
FPGA-Based Multiplier With A New Approximate Full Adder For Error-Resilient Applications
5 pages
E Cient Constant Coe Cient Multiplication Using Advanced FPGA Architectures
No ratings yet
E Cient Constant Coe Cient Multiplication Using Advanced FPGA Architectures
10 pages
VHDL Implementation of ECC Processor Over GF (2 163)
No ratings yet
VHDL Implementation of ECC Processor Over GF (2 163)
7 pages
FPGA Implementation of 8 Bit Multiplier
No ratings yet
FPGA Implementation of 8 Bit Multiplier
4 pages
Performance Evaluation of Fixed-Point Array Multipliers On Xilinx Fpgas
No ratings yet
Performance Evaluation of Fixed-Point Array Multipliers On Xilinx Fpgas
5 pages
Karatsuba Algorithm - Wikipedia
No ratings yet
Karatsuba Algorithm - Wikipedia
5 pages
Computer Organisation and Architecture:Multiplier Design
No ratings yet
Computer Organisation and Architecture:Multiplier Design
6 pages
ASIC Implementation of High-Speed Adaptive Recursive Karatsuba Multiplier With Square-Root-Carry-Select-Adder
No ratings yet
ASIC Implementation of High-Speed Adaptive Recursive Karatsuba Multiplier With Square-Root-Carry-Select-Adder
4 pages
Document Scientist 1-2
No ratings yet
Document Scientist 1-2
8 pages
Design and Implementation of Efficient 4x4 Vedic Multiplier For DSP Applications
No ratings yet
Design and Implementation of Efficient 4x4 Vedic Multiplier For DSP Applications
6 pages
Design of Area, Power and Delay Efficient High-Speed Multipliers
No ratings yet
Design of Area, Power and Delay Efficient High-Speed Multipliers
8 pages
Research Outcome
No ratings yet
Research Outcome
4 pages
Design, Comparison and Implementation of Multipliers On FPGA
No ratings yet
Design, Comparison and Implementation of Multipliers On FPGA
8 pages
Convolution FPGA
No ratings yet
Convolution FPGA
6 pages
Bhattacharjee 2011
No ratings yet
Bhattacharjee 2011
5 pages
EC3021 Computer Organisation and Architecture: Latest Technologies in Multiplier Design
No ratings yet
EC3021 Computer Organisation and Architecture: Latest Technologies in Multiplier Design
6 pages
Fast Modular Multiplication Using Parallel Prefix Adder: Pravin P. Zode Raghavendra B. Deshmukh
No ratings yet
Fast Modular Multiplication Using Parallel Prefix Adder: Pravin P. Zode Raghavendra B. Deshmukh
4 pages
Power Area FILTERS
No ratings yet
Power Area FILTERS
8 pages
Braun's Multipliers: Spartan-3AN Based Design and Implementation
No ratings yet
Braun's Multipliers: Spartan-3AN Based Design and Implementation
4 pages
BSC CS Sem
No ratings yet
BSC CS Sem
42 pages
International Journal of Engineering Research and Development
No ratings yet
International Journal of Engineering Research and Development
7 pages
Handout 1 PDF
No ratings yet
Handout 1 PDF
48 pages
FPGA Implementation of IEEE-754 Karatsuba Multiplier
No ratings yet
FPGA Implementation of IEEE-754 Karatsuba Multiplier
4 pages
Maths Dissertation Introduction
100% (2)
Maths Dissertation Introduction
6 pages
FPGA Implementation of High Speed FIR Filters Using Add and Shift Method
No ratings yet
FPGA Implementation of High Speed FIR Filters Using Add and Shift Method
6 pages
Grade 9-12 Math Polynomial-Worksheet 8 Pages
No ratings yet
Grade 9-12 Math Polynomial-Worksheet 8 Pages
8 pages
Braun's Multipliers: A Delay Study: Mohammed H. Al Mijalli
No ratings yet
Braun's Multipliers: A Delay Study: Mohammed H. Al Mijalli
2 pages
Design of Low Power and High Speed Carry Select Adder Using Brent Kung Adder
No ratings yet
Design of Low Power and High Speed Carry Select Adder Using Brent Kung Adder
3 pages
José Ignacio Royo Prieto and Eulàlia Tramuns - Abelian and Non-Abelian Numbers Via 3D Origami
No ratings yet
José Ignacio Royo Prieto and Eulàlia Tramuns - Abelian and Non-Abelian Numbers Via 3D Origami
10 pages
Encyclopedia of Mathematics James Stuart Tanton Download
No ratings yet
Encyclopedia of Mathematics James Stuart Tanton Download
86 pages
Nucl - Phys.B v.711
No ratings yet
Nucl - Phys.B v.711
617 pages
Wings of Fire
No ratings yet
Wings of Fire
5 pages
DLP-8 (Week 2 DAY 1)
No ratings yet
DLP-8 (Week 2 DAY 1)
9 pages
Chapter 2 - Aerodynamics - SomeFundamentalPrinciplesandEquations
No ratings yet
Chapter 2 - Aerodynamics - SomeFundamentalPrinciplesandEquations
29 pages
Bernhard Riemanns The Habilitation Dissertation
100% (1)
Bernhard Riemanns The Habilitation Dissertation
6 pages
Get (Ebook) Simple Noetherian Rings by John Cozzens, Carl Faith ISBN 9780521092999, 052109299X PDF Ebook With Full Chapters Now
No ratings yet
Get (Ebook) Simple Noetherian Rings by John Cozzens, Carl Faith ISBN 9780521092999, 052109299X PDF Ebook With Full Chapters Now
59 pages
A Gentle Course in Local Class Field Theory Local Number Fields Brauer Groups Galois Cohomology 1st Edition Pierre Guillot
100% (3)
A Gentle Course in Local Class Field Theory Local Number Fields Brauer Groups Galois Cohomology 1st Edition Pierre Guillot
65 pages
Foundations of Geophysical Electromagnetic Theory and Methods 2Nd Edition Michael S. Zhdanov - Ebook PDF
100% (1)
Foundations of Geophysical Electromagnetic Theory and Methods 2Nd Edition Michael S. Zhdanov - Ebook PDF
46 pages
Linear Algebra and Partial Differential Equations T Veerarajan All Chapters Instant Download
No ratings yet
Linear Algebra and Partial Differential Equations T Veerarajan All Chapters Instant Download
41 pages
Integers Worksheet (Grade 6 Maths)
No ratings yet
Integers Worksheet (Grade 6 Maths)
2 pages
5111 Lecture 05 Subfields Simple Extensions
No ratings yet
5111 Lecture 05 Subfields Simple Extensions
52 pages
Notes On Quotients and Group Actions - Erik Van Den Ban.
No ratings yet
Notes On Quotients and Group Actions - Erik Van Den Ban.
15 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Algorithmic Theory of Zeta Functions Over Finite Fields: Bstract
No ratings yet
Algorithmic Theory of Zeta Functions Over Finite Fields: Bstract
28 pages
Sec 13
No ratings yet
Sec 13
2 pages
G. Gallot Et Al - Coupling Between Molecular Rotations and OH... O Motions in Liquid Water: Theory and Experiment
No ratings yet
G. Gallot Et Al - Coupling Between Molecular Rotations and OH... O Motions in Liquid Water: Theory and Experiment
10 pages
On The Solvability of Systems of Bilinear Equations in Finite Fields 0903.1156v1
No ratings yet
On The Solvability of Systems of Bilinear Equations in Finite Fields 0903.1156v1
10 pages
Prime and Composite: Super Teacher Worksheets
No ratings yet
Prime and Composite: Super Teacher Worksheets
2 pages

FPGA Based Modified Karatsuba Multiplier

Uploaded by

FPGA Based Modified Karatsuba Multiplier

Uploaded by

International Conference on VLSI and Signal Processing (ICVSP): 10 – 12 January, 2014

FPGA Based Modified Karatsuba Multiplier

fast but it is implemented with a higher complexity. Efficient

A(x) and B(x) are two field polynomials with degree 3 in

for m=8, KM requires 139 additions and 36 multiplications

Fig. 3: Time delay graph of various multipliers in GF(24)

TABLE 3: Comparison of resource utilization between KM and MKM in

Applying Modified Karatsuba Algorithm (MKA) in equation

Similarly the expressions of d1,d2,d3, d4 d5,d6 and d7 are

d4= a0b4 +a1b3+a2b2+a3b1+a4b0+a5b7+a6 b6+a7b5

area and path delay. Figure 9 shows the delay in computing

You might also like