Mini-Project Report Final
Mini-Project Report Final
Chapter 1
INTRODUCTION
The design of the 32-bit MAC unit involves incorporating a Vedic multiplier and a
carry-save adder, which can be divided into two distinct parts. The first part focuses on the
multiplier unit, where the conventional multiplier is replaced with a Vedic multiplier
utilizing the “Urdhava Triyagbhayam” sutra. Historically, these sutras have been employed
for multiplying two numbers within the decimal number system. However, in this project,
similar techniques will be applied to tackle multiplication within the binary number system,
resulting in a novel aphorism that is better suited for digital systems. This innovative
technique represents a general multiplication formula that can be applied to all
multiplication scenarios.[1]
The proposed Vedic multiplier relies on the "Urdhva Tiryagbhyam" sutra, which
translates to "Vertically and Crosswise." These sutras have historically been employed for
multiplication within the decimal number system, offering faster and more convenient
calculations. The advantage of utilizing a multiplier based on this sutra lies in its efficiency,
which improves as the number of bits increases.
The architecture of the MAC unit incorporates vital components such as multipliers,
adders, and accumulators, which have been optimized to achieve high throughput and low
latency. The implementation of the architecture is conducted using hardware description
languages (HDL) and simulated using suitable design tools like cadence genus. The project
encompasses a comprehensive analysis of the MAC unit's performance, power
consumption, area utilization, and accuracy. Throughput and latency measurements will
offer valuable insights into the processing speed of the units, while power consumption
analysis will facilitate the assessment of their energy efficiency. Additionally, the
evaluation of area utilization will ensure optimal utilization of chip space, while accuracy
assessment will guarantee reliable and precise outcomes.
Further investigation revealed that the MAC unit plays a crucial role in numerous
digital processing systems. However, it was observed that conventional multipliers used in
these systems are not efficient.
To address this issue, the project turned to the principles of Vedic mathematics—a
system of mathematics derived from the ancient Indian wisdom found in the Vedas, the
sacred texts of India. Incorporating Vedic principles into the MAC unit design is essential,
as it acknowledges the contributions of the scholars who dedicated their lives to formulating
these principles. Moreover, it serves as a valuable contribution to future generations,
inspiring them with the works that originated in India and setting an example for young
learners.
To address these challenges, the project was taken with motivation to use Vedic
multiplication techniques and carry save addition in the design of a 32-bit MAC unit. The
Vedic multiplier offers significant speed improvements over conventional multiplication
techniques by exploiting the parallelism inherent in Vedic mathematics. This enables faster
execution of digital signal processing algorithms, reducing latency and meeting the
stringent timing requirements of real-time systems. Additionally, the carry save adder
reduces power consumption by eliminating unnecessary carry propagation during addition
operations, leading to improved energy efficiency.
Researchers are driven by their desire to contribute to the advancements in the field,
publish their findings in academic journals or conference proceedings, and potentially
develop intellectual property for commercial purposes.
The expected outcomes of this project include an improved processing system that
provides enhanced accuracy and reduced computation time. By utilizing the Vedic
multiplication technique within the MAC unit, the overall effectiveness of signal
processing algorithms is enhanced, enabling more efficient analysis and interpretation of
data in diverse fields.
Chapter 2
Adders play a crucial role in digital systems as they are widely used in arithmetic
operations such as subtraction, multiplication, and division. Therefore, the performance of
binary adders significantly impacts the execution of binary operations within a circuit that
consists of such components. Additionally, when considering other aspects of integrated
circuits (ICs) such as area and power, it becomes evident that the hardware dedicated to
addition plays a significant role in these areas. Given the importance of adders in digital
systems, it is crucial to carefully select the appropriate adder design for a given application.
Choosing the right adder design can contribute to achieving optimal performance and
efficiency in the overall IC design.
The control logic within the MAC unit plays a crucial role in coordinating the
sequencing and timing of operations. It generates the necessary control signals for various
components, ensuring proper synchronization and efficient operation of the MAC unit.
In the realm of DSP and various other applications, multipliers play a crucial role.
With advancements in technology, researchers and scholars have been exploring the design,
development, and implementation of multipliers with specific features such as high speed,
low power consumption, regularity of layout, and compactness. These features are desired
in the creation of high-speed, low-power, and compact VLSI circuits.
to increased computational speed [7]. In this paper, we propose a specific method for
accumulating these intermediate products with minimal delay. Multiplication is an essential
operation in Digital Signal Processing (DSP) and finds numerous applications in this field.
When applying this algorithm, the multiplication process is divided into smaller,
parallel steps. Partial products are generated by multiplying corresponding digits of the
multiplicand and multiplier. These partial products are then added concurrently in parallel,
combining the individual results to obtain the final product. By leveraging parallel
processing, the algorithm improves efficiency and accelerates the multiplication process.
The parallel execution of concurrent additions allows for faster computation and enhances
the overall performance of multiplication algorithms based on the Vedic mathematics
principles. The multiplication stages in the Urdhva Tiryakbhayam sutra for two decimal
integers are illustrated in Figure 2.1. If there is a preceding carry, the values at the end of
the route are multiplied, it is also added. Multiple multiplications of any step have been
combined with the preceding carry. The result bit is the unit place digit, and the carry for
the next step is the tens place digit. The multiplier is unaffected by the processor's clock
frequency since n parallel, the entire combination and sums are computed. As a result,
microprocessors do not need to operate at an increasingly high frequency, maximising
processing power. It can be simply laid out in a silicon chip because to its regular structure.
In comparison to the traditional way of multiplication, it saves time, energy, and space [6].
The Vedic multiplier is based on the ancient Indian mathematical technique called
Urdhva Tiryakbhyam sutra, which translates to "Vertically and Crosswise." The
algorithm for the Vedic multiplier can be summarized as follows:
1. Split the given two numbers (multiplicand and multiplier) into their respective digits
or bit positions.
2. Begin with the rightmost digit/bit of the multiplier and multiply it with each digit/bit
of the multiplicand.
3. Record the partial products obtained from each multiplication.
4. Shift the multiplier one position to the left.
5. Repeat steps 2-4 until all digits/bits of the multiplier have been processed.
6. Sum up all the partial products obtained in step 3, aligning them properly based on
their positions.
7. The final sum represents the product of the multiplicand and multiplier.
Multiplier: 1 0 1 0
• Multiply the rightmost digit/bit of the multiplier (0) with each digit/bit of the
multiplicand:
Partial products: 0 0 0 0
• Multiply the new rightmost digit/bit of the multiplier (0) with each digit/bit of the
multiplicand:
Partial products: 0 0 0 0
• Multiply the new rightmost digit/bit of the multiplier (1) with each digit/bit of the
multiplicand:
Partial products: 1 1 0 1
• Multiply the new rightmost digit/bit of the multiplier (1) with each digit/bit of the
multiplicand:
Partial products: 1 1 0 1
0000
+1 1 0 1
1 1 0 1 1 0 1 0 (130 in decimal)
Thus, the product of 13 and 10 using the Vedic multiplier algorithm is 130.
Note: The algorithm can be extended to handle larger numbers by considering additional
digits/bits and following the same procedure [10].
Chapter 3
LITERATURE SURVEY
Sachin B. Jadhav, Nikhil N. Man “A Novel High Speed FPGA Architecture for FIR
Filter Design" [8] proposed a hardware implementation of linear phase FIR filter using
merged MAC architecture, which reduces the complexity and increases the speed of
convolution operation. The paper uses sparse powers of two partial products terms
coefficients to implement an FIR filter tap with low hardware cost. The paper also exploits
word and bit level parallelism to achieve high sampling rates. The paper employs modified
4:2 and 5:2 compressor circuits to construct a binary tree based architecture for
multiplication and accumulation. The paper aims to minimize the number of combinational
gates in the critical path by using higher n:2 compressors for array multiplier.
Vedic multiplier is based on the Urdhva Tiryakbhyam sutra and uses two Half Adders and
four AND gates for each 2-bit multiplication. The booth multiplier uses radix-4 algorithm
and reduces the number of partial products by half. The results show that the Vedic
multiplier has less area, power and delay than the booth multiplier.
Aki Vamsi Krishna, S. Deepthi & M. Nirmala Devi "Design of 32-Bit MAC Unit
Using Vedic Multiplier and XOR Logic” [12] presents a design of a 32-bit MAC unit using
a Vedic multiplier and XOR logic. The Vedic multiplier is based on the Nikhilam
Navatashcaramam Dashatah sutra and uses XOR gates to perform subtraction. The XOR
logic is used to reduce the number of adders required for the accumulation process. The
results show that the proposed design has less area, power and delay than a conventional
MAC unit.
Akella Srinivasa, Krishna Vamsi and Ramesh S R "An Efficient Design of 16 Bit
MAC Unit using Vedic Mathematics" [9] proposed, A Multiply and Accumulate (MAC)
unit is a key component of digital signal processing applications. It performs the operation
of multiplying two operands and adding the result to an accumulator. The performance of
a MAC unit depends on the efficiency of the multiplier and the adder units. Vedic
Mathematics is a system of mathematics that offers fast and simple methods for performing
various arithmetic operations. One of the methods, called Urdhva Tiryakbhyam sutra, can
be used to design a Vedic multiplier that can multiply two binary numbers in parallel. A
Carry save adder (CSA) is a type of adder that can add three or more binary numbers
without propagating the carry bits. A CSA can be used to reduce the partial products
generated by the Vedic multiplier and to speed up the addition process. In this paper, we
propose a design and implementation of a 16-bit MAC unit using a 16 bit Vedic multiplier
and a 32 bit CSA circuit. The proposed design is compared with existing designs in terms
of power, area, and delay using Verilog HDL and Xilinx Vivado tools.
Ankush Nikam, Swati Salunke, Sweta Bhurse "Design and Implementation of 32bit
Complex Multiplier using Vedic Algorithm" [11] presents a design and implementation of
a 32-bit complex multiplier using a Vedic algorithm based on Nikhilam Sutra. The paper
also presents 8-bit and 16-bit versions of the complex multiplier and compares them with
other existing methods. It also gives comparison of 8-bit, 16-bit, 32-bit complex multiplier
on various performance parameters like power and delay. The proposed system is designed
using VHDL and Verilog and is implemented through Xilinx ISE 14.2 navigator and
modelsim v6.3 softwares. The results show that the proposed complex multiplier has less
area, delay and power consumption than the other methods.
Shauvik Panda, Dr. Alpana Agarwal “A New High Speed 16x16 Vedic Multiplier”
[14] proposed, A spanning tree adder is a type of adder that can add multiple binary
numbers using a tree structure. The tree structure consists of multiple levels of full adders
and half adders that are connected in a way that minimizes the critical path delay. A STA
can also reduce the number of partial products generated by the Vedic multiplier and speed
up the addition process. A 16-bit MAC unit using a 16 bit Vedic multiplier and a 32-bit
CSA circuit can perform the operation of multiplying two 16-bit operands and adding the
result to a 32-bit accumulator. The proposed design is implemented using Verilog HDL and
Xilinx Vivado tools. The results show that the proposed design achieves significant
improvement in power and delay over the existing designs.
Chapter 4
The design made for the 32-bit MAC unit is made by top-down design methodology
in Verilog HDL and major functional blocks are involved in the code is detailed here. The
code consists of several modules for performing arithmetic operations using various adders
and multipliers. The modules are organized based on the number of bits involved in the
operations.
• `module ha (a, b, sum, carry) ` - This module represents a half adder. It takes two inputs
`a` and `b` and produces two outputs `sum` and `carry`. The sum output is the XOR of
the inputs, while the carry output is the AND of the inputs.
• `module vedic_2x2(a, b, c) ` - This module represents a 2x2 multiplier using the Vedic
multiplication technique. It takes two 2-bit inputs `a` and `b` and produces a 4-bit output
`c`. The multiplication is performed by decomposing the inputs into smaller units and
using half adders and full adders for intermediate calculations.
• `module add_4_bit (input1, input2, answer) ` - This module represents a 4-bit adder. It
takes two 4-bit inputs `input1` and `input2` and produces a 4-bit output `answer`. The
addition is performed using full adders.
• `module add_16_bit (input1, input2, answer) ` - This module represents a 6-bit adder.
It takes two 6-bit inputs `input1` and `input2` and produces a 6-bit output `answer`. The
addition is performed using full adders.
• `module vedic_4x4(a, b, c)` - This module represents a 4x4 multiplier using the Vedic
multiplication technique. It takes two 4-bit inputs `a` and `b` and produces an 8-bit
output `c`. The multiplication is performed by decomposing the inputs into smaller units
and using 2x2 multipliers and adders for intermediate calculations.
• `module add_8_bit(input1, input2, answer)` - This module represents an 8-bit adder. It
takes two 8-bit inputs `input1` and `input2` and produces an 8-bit output `answer`. The
addition is performed using full adders.
• `module add_12_bit(input1, input2, answer)` - This module represents a 12-bit adder.
It takes two 12-bit inputs `input1` and `input2` and produces a 12-bit output `answer`.
The addition is performed using full adders.
• `module vedic_8x8(a, b, c) ` - This module represents an 8x8 multiplier using the Vedic
multiplication technique. It takes two 8-bit inputs `a` and `b` and produces a 16-bit
output `c`. The multiplication is performed by decomposing the inputs into smaller units
and using 4x4 multipliers and adders for intermediate calculations.
• `module add_16_bit (input1, input2, answer) ` - This module represents a 16-bit adder.
It takes two 16-bit inputs `input1` and `input2` and produces a 16-bit output `answer`.
The addition is performed using full adders.
• `module add_24_bit (input1, input2, answer) ` - This module represents a 24-bit adder.
It takes two 24-bit inputs `input1` and `input2` and produces a 24-bit output `answer`.
The addition is performed using full adders.
2. The code also defines modules for different types of adders, such as `add_4_bit`,
`add_6_bit`, `add_8_bit`, `add_12_bit`, `add_16_bit`, `add_24_bit`, `add_32_bit`, and
`add_48_bit`. These modules implement N-bit addition using the previously defined half
adders and full adders.
3. The code defines a module `vedic_2x2` that represents a 2x2 multiplier using the Vedic
Multiplication algorithm. It takes two 2-bit inputs (`a` and `b`) and produces a 4-bit output
(`c`).
4. The code defines a module `vedic_4x4` that represents a 4x4 multiplier using the Vedic
Multiplication algorithm. It takes two 4-bit inputs (`a` and `b`) and produces an 8-bit output
(`c`).
5. The code defines a module `vedic_8x8` that represents an 8x8 multiplier using the Vedic
Multiplication algorithm. It takes two 8-bit inputs (`a` and `b`) and produces a 16-bit output
(`c`).
6. The code defines a module `vedic_16x16` that represents a 16x16 multiplier using the
Vedic Multiplication algorithm. It takes two 16-bit inputs (`a` and `b`) and produces a 32-
bit output (`c`).
7. Finally, the code defines a module `vedic_32x32` that represents a 32x32 multiplier
using the Vedic Multiplication algorithm. It takes two 32-bit inputs (`a` and `b`) and
produces a 32-bit output (`c`).
Each module uses the smaller adders and multipliers defined earlier to perform the
necessary addition and multiplication operations. The output of each module represents the
result of the corresponding multiplication operation.
Chapter 5
SOFTWARE SPECIFICATIONS
The project is implemented on the Cadence tools which are very particular in their
system requirements. Hence the system requirements are given below.
Note: Cadence SPB and OrCAD products do not support Windows XP 64-bit, Windows 7
Starter and Home Basic, and Windows Server 2003.
In addition, Windows Server support does not include support for Windows Remote
Desktop. So, the following are the Software & Hardware requirements:
In this project, the Cadence Genus tools prove to be highly useful. Genus is an RTL
synthesis tool that takes RTL codes, written in hardware description languages like Verilog
or VHDL, and generates gate-level netlists. By utilizing Genus, users can optimize their
Vedic multiplier designs and obtain optimized gate-level netlists that meet timing, area, and
power constraints.
One of the key advantages of Genus is its ability to perform design explorations.
With these tools, users can experiment with various optimization options and techniques to
explore different design choices and trade-offs. By adjusting synthesis settings and
constraints, they can achieve the desired performances, areas, and power characteristics for
their Vedic multipliers.
Genus also allows users to specify constraints for the synthesis processes, such as
timing requirements and area constraints. By defining these constraints, users can ensure
that the resulting designs meet the desired performance targets and resource utilization
limits. Genus efficiently manages these constraints and optimizes the designs accordingly.
Another crucial aspect of the Cadence Genus tools is their formal equivalence
checking capabilities. They rigorously verify that the gate-level netlists synthesized by
Genus are functionally equivalent to the original RTL designs. This step ensures that the
synthesized designs perform correctly and match the intended behaviors.
Overall, Cadence Genus offers comprehensive synthesis flows for Vedic multiplier
projects. Its features include RTL synthesis, design explorations, constraint management,
formal equivalence checking, and DFM considerations. Utilizing Genus can significantly
contribute to the designs, optimizations, and verifications of projects, enabling users to
achieve highly efficient and manufacturable Vedic multiplier designs.
One of the key uses of Xcelium is its ability to perform efficient and accurate
functional verifications. With these tools, the researchers could simulate their Vedic
multiplier designs using test vectors and stimuli to ensure that they behave correctly under
various input conditions. Xcelium supports both behavioural and gate-level simulations,
allowing them to verify the designs at different levels of abstraction.
Xcelium also provides advanced debugging features that help identify and resolve
issues in the designs. It offers powerful waveform visualization and analysis capabilities,
allowing the researchers to inspect and analyse the simulation results in detail. By
examining the waveforms and signals, they could pinpoint potential bugs, timing
violations, or other functional issues, and take appropriate corrective actions.
Another significant advantage of Xcelium is its support for various advanced
verification methodologies, such as System Verilog Assertions (SVA) and Universal
Verification Methodology (UVM). These methodologies enable the creation of
comprehensive testbenches and the assertion-based verification of the Vedic multiplier
designs. Xcelium's compatibility with these methodologies enhances the effectiveness and
efficiency of the verification process.
Xcelium also incorporates features to improve simulation performance, such as multi-core
simulation acceleration and advanced optimization techniques. These capabilities allow the
researchers to speed up the verification process, especially when dealing with complex
Vedic multiplier designs that require extensive simulation runs.
Furthermore, Xcelium seamlessly integrates with other Cadence tools and design
flows, such as RTL synthesis tools like Genus. This integration facilitates a smooth
transition from synthesis to simulation, enabling them to verify the synthesized gate-level
netlists using Xcelium.
Cadence Genus and Xcelium are valuable tools for this project. Cadence Genus is
a synthesis tool utilized to generate gate-level netlists from RTL designs. It optimizes
designs for area, power, and performance, and incorporates design rule checking and static
timing analysis to ensure adherence to required specifications. Xcelium, on the other hand,
is a simulation tool employed to verify the functionality and performance of designs. It
conducts simulations at various levels of abstraction, including RTL, gate-level, or mixed-
signal, and facilitates code coverage and assertion-based verification to ascertain the
accuracy and completeness of designs. Both Cadence Genus and Xcelium are extensively
employed in industry and academia for digital design and verification. Their inclusion in
this project can enhance the overall quality and reliability of the outcomes [11].
Hence in this project the synthesis is carried out in the Cadence Genus tool and the
Simulation is conducted in the Cadence Xcelium tool.
Chapter 6
PROJECT IMPLEMENTATION
sum produced by the carry save adder. In this project, the accumulator is designed to be 48
bits wide, which provides sufficient storage capacity for the accumulated results. The
accumulator maintains the intermediate and final results of the MAC operation, allowing
for iterative calculations.
Together, these three blocks from the main components of the 32-bit MAC Unit. The 16 x
16 Vedic Multiplier generates partial products, the 32-bit Carry Save Adder combines these
partial products to produce a sum, and the 48-bit Accumulator stores and accumulates the
results. This block diagram represents the functional flow of the MAC Unit, highlighting
the key operations involved in multiplying and accumulating data.
Following to this the detailed block diagram figure 4.2 consisting of all the
intermediate blocks in the 32-bit MAC Unit.
The split and the flow of the various block diagram inside the MAC unit such as the
Half adder, Full adder, Multiplier, and an Accumulator. The analysis focuses on identifying
potential optimization opportunities to further enhance the MAC unit's efficiency.
Techniques such as reducing power consumption, improving throughput, and optimizing
the area are explored to achieve performance improvements [9].
The data path of the MAC unit includes registers, multiplexers, adders, and other
arithmetic and logic elements. It facilitates the movement and manipulation of data during
multiplication and accumulation operations. The design of the data path should consider
factors such as data width, precision, and the number of stages required for efficient
operation.
In this project, the MAC unit utilizes the Vedic multiplication technique. Vedic
mathematics is a system of ancient Indian mathematics that offers efficient methods for
arithmetic operations. The Vedic multiplication technique, also known as the
“Urdhva Tiryakbhyam " method, decomposes complex multiplication into simpler
steps, reducing computational complexity [11].
The Vedic multiplication technique involves the following steps:
a. Vertical and crosswise multiplication: The multiplicand and multiplier are aligned
vertically, and crosswise products are computed by multiplying the corresponding digits.
b. Addition and carry propagation: The crosswise products are added, considering the
appropriate place values, and any carry is propagated to the subsequent steps.
c. Sub-sum generation: Partial products obtained from addition are combined to form sub-
sums, which are accumulated to obtain the result.
The Vedic multiplication technique offers advantages such as parallel computation,
reduced complexity, and scalability, making it a suitable choice for implementing high-
performance MAC units shown in the Table 6.1.
The control logic of the MAC unit coordinates the timing and sequencing of various
operations. It generates control signals to enable/disable the multiplier, accumulator, and
other components within the data path. The control logic ensures proper synchronization
and coordination of data flow, enabling efficient operation of the MAC unit.
Chapter 7
RESULTS
7.1 The Synthesis and Simulation Results in Cadence
The simulation results obtained from the Cadence tool from college system which
has following specification 16 GB RAM, 512 GB hard disk, Windows 10 Professional,
Intel inbuilt Xeon graphics Card, 32 GB virtual memory and Broadband connection up to
50MBps. As mentioned before the Cadence tool requires some specifications. Hence the
MAC unit code could online be simulated in this system. The following snapshots are the
results obtained during the execution:
This the waveform that is obtained after running the code in the simulation tool
along with the test bench where the result of the multiplication of two 32-bit number is
obtained as follows. Given, are the different test cases for the MAC unit
The figure 6.2 below is the synthesis of 32-bit MAC Unit schematic block diagram
which includes a Vedic_16x16 module for multiplication, a 32-bit adder for accumulation,
and a 48-bit adder for overflow handling. This is obtained using Cadence Genus tool, which
showcases the connectivity and functionality of these components in the overall MAC Unit.
Figure 7.2 - 32 Bit MAC Unit Schematic Block Diagram Comprising of Vedic_16x16 module and
32 Bit & 48 Bit Adder
The figure 7.3 consists of the Vedic_16x16 module schematic block diagram
demonstrates the implementation of Vedic multiplication using an 8x8 module, a 16-bit
adder, and a 24-bit adder. Created in Cadence Genus tool, the diagram illustrates the
interconnection and operation of these components within the Vedic_16x16 module.
Figure 7.3 - Vedic_16x16 module Schematic Block Diagram Comprising of Vedic_8x8 module
and 16 Bit & 24 Bit Adder.
The figure 7.4 below is the Vedic_8x8 module schematic block diagram showcases
the utilization of Vedic multiplication through a combination of a 4x4 module, an 8-bit
adder, and a 12-bit adder. Developed using the Cadence Genus tool, the diagram represents
the arrangement and functionality of these elements within the Vedic_8x8 module.
Figure 7.4 -Vedic_8x8 module Schematic Block Diagram Comprising of Vedic_4x4 module and 8 Bit & 12
Bit Adder
The Cadence Genus tool was used to create the large-scale adder circuit shown in
the 48-bit adder block diagram. This block diagram shows how different Full adder logic
gates, and half adder modules must be arranged and connected to perform addition on 48-
bit operands. The adder's structure, including the cascaded stages and carry propagation
pathways, is represented visually in the diagram. It enables a thorough examination of the
circuit's operation and can be used as a guide to comprehend the conception and execution
of massive addition operations.
The block diagram highlights the input and output connections of the adder, as well
as the internal data paths and control signals. It visually represents the flow of data through
the different stages of the adder, including the carry-in and carry-out signals used for carry
propagation.
The internal block diagram helps in understanding the functional units within the
48-bit adder, such as full adders or sub-adders, and their arrangement to perform the
addition operation. It enables designers to analyse the circuit's performance, optimize
critical paths, and ensure proper data flow and signal integrity throughout the adder design.
The Linux terminal window shows details on the modules a system uses, giving
important details about the system's architecture and component arrangement. In this
instance, the terminal window displays the presence of 3029 instance modules overall and
17 distinct modules.
Users and developers can better comprehend the system's structure and composition
because to the information that is displayed. Each module represents a unique functional
unit or system component that contributes to the functionality of the whole. The system
design's use of a range and diversity of components is demonstrated by the 17 distinct
modules. Additionally, the fact that there are 3029 instance modules indicates that specific
modules are extensively used and replicated throughout the system.
Figure 7.9 - Terminal Window: Displaying 17 unique Modules and 3029 instance Modules.
Chapter 8
8.1 Conclusion
The methodology for this project entails the design of a 32-MAC unit architecture
utilizing Vedic multiplication principles. The architecture encompasses crucial components
such as the multiplier, adder, and accumulator, which are optimized to achieve high
throughput and low latency. The implementation of the architecture is executed through the
utilization of hardware description languages (HDL), while simulation is performed using
Cadence design tools. These tools provide the necessary environment for modelling,
designing, and verifying the functionality of the architecture. By employing HDL and
Cadence design tools, the project aims to ensure accurate representation and efficient
simulation of the 32-MAC unit architecture.
The results obtained from the MAC unit implementation are analysed, compared,
and validated against reference results and existing multiplication techniques. The analysis
focuses on identifying potential optimization opportunities to further enhance the MAC
unit's efficiency. Techniques such as reducing power consumption, improving throughput,
and optimizing the area are explored to achieve performance improvements.
leveraging the benefits of Vedic mathematics, this project aims to contribute to the
advancement of efficient and high-performance computational systems for many
applications in the further development.
Firstly, there is potential for expanding the design to support higher bit-width MAC
Units. With the increasing demand for more powerful computing systems, extending the
MAC Unit to larger word sizes, such as 64-bit or 128-bit, would enhance its applicability
in high-performance applications.
Secondly, future work can focus on optimization techniques to further improve the
MAC Unit's performance and efficiency. This may involve exploring advanced algorithmic
improvements, parallel processing techniques, or power optimization methods. By refining
the design and incorporating these optimization strategies, the MAC Unit can deliver even
faster and more energy-efficient computations [14].
The MAC Unit can be integrated into more complex digital architectures, such as
DSP (Digital Signal Processing) systems or neural network accelerators. Future research
could explore the integration of the MAC Unit into such architectures to enhance their
overall performance and functionality. This integration may involve adapting the MAC
Unit design to suit the specific requirements and constraints of these advanced
architectures.
One area is in the field of digital signal processing, where MAC Units play a crucial
role in computations involving filtering, audio/video processing, and communications. By
further enhancing the performance and efficiency of MAC Units, everyday applications
like audio and video editing software, voice recognition systems, and image processing
algorithms can benefit from faster and more accurate calculations [15].
Future work could involve conducting a comparative analysis of the proposed MAC
Unit with other existing multiplication and accumulation techniques for floating points.
Benchmarking the performance, area utilization, power consumption, and other relevant
metrics against alternative approaches would provide insights into the strengths and
weaknesses of the Vedic multiplier-based MAC Unit and guide further improvements[16].
REFERENCES
[1] Manpreet Kaur and Amandeep Kaur - "Review Paper on Vedic Multiplier by Using
Different Methods " published at International Journal of Engineering Development
and Research (IJEDR) 2018, Volume 6, Issue 2.
[2] Shraddha Lad and Varsha S. Bendre - "Design and Comparison of Multiplier using
Vedic Sutras" published at 2019 5th International Conference On Computing,
Communication, Control And Automation (ICCUBEA) 21 September 2019.
[5] Vishal Galphat, Nitin Lonbale - " The High Speed Multiplier by using Prefix Adder
with MUX and Vedic Multiplication" published at International Journal of Science
and Research (IJSR) Volume 5 Issue 1, January 2016
[8] Vishnu Prasad Patidar and Sourabh Sharma - "A Novel High Speed MAC-16x16
Vedic Multiplier Using ripple carry adder on FPGA" published at International
Journal of Engineering Research and Management Studies June 2016.
[9] Akella Srinivasa, Krishna Vamsi and Ramesh S - "An Efficient Design of 16 Bit
MAC Unit using Vedic Mathematics " published at International Conference on
Communication and Signal Processing, April 4-6, 2019, India.
[11] Krishnaveni D and Umarani T.G - "VLSI Implementation of Vedic Multiplier with
reduced delay" published at International Journal of Advanced Technology &
Engineering Research (IJATER) Volume 2, Issue 4, July 2012.
[12] Vijendra Bairwa and Poonam Jindal - "Analysis of 16x16 Vedic and 16 Bit Floating
Point Multiplier -A Comparative Study" published at Proceedings of the 3rd
International Conference on Contents, Computing & Communication (ICCCC-
2022).
[13] Mounika and Ashraf - "A 32 BIT MAC Unit Design Using Vedic Multiplier and
Reversible Logic Gate" published at International Journal and magzine of
Engineering, Technology, Management and Research.
[14] Suma Nair, K. Sai Naveen, M. Nagamani, M. Sushma Nivasini - "Design of Vedic
Mathematics Based On Mac Unit for Power Optimization" published at
International Journal of Research in Engineering and Science (IJRES) Volume 10
Issue 6, 2022.
[15] A. Abdelgawad and Magdy Bayoumi - "High Speed and Area-Efficient Multiply
Accumulate (MAC) Unit for Digital Signal Processing Application" published at
2007 IEEE International Symposium on Circuits and Systems 30 May 2007.
[16] Akanksha Kant and Shobha Sharma - "Applications of Vedic multiplier designs -
A review" published at 2015 4th International Conference on Reliability, Infocom
Technologies and Optimization (ICRITO) 04 September 2015.