0% found this document useful (0 votes)
17 views25 pages

Efficient Approximate Parallel Prefix Adder Design

The document discusses various types of adders used in digital systems, focusing on Multi-bit Adders and their structures, including Ripple Carry Adder and Parallel Prefix Adder. It highlights the introduction of Approximate Computing techniques to optimize adders for energy efficiency and circuit area, leading to the development of Approximate PPA and its limitations. The paper proposes an Efficient Approximate PPA (EAxPPA) that combines different approximate techniques to enhance performance, supported by experimental results demonstrating significant improvements in circuit area and power savings.

Uploaded by

mailtochipmatrix
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views25 pages

Efficient Approximate Parallel Prefix Adder Design

The document discusses various types of adders used in digital systems, focusing on Multi-bit Adders and their structures, including Ripple Carry Adder and Parallel Prefix Adder. It highlights the introduction of Approximate Computing techniques to optimize adders for energy efficiency and circuit area, leading to the development of Approximate PPA and its limitations. The paper proposes an Efficient Approximate PPA (EAxPPA) that combines different approximate techniques to enhance performance, supported by experimental results demonstrating significant improvements in circuit area and power savings.

Uploaded by

mailtochipmatrix
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

CHAPTER 1

INTRODUCTION

Adders are widely used in the field of digital systems, serving a variety of applications, not just
limited to basic arithmetic operations like multiplication and decimal addition but extending to
more advanced applications in core and accelerator. An adder typically employs the Full-Adder
as its basic design, which is composed of two Half-Adders. Design a Multi-bit Adder using
multiple Full-Adders. Various structures for Multi-bit Adders have been proposed, such as
Ripple Carry Adder(RCA), Carry Lookahead Adder(CLA), Carry Select Adder(CSLA), and
Carry Skip Adder(CSKA). The simplest structure, RCA, connects Full-Adders in a linear
fashion,making it the most area-efficient Multi-bit Adder. However, it suffers from significant
delay due to the propagation of the carry signal. To resolve this problem, structures such as CLA,
CSLA, and CSKA were proposed, each with its own method of carry calculation. To more
effectively resolve the delay problem, the Parallel Prefix Adder (PPA) structure was introduced.
PPA consists of three stages, preprocessing, prefix-processing, and post-processing. Different
PPA has been proposed based on the configuration of the prefix-processing stage . PPA offers
the advantage of minimal delay compared to traditional serial adders. However, it requires logic
for parallel carry calculation, which can result in suboptimal performance in terms of circuit area
and power efficiency compared to serial adders. To resolve this problem with each adder design,
research has been conducted on applying Approximate Computing(AxC) to adder to optimize
accuracy and reduce circuit area and energy consumption, resulting in Approximate
Adder(AxA). The range of AxC application includes the transistor level, full-adder level, multi-
bit adder level, and PPA level. We conduct research on Approximate PPA with AxC applied at
the PPA level. A structure with AxC applied to PPA is proposed in , known as AxPPA. This
paper analyzes the limitations of AxPPA and introduces the Efficient Approximate
PPA(EAxPPA).

1.1. Parallel Prefix Adder


PPA typically consists of three stages, preprocessing, prefix-processing, and post-processing,
with each stage employing different circuit configurations to perform specific functions. The first
stage, pre-processing, is responsible for encoding the input operands 𝑎 and 𝑏 . It includes logic
for generating the carry 𝑔 and logic for propagating the carry 𝑝. The boolean equations for the 𝑝
and 𝑔 of the i-th bit are given by (1), (2).

The second stage, prefix-processing, groups 𝑝 and 𝑔 to generate the final carry signal over
several steps. prefix-processing is composed of an operation block

Figure 1. AxPPA structure

called the Prefix Operator (PO). The boolean equations for the PO are provided by (3), (4).
𝐺 is generated based on the previous node's 𝑔𝑘 and expressions for 𝑝𝑖 and 𝑔𝑖 for the i-th bit. 𝑃
is generated based on 𝑝𝑖 for the i-th bit and the previously calculated 𝑝𝑘 . The outputs 𝑃 and 𝐺
from the PO are then connected as inputs to other POs. Unlike traditional serial adders, POs in
PPA are connected in parallel over multiple stages to compute the carry quickly. The last stage,
post-processing, combines the 𝑝 from the pre-processing stage and the individual bit carries 𝐺
calculated in the prefixprocessing stage to generate the final sum 𝑆 for operands. The boolean
equation for the i-th bit is given by (5).

PPA performance varies based on the configuration of the PO, leading to the proposal of
different PPA types such as Kogge-Stone(KS) PPA, Brent-Kung(BK) PPA, Sklansky(SK) PPA,
and LadnerFischer(LF) PPA.

1.2.Approximate Technique
The approximate techniques used in the design of Approximate PPA can be broadly defined as
three main techniques:
1) Elimination Technique: The elimination technique involves removing gates from the existing
operator and connecting the input and output of the operator using wires.
2) Constant Technique: The constant technique entails removing gates from the existing
operator and connecting constant values, typically cte-0(1'b0) and cte-1(1'b1), to the output.
3) Simplification Technique: The simplification technique involves replacing the gates in the
existing operator with other area-efficient gates(and, nand, or, nor, xor, nxor).

1.3.AxPPA Design
Approximate PPA refers to a modified structure of the existing PPA, where approximate
techniques are applied to improve the area and power efficiency issues inherent in traditional
PPA. One prominent structure is the AxPPA proposed in which introduces modifications to
address these inefficiencies. Figure 1 shows the structure of AxPPA. In the case of AxPPA, the
key proposal involves divide the entire PPA into two parts, and apply elimination technique to
low part POs to enhance efficiency.

CHAPTER 2
LITERATURE REVIEW

2.1. Introduction to Approximate Computing in Adders


Approximate computing has gained prominence as a strategy to enhance energy efficiency in
digital designs. Parallel prefix adders (PPAs) are a key focus due to their widespread use in
arithmetic units.

● H. Jiang et al. introduced approximate PPAs (AxPPA) that optimize delay and energy
consumption for multimedia and machine learning workloads.
● Z. Liu et al. reviewed approximate adders and their trade-offs in speed, power, and
accuracy, offering insights into their application in resource-constrained environments .

2.2. Architectural Innovations in Approximate PPAs

Innovative architectural techniques have been proposed to improve the efficiency of PPAs.

● R. V. Shenoy et al. designed hybrid approximate PPAs for low-power applications by


combining carry-skip logic with approximate logic gates, achieving 25% area savings .
● S. Srivastava and A. Tiwari explored parallel prefix structures optimized for high-speed
and area-constrained environments .

2.3. FPGA Implementation of Approximate PPAs

Field-programmable gate array (FPGA) implementations are essential to evaluate real-world


performance.

● K. Nagarajan and D. Malhotra demonstrated FPGA designs for PPAs, highlighting


significant power reductions in digital signal processing (DSP) tasks.
● P. Chen et al. focused on FPGA-based hybrid approximate PPAs, achieving up to 40%
energy savings in image processing systems .

2.4. Applications in Internet of Things (IoT)

Approximate PPAs have been tailored for IoT devices that demand low power and acceptable
error rates.

● J. Kim et al. developed lightweight approximate PPAs optimized for energy-constrained


IoT applications, demonstrating their potential in wearable devices.
● L. Zhang et al. proposed designs that balance accuracy and energy consumption for IoT
edge computing platforms .

2.5. Accuracy and Error Metrics

Error-tolerant designs are crucial for approximate adders to maintain functional correctness.

● M. Zhao et al. evaluated accuracy metrics in approximate PPAs, proposing error-resilient


designs for multimedia applications .
● V. K. Singh et al. analyzed the trade-off between precision and power efficiency in error-
tolerant PPA architectures .

2.6. Power and Energy Efficiency in Approximate PPAs

Power efficiency is a key driver for approximate adder designs.

● A. Gupta and S. Bansal achieved 30% energy savings by simplifying logic paths in PPAs
for low-power devices .
● K. Verma et al. examined power-aware PPAs, focusing on voltage scaling and logic
simplification to minimize energy consumption .

2.7. Applications in Neural Networks and AI

Approximate PPAs have found applications in AI accelerators, where exact precision is not
always required.

● N. Patel et al. implemented PPAs in neural network hardware accelerators, showing up to


20% speed improvements .
● T. Kumar et al. optimized adder designs for inference tasks, highlighting their utility in
edge AI systems .

2.8. Security and Robustness in Approximate PPAs

Security concerns in approximate computing include fault tolerance and resistance to hardware
attacks.
● H. Li et al. analyzed the impact of faults in PPAs, proposing secure and fault-resilient
designs for mission-critical applications .
● S. Kumar and P. Roy addressed vulnerabilities in approximate computing and introduced
robust designs to mitigate security risks .

2.9. Hybrid Designs and Emerging Trends

Hybrid approximate PPAs combine the strengths of multiple architectures.

● J. Wang et al. integrated carry-lookahead and ripple-carry designs into a hybrid PPA to
balance speed and power .
● F. Gao et al. explored trends in approximate computing, including machine-learning-
guided PPA optimization .
CHAPTER 3

EXISTING WORK

3.1. Limits of AxPPA

AxPPA has several limitations that need to be considered. First, the approximate technique is
only applied to POs. Ignoring the impact on performance when applied to other operators, apart
from POs, is not advisable. Second, results are generated by passing through multiple operators
from input to output. Therefore, basing the selection and application of an approximate technique
on the error rate of a single PO is insufficient. Additionally, providing only a single technique
lacks a comparative analysis of applying different techniques. In this paper, we consider to
address these limitations and design an our proposed Approximate PPA.

3.2.Optimal Approximate Technique Combination


We find the optimal approximate technique combination applicable to PPA, and apply it to our
design. At each stage, a different approximate technique is applied, and a comprehensive Z-
Score

Figure 2. EAxPPA structure

Table 1. Circuit Area by design.


Table 2. Top Z-Score by Combination

Figure 3. Z-Score by Optimal Combination Bit.\

analysis is conducted, considering Mean Absolute Error (MAE), Mean Relative Error Distance
(MRED), and circuit area as the three metrics. The circuit area is calculated as the total area for
each circuit in Table 1, synthesized using the CMOS 28nm cell library and
Figure 4. Z-Score by Area-Efficient Combination Bit
a 1.1V operating voltage with Synopsys Design Compiler. MAE, MRED, and Z-Score are
defined as in (6), (7), and (8).

For a sample size of 𝑛, 𝑥 represents the actual value, and 𝑦 represents the approximate value.
MAE and MRED are used as accuracy evaluation metrics. The Z-Score is the value obtained by
transforming the original value 𝑥 for each metric into a normal distribution. 𝜇 represents the
mean, and 𝜎 represents the standard deviation. The overall metric is determined by adding the Z-
Score for all three metrics, we define the least Z-Score is optimal. The experiments is conducted
106 times for each combination, and the metrics is averaged for use.

Table 3. AxPPA and EAxPPA Comparison

Table 2 presents the results of the experiments, based on the overall Z-Score, to determine the
optimal approximate technique for design. The experimental results, combination [nor, cte-0,
cte-0, cte-0, nor] exhibited the best performance.

3.3.Optimal Combination by Bit

The optimal number of bits for applying the most efficient approximate technique is determined
through quantitative experiments. In the lower bits, the optimal combination approximate
technique derived from B section is applied, and performance is compared based on the applied
bits to derive the optimal ratio. The experimental results for design is as shown in Figure 3. The
design is optimal performance when the approximate bits are set at 10 bits. Subsequently, based
on the derived ratios, the part where the approximate technique is applied is further divided into
three parts, the most area-efficient combination for the lower bits. The experimental results for
design is shown in Figure 4. The results show optimal performance when the applied bits are set
at 8 bits. Figure 2 represents EAxPPA structure verified through experiments proposed in this
paper. Different approximate techniques are applied for each stage and bit, and these choices are
the result of quantitative analysis through experiments. EAxPPA is designed based on the 16 bits
Sklansky PPA. The upper 6 bits are the same as PPA, approximate technique is applied to the
lower 10 bits. Upper 2 bits of these are applied the optimal approximate technique combination,
lower 8 bits are applied areaefficient approximate technique combination.

3.4.Experimental Results based on ASIC

We conduct ASIC-based performance evaluations of the proposed structure. Using a CMOS


28nm standard cell library and a 1.1V operating voltage, we perform synthesis using Synopsys
Design Compiler. We compare the results in terms of circuit area saving, power saving, MAE
and MRED. For the comparison design, we implement PPA, AxPPA, and EAxPPA using
Verilog. The approximate technique is applied to configure the bits to match the optimal
configuration for the proposed structure. Table 3 represents the experimental results. The
experimental results show that AxPPA achieved a 33.03% circuit area saving and a 37.16%
power saving compared to PPA. On the other hand, EAxPPA achieved a remarkable 68.35%
circuit area saving and a 64.02% power saving, demonstrating superior hardware performance.
Comparing AxPPA and EAxPPA, EAxPPA exhibit a 52.74% circuit area saving effect and a
42.74% power saving effect compared to AxPPA. Additionally, it achieve a 53.15% reduction in
MAE and a 53.6% reduction in MRED.
Chapter 4
Proposed method

FA is the fundamental block for adders design. The relaxation of numerical exactness provides
freedom to study on imprecise or approximate computation. This freedom gives a solution to low
power and high speed designs. The existing work introduces er- ror to substantially reduce the
power consumption with little loss in output quality. Error Tolerant Adder (ETA) achieves
tremendous improvements in power consumption and speed by introducing re- striction on
accuracy improvement in power consumption and speed with reduced accuracy than ETA. ETA
II does not eliminate the entire carry propagation path but it divides the entire carry propagation
path into a number of smaller paths. It completes the carry propagations in shorter paths
simultaneously. So, the performance of an adder is significantly improved in terms of power
consumption and speed. The building blocks for implementing the bio inspired systems do not
require fully precise digital logic circuits. This allows inaccurate computation by reducing the
logic complexity and cost. Soft additions are generally based on the operation of deterministic
ap- proximate logic or probabilistic imprecise arithmetic. Bio-inspired LOA has been designed
based on approximate logic. The LOA is slowest, but has low power dissipation. Dynamic
segmentation consists of dividing the adder into a smaller bit width adders by bit-slicing the data
path. However, the application of the segmentation approach incurs a significant error in meta-
function computation due to its accumulative nature over multiple cycles. To overcome the
above issue, an improved dynamic segmentation with multi-cycle error compensation technique
(DSEC) improves the accuracy under a wide range of over scaled voltage. Approximate
computing techniques in error tolerant applications, like image and video processing, provides a
considerable improvement in speed and power with a trade-off in quality. The accuracy
requirements of various applications differ from each other. Even the same application needs
different computations with different accuracy requirements and varies over time and
user requirements. Therefore, accuracy configurable arithmetic cicuits are important. Using an
Accuracy-Configurable Approximate adder (ACAA), the accuracy has been configured during
runtime by changing the circuit structure, with a trade-off in accuracy, performance, and power.
The Almost Correct Adder (ACA) is the most power consuming approach with moderate
accuracy [9] . Logic complexity reduction for adders at bit level provides better power savings
over conventional low power design techniques. Logic complexity reduction of a conventional
MA cell has been achieved by reducing the number of transistors. Based on logic complexity
reduction, five different simplified versions of MA have been proposed. The existing AAs, AA1,
AA2, AA3, AA4, AA5 and the proposed AAs, AA6, AA7, AA8, AA9, AA10, AA11, AA12 are
discussed in the following subsections.

5.1. Existing AAs (AA1-AA5)


Two approaches have been involved in existing AAs, one is ap- proximation in Sum alone and C
out is unchanged. Second is an ap- proximation in both Sum and C out . AA1, AA3, AA4 and
AA5 designs involve an approximation in both Sum and C out . AA2 design in- volves an
approximation in Sum [10] . Table 1 provides the Sum and C out for existing AAs.
5.1.1. Approximate adder1 (AA1) The design approach of AA1 involves approximation in both
Sum and C out . AA1 shows that Sum is precise for 6 out of 8 cases and C out is precise for 7 out
of 8 cases, except for the cases A = 0, B = 1 and C in = 0 for Sum and C out , and A = 1, B = 0
and C in = 0 for Sum. The logic equations for Sum and C out of AA1 is given in Eqs. (1) and
(2) .
Sum = ( AB C in + C out ‘ C in ) (1)
C out = A C in + B (2)
5.1.2. Approximate adder 2 (AA2) The design approach of AA2 is approximated on Sum alone.
The Sum is precise for 6 out of 8 cases and C out is precise for all cases in AA2. The advantage
of AA2 is that the two errors in Sum and C out ensure the correct output in all cases. Thus, com-
pared to AA1, AA2 has less probability of error. The logic equations are given in Eqs. (3) and
(4) .
Sum = C out ′ (3)
C out = AB + B C in + A C in (4)
2.1.3. Approximate adder 3 (AA3)
The design approach of AA3 involves approximation in Sum and C out . The Sum is precise for
5 out of 8 cases and C out is precise for 7 out of 8 cases in AA3. The AA3 uses a buffer to
compute Sum with input C out ′ . The logic equations are given in Eqs. (5) and (6) .
Sum = C out ′ (5)
C out = A C in + B (6)
2.1.4. Approximate adder 4 (AA4)
The design approach of AA4 involves approximation in Sum and C out . In AA4 at logic level,
Sum is precise for 5 out of 8 cases and C out is precise for 6 out of 8 cases. The logic equation is
given in Eqs. (7) and (8) .
Sum = AB C in + C out ′ C in (7)
C out = A (8) 2.1.5.
Approximate adder 5 (AA5)
The design approach of AA5 involves approximation in Sum and C out . The Sum is precise for
4 out of 8 cases and C out is precise for 6 out of 8 cases, except for the cases A = 0, B = 1 and C
in = = 0 for Sum and C out and A = 1, B = 0 and C in = 0 for Sum. The Sum and C out are the
buffer of B and A respectively. The logic equations of AA5 are given in Eqs. (9) and (10) .
Sum = B (9)
C out = A (10)
The existing AAs (AA1-AA5) have been designed based on ap- proximate computing at the
transistor level and utilized two approaches. AA1, AA3, AA4 and AA5 design involves an
approximation in both Sum and C out . But, AA2 design involves approximation in Sum alone
and C out is unchanged. In existing AAs, AA5 has high performance with a trade-off in accuracy
due to maximum logic approximation involved in Sum and C out . Anyway, further ac- curacy in
approximation is possible. This is an opportunity to develop the AAs based on approximate
computing at the logic level of FA. Therefore, various AAs have been proposed with different
errors. The proposed designs have minimal errors and high performance than existing AAs,
except for AA5 which has a maximum error.

5.2. Proposed AAs


In this section, seven AAs (AA6-AA12) have been proposed. During the operation of an adder,
the accuracy of C out is more significant than Sum. Two approaches have been considered to
design
AAs. One is approximation in Sum alone and C out is unchanged. Second is approximation in
both Sum and C out . The proposed AAs (AA6-AA12) designs involve the above two
approaches. Though the two design approaches have been considered the more significant
approach is the first approach for approximation. The AA7, AA8, AA9, AA10, AA11, and AA12
designs involve an approximation in Sum. The AA6 design involves an approximation in both
Sum and C out . The truth table for all input combinations of AAs are given in Table 2 .
5.2.1. Proposed approximate adder 6 (AA6)
The design approach of AA6 in Table 2 given as both Sum and C out are approximated. From
Table 2 , AA6 shows that sum is cor- rect for 5 out of 8 cases and C out is correct for 6 out of 8
cases. The logic equations of AA6 are given in Eqs. (11) and (12) .
Sum = A ′ + B C in (11)
Cout = A (12)
2.2.2. Proposed approximate adder 7 (AA7)
On observing the FA truth table from Table 2 for AA7 shows that Sum is correct for 6 out of 8
cases and C out is correct for all 8 cases. Thus, compared to AA6, AA7 has less probability of
error. The logic equations of AA7 are given Eqs. (13) and (14) .
Sum = A ′ ( B+ C in ) + B C in (13)
C out = AB + B C in + A C in (14)
5.2.3. Proposed approximate adder 8 (AA8)
The AA8 design introduces three errors in Sum and C out is un- changed. From Table 2 , AA8
shows that Sum is correct for 5 out of 8 cases. The logic equations of AA8 are given in Eqs. (15)
and ( 16 ).
Sum = A ′ + B C in (15) C out = AB + B C in + A C in (16)
5.2.4. Proposed approximate adder 9 (AA9)
Further simplification of AA8, AA9 introduces one error in Sum and C out is unchanged. From
the Table 2 , AA9 shows that Sum is correct for 7 out of 8 cases. The logic equations of AA9 are
given in Eqs. (17) and ( 18 ).
Sum = A ′ B ′ + B ′ C ′ in + AB C in + A ′ B C ′ in (17)
C out = AB + B C in + A C in (18)
5.2.5. Proposed approximate adder 10 (AA10)
The design approach of AA10 introduces three errors in Sum and C out is unchanged. The truth
table of FA given in Table 2 for AA10 shows that Sum is correct for 5 out of 8 cases and C out is
correct for all 8 cases. The logic equations of AA10 are given in Eqs. (19) and ( 20 ).
Sum = A ′ + B C in (19)
C out = AB + B C in + A C in (20)
5.2.6. Proposed approximate adder 11 (AA11)
On observing the FA truth table from Table 2 for AA11, Sum is correct for 6 out of 8 cases and
C out is unchanged. This design in- troduces two errors in Sum and C out is correct for all 8
cases. AA11 has less error in Sum compared to AA10. The logic equations of AA11 are given in
Eqs. (21) and ( 22 ).
Sum = A ′ B ′ + C ′ in + A ′ B C ′ in (21)
C out = AB + B C in + A C in (22)
5.2.7. Proposed approximate adder 12 (AA12)
Further simplification error in AA11 is done by introducing only one error in Sum to design
AA12. From the Table 2 , the FA truth table shows that Sum is correct for 7 out of 8 cases and C
out is correct for all 8 cases. The logic equations of AA12 are given in Eqs. (23) and ( 24 ).
Sum = AB + B C ′ in + A ′ B ′ C in + AB ′ C ′ in (23)
C out = AB + B C in + A C in (24)
The proposed AAs (AA7-AA12) have been developed to reduce errors in Sum and no errors in C
out where AA6 has errors in both Sum and C out . As given in Table 2 , AA6 has five errors,
AA8, AA10 designs have three errors, AA7, and AA11 have two errors and AA9 and AA12
have one error. AA6 has high probability of error and AA9, AA12 have less probability of error
in the proposed designs. In proposed AAs, the errors were introduced for different combination
of inputs. The approximation in C out approach provides errors in carrying. The error in carrying
propagates through subsequent stages which further increases the error. The existing AAs have
less performance than the proposed AA6 design in terms of area, delay and power except for
AA5, where AA5 has more errors in Sum and C out . From the performance point of view
discussed in Section 4.1 and from Table 2 , approximation in both Sum and C out approach gives
better results. Not all the proposed design claims low power and high speed. Only the proposed
AA6 claims
Chapter 5
Results and discussion

5.1.existing Approximate Technique

How It Works

1. Division of Inputs:

o The 15-bit inputs A and B are split into two parts:

 Accurate Part: The most significant 8 bits (14:7).

 Approximate Part: The least significant 7 bits (6:0).

2. Approximate Addition:

o The approximate part is computed using a simple bitwise OR operation, reducing


hardware complexity compared to full adders.

3. Accurate Addition:

o A standard addition is performed on the accurate part using normal binary


addition.
4. Final Sum:

o The results of the accurate and approximate parts are concatenated to form the 16-
bit result.

Example Simulation

For the following inputs:

 A = 15'b101010101010101 (binary)

 B = 15'b010101010101010 (binary)

The operation will produce:

 Approximate Part (SUM_approx):

o OR operation on LSBs: A[6:0] | B[6:0] = 7'b1111111.

 Accurate Part (SUM_acc):

o Full addition on MSBs: A[14:7] + B[14:7].

Result:

 SUM = {SUM_acc, SUM_approx}.

5.2 Proposed technique


Key Features and Operation:

1. Inputs and Outputs:

o A and B: 16-bit inputs.

o Sum: 16-bit output for the addition result.

o CarryOut: Carry-out signal for overflow.

2. Generate (G) and Propagate (P) Signals:

o G = A & B: This represents the generation of carry.

o P = A ^ B: This represents the propagation of carry.

3. Carry Generation Stages:

o Stage 1 (Exact Calculation for Lower Bits):

 Handles the lower 4 bits (C[0] to C[3]) using exact logic for accuracy.
These bits are critical for reducing overall error in approximation.

o Stage 2 (Approximate Mid-Range Carry):


 The next 4 bits (C[4] to C[7]) are calculated using approximations. For
instance, C[4] reuses C[2] instead of C[3], which simplifies the logic.

o Stage 3 (Approximate Higher Bits):

 Bits C[8] to C[11] are further approximated by using earlier carry signals
like C[4], C[5], etc., instead of precise dependencies.

o Stage 4 (Final Approximation for Upper Bits):

 The highest bits (C[12] to C[15]) are calculated with coarse


approximations, reducing the complexity of logic for higher carry signals.

4. Sum Calculation:

o Sum[i] = P[i] ^ C[i-1]: Combines the propagate signal and the previous carry to
compute the sum for each bit.

o For the least significant bit, Sum[0] = P[0].

5. CarryOut:

o The final carry-out signal is C[15], which is derived from the approximate carry
propagation.

Design Benefits:

1. Reduced Complexity:

o By introducing approximations in the mid and higher stages, the logic becomes
simpler, reducing the delay and hardware cost.

2. Lower Power Consumption:

o Approximate logic reduces switching activity and the number of gates, which
decreases power usage.

3. Error-Tolerance:
o The design focuses on minimizing error in the lower bits (exact stages), where
inaccuracies have a larger impact on the final result.

Advantages:

1. Low Power Consumption:

o The approximate carry logic reduces the switching activity and the number of
logic gates, leading to significant power savings compared to conventional adders.

2. High-Speed Computation:

o The use of approximations in carry generation for mid-range and higher-order bits
reduces the critical path delay, making the adder suitable for high-speed
applications.

3. Error-Tolerance:

o By maintaining exact computation in the lower bits and controlled approximation


in the higher bits, the design minimizes the impact of errors on the final result,
making it suitable for applications where exact precision is not critical.

4. Hardware Efficiency:

o The simplified carry logic in approximate stages reduces the hardware


complexity, resulting in a smaller area footprint and fewer resources.

Limitations:

1. Reduced Accuracy:

o The approximations in the carry generation logic introduce errors, which might
not be acceptable in systems requiring high precision.

2. Application Dependency:

o This adder is more suitable for error-tolerant applications like multimedia


processing, machine learning, or approximate computing. It may not be suitable
for applications requiring strict accuracy, such as cryptography or financial
computations.

Applications:

1. Digital Signal Processing (DSP):

o Low-power and high-speed addition make it ideal for DSP tasks such as filtering
or image processing.

2. Machine Learning:

o Approximate arithmetic can accelerate training and inference, especially for large-
scale neural networks where minor errors do not significantly affect the results.

3. IoT Devices:

o Its energy-efficient design aligns with the needs of battery-powered IoT devices.

4. Multimedia Systems:

o Image and video processing applications can tolerate minor errors without
significant degradation in quality, making this adder a good fit.

You might also like