0% found this document useful (0 votes)

17 views25 pages

Efficient Approximate Parallel Prefix Adder Design

The document discusses various types of adders used in digital systems, focusing on Multi-bit Adders and their structures, including Ripple Carry Adder and Parallel Prefix Adder. It highlights the introduction of Approximate Computing techniques to optimize adders for energy efficiency and circuit area, leading to the development of Approximate PPA and its limitations. The paper proposes an Efficient Approximate PPA (EAxPPA) that combines different approximate techniques to enhance performance, supported by experimental results demonstrating significant improvements in circuit area and power savings.

Uploaded by

mailtochipmatrix

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views25 pages

Efficient Approximate Parallel Prefix Adder Design

Uploaded by

mailtochipmatrix

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 25

CHAPTER 1

INTRODUCTION

Adders are widely used in the field of digital systems, serving a variety of applications, not just
limited to basic arithmetic operations like multiplication and decimal addition but extending to
more advanced applications in core and accelerator. An adder typically employs the Full-Adder
as its basic design, which is composed of two Half-Adders. Design a Multi-bit Adder using
multiple Full-Adders. Various structures for Multi-bit Adders have been proposed, such as
Ripple Carry Adder(RCA), Carry Lookahead Adder(CLA), Carry Select Adder(CSLA), and
Carry Skip Adder(CSKA). The simplest structure, RCA, connects Full-Adders in a linear
fashion,making it the most area-efficient Multi-bit Adder. However, it suffers from significant
delay due to the propagation of the carry signal. To resolve this problem, structures such as CLA,
CSLA, and CSKA were proposed, each with its own method of carry calculation. To more
effectively resolve the delay problem, the Parallel Prefix Adder (PPA) structure was introduced.
PPA consists of three stages, preprocessing, prefix-processing, and post-processing. Different
PPA has been proposed based on the configuration of the prefix-processing stage . PPA offers
the advantage of minimal delay compared to traditional serial adders. However, it requires logic
for parallel carry calculation, which can result in suboptimal performance in terms of circuit area
and power efficiency compared to serial adders. To resolve this problem with each adder design,
research has been conducted on applying Approximate Computing(AxC) to adder to optimize
accuracy and reduce circuit area and energy consumption, resulting in Approximate
Adder(AxA). The range of AxC application includes the transistor level, full-adder level, multi-
bit adder level, and PPA level. We conduct research on Approximate PPA with AxC applied at
the PPA level. A structure with AxC applied to PPA is proposed in , known as AxPPA. This
paper analyzes the limitations of AxPPA and introduces the Efficient Approximate
PPA(EAxPPA).

1.1. Parallel Prefix Adder

PPA typically consists of three stages, preprocessing, prefix-processing, and post-processing,
with each stage employing different circuit configurations to perform specific functions. The first
stage, pre-processing, is responsible for encoding the input operands 𝑎 and 𝑏 . It includes logic
for generating the carry 𝑔 and logic for propagating the carry 𝑝. The boolean equations for the 𝑝
and 𝑔 of the i-th bit are given by (1), (2).

The second stage, prefix-processing, groups 𝑝 and 𝑔 to generate the final carry signal over
several steps. prefix-processing is composed of an operation block

Figure 1. AxPPA structure

called the Prefix Operator (PO). The boolean equations for the PO are provided by (3), (4).
𝐺 is generated based on the previous node's 𝑔𝑘 and expressions for 𝑝𝑖 and 𝑔𝑖 for the i-th bit. 𝑃
is generated based on 𝑝𝑖 for the i-th bit and the previously calculated 𝑝𝑘 . The outputs 𝑃 and 𝐺
from the PO are then connected as inputs to other POs. Unlike traditional serial adders, POs in
PPA are connected in parallel over multiple stages to compute the carry quickly. The last stage,
post-processing, combines the 𝑝 from the pre-processing stage and the individual bit carries 𝐺
calculated in the prefixprocessing stage to generate the final sum 𝑆 for operands. The boolean
equation for the i-th bit is given by (5).

PPA performance varies based on the configuration of the PO, leading to the proposal of
different PPA types such as Kogge-Stone(KS) PPA, Brent-Kung(BK) PPA, Sklansky(SK) PPA,
and LadnerFischer(LF) PPA.

1.2.Approximate Technique
The approximate techniques used in the design of Approximate PPA can be broadly defined as
three main techniques:
1) Elimination Technique: The elimination technique involves removing gates from the existing
operator and connecting the input and output of the operator using wires.
2) Constant Technique: The constant technique entails removing gates from the existing
operator and connecting constant values, typically cte-0(1'b0) and cte-1(1'b1), to the output.
3) Simplification Technique: The simplification technique involves replacing the gates in the
existing operator with other area-efficient gates(and, nand, or, nor, xor, nxor).

1.3.AxPPA Design
Approximate PPA refers to a modified structure of the existing PPA, where approximate
techniques are applied to improve the area and power efficiency issues inherent in traditional
PPA. One prominent structure is the AxPPA proposed in which introduces modifications to
address these inefficiencies. Figure 1 shows the structure of AxPPA. In the case of AxPPA, the
key proposal involves divide the entire PPA into two parts, and apply elimination technique to
low part POs to enhance efficiency.

CHAPTER 2
LITERATURE REVIEW

2.1. Introduction to Approximate Computing in Adders

Approximate computing has gained prominence as a strategy to enhance energy efficiency in
digital designs. Parallel prefix adders (PPAs) are a key focus due to their widespread use in
arithmetic units.

● H. Jiang et al. introduced approximate PPAs (AxPPA) that optimize delay and energy
consumption for multimedia and machine learning workloads.
● Z. Liu et al. reviewed approximate adders and their trade-offs in speed, power, and
accuracy, offering insights into their application in resource-constrained environments .

2.2. Architectural Innovations in Approximate PPAs

Innovative architectural techniques have been proposed to improve the efficiency of PPAs.

● R. V. Shenoy et al. designed hybrid approximate PPAs for low-power applications by

combining carry-skip logic with approximate logic gates, achieving 25% area savings .
● S. Srivastava and A. Tiwari explored parallel prefix structures optimized for high-speed
and area-constrained environments .

2.3. FPGA Implementation of Approximate PPAs

Field-programmable gate array (FPGA) implementations are essential to evaluate real-world

performance.

● K. Nagarajan and D. Malhotra demonstrated FPGA designs for PPAs, highlighting

significant power reductions in digital signal processing (DSP) tasks.
● P. Chen et al. focused on FPGA-based hybrid approximate PPAs, achieving up to 40%
energy savings in image processing systems .

2.4. Applications in Internet of Things (IoT)

Approximate PPAs have been tailored for IoT devices that demand low power and acceptable
error rates.

● J. Kim et al. developed lightweight approximate PPAs optimized for energy-constrained

IoT applications, demonstrating their potential in wearable devices.
● L. Zhang et al. proposed designs that balance accuracy and energy consumption for IoT
edge computing platforms .

2.5. Accuracy and Error Metrics

Error-tolerant designs are crucial for approximate adders to maintain functional correctness.

● M. Zhao et al. evaluated accuracy metrics in approximate PPAs, proposing error-resilient

designs for multimedia applications .
● V. K. Singh et al. analyzed the trade-off between precision and power efficiency in error-
tolerant PPA architectures .

2.6. Power and Energy Efficiency in Approximate PPAs

Power efficiency is a key driver for approximate adder designs.

● A. Gupta and S. Bansal achieved 30% energy savings by simplifying logic paths in PPAs
for low-power devices .
● K. Verma et al. examined power-aware PPAs, focusing on voltage scaling and logic
simplification to minimize energy consumption .

2.7. Applications in Neural Networks and AI

Approximate PPAs have found applications in AI accelerators, where exact precision is not
always required.

● N. Patel et al. implemented PPAs in neural network hardware accelerators, showing up to

20% speed improvements .
● T. Kumar et al. optimized adder designs for inference tasks, highlighting their utility in
edge AI systems .

2.8. Security and Robustness in Approximate PPAs

Security concerns in approximate computing include fault tolerance and resistance to hardware
attacks.
● H. Li et al. analyzed the impact of faults in PPAs, proposing secure and fault-resilient
designs for mission-critical applications .
● S. Kumar and P. Roy addressed vulnerabilities in approximate computing and introduced
robust designs to mitigate security risks .

2.9. Hybrid Designs and Emerging Trends

Hybrid approximate PPAs combine the strengths of multiple architectures.

● J. Wang et al. integrated carry-lookahead and ripple-carry designs into a hybrid PPA to
balance speed and power .
● F. Gao et al. explored trends in approximate computing, including machine-learning-
guided PPA optimization .
CHAPTER 3

EXISTING WORK

3.1. Limits of AxPPA

AxPPA has several limitations that need to be considered. First, the approximate technique is
only applied to POs. Ignoring the impact on performance when applied to other operators, apart
from POs, is not advisable. Second, results are generated by passing through multiple operators
from input to output. Therefore, basing the selection and application of an approximate technique
on the error rate of a single PO is insufficient. Additionally, providing only a single technique
lacks a comparative analysis of applying different techniques. In this paper, we consider to
address these limitations and design an our proposed Approximate PPA.

3.2.Optimal Approximate Technique Combination

We find the optimal approximate technique combination applicable to PPA, and apply it to our
design. At each stage, a different approximate technique is applied, and a comprehensive Z-
Score

Figure 2. EAxPPA structure

Table 1. Circuit Area by design.

Table 2. Top Z-Score by Combination

Figure 3. Z-Score by Optimal Combination Bit.\

analysis is conducted, considering Mean Absolute Error (MAE), Mean Relative Error Distance
(MRED), and circuit area as the three metrics. The circuit area is calculated as the total area for
each circuit in Table 1, synthesized using the CMOS 28nm cell library and
Figure 4. Z-Score by Area-Efficient Combination Bit
a 1.1V operating voltage with Synopsys Design Compiler. MAE, MRED, and Z-Score are
defined as in (6), (7), and (8).

For a sample size of 𝑛, 𝑥 represents the actual value, and 𝑦 represents the approximate value.
MAE and MRED are used as accuracy evaluation metrics. The Z-Score is the value obtained by
transforming the original value 𝑥 for each metric into a normal distribution. 𝜇 represents the
mean, and 𝜎 represents the standard deviation. The overall metric is determined by adding the Z-
Score for all three metrics, we define the least Z-Score is optimal. The experiments is conducted
106 times for each combination, and the metrics is averaged for use.

Table 3. AxPPA and EAxPPA Comparison

Table 2 presents the results of the experiments, based on the overall Z-Score, to determine the
optimal approximate technique for design. The experimental results, combination [nor, cte-0,
cte-0, cte-0, nor] exhibited the best performance.

3.3.Optimal Combination by Bit

The optimal number of bits for applying the most efficient approximate technique is determined
through quantitative experiments. In the lower bits, the optimal combination approximate
technique derived from B section is applied, and performance is compared based on the applied
bits to derive the optimal ratio. The experimental results for design is as shown in Figure 3. The
design is optimal performance when the approximate bits are set at 10 bits. Subsequently, based
on the derived ratios, the part where the approximate technique is applied is further divided into
three parts, the most area-efficient combination for the lower bits. The experimental results for
design is shown in Figure 4. The results show optimal performance when the applied bits are set
at 8 bits. Figure 2 represents EAxPPA structure verified through experiments proposed in this
paper. Different approximate techniques are applied for each stage and bit, and these choices are
the result of quantitative analysis through experiments. EAxPPA is designed based on the 16 bits
Sklansky PPA. The upper 6 bits are the same as PPA, approximate technique is applied to the
lower 10 bits. Upper 2 bits of these are applied the optimal approximate technique combination,
lower 8 bits are applied areaefficient approximate technique combination.

3.4.Experimental Results based on ASIC

We conduct ASIC-based performance evaluations of the proposed structure. Using a CMOS

28nm standard cell library and a 1.1V operating voltage, we perform synthesis using Synopsys
Design Compiler. We compare the results in terms of circuit area saving, power saving, MAE
and MRED. For the comparison design, we implement PPA, AxPPA, and EAxPPA using
Verilog. The approximate technique is applied to configure the bits to match the optimal
configuration for the proposed structure. Table 3 represents the experimental results. The
experimental results show that AxPPA achieved a 33.03% circuit area saving and a 37.16%
power saving compared to PPA. On the other hand, EAxPPA achieved a remarkable 68.35%
circuit area saving and a 64.02% power saving, demonstrating superior hardware performance.
Comparing AxPPA and EAxPPA, EAxPPA exhibit a 52.74% circuit area saving effect and a
42.74% power saving effect compared to AxPPA. Additionally, it achieve a 53.15% reduction in
MAE and a 53.6% reduction in MRED.
Chapter 4
Proposed method

FA is the fundamental block for adders design. The relaxation of numerical exactness provides
freedom to study on imprecise or approximate computation. This freedom gives a solution to low
power and high speed designs. The existing work introduces error to substantially reduce the
power consumption with little loss in output quality. Error Tolerant Adder (ETA) achieves
tremendous improvements in power consumption and speed by introducing re- striction on
accuracy improvement in power consumption and speed with reduced accuracy than ETA. ETA
II does not eliminate the entire carry propagation path but it divides the entire carry propagation
path into a number of smaller paths. It completes the carry propagations in shorter paths
simultaneously. So, the performance of an adder is significantly improved in terms of power
consumption and speed. The building blocks for implementing the bio inspired systems do not
require fully precise digital logic circuits. This allows inaccurate computation by reducing the
logic complexity and cost. Soft additions are generally based on the operation of deterministic
approximate logic or probabilistic imprecise arithmetic. Bio-inspired LOA has been designed
based on approximate logic. The LOA is slowest, but has low power dissipation. Dynamic
segmentation consists of dividing the adder into a smaller bit width adders by bit-slicing the data
path. However, the application of the segmentation approach incurs a significant error in meta-
function computation due to its accumulative nature over multiple cycles. To overcome the
above issue, an improved dynamic segmentation with multi-cycle error compensation technique
(DSEC) improves the accuracy under a wide range of over scaled voltage. Approximate
computing techniques in error tolerant applications, like image and video processing, provides a
considerable improvement in speed and power with a trade-off in quality. The accuracy
requirements of various applications differ from each other. Even the same application needs
different computations with different accuracy requirements and varies over time and
user requirements. Therefore, accuracy configurable arithmetic cicuits are important. Using an
Accuracy-Configurable Approximate adder (ACAA), the accuracy has been configured during
runtime by changing the circuit structure, with a trade-off in accuracy, performance, and power.
The Almost Correct Adder (ACA) is the most power consuming approach with moderate
accuracy [9] . Logic complexity reduction for adders at bit level provides better power savings
over conventional low power design techniques. Logic complexity reduction of a conventional
MA cell has been achieved by reducing the number of transistors. Based on logic complexity
reduction, five different simplified versions of MA have been proposed. The existing AAs, AA1,
AA2, AA3, AA4, AA5 and the proposed AAs, AA6, AA7, AA8, AA9, AA10, AA11, AA12 are
discussed in the following subsections.

5.1. Existing AAs (AA1-AA5)

Two approaches have been involved in existing AAs, one is approximation in Sum alone and C
out is unchanged. Second is an approximation in both Sum and C out . AA1, AA3, AA4 and
AA5 designs involve an approximation in both Sum and C out . AA2 design involves an
approximation in Sum [10] . Table 1 provides the Sum and C out for existing AAs.
5.1.1. Approximate adder1 (AA1) The design approach of AA1 involves approximation in both
Sum and C out . AA1 shows that Sum is precise for 6 out of 8 cases and C out is precise for 7 out
of 8 cases, except for the cases A = 0, B = 1 and C in = 0 for Sum and C out , and A = 1, B = 0
and C in = 0 for Sum. The logic equations for Sum and C out of AA1 is given in Eqs. (1) and
(2) .
Sum = ( AB C in + C out ‘ C in ) (1)
C out = A C in + B (2)
5.1.2. Approximate adder 2 (AA2) The design approach of AA2 is approximated on Sum alone.
The Sum is precise for 6 out of 8 cases and C out is precise for all cases in AA2. The advantage
of AA2 is that the two errors in Sum and C out ensure the correct output in all cases. Thus, com-
pared to AA1, AA2 has less probability of error. The logic equations are given in Eqs. (3) and
(4) .
Sum = C out ′ (3)
C out = AB + B C in + A C in (4)
2.1.3. Approximate adder 3 (AA3)
The design approach of AA3 involves approximation in Sum and C out . The Sum is precise for
5 out of 8 cases and C out is precise for 7 out of 8 cases in AA3. The AA3 uses a buffer to
compute Sum with input C out ′ . The logic equations are given in Eqs. (5) and (6) .
Sum = C out ′ (5)
C out = A C in + B (6)
2.1.4. Approximate adder 4 (AA4)
The design approach of AA4 involves approximation in Sum and C out . In AA4 at logic level,
Sum is precise for 5 out of 8 cases and C out is precise for 6 out of 8 cases. The logic equation is
given in Eqs. (7) and (8) .
Sum = AB C in + C out ′ C in (7)
C out = A (8) 2.1.5.
Approximate adder 5 (AA5)
The design approach of AA5 involves approximation in Sum and C out . The Sum is precise for
4 out of 8 cases and C out is precise for 6 out of 8 cases, except for the cases A = 0, B = 1 and C
in = = 0 for Sum and C out and A = 1, B = 0 and C in = 0 for Sum. The Sum and C out are the
buffer of B and A respectively. The logic equations of AA5 are given in Eqs. (9) and (10) .
Sum = B (9)
C out = A (10)
The existing AAs (AA1-AA5) have been designed based on approximate computing at the
transistor level and utilized two approaches. AA1, AA3, AA4 and AA5 design involves an
approximation in both Sum and C out . But, AA2 design involves approximation in Sum alone
and C out is unchanged. In existing AAs, AA5 has high performance with a trade-off in accuracy
due to maximum logic approximation involved in Sum and C out . Anyway, further accuracy in
approximation is possible. This is an opportunity to develop the AAs based on approximate
computing at the logic level of FA. Therefore, various AAs have been proposed with different
errors. The proposed designs have minimal errors and high performance than existing AAs,
except for AA5 which has a maximum error.

5.2. Proposed AAs

In this section, seven AAs (AA6-AA12) have been proposed. During the operation of an adder,
the accuracy of C out is more significant than Sum. Two approaches have been considered to
design
AAs. One is approximation in Sum alone and C out is unchanged. Second is approximation in
both Sum and C out . The proposed AAs (AA6-AA12) designs involve the above two
approaches. Though the two design approaches have been considered the more significant
approach is the first approach for approximation. The AA7, AA8, AA9, AA10, AA11, and AA12
designs involve an approximation in Sum. The AA6 design involves an approximation in both
Sum and C out . The truth table for all input combinations of AAs are given in Table 2 .
5.2.1. Proposed approximate adder 6 (AA6)
The design approach of AA6 in Table 2 given as both Sum and C out are approximated. From
Table 2 , AA6 shows that sum is correct for 5 out of 8 cases and C out is correct for 6 out of 8
cases. The logic equations of AA6 are given in Eqs. (11) and (12) .
Sum = A ′ + B C in (11)
Cout = A (12)
2.2.2. Proposed approximate adder 7 (AA7)
On observing the FA truth table from Table 2 for AA7 shows that Sum is correct for 6 out of 8
cases and C out is correct for all 8 cases. Thus, compared to AA6, AA7 has less probability of
error. The logic equations of AA7 are given Eqs. (13) and (14) .
Sum = A ′ ( B+ C in ) + B C in (13)
C out = AB + B C in + A C in (14)
5.2.3. Proposed approximate adder 8 (AA8)
The AA8 design introduces three errors in Sum and C out is unchanged. From Table 2 , AA8
shows that Sum is correct for 5 out of 8 cases. The logic equations of AA8 are given in Eqs. (15)
and ( 16 ).
Sum = A ′ + B C in (15) C out = AB + B C in + A C in (16)
5.2.4. Proposed approximate adder 9 (AA9)
Further simplification of AA8, AA9 introduces one error in Sum and C out is unchanged. From
the Table 2 , AA9 shows that Sum is correct for 7 out of 8 cases. The logic equations of AA9 are
given in Eqs. (17) and ( 18 ).
Sum = A ′ B ′ + B ′ C ′ in + AB C in + A ′ B C ′ in (17)
C out = AB + B C in + A C in (18)
5.2.5. Proposed approximate adder 10 (AA10)
The design approach of AA10 introduces three errors in Sum and C out is unchanged. The truth
table of FA given in Table 2 for AA10 shows that Sum is correct for 5 out of 8 cases and C out is
correct for all 8 cases. The logic equations of AA10 are given in Eqs. (19) and ( 20 ).
Sum = A ′ + B C in (19)
C out = AB + B C in + A C in (20)
5.2.6. Proposed approximate adder 11 (AA11)
On observing the FA truth table from Table 2 for AA11, Sum is correct for 6 out of 8 cases and
C out is unchanged. This design introduces two errors in Sum and C out is correct for all 8
cases. AA11 has less error in Sum compared to AA10. The logic equations of AA11 are given in
Eqs. (21) and ( 22 ).
Sum = A ′ B ′ + C ′ in + A ′ B C ′ in (21)
C out = AB + B C in + A C in (22)
5.2.7. Proposed approximate adder 12 (AA12)
Further simplification error in AA11 is done by introducing only one error in Sum to design
AA12. From the Table 2 , the FA truth table shows that Sum is correct for 7 out of 8 cases and C
out is correct for all 8 cases. The logic equations of AA12 are given in Eqs. (23) and ( 24 ).
Sum = AB + B C ′ in + A ′ B ′ C in + AB ′ C ′ in (23)
C out = AB + B C in + A C in (24)
The proposed AAs (AA7-AA12) have been developed to reduce errors in Sum and no errors in C
out where AA6 has errors in both Sum and C out . As given in Table 2 , AA6 has five errors,
AA8, AA10 designs have three errors, AA7, and AA11 have two errors and AA9 and AA12
have one error. AA6 has high probability of error and AA9, AA12 have less probability of error
in the proposed designs. In proposed AAs, the errors were introduced for different combination
of inputs. The approximation in C out approach provides errors in carrying. The error in carrying
propagates through subsequent stages which further increases the error. The existing AAs have
less performance than the proposed AA6 design in terms of area, delay and power except for
AA5, where AA5 has more errors in Sum and C out . From the performance point of view
discussed in Section 4.1 and from Table 2 , approximation in both Sum and C out approach gives
better results. Not all the proposed design claims low power and high speed. Only the proposed
AA6 claims
Chapter 5
Results and discussion

5.1.existing Approximate Technique

How It Works

1. Division of Inputs:

o The 15-bit inputs A and B are split into two parts:

 Accurate Part: The most significant 8 bits (14:7).

 Approximate Part: The least significant 7 bits (6:0).

2. Approximate Addition:

o The approximate part is computed using a simple bitwise OR operation, reducing

hardware complexity compared to full adders.

3. Accurate Addition:

o A standard addition is performed on the accurate part using normal binary

addition.
4. Final Sum:

o The results of the accurate and approximate parts are concatenated to form the 16-
bit result.

Example Simulation

For the following inputs:

 A = 15'b101010101010101 (binary)

 B = 15'b010101010101010 (binary)

The operation will produce:

 Approximate Part (SUM_approx):

o OR operation on LSBs: A[6:0] | B[6:0] = 7'b1111111.

 Accurate Part (SUM_acc):

o Full addition on MSBs: A[14:7] + B[14:7].

Result:

 SUM = {SUM_acc, SUM_approx}.

5.2 Proposed technique

Key Features and Operation:

1. Inputs and Outputs:

o A and B: 16-bit inputs.

o Sum: 16-bit output for the addition result.

o CarryOut: Carry-out signal for overflow.

2. Generate (G) and Propagate (P) Signals:

o G = A & B: This represents the generation of carry.

o P = A ^ B: This represents the propagation of carry.

3. Carry Generation Stages:

o Stage 1 (Exact Calculation for Lower Bits):

 Handles the lower 4 bits (C[0] to C[3]) using exact logic for accuracy.
These bits are critical for reducing overall error in approximation.

o Stage 2 (Approximate Mid-Range Carry):

 The next 4 bits (C[4] to C[7]) are calculated using approximations. For
instance, C[4] reuses C[2] instead of C[3], which simplifies the logic.

o Stage 3 (Approximate Higher Bits):

 Bits C[8] to C[11] are further approximated by using earlier carry signals
like C[4], C[5], etc., instead of precise dependencies.

o Stage 4 (Final Approximation for Upper Bits):

 The highest bits (C[12] to C[15]) are calculated with coarse

approximations, reducing the complexity of logic for higher carry signals.

4. Sum Calculation:

o Sum[i] = P[i] ^ C[i-1]: Combines the propagate signal and the previous carry to
compute the sum for each bit.

o For the least significant bit, Sum[0] = P[0].

5. CarryOut:

o The final carry-out signal is C[15], which is derived from the approximate carry
propagation.

Design Benefits:

1. Reduced Complexity:

o By introducing approximations in the mid and higher stages, the logic becomes
simpler, reducing the delay and hardware cost.

2. Lower Power Consumption:

o Approximate logic reduces switching activity and the number of gates, which
decreases power usage.

3. Error-Tolerance:
o The design focuses on minimizing error in the lower bits (exact stages), where
inaccuracies have a larger impact on the final result.

Advantages:

1. Low Power Consumption:

o The approximate carry logic reduces the switching activity and the number of
logic gates, leading to significant power savings compared to conventional adders.

2. High-Speed Computation:

o The use of approximations in carry generation for mid-range and higher-order bits
reduces the critical path delay, making the adder suitable for high-speed
applications.

3. Error-Tolerance:

o By maintaining exact computation in the lower bits and controlled approximation

in the higher bits, the design minimizes the impact of errors on the final result,
making it suitable for applications where exact precision is not critical.

4. Hardware Efficiency:

o The simplified carry logic in approximate stages reduces the hardware

complexity, resulting in a smaller area footprint and fewer resources.

Limitations:

1. Reduced Accuracy:

o The approximations in the carry generation logic introduce errors, which might
not be acceptable in systems requiring high precision.

2. Application Dependency:

o This adder is more suitable for error-tolerant applications like multimedia

processing, machine learning, or approximate computing. It may not be suitable
for applications requiring strict accuracy, such as cryptography or financial
computations.

Applications:

1. Digital Signal Processing (DSP):

o Low-power and high-speed addition make it ideal for DSP tasks such as filtering
or image processing.

2. Machine Learning:

o Approximate arithmetic can accelerate training and inference, especially for large-
scale neural networks where minor errors do not significantly affect the results.

3. IoT Devices:

o Its energy-efficient design aligns with the needs of battery-powered IoT devices.

4. Multimedia Systems:

o Image and video processing applications can tolerate minor errors without
significant degradation in quality, making this adder a good fit.

Philippine Skills Framework For Contact Center and Business Process Management
100% (2)
Philippine Skills Framework For Contact Center and Business Process Management
320 pages
AxPPA Approximate Parallel Prefix Adders
No ratings yet
AxPPA Approximate Parallel Prefix Adders
12 pages
Energy Efficient Approximate Adder
No ratings yet
Energy Efficient Approximate Adder
12 pages
DOC-Reducing The Hardware Complexity of A Parallel Prefix Adder
100% (1)
DOC-Reducing The Hardware Complexity of A Parallel Prefix Adder
48 pages
High-Speed Configurable Adder For Approximate Computing
No ratings yet
High-Speed Configurable Adder For Approximate Computing
9 pages
IJONS - Yogeswari P
No ratings yet
IJONS - Yogeswari P
17 pages
Documentation V4
No ratings yet
Documentation V4
71 pages
Efficient Approximate Parallel Prefix Adder Design (REPORT) - 1
No ratings yet
Efficient Approximate Parallel Prefix Adder Design (REPORT) - 1
22 pages
FALLSEM2024-25 BECE406E ETH VL2024250104214 2024-08-16 Reference-Material-I
No ratings yet
FALLSEM2024-25 BECE406E ETH VL2024250104214 2024-08-16 Reference-Material-I
23 pages
s11227 024 06356 7 Approximate Multiplier
No ratings yet
s11227 024 06356 7 Approximate Multiplier
28 pages
PP Adder
No ratings yet
PP Adder
33 pages
Class 10 Artificial Intelligence Sample Paper Set 4
No ratings yet
Class 10 Artificial Intelligence Sample Paper Set 4
9 pages
Reddy - 2021 - BP - 11414D
No ratings yet
Reddy - 2021 - BP - 11414D
21 pages
Synthesis of Approximate Parallel-Prefix Adders
No ratings yet
Synthesis of Approximate Parallel-Prefix Adders
14 pages
Adaptive Area-Efficient Multiplier With Accuracy-Configurable Lookahead Multiplication
No ratings yet
Adaptive Area-Efficient Multiplier With Accuracy-Configurable Lookahead Multiplication
23 pages
Final
No ratings yet
Final
26 pages
Roy 2014
No ratings yet
Roy 2014
14 pages
Performance Analysis and Implementation 097e10b9
No ratings yet
Performance Analysis and Implementation 097e10b9
20 pages
Final Presentation
No ratings yet
Final Presentation
19 pages
JETIRBC06047
No ratings yet
JETIRBC06047
13 pages
Design and Analysis of An Approximate Adder With H
No ratings yet
Design and Analysis of An Approximate Adder With H
13 pages
Performance Evaluation of Approximate Adders Case
No ratings yet
Performance Evaluation of Approximate Adders Case
8 pages
Liquid Crystal Display: Features of LCD
100% (1)
Liquid Crystal Display: Features of LCD
6 pages
Ieiejsts 202304 006
No ratings yet
Ieiejsts 202304 006
11 pages
Wa0007.edited
No ratings yet
Wa0007.edited
9 pages
370 ICCAD2014 HLS Adders ErrorAnalys & Control PDF
No ratings yet
370 ICCAD2014 HLS Adders ErrorAnalys & Control PDF
8 pages
Banerjee 2017
No ratings yet
Banerjee 2017
6 pages
Design of High-Speed and Energy-Efficient Parallel Prefix Kogge Stone Adder
No ratings yet
Design of High-Speed and Energy-Efficient Parallel Prefix Kogge Stone Adder
7 pages
Review of Adders
No ratings yet
Review of Adders
6 pages
IJCRT2304688
No ratings yet
IJCRT2304688
8 pages
Kogge-Stone Adder
No ratings yet
Kogge-Stone Adder
6 pages
END SEM 23 November
No ratings yet
END SEM 23 November
7 pages
Das 2008
No ratings yet
Das 2008
6 pages
PRJ p388
No ratings yet
PRJ p388
6 pages
Ijaret: International Journal of Advanced Research in Engineering and Technology (Ijaret)
No ratings yet
Ijaret: International Journal of Advanced Research in Engineering and Technology (Ijaret)
4 pages
High Speed Area Efficient Vlsi Architecture of Three
No ratings yet
High Speed Area Efficient Vlsi Architecture of Three
10 pages
IJCRT2304688
No ratings yet
IJCRT2304688
7 pages
Base Papers
No ratings yet
Base Papers
7 pages
2018-Approximate Carry Look Ahead Adder (CLA) For ETA - Newone.22222
No ratings yet
2018-Approximate Carry Look Ahead Adder (CLA) For ETA - Newone.22222
6 pages
ACA-CSU: A Carry Selection Based Accuracy Configurable Approximate Adder Design
No ratings yet
ACA-CSU: A Carry Selection Based Accuracy Configurable Approximate Adder Design
6 pages
A Simple Yet Efficient Accuracy-Configurable Adder Design
No ratings yet
A Simple Yet Efficient Accuracy-Configurable Adder Design
14 pages
Kahng 2012
No ratings yet
Kahng 2012
6 pages
STA of of Approximate Parallel Prefix Adders With Results
No ratings yet
STA of of Approximate Parallel Prefix Adders With Results
13 pages
Block-Based Carry Speculative Approximate Adder For Energy-Efficient Applications
No ratings yet
Block-Based Carry Speculative Approximate Adder For Energy-Efficient Applications
5 pages
VZXF
No ratings yet
VZXF
5 pages
A Low-Power, High-Performance Approximate Multiplier With Configurable Partial Error Recovery
No ratings yet
A Low-Power, High-Performance Approximate Multiplier With Configurable Partial Error Recovery
4 pages
A Family of Adders: Simon Knowles Element 14, Aztec Centre, Bristol, UK
No ratings yet
A Family of Adders: Simon Knowles Element 14, Aztec Centre, Bristol, UK
8 pages
An Enhanced Approximate Multiplier Using Error Report Propagation Full Adders 111
No ratings yet
An Enhanced Approximate Multiplier Using Error Report Propagation Full Adders 111
6 pages
Intro 2
No ratings yet
Intro 2
4 pages
Performance Analysis of Parallel Prefix Adder For Datapath Vlsi Design
No ratings yet
Performance Analysis of Parallel Prefix Adder For Datapath Vlsi Design
4 pages
10 1109@tcsii 2019 2901060
No ratings yet
10 1109@tcsii 2019 2901060
5 pages
IJPREMS41000036838
No ratings yet
IJPREMS41000036838
6 pages
Asynchronous Hybrid Kogge-Stone Structure Carry Select Adder Based IEEE-754 Double-Precision Floating-Point Adder
No ratings yet
Asynchronous Hybrid Kogge-Stone Structure Carry Select Adder Based IEEE-754 Double-Precision Floating-Point Adder
8 pages
Design and Analysis of 32-Bit Parallel Prefix Adde
No ratings yet
Design and Analysis of 32-Bit Parallel Prefix Adde
5 pages
Parallel Prefix Adder
No ratings yet
Parallel Prefix Adder
4 pages
Design and Estimation of Delay, Power and Area For Parallel Prefix Adders
No ratings yet
Design and Estimation of Delay, Power and Area For Parallel Prefix Adders
6 pages
MAC - Low Power and Area
No ratings yet
MAC - Low Power and Area
6 pages
64-Bit Prefix Adders: Power-Efficient Topologies and Design Solutions
No ratings yet
64-Bit Prefix Adders: Power-Efficient Topologies and Design Solutions
4 pages
Roba Multiplier A Rounding-Based Approximate Multiplier For High-Speed Energy-Efficient DSP
No ratings yet
Roba Multiplier A Rounding-Based Approximate Multiplier For High-Speed Energy-Efficient DSP
7 pages
FPGA-Based Multiplier With A New Approximate Full Adder For Error-Resilient Applications
No ratings yet
FPGA-Based Multiplier With A New Approximate Full Adder For Error-Resilient Applications
5 pages
64 Bit Parallel Prefix Adder PDF
No ratings yet
64 Bit Parallel Prefix Adder PDF
4 pages
Phases of Moon
No ratings yet
Phases of Moon
7 pages
SKF TrainingCalendar 2019-20 - India
No ratings yet
SKF TrainingCalendar 2019-20 - India
84 pages
Triaxial Test For Rocks
No ratings yet
Triaxial Test For Rocks
12 pages
Medical Devices Report 2020 Revmar2021
No ratings yet
Medical Devices Report 2020 Revmar2021
15 pages
3 Hinge Analysis of Masonry Arches PDF
No ratings yet
3 Hinge Analysis of Masonry Arches PDF
5 pages
Autobiography Rubric: Category 4 3 2 1
No ratings yet
Autobiography Rubric: Category 4 3 2 1
1 page
The Man Behind The Famous Bee (Jollibee)
No ratings yet
The Man Behind The Famous Bee (Jollibee)
2 pages
Uuuu U U U U: Registers (16-Bit)
No ratings yet
Uuuu U U U U: Registers (16-Bit)
3 pages
Physics Ia (Electricity)
No ratings yet
Physics Ia (Electricity)
5 pages
OU Diary-2020 Informatica PDF
No ratings yet
OU Diary-2020 Informatica PDF
75 pages
Disability Project Work
No ratings yet
Disability Project Work
16 pages
DR Fixit Polymer Mortar PX 75 1
No ratings yet
DR Fixit Polymer Mortar PX 75 1
3 pages
Minutes of Meeting Attendance: Present
No ratings yet
Minutes of Meeting Attendance: Present
3 pages
Sun and Eames in ST of Energy 1995
No ratings yet
Sun and Eames in ST of Energy 1995
16 pages
Adv PT1
No ratings yet
Adv PT1
23 pages
Modal Verbs
No ratings yet
Modal Verbs
8 pages
Abs 1
No ratings yet
Abs 1
2 pages
4bs1 02 Rms 20230824
No ratings yet
4bs1 02 Rms 20230824
25 pages
Interview To A Manager
No ratings yet
Interview To A Manager
5 pages
Politeness On Instagram Comment Section
No ratings yet
Politeness On Instagram Comment Section
12 pages
Exercise 37. Read and Find The Appropriate Translation For The Words Below in The Text
No ratings yet
Exercise 37. Read and Find The Appropriate Translation For The Words Below in The Text
3 pages
Harmony 895 Logitech Manuel Us
No ratings yet
Harmony 895 Logitech Manuel Us
17 pages
Skin and Temperature Control
No ratings yet
Skin and Temperature Control
3 pages
DB Ex3
No ratings yet
DB Ex3
4 pages
Ebook Golden Rules For Futures Traders
No ratings yet
Ebook Golden Rules For Futures Traders
15 pages
Saipem Modern Slavery Statement 22 FINAL
No ratings yet
Saipem Modern Slavery Statement 22 FINAL
20 pages
Stealthy
No ratings yet
Stealthy
3 pages
Pedestrain Abstarct
No ratings yet
Pedestrain Abstarct
2 pages
Accepted For Publication-Journal of Biomolecular Structure & Dynamics - Decision On Manuscript ID TBSD-2023-4739
No ratings yet
Accepted For Publication-Journal of Biomolecular Structure & Dynamics - Decision On Manuscript ID TBSD-2023-4739
3 pages
Feds Env Habitats
No ratings yet
Feds Env Habitats
2 pages
Organizational Behavior Assignment: Submitted By-Dachiraju Chandana Varma Section-D 141356
No ratings yet
Organizational Behavior Assignment: Submitted By-Dachiraju Chandana Varma Section-D 141356
4 pages

Efficient Approximate Parallel Prefix Adder Design

Uploaded by

Efficient Approximate Parallel Prefix Adder Design

Uploaded by

CHAPTER 1

1.1. Parallel Prefix Adder

Figure 1. AxPPA structure

2.1. Introduction to Approximate Computing in Adders

2.2. Architectural Innovations in Approximate PPAs

● R. V. Shenoy et al. designed hybrid approximate PPAs for low-power applications by

2.3. FPGA Implementation of Approximate PPAs

Field-programmable gate array (FPGA) implementations are essential to evaluate real-world

● K. Nagarajan and D. Malhotra demonstrated FPGA designs for PPAs, highlighting

2.4. Applications in Internet of Things (IoT)

● J. Kim et al. developed lightweight approximate PPAs optimized for energy-constrained

2.5. Accuracy and Error Metrics

● M. Zhao et al. evaluated accuracy metrics in approximate PPAs, proposing error-resilient

2.6. Power and Energy Efficiency in Approximate PPAs

Power efficiency is a key driver for approximate adder designs.

2.7. Applications in Neural Networks and AI

● N. Patel et al. implemented PPAs in neural network hardware accelerators, showing up to

2.8. Security and Robustness in Approximate PPAs

2.9. Hybrid Designs and Emerging Trends

Hybrid approximate PPAs combine the strengths of multiple architectures.

3.1. Limits of AxPPA

3.2.Optimal Approximate Technique Combination

Figure 2. EAxPPA structure

Table 1. Circuit Area by design.

Figure 3. Z-Score by Optimal Combination Bit.\

Table 3. AxPPA and EAxPPA Comparison

3.3.Optimal Combination by Bit

3.4.Experimental Results based on ASIC

We conduct ASIC-based performance evaluations of the proposed structure. Using a CMOS

5.1. Existing AAs (AA1-AA5)

5.2. Proposed AAs

5.1.existing Approximate Technique

o The 15-bit inputs A and B are split into two parts:

 Accurate Part: The most significant 8 bits (14:7).

 Approximate Part: The least significant 7 bits (6:0).

o The approximate part is computed using a simple bitwise OR operation, reducing

o A standard addition is performed on the accurate part using normal binary

For the following inputs:

The operation will produce:

 Approximate Part (SUM_approx):

o OR operation on LSBs: A[6:0] | B[6:0] = 7'b1111111.

 Accurate Part (SUM_acc):

o Full addition on MSBs: A[14:7] + B[14:7].

 SUM = {SUM_acc, SUM_approx}.

5.2 Proposed technique

1. Inputs and Outputs:

o A and B: 16-bit inputs.

o Sum: 16-bit output for the addition result.

o CarryOut: Carry-out signal for overflow.

2. Generate (G) and Propagate (P) Signals:

o G = A & B: This represents the generation of carry.

o P = A ^ B: This represents the propagation of carry.

3. Carry Generation Stages:

o Stage 1 (Exact Calculation for Lower Bits):

o Stage 2 (Approximate Mid-Range Carry):

o Stage 3 (Approximate Higher Bits):

o Stage 4 (Final Approximation for Upper Bits):

 The highest bits (C[12] to C[15]) are calculated with coarse

o For the least significant bit, Sum[0] = P[0].

2. Lower Power Consumption:

1. Low Power Consumption:

o By maintaining exact computation in the lower bits and controlled approximation

o The simplified carry logic in approximate stages reduces the hardware

o This adder is more suitable for error-tolerant applications like multimedia

1. Digital Signal Processing (DSP):

You might also like