Optimization of Area and Power in Feed Forward Cut Set Free MAC Unit Using EXOR Full Adder and 4:2 Compressor

Uploaded by

nagaveni desugani

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views

Optimization of Area and Power in Feed Forward Cut Set Free MAC Unit Using EXOR Full Adder and 4:2 Compressor

Uploaded by

nagaveni desugani

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 07 Issue: 05 | May 2020 www.irjet.net p-ISSN: 2395-0072

Optimization of Area and Power in Feed Forward Cut Set Free MAC Unit
using EXOR Full Adder and 4:2 Compressor
V. Mohanapriya1, S. Purushothaman2, S. Tamilarasi3, P. Vinitha4
1PG Student , Dept. of ECE, PGP College of Engineering and Technology, Tamilnadu (India) .
2Assistant Professor, Dept. of ECE, PGP College of Engineering and Technology, Tamilnadu(India).
3Assistant Professor, Dept. of ECE, PGP College of Engineering and Technology, Tamilnadu(India).
4Assistant Professor, Dept. of ECE, PGP College of Engineering and Technology, Tamilnadu(India).

---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract: MAC (Multiply Accumulate Unit) column addition, and the carry-lookahead (CLA) adder is
computation plays a important role in (DSP) Digital often used to reduce the critical path in the accumulator
Signal Processing. The MAC is common step that or the final addition stage of the multiplier. Meanwhile, a
computes the product of two numbers and add that MAC operation is performed in the machine learning
product to an accumulator. Generally, the Pipelined algorithm to compute a partial sum that isthe
architecture is used to improve the performance by accumulation of the input multiplied by the weight. In a
reducing the length of the critical path. But, more MAC unit, the multiply and accumulate operations are
number of flip flops are used when using the pipeline usually merged to reduce the number of carry-
architecture that reduces the efficiency of MAC and propagation steps from two to one [10]. Such a structure,
increases the power consumption. On the basis of however, still comprises a long critical path delay that is
machine learning algorithm, this paper proposes a feed approximately equal to the critical path delay of a
forward-cutset-free (FCF) pipelined MAC architecture multiplier. It is well known that pipelining is one of the
that is specialized for a high- performance machine most popular approaches for increasing the operation
learning accelerator, and also proposes the new design clock frequency. Although pipelining is an efficient way
concept of MFCF_PA using the concept of column addition to reduce the critical path delays, it results in an increase
stage with the 4:2 compressor. Therefore, the proposed in the area and the power consumption due to the
design method reduces the area and the power insertion of many flip- flops. In particular, the number of
consumption by decreasing the number of inserted flip- flip-flops tends to be large because the flip-flops must be
flops for the pipelining when compared to the existing inserted in the feed forward-cutset to ensure functional
pipelined architecture for MAC computation. Finally, the equality before and after the pipelining. The problem
proposed feed forward cutset free pipelined architecture worsens as the number of pipeline stages is increased.
for MAC is implemented in the VHDL and synthesized in The main idea of this paper is the ability to relax the
XILINX and compared in terms of area, power and delay feedforward-cutset rule in the MAC design for machine
reports. learning applications, because only the final value is used
out of the large number of multiply–accumulations. In
Keywords: Hardware accelerator, Machine Learning, other words, different from the usage of the
Multiply–Accumulate (MAC) unit, Pipelining. conventional MAC unit, intermediate accumulation
values are not used here, and hence, they do not need to
1. INTRODUCTION
be correct as long as the final value is correct. Under such
In a machine learning accelerator, a large number of a condition, the final value can become correct if each
multiply–accumulate (MAC) units are included for binary input of the adders inside the MAC participates in
parallel computations, and timing- critical paths of the the calculation once and only once, irrespective of the
system are often found in the unit. A multiplier typically cycle. Therefore, it is not necessary to set an accurate
consists of several computational parts including a pipeline boundary. Based on the previously explained
partial product generation, a column addition, and a final idea, this paper proposes a feed forward-cutset-
addition. An accumulator consists of the carry- free(FCF) pipelined MAC architecture for a high-
propagation adder. Long critical paths through these performance machine learning accelerator.
stages lead to the performance degradation of the overall
system. To minimize this problem, various methods have
been studied. The Wallace [8] and Dadda [9] multipliers
are well-known examples for the achievement of a fast
© 2020, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 943
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 05 | May 2020 www.irjet.net p-ISSN: 2395-0072

2. EXISTING SYSTEM 2.2 Disadvantages

RECENTLY, the deep neural network (DNN) emerged as a  Number of inserted flip flops increases the
powerful tool for various applications including image pipeline stages.
classification and speech recognition. Since an enormous  Consumes larger area and high critical path delay.
amount of vector- matrix multiplication computations are  Power consumption is high.
required in a typical DNN application, a variety of
3. PROPOSED SYSTEM
dedicated hardware for machine learning have been
proposed to accelerate the computations. In a machine MAC (Multiply Accumulate Unit) computation plays a
learning accelerator, a large number of multiply– important role in (DSP) Digital Signal Processing. The MAC
accumulate (MAC) units are included for parallel is common step that computes the product of two
computations, and timing-critical paths of the system are numbers and add that product to an accumulator.
often found in the unit. Generally, the Pipelined architecture is used to improve
the performance by reducing the length of the critical
The main idea of this paper is the ability to relax the feed
path. But, more number of flip flops are used when using
forward-cutset rule in the MAC design for machine
the pipeline architecture that reduces the efficiency of
learning applications, because only the final value is used
MAC and increases the power consumption. On the basis
out of the large number of multiply–accumulations. In
of machine learning algorithm, this paper proposes a feed
other words, different from the usage of the conventional
forward- cutset-free (FCF) pipelined MAC architecture
MAC unit, intermediate accumulation values are not used
that is specialized for a high-performance machine
here, and hence, they do not need to be correct as long as
learning accelerator. The proposed design method
the final value is correct. Under such a condition, the final
reduces the area and the power consumption by
value can become correct if each binary input of the
decreasing the number of inserted flip-flops for the
adders inside the MAC participates in the calculation
pipelining when compared to the existing pipelined
once and only once, irrespective of the cycle. Therefore,
architecture for MAC computation. Finally, the proposed
it is not necessary to set an accurate pipeline boundary.
feed forward cutset free pipelined architecture for MAC is
Based on the previously explained idea, this paper implemented in the VHDL and synthesized in XILINX and
proposes a feed forward-cutset-free (FCF) pipelined compared in terms of area, power and delay reports.
MAC architecture that is specialized for a high-
3.1 Proposed FCF Pipelining
performance machine learning accelerator. The
proposed design method reduces the area and the Fig. 1 shows examples of the two-stage 32-bit pipelined
power consumption by decreasing the number of accumulator (PA) that is based on the ripple carry adder
inserted flip-flops for the pipelining. (RCA). A[31 : 0] represents data that move from the
outside to the input buffer register.
2.1 Preliminary: Feed forward-Cutset Rule for
Pipelining A Reg[31 : 0] represents the data that are stored in the
input buffer. S[31 : 0] represents the data that are stored
It is well known that pipelining is one of the most in the output buffer register as a result of the
effective ways to reduce the critical path delay, thereby accumulation. In the conventional PA structure [Fig. 1(a)],
increasing the clock frequency. This reduction is the flip-flops must be inserted along the feed forward-
achieved through the insertion of flip- flops into the data cutset to ensure functional equality. Since the accumulator
path. In addition to reducing critical path delays through in Fig.1(a) comprises two pipeline stages, the number of
pipelining, it is also important to satisfy functional equality additional flip-flops for the pipelining is 33 (gray- colored
before and after pipelining. The point at which the flip-flops flip-flops). If the accumulator is pipelined to the n-stage,
are inserted to ensure functional equality is called the feed the number of inserted flip-flops becomes 33(n−1), which
forward-cutset. confirms that the number of flip-flops for the pipelining
increases significantly as the number of pipeline stages is
Cutset: A set of the edges of a graph such that if these edges
increased.
are removed from the graph, and the graph becomes
disjointed. Fig. 1(b) shows the proposed FCF-PA. For the FCF-PA,
only one flip-flop is inserted for the two- stage pipelining.
Feed forward-cutset: A cutset where the data move in the
Therefore, the number of additional flip-flops for the n-
forward direction on all of the cutset edges.
stage pipeline is n − 1 only.

© 2020, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 944
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 05 | May 2020 www.irjet.net p-ISSN: 2395-0072

work. In the conventional two-stage PA, the

accumulation output (S) is produced two clock- cycle
after the corresponding input is stored in the input
buffer. On the other hand, regarding the proposed
structure, the output is generated one clock cycle after
the input arrives. Moreover, for the proposed scheme,
the generated carry from the lower half of the 32-bit
adder is involved in the accumulation one clock cycle
later than the case of the conventional pipelining.
For example, in the conventional case, the generated
carry from the lower half and the corresponding inputs
are fed into the upper half adder in the same clock cycle
asshown in the cycles 4 and 5 of Fig. 2 (left). On the
other hand, in the proposed FCF-PA, the carry from the
lower half is fed into the upper half one cycle later than
the corresponding input for the upper half, as depicted
in the clock cycles 3-5 of Fig. 2 (right). This
characteristic makes the intermediate result that is
stored in the output buffer of the proposed accumulator
different from the result of the conventional pipelining
case.

Fig -2: Two-stage 32-bit pipelined-accumulation

Fig -1: Schematics and timing diagrams of two-stage examples with the conventional pipelining (left) and
32-bit accumulators. (a) Conventional PA. (b) Proposed proposed FCF-PA (right). Binary number “1” between
FCF-PA. the two 16-bit hexadecimal numbers is a carry from
the lower half.
In the conventional PA, the correct accumulation
values of all the inputs up to the corresponding clock The proposed accumulator, however, shows the same
cycle are produced in each clock cycle as shown in the final output (cycle 5) as that of the conventional one. In
timing diagram of Fig. 1(a). A two-cycle difference exists addition, regarding the two architectures, the number of
between the input and the corresponding output due to cycles from the initial input to the final output is the
the two- stage pipeline. On the other hand, in the same. The characteristic of the proposed FCF pipelining
proposed architecture, only the final accumulation method can be summarized as follows: In the case where
result is valid as shown in the timing diagram of Fig. adders are used to process data in an accumulator, the
1(b). final accumulation result is the same even if binary
inputs are fed to the adders in an arbitrary clock cycle as
Fig. 2 shows examples of the ways that the far as they are fed once and only once.
conventional PA and the proposed method (FCF- PA)
© 2020, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 945
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 05 | May 2020 www.irjet.net p-ISSN: 2395-0072

Meanwhile, the CLA adder has been mostly used to

reduce the critical path delay of the accumulator. The
carry prediction logic in the CLA, however, causes a
significant increase in the area and the power
consumption. For the same critical path delay, the FCF-
PA can be implemented with less area and lower power
consumption compared with the accumulator that is
based on the CLA.

3.2 Full adder designs using XNOR and XOR gates

for sum logic
A full adder design employing two stages of XNOR
gates for the sum logic, while that employing two
successive stages of XOR gates for the sum logic is
depicted.
Fig -4: Pipelined column addition structure with the
Dadda multiplier. Conventional pipelining. (b)
Proposed FCF pipelining. HA: half-adder. FA: full
adder.

Fig -3: Full adder using XOR gates and a MUX.

3.3 Modified FCF-PA for Further Power

Reductions

Although the proposed FCF-PA can reduce the area

and the power consumption by replacing the CLA, there
are certain input conditions in which the undesired data
transition in the output buffer occurs, thereby reducing
the power efficiency when 2’s complement numbers are
used. Fig. 4 shows an example of the undesired data
transition. The inputs are 4-bit 2’s complement binary Fig -5: Proposed (a) FCF-PA and (b) MFCF-PA for the
improvement of the power efficiency.
numbers. AReg [7 : 4] is the sign extension of AReg [3],
which is the sign bit of ARe g [3 : 0]. In the conventional 3.4 4:2 Compressor Design
pipelining [Fig. 4 (left)], the accumulation result (S) in cycle
3 and the data stored in the input buffer (AReg) in cycle 2 The 4:2 compressor used to reduce the number of
are added and stored in the output buffer (S) in cycle. In device computation in order to reduce the area and power
this case, the “1” in AReg [2] in cycle 2 and the “1” in S[2] of a MAC unit is depicted.
in cycle 3 are added, thereby generating a carry. The
carry is transmitted to the upper half of the S, and hence,
S[7:4] remains as “0000” in cycle.

© 2020, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 946
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 05 | May 2020 www.irjet.net p-ISSN: 2395-0072

Half adder Full adder 4:2 compressor

Feed forward
4.2 Delay Report
cutset rule
Design 1

a7b7 a6b7 a5b7 a4b7 a3b7 a2b7 a1b7 a0b7 a0b6 a0b5 a0b4 a0b3 a0b2 a0b1 a0b0
a7b6 a6b6 a5b6 a4b6 a3b6 a2b6 a1b6 a1b5 a1b4 a1b3 a1b2 a1b1 a1b0
a7b5 a6b5 a5b5 a4b5 a3b5 a2b5 a2b4 a2b3 a2b2 a2b1 a2b0
a7b4 a6b4 a5b4 a4b4 a3b4 a3b3 a3b2 a3b1 a3b0
a7b3 a6b3 a5b3 a4b3 a4b2 a4b1 a4b0
a7b2 a6b2 a5b2 a5b1 a5b0
a7b1 a6b1 a6b0
a7b0

Pipeline Stage

a7b7 a6b7 c12 c11 c9 c7 c5 c3 c2 c1 s1 a0b3 a0b2 a0b1 a0b0

a7b6 a5b7 s12 c10 c8 c6 c4 s3 s2 a2b2 a1b2 a1b1 a1b0
a6b6 a6b5 s11 s9 s7 s5 s4 a4b1 a3b1 a2b1 a2b0
a7b5 a7b4 a7b3 s10 s8 s6 a6b0 a5b0 a4b0 a3b0

Pipeline Stage

y12 y11 y10 y9 y8 y7 y6 y5 y4 y3 y2 y1 x1 a0b1 a0b0

a7b7 x12 x11 x10 x9 x8 x7 x6 x5 x4 x3 x2 a2b0 a1b0

Fig -6: MAC unit using 4:2 compressor.

3.5 Advantages

 Feed Forward Cutset Free technique decreases

the Pipeline stages. Fig –8 Delay report of MAC unit using 4:2 compressor

 Less area and shorter critical path delay when 4.3 Area Report
using the concept of DADDA multiplier.
 Power consumption is low.

4. RESULT AND DISCUSSION

4.1 Power report

Fig –9 Area report of MAC unit using 4:2 compressor

Fig-7 Power report of MAC Unit using 4:2 Compressor.

© 2020, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 947
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 05 | May 2020 www.irjet.net p-ISSN: 2395-0072

4.4 Simulation Output networks,” in Proc. Adv. Neural Inf. Process. Syst.,
2012, 1097–1105.
[2] K.Simonyan and A. Zisserman. “Very deep
convolution networks for large-scale image
recognition.” [Online]. Available:
https://arxiv.org/abs/1409.1556. 2014.
[3] A.Graves, A.-R. Mohamed, and G. Hinton, “Speech
recognition with deep recurrent neural
networks,” in Proc. IEEE Int. Conf. Acoust.,
Speech Signal Process. (ICASSP), 2013, 6645 –
6649.
[4] Y. H. Chen, T. Krishna, J. S. Emer, and V. Sze,
“Eyeriss: An energy-efficient reconfigurable
accelerator for deep convolution neural
networks,” IEEE J. Solid-State Circuits, 52, 2017,
127–138.
[5] C. S. Wallace, “A suggestion for a fast multiplier,”
IEEE Trans. Electron. Comput., 1, , 1964, 14 –17,
Feb. 1964.
[6] L. Dadda, “Some schemes for parallel
multipliers,” Alta Frequenza, vol. 34, no. 5, pp.
349 –356, Mar.1965.
Fig –10 Simulation output of MAC unit using 4:2 [7] P. F. Stelling and V. G. Oklobdzija, “Implementing
compressor multiply-accumulate operation in multiplication
time,” in Proc. 13th IEEE Symp. Comput.
5. CONCLUSION Arithmetic, Jul. 1997, pp. 99 –106.
[8] K. K. Parhi, VLSI Digital Signal Processing Systems:
We introduced the FCF pipelining method in this Design and Implementation. New Delhi, India:
paper. In the proposed scheme, the number of flip-flops in a Wiley, 1999.
pipeline can be reduced by relaxing the feed forward-cutset
constraint, thanks to the unique characteristic of the [9] T. T. Hoang, M. Sjalander, and P. Larsson-Edefors,
machine learning algorithm. We applied the FCF pipelining “A high-speed, energy-efficient two-cycle multiply-
method to the accumulator (FCF-PA) design, and then accumulate (MAC) architecture and its application
optimized the power dissipation of FCF-PA by reducing the to a double-throughput MAC unit,” IEEE Trans.
chance of undesired data transitions (MFCF-PA). The Circuits Syst. I, Reg. Papers, vol. 57, no. 12, pp.
proposed scheme was also expanded, and applied to the 3073–3081, Dec. 2010.
MAC unit (FCF- MAC). For the evaluation, the conventional
[10] W. J. Townsend, E. E. Swartzlander, and J. A.
and proposed MAC architectures were synthesized in a 65-
Abraham, “A comparison of Dadda and Wallace
nm CMOS technology. The proposed accumulator showed
multiplier delays,” Proc. SPIE, Adv. Signal
the reduction of area and the power consumption by 17%
Process. Algorithms, Archit., Implement. XIII, vol.
and 19%, respectively, compared with the accumulator with
5205, pp. 552–560, Dec. 2003, doi:
the conventional CLA adder-based design. In the case of the
10.1117/12.507012.
MAC architecture, the proposed scheme reduced both the
area and power by 20%. we will design MAC Unit using
MCF_PA with 4:2 compressor and XOR MUX Full adder with
compared Conventional full adder designs in the future.We
believe that the proposed idea to utilize the unique
characteristic of 4:2 compressor computation for more
efficient MAC design can be adopted in many hardware
accelerator designs.

6. REFERENCES
[1] A.Krizhevsky, I. Sutskever, and G. E. Hinton, “Image
Net classification with deep convolution neural

Digital Modulations using Matlab
From Everand
Digital Modulations using Matlab
Mathuranathan Viswanathan
4/5 (6)
Aspen HYSYS SQP Optimization
No ratings yet
Aspen HYSYS SQP Optimization
24 pages
Operations - Assignment 1
0% (1)
Operations - Assignment 1
3 pages
Overview of SAP IS-U CCS Modules
100% (4)
Overview of SAP IS-U CCS Modules
7 pages
Hotel Reservation System
67% (3)
Hotel Reservation System
44 pages
Information System and Databases
No ratings yet
Information System and Databases
10 pages
A New Vlsi Architecture For Modi Ed
No ratings yet
A New Vlsi Architecture For Modi Ed
6 pages
Feedforward-Cutset-Free Pipelined Multiply-Accumulate Unit For The Machine Learning Accelerator
No ratings yet
Feedforward-Cutset-Free Pipelined Multiply-Accumulate Unit For The Machine Learning Accelerator
9 pages
307 - An Efficient Two-Phase 3387-11439-1-PB
No ratings yet
307 - An Efficient Two-Phase 3387-11439-1-PB
7 pages
Priyanka - 50300 16 130
No ratings yet
Priyanka - 50300 16 130
4 pages
Design of High-Speed Area Efficient Mac Unit Using Reversible Logic
No ratings yet
Design of High-Speed Area Efficient Mac Unit Using Reversible Logic
6 pages
Optimization of Delay IIN Pipeline Mac Unit Using Wallace Tree Multiplier
No ratings yet
Optimization of Delay IIN Pipeline Mac Unit Using Wallace Tree Multiplier
9 pages
FPGA-Based Multi-Level Approximate Multipliers For High-Performance Error-Resilient Applications
No ratings yet
FPGA-Based Multi-Level Approximate Multipliers For High-Performance Error-Resilient Applications
17 pages
Mux Implementation of Bec-1 Based Pipelined Vedic Mac Using Han Carlson Accumulator
No ratings yet
Mux Implementation of Bec-1 Based Pipelined Vedic Mac Using Han Carlson Accumulator
94 pages
Content Addressable Memory With Efficien PDF
No ratings yet
Content Addressable Memory With Efficien PDF
7 pages
Assignment Epo552
No ratings yet
Assignment Epo552
7 pages
Low Power Datapath Architecture For Multiply - Accumulate MAC Unit
No ratings yet
Low Power Datapath Architecture For Multiply - Accumulate MAC Unit
5 pages
Motor Drive System
No ratings yet
Motor Drive System
6 pages
Dmatm: Dual Modified Adaptive Technique Based Multiplier
No ratings yet
Dmatm: Dual Modified Adaptive Technique Based Multiplier
6 pages
Efficient MAC Unit Design For DSP Processors Using Multiplication and Accumulation Operations
No ratings yet
Efficient MAC Unit Design For DSP Processors Using Multiplication and Accumulation Operations
8 pages
p226 Sakthivel PDF
No ratings yet
p226 Sakthivel PDF
6 pages
High Performance Multiply
No ratings yet
High Performance Multiply
11 pages
VLSI Design and Implementation of Low Power MAC Unit With Block Enabling Technique
0% (1)
VLSI Design and Implementation of Low Power MAC Unit With Block Enabling Technique
5 pages
Efficient Design of Majority-Logic-Based Approximate Arithmetic Circuits
No ratings yet
Efficient Design of Majority-Logic-Based Approximate Arithmetic Circuits
13 pages
CMOS Op-Amp Sizing Using A Geometric Programming Formulation
No ratings yet
CMOS Op-Amp Sizing Using A Geometric Programming Formulation
17 pages
Multiplier and Accumulator Unit
100% (4)
Multiplier and Accumulator Unit
4 pages
Efficient Approximate Parallel Prefix Adder Design (REPORT)-1
No ratings yet
Efficient Approximate Parallel Prefix Adder Design (REPORT)-1
22 pages
Efficient Approximate Parallel Prefix Adder Design (1)
No ratings yet
Efficient Approximate Parallel Prefix Adder Design (1)
25 pages
McPAT An Integrated Power, Area, and Timing Modeling
No ratings yet
McPAT An Integrated Power, Area, and Timing Modeling
12 pages
Implementation of MAC Unit Using Booth Multiplier & Ripple Carry Adder
No ratings yet
Implementation of MAC Unit Using Booth Multiplier & Ripple Carry Adder
3 pages
ASIP Architecture Implementation of Channel Equalization Algorithms For MIMO Systems in WCDMA Downlink
No ratings yet
ASIP Architecture Implementation of Channel Equalization Algorithms For MIMO Systems in WCDMA Downlink
5 pages
Dong 2020
No ratings yet
Dong 2020
14 pages
228557429 (1)
No ratings yet
228557429 (1)
5 pages
Aco 2022
No ratings yet
Aco 2022
18 pages
Mimii Dataset: Sound Dataset For Malfunctioning Industrial Machine Investigation and Inspection
No ratings yet
Mimii Dataset: Sound Dataset For Malfunctioning Industrial Machine Investigation and Inspection
9 pages
A Noval Approach of FFT
No ratings yet
A Noval Approach of FFT
3 pages
research paper[1] (1)
No ratings yet
research paper[1] (1)
6 pages
Pramod 2019
No ratings yet
Pramod 2019
12 pages
Fast Mapping and Updating Algorithms for a
No ratings yet
Fast Mapping and Updating Algorithms for a
9 pages
Tam Metin
No ratings yet
Tam Metin
4 pages
Machine Learning Based Adaptive Prediction
No ratings yet
Machine Learning Based Adaptive Prediction
9 pages
Performance Analysis of 16 bit Adders in high speed computing applications
No ratings yet
Performance Analysis of 16 bit Adders in high speed computing applications
7 pages
An Efficient Multiplier Based On Shift A
No ratings yet
An Efficient Multiplier Based On Shift A
6 pages
Model Predictive Control Using Artificial Neural Network For Power Converters
No ratings yet
Model Predictive Control Using Artificial Neural Network For Power Converters
10 pages
Design of Area, Power and Delay Efficient High-Speed Multipliers
No ratings yet
Design of Area, Power and Delay Efficient High-Speed Multipliers
8 pages
Ilhan Resource-Efficient Transformer Pruning for Finetuning of Large Models CVPR 2024 Paper
No ratings yet
Ilhan Resource-Efficient Transformer Pruning for Finetuning of Large Models CVPR 2024 Paper
10 pages
major conference paper
No ratings yet
major conference paper
14 pages
International Journal of Computational Engineering Research (IJCER)
No ratings yet
International Journal of Computational Engineering Research (IJCER)
11 pages
Distributed Arithmetic Unit Design For Fir Filter
No ratings yet
Distributed Arithmetic Unit Design For Fir Filter
5 pages
The Design and Use of Simplepower: A Cycle-Accurate Energy Estimation Tool
No ratings yet
The Design and Use of Simplepower: A Cycle-Accurate Energy Estimation Tool
6 pages
Design and Implementation of MAC using approx. Multiplier
No ratings yet
Design and Implementation of MAC using approx. Multiplier
7 pages
Shanthala 2009
No ratings yet
Shanthala 2009
6 pages
2412.03626v1
No ratings yet
2412.03626v1
9 pages
A2 Intro
No ratings yet
A2 Intro
28 pages
A High-Speed, Energy-Efficient Two-Cycle Multiply-Accumulate (MAC) Architecture and Its Application To A Double-Throughput MAC Unit
No ratings yet
A High-Speed, Energy-Efficient Two-Cycle Multiply-Accumulate (MAC) Architecture and Its Application To A Double-Throughput MAC Unit
9 pages
Block-Based Carry Speculative Approximate Adder For Energy-Efficient Applications
No ratings yet
Block-Based Carry Speculative Approximate Adder For Energy-Efficient Applications
5 pages
Iowtc2022 98217
No ratings yet
Iowtc2022 98217
10 pages
2. Karatsuba Matrix Multiplication and Its Efficient Custom Hardware Implementations
No ratings yet
2. Karatsuba Matrix Multiplication and Its Efficient Custom Hardware Implementations
15 pages
Function Unit Specialization Through Code Analysis
No ratings yet
Function Unit Specialization Through Code Analysis
4 pages
Vlsi
No ratings yet
Vlsi
32 pages
Numerical Simulation in Automotive Design: G. Lonsdale C&C Research Laboratories, NEC Europe LTD., St. Augustin, Germany
No ratings yet
Numerical Simulation in Automotive Design: G. Lonsdale C&C Research Laboratories, NEC Europe LTD., St. Augustin, Germany
7 pages
Mcsa Tool
No ratings yet
Mcsa Tool
6 pages
A Novel Low Power and High Speed Multiply-Accumulate MAC Unit Design For Floating-Point Numbers
No ratings yet
A Novel Low Power and High Speed Multiply-Accumulate MAC Unit Design For Floating-Point Numbers
7 pages
Energy Efficient Time Domain Vector by Matrix Mult
No ratings yet
Energy Efficient Time Domain Vector by Matrix Mult
7 pages
MATLAB Based Simulations Model For Three Phase Power System Network
No ratings yet
MATLAB Based Simulations Model For Three Phase Power System Network
16 pages
Math Form 1 17
No ratings yet
Math Form 1 17
7 pages
Inter-Process Communication and Synchronization
No ratings yet
Inter-Process Communication and Synchronization
43 pages
RAPT User Manual RAPT User Manual: Configure For Network Use
No ratings yet
RAPT User Manual RAPT User Manual: Configure For Network Use
8 pages
Training & Placement Program
No ratings yet
Training & Placement Program
5 pages
ETABS
No ratings yet
ETABS
6 pages
DE Shaw OA round
No ratings yet
DE Shaw OA round
7 pages
BDC
No ratings yet
BDC
2 pages
CHP2 Lpapplication
0% (1)
CHP2 Lpapplication
44 pages
Leetcode Solution Tracker Sheet
No ratings yet
Leetcode Solution Tracker Sheet
64 pages
Panel Master Manual - 1
No ratings yet
Panel Master Manual - 1
67 pages
Minicom (1) - Linux Man Page
No ratings yet
Minicom (1) - Linux Man Page
15 pages
Antony Kungu - Final Project Assignment
No ratings yet
Antony Kungu - Final Project Assignment
11 pages
Trainguard PTC Safe and Efficient Train Operation - Siemens PDF
No ratings yet
Trainguard PTC Safe and Efficient Train Operation - Siemens PDF
4 pages
SBB - Online Timetable
No ratings yet
SBB - Online Timetable
2 pages
Computer MCQ
No ratings yet
Computer MCQ
145 pages
Open Source Tools For Artificial Intelligence
No ratings yet
Open Source Tools For Artificial Intelligence
2 pages
Final Exam (Sum. 2019)
No ratings yet
Final Exam (Sum. 2019)
8 pages
Informacion Autoral Del Concepto de Nooplatonismo
No ratings yet
Informacion Autoral Del Concepto de Nooplatonismo
9 pages
System Programming
No ratings yet
System Programming
29 pages
Tutorial Ansys TurboGrid
No ratings yet
Tutorial Ansys TurboGrid
124 pages
Abaqus Tutorial 15 Pane XFEM
100% (2)
Abaqus Tutorial 15 Pane XFEM
18 pages
Macintosh Os X Tiger Keys
No ratings yet
Macintosh Os X Tiger Keys
4 pages
Stephen Calogera Resume
No ratings yet
Stephen Calogera Resume
2 pages
Fastreport 3 Enterprise Edition: Programmer Manual
No ratings yet
Fastreport 3 Enterprise Edition: Programmer Manual
35 pages
LITERATURE SURVEY On Moving Object Detection
100% (1)
LITERATURE SURVEY On Moving Object Detection
2 pages