Block Floating Point Interval ALU For Digital Signal Processing
Block Floating Point Interval ALU For Digital Signal Processing
Interval ALU
for Digital Signal Processing
Introduction
Background
Architecture
Results
Conclusions and Future Work
References
Outline
Introduction
Background
Architecture
Results
Conclusion s and Future Work
References
Introduction
Problem Statement
To provide reliable arithmetic for embedded systems.
Low power
Small footprint
Real-time computing
Applications
Digital signal processing & Control
Fuzzy systems
Adaptive filtering
Decision systems
Introduction
Problem Statement
Fixed point implementations
dynamic range
Overflow in an interval
Summation
Build a fixed point interval ALU whose arithmetic Lower
stays reliable
Bound
Lower Bound
even
75 in the presence of overflow.
a
n0
n
a 1.10,1.15
Upper Bound
Introduction
Background
Architecture
Results
Conclusions and Future Work
References
Previous Work
Previous Work
Lower
Endpoint
envelope
BLOCK NORMALIZATION
Scale data to common exponent pre-operation.
Perform fixed point operations to process that block.
Mathematical Formulation of
Block Floating Point for Intervals
Block Exponent
γ can also be evaluated as negated minimum count of leading number of sign bits
in binary
Design Specifications
Handling Fixed Point Overflows
Conditional Block Floating-point Scaling (CBFS)
Overflow mainly associated with Addition operation
CBFS based on correcting errors
Procedure:
Perform operation
Check if overflow occurred
If it did, scale down the result by a factor of 2
Increment output block exponent
If it didn’t overflow, retain result
Output block exponent is same as input block exponent
Design Specifications
Rounding
Outward Rounding
Output interval must meet correctness
Retain the rounding scheme from IALU [Ruchir2006]
Truncate lower endpoint by discarding higher precision bits
Add the OR-ed result of the discarded bits to round the result to +∞.
Introduction
Background
Architecture
Results
Conclusions and Future Work
References
Hardware Architecture
Top Level Hardware Architecture
Hardware Architecture
Slide 17/35
Flag Generator
Compare
(XL with YU) ; (XU with YL)
s2
output interval
DIV
SUB / s3
WIDTH
SELECT
LOGIC
XL 16 UNION s4
MIN / s6
MAX 4
cmd1
s7
OR
AND s8
Sets OVFL_L, a one bit signal, high to
XOR s9 indicate overflow to the Scale
EXP. Synchronizer
DET.
s10
SIGNED
LEFT
SHIFT
Hardware Architecture
Slide 19/35
Upper Bound Module
EXPONENT DETECTION
Identify the redundant sign bits by
XOR of successive data bits.
LEFT SHIFTING
The integer output from the Priority
Encoder is the value of γ
Single cycle Normalize : Select
normalized value from shifted versions of
the input using γ as the select line
Hardware Architecture
Slide 21/35
Scale Synchronizer
Main functions
Point-wise operations
Rounding scheme could be Truncation or Rounding to +∞
No synchronization needed
Updating Block Exponent increment
Whether overflow occurred or not
Whether special case rounding occurred or not
Whether the operations are iterative or not
Hardware Architecture
Slide 23/35
Scaling Modules
Introduction
Background
Architecture
Results
Conclusions and Future Work
References
Results
Module Execution Rates
For interval block of size N, (N) cycles needed each for Exponent
Detection and Left-shifting to Normalize
3 cycle penalty per overflow associated with flushing the MAC feedback
path, reloading the new block exponent and resuming operations.
€
Conclusions and Future Work
Future Work
Introduction
Background
Architecture
Results
Conclusions and Future Work
References
References
[Ruchir2006] R. Gupte, W. Edmonson, Gianchandani, J, S. Ocloo, and W. Alexander,
“Pipelined ALU for signal processing to implement interval arithmetic," Signal
Processing Systems Design and Implementation IEEE, pp. 95-100, 2006.
[Amaricai2007] Alexandru Amaricai, Mircea Vladutiu, Lucian Prodan, Mihai Udrescu,
Boncalo, Oana,” Design of Addition and Multiplication Units for High Performance
Interval Arithmetic Processor”, Design and Diagnostics of Electronic Circuits and
Systems, 2007. DDECS '07. IEEE, April 2007
[Schultz2000] M. J. Schultz and E. E. Swartzlander, “A family of variable-precision interval
arithmetic processors," IEEE Transactions on Computers, vol. 49, May 2000.
[Stine1998] J. E. Stine and M. J. Schulte, “A combined interval and floating-point
multiplier," 8th Great Lakes Symposium on VLSI, pp. 208-213, Feb 1998.
[Stine1998a] J. E. Stine and M. J. Schulte, “A combined interval and floating-point divider,"
IEEE Conference Record on Signals, Systems and Computers, 1998
[Akkas2002] A. Akkas, “A combined interval and Floating-point comparator/selector,“
Application-Specific Systems, Architectures and Processors, pp. 208-217, July 2002.
[Oppenheim1970] A. Oppenheim, Realization of digital filters using block-floating-point
arithmetic,“ IEEE Transactions on Audio and Electroaccoustics, vol. 18, pp. 130-136,
Jun 1970.
[Erickson1992] A. C. Erickson and B. S. Fagin, Calculating the FHT in hardware," IEEE
Transactions on Signal Processing, vol. 40, June 1992.
References
[Bidet1995] Bidet E., Castelain D., Joanblanq C. and Senn, P.,”A fast single-chip
implementation of 8192 complex point FFT”, IEEE Journal of Solid-State Circuits,
vol. 30, No.3, pp. 300-305, Mar 1995
[Van Emden2001] M. Van Emden, T. Hickey, and Q. Ju, “Interval arithmetic: From
principles to implementation,“ Massachusetts Journal of the ACM, vol. 48, pp. 1038-
1068, September 2001.
[Liang2000] Q. Liang and J. M. Mendel, “Overcoming time-varying co-channel interference
using Type-2 fuzzy adaptive filters," IEEE Transactions on Circuits and Systems - II,
vol. 47, Dec 2000.
[Chhabra1999] Chhabra and R. Iyer, “A block floating point implementation on the
TMS320C54x DSP," Tech. Rep., Texas Instruments, December 1999. Application
report SPRA610.
[Kalliojarvi1996] K. Kalliojarvi and J. Astola, Roundoff errors in block-floating-point
systems," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 44, pp.
783-790, April 1996.
[Deschamps2006] J.-P. Deschamps, G. J. A. Bioul, and G. D. Sutter, Synthesis of Arithmetic
Circuits. John Wiley & Sons, 2006.
[Cragon1996] H. G. Cragon, Memory Systems and Pipelined Processors. Sudbury,
Massachusetts: Jones and Barlett Publishers, 1996.
[Hansen2004] E. Hansen and G. W. Walster, “Global optimization using interval analysis”,
Marcel Dekker, Inc. and Sun Microsystems, Inc., 2004.
[intervalhomepage] http://www.cs.utep.edu/interval-comp/intsoft.html.
Slide 32/35
Thank You !!