ENSC254 - Floating Point Computation
ENSC254 - Floating Point Computation
Ensc254
• Floating point numbers come into play when we are interested in a large range of
values
• Traditionally, floating point algorithms have been manually ported to the integer
domain for embedded systems. BUT, as the complexity of embedded systems
and their performance increase exponentially, floating point computations are
beginning to be increasingly common
2
IEEE 754 - 2008
• This standard specifies interchange and arithmetic formats and methods for
floating-point arithmetic in computer environments.
• It specifies exception conditions and their default handling.
• An implementation of a floating-point system conforming to this standard may
be realized entirely in software/firmware, entirely in hardware, or in any
combination of software/firmware and hardware.
3
Floating Point in C environment
• IEEE 754-2008 defines 4 binary FP formats: 16, 32, 64 and 128 bits
(half precision, single precision, double precision and quad precision)
4
Floating point Support in ARM
FLOATING POINT
Manual Hardware
Porting Software
Libraries
𝐹 = −1 𝑠 ∗ 2𝑒𝑥𝑝−𝑏𝑖𝑎𝑠 ∗ 1. 𝑓
31 30 23 22 0
s exponent Fraction
See Hohl textbook page 180 for analogous 16 and 64 bit formats
6
Example from textbook (Page 181)
• 6.5/2 = 3.25
• 6.5/4 = 1.625
7
Example from Textbook
8
Example from Textbook
• 0.4375/2−1 =0.875
• 0.4375/2−2 =1.75 => exp-bias=-2
9
Range of FP Numbers
• Floating point numbers cover a much larger range as opposed to integers, but
they are represented with the same amount of bits (32 for single precision)
• This means that they can be less accurate: there can be a higher distance
between every single element and the following element, leading to possible
ERRORS in the REPRESENTATION
10
Accuracy of Floating Point Numbers
𝑠
𝐹 = −1 ∗ 2𝑒𝑥𝑝−𝑏𝑖𝑎𝑠 ∗ 1. 𝑓
11
De-normal / Subnormal Numbers
𝑠
𝐹 = −1 ∗ 2−126 ∗ 0. 𝑓
In this case we can add 223 more numbers to the list of representable floats.
Sub-normal numbers are indicated by an unbiased exponent of zero, that is all
numbers with exp=0 are considered de-normalized.
12
Zero and Infinity
• FP representation includes two zero (+/-0) and two infinity values (+/-∞)
Infinity is considered as a mathematical concept, and NOT as the
maximum representable value!
31 30 23 22 0
s exponent Fraction
13
Not a Number (NaN)
• IEEE 754-2008 imposes that the result of any operation involving a NaN is a
NaN
• NaN are encoded with exponent of all ‘1’s and non zero significand
14
Implementation of FP Operations in ARM
• Latest architectures from ARM such as the Cortex allow the use of a Hardware
FP calculation
15
Co-Processor Computation
microprocessor
Co-Processor
Operands
rfile
rfile
Results
16
Coprocessor vs Memory mapped peripheral
• Memory mapped peripherals are loosely coupled to the CPU. They have NO
IMPACT on the CPU architecture and ISA, that sees the divider as a part of the
memory. But memory accesses can be demanding in terms of cycle time
17
Coprocessor vs Memory mapped peripheral (2)
bus
z=x/y;
MTC MTC CR0, r4
MTC CR1, r5
CPU MFC
DIVIDER COP.div CR2,CR0,CR1
[....wait necessary time ....]
MFC CR2,r2
MTC=Move to Coprocessor
MFC=Move from Coprocessor
• In this case, it is customary for the coprocessor to have an internal register file to
store temporary values internally minimizing transfers to/from the main CPU
18
Floating Points Units
• The ARM FPU also has two control FP registers: FPCSR (Floating Point
Status and Control Register) and CPACR (Coprocessor Access Control
Register)
19
Cortex FP Registers
• The 32 GP registers are “flat”: there is no specific usage such as in the case of
integer registers, they can all be used interchangeably. The mnemonics s#
(single precision) or d# (double precision) are used instead of r# to
indicate floating point registers
• FPCSR (Floating Point Status and Control Register) is the equivalent of the
integer CPSR, and stores operation information:
20
Loading FP Registers
21
Loading/Storing data to/from the FPU
22
Moving FP data between GP and FP Registers
VMOV.f32 S#, R#
VMOV.f32 R#, S#
VMOV.f32 S#, S#
VMOV.f32 S#, immed
• NOTE: VMOV, VLDR, VSTR as well as all other floating point operations are
part of the ARM ISA, so they support all conditional execution suffixes
23
Double Precision FP move
VMOV S#,S#,R#,R#
VMOV R#,R#,S#,S#
VMOV D#, R#,R#
VMOV R#,R#,D#
• Note that the last two are not independent operation, but a different way
(aliasing) of writing the same operation: they correspond to the same
machine code
24
Floating Point Processing Instructions
ABS / NEG / ADD / SUB / MUL / MLA / MLS / CMP / DIV / SQRT
25
Format Conversion Instructions
int a;
Instruction for format conversion:
float b;
main()
{ b=(float) a;} VCVTB, VCVTT, VCVT
26
Tutorial 1: Disassembling a simple FPU Code
Cortex M4 Version
C Code:
float a,b=2.3,c=3.4;
main()
{a=b+c;}
ARM7TDMI Version
Emulation
Function
27
Tutorial 1: Profiling Information
Single cycle
28
Tutorial 2: Disassembling an FP Division
C Code:
float a,b=2.3,c=3.4;
main() 14 cycles
{a=c/b;}
Note: For your reference, a SP mul operation takes one cycle, a MLA 3 cycles
29