Fused Multiply Add
A
Fused Multiply Add computes a
multiply-accumulate
FMA(A,B,C) = A*B+C
with a single rounding of floating point numbers.
When implemented in a microprocessor this is typically faster than
a multiply operation followed by an add. It also allows for getting the bottom half of the multiplication. E.g.,
- H = FMA(A,B,0.0)
- L = FMA(A,B,-H)
This is implemented on the
PowerPC and
Itanium processor families. Because of this instruction there is no need for a hardware
divide or
square root unit since they can both be implemented using the FMA in software.
The FMA operation will likely be added to IEEE 754 in IEEE 754r.