denormalize the product of 2 floating point numbers or not

Question

denormalize the product of 2 floating point numbers or not

780 views Asked by AudioBubble At 12 July 2020 at 11:36

I'm trying to multiply 2 floating point numbers without using the floating point instructions. Everything was going well until I came across denormalized numbers. How do I know whether I should normalize or denormalize the product? This uncertainty makes rounding the product hard. My intuition tells me that the product should be denormalized if both factors are denormalized numbers.

Original Q&A

There are 2 answers

chux - Reinstate Monica On 12 July 2020 at 12:11

How do I know whether I should normalize or denormalize the product?

When the product is so small its biased exponent is less than 1, the result is a subnormal ~~denormal~~ or 0.0 when the biased exponent is less than 1 - number of significant bits.

For binary64:

When the product is less than DBL_MIN ....

DBL_MIN 2.225...E-308 or 0X1P-1022

Yet as large as DBL_TRUE_MIN

DBL_TRUE_MIN 4.940...E-324 or 0X1P-1074

My intuition tells me that the product should be denormalized if both factors are denormalized numbers.

The product of 2 numbers that are so small is itself so small that typically the product rounds to zero.

A product is in the sub-normal range even with normal arguments. Example:

DBL_MIN * 0.5 --> subnormal

**Peter Cordes** · Accepted Answer · 2020-07-12T11:54:12+00:00

Subnormal numbers are very close to zero. For a subnormal x, x^2 has about half the unbiased exponent, and that's way too small for even a subnormal to represent. (Even if x was the largest subnormal, i.e. nextafter(FLT_MIN, -INF). Things are similar for any two subnormal numbers.

The product of two subnormal numbers always fully underflows to + or -0.0.

The result of any operation should always be normalized if possible. The only time it's not possible is when the exponent would be too small, then you have subnormal (aka denormal) numbers give you gradual underflow by leaving leading bits of the mantissa as zero, for the minimum exponent value. https://en.wikipedia.org/wiki/Single-precision_floating-point_format explains subnormal numbers in general pretty well.

This is a general rule for floating point, always: IEEE754 formats like binary32 and binary64 leave no choice in how to represent any given finite value. A non-zero exponent encoding implies a leading 1 in the mantissa, so you can't have a denormalized float or double except for subnormal. The x87 80-bit extended-precision format has all its mantissa bits stored explicitly, so it's possible to encode a number with a non-zero exponent but leading zeros in the mantissa. However, hardware may even consider that invalid, and you should definitely never do it because it means throwing away more mantissa bits than necessary (if this was a multiply).

Addition or subtraction can also produce subnormal numbers, if the signs differ/match respectively. e.g. nextafter(FLT_MIN, +INFINITY) - FLT_MIN cancels all but the lowest mantissa bit (an example of "catastrophic cancellation"), leaving a number too small to be represented as a normalized float.

TechQA.

denormalize the product of 2 floating point numbers or not

There are 2 answers

Related Questions in ASSEMBLY

Related Questions in FLOATING-POINT

Related Questions in NASM

Related Questions in MULTIPLICATION

Related Questions in DENORMAL-NUMBERS

Popular Questions

Trending Questions