#1,091 – Subnormal Floating Point Numbers

32-bit binary floating point numbers are normally stored in a normalized form, that is:

1091-001

where d is the fractional part of the mantissa, consisting of 23 binary digits and e is the exponent, represented by 8 bits.

In this form, the minimum allowed value for e is -126, which is stored in the 8-bit exponent as a value of 1.  Because the leading 1 is implicit, this means that the  minimum positive floating point value is:

1091-002

We could use the 8-bit value of 0 in the exponent to represent an exponent of -127, but that would only gain us a single power of two, or one more value that we could store.

Instead, a value of of 0 stored in the 8-bit exponent is a signal to drop the leading 1 in the mantissa.  This allows storing a set of much smaller numbers, known as subnormal numbers, of the form:

1091-003

We can now use all 23 digits in the mantissa, allowing us to store numbers as low as 2^-149.

Advertisements