Learning · Step 1
IEEE-754: float & double
A computer doesn't store 3.14 as decimal digits. It
stores it the way scientific notation works — a sign, a fraction,
and a power of two:
(−1)s × 1.f × 2e.
The IEEE-754 standard fixes exactly how those three
parts are packed into the bits of a float or a
double.
A 32-bit float splits into
1 sign bit ·
8 exponent bits ·
23 mantissa bits; a 64-bit
double into 1 ·
11 · 52.
The exponent is stored biased — add 127 for a float, 1023
for a double — so it can represent negative powers. The mantissa
stores only the fraction; the leading 1. is implied.
Two exponent patterns are special: all-zeros means zero or a subnormal, all-ones means infinity or NaN. Everything else is a normal number. This is what your code holds before you port it to fixed point — and the mantissa's fixed width is exactly why precision eventually runs out.
Try it
Click any bit to flip it, type a value, or pick a preset — the
decode updates live. Switch between float and
double to see the same number use a different layout.
What to notice
- Type
0.1— the mantissa fills with a repeating pattern. 0.1 has no exact binary form, so it's rounded; that tiny error is the root of most fixed-point surprises. - Flip a high exponent bit — the value jumps by a huge power of two. The exponent sets the scale, the mantissa the detail.
- Set every exponent bit to 1 — that's
Infinity, orNaNthe moment any mantissa bit is also set. - Switch to
double: same three fields, but 52 mantissa bits instead of 23 — far more precision for the same value.
Step exam
Answer all 3 questions correctly to complete this step.
-
In IEEE-754 single precision (float), how many bits hold the exponent?
-
In the value formula (-1)^S · 1.mantissa · 2^(e - bias), what does the exponent field set?
-
Double precision carries how many mantissa bits, versus float's 23?