printfis printing that decimal representation. However, what was surprising to me is that this imprecise representation can be expanded to so many new decimal digits.
printfprints decimal expansion of the binary number corresponding to our original literal. That binary number approximates the literal. There can be maximum
53 - binary exponentdigits in the decimal expansion (at least for "normal" numbers less than 1). Read below for the details.
printf. How many digits can I expect in this decimal expansion?
sign * significand * (base ^ exponent)
1 * 1.011 * (base ^ 2), sign is 1, significand is 1.011 and we need to scale it two positions to the right, so exponent is
doublewe can do the following steps.
P / Qwhere
Qis an exact power of 2 and
Pis an integer.
0.000 11111100110101101110100110111010001101111011001011001 011011...
1we should also add
1to that large part for rounding purposes. In our case it's
0so we are fine.
1is 4 positions to the right from the binary point. So in our case the exponent will be
-4. However, in IEEE 754 exponents are stored with a particular bias which has to be added before we store it in bits. For double precision this bias is 1023, so we have to add that to -4 getting
1019which we need to store as unsigned integer in 11 bits of exponent (those numbers can also be taken from the table above). Why to store the exponent with a bias and not as "sign + absolute value" or "two complement"? The main reason that with this way we can use integer comparator to compare floating-point numbers. Also, it leads to a nice zero representation with all 0 bits. See here for the details.
0for sign. Exponent is
01111111011if represented with 11 bits. And we've got our 53 precision bits of the significand which we need to pack into 52 bits. This is easily done, since the first bit is always 1, so we never store it (it's called "hidden" or "implicit" bit). In the end we get this:
0 01111111011 1111100110101101110100110111010001101111011001011001
0.1234567890123456in memory! Whoa, that was a lot of work. Let's check if we did it right with some Rust code (it's easier to print bits in Rust than in C):
double d = 1.111111111111111111111111111111111111111111111111;
For decimal floating constants [...] the result is either the nearest representable value, or the larger or smaller representable value immediately adjacent to the nearest representable value, chosen in an implementation-defined manner.
printf. It takes all those bits we used for binary representation, converts it back to exact decimal and prints it with specified precision.
is. This corresponds to the value of
i-th bit in the significand.
printfand we also have an upper bound on them. For example in our case we should expect no more than 53 + 4 digits after the point. "+4" because we use
-4exponent, which can add more digits. Indeed, if we add some more precision to our original program, we can see that decimal representation of our double has 56 digits (and zeros after that):