Floating-point numbers aren’t evenly distributed

Last Saturday, in my explanations about numbers on computers, I wrote that, unlike mathematical numbers, Double (64-bit) floating-point numbers on computers aren’t infinite, but are spread very unevenly. Here I show just how uneven that is, and its effect on the numbers that your computer calculates with.

Perhaps the easier of my two graphs to understand is that showing the density of Double values.

Both axes on this are logarithmic, for reasons which become obvious when you look at the values. Along the X (horizontal) axis are Double values ranging from 0.0001 up to 1.0 x 10^32 – that’s 1 followed by 32 noughts, which is far larger than even Apple’s annual revenue. On the Y axis (vertical) are the number of different Double numbers for each unit, that’s 1.0.

So at a Double value of 1.0, there are 4,503,599,627,370,496 different doubles for every 1.0 change in value – that’s a great deal of different numbers. At a value of 0.5 x 10^16, each step of 1.0 brings just one different Double, and at 1 x 10^32, there are less than 6.0 x 10^-17 Doubles in every change of 1.0.

The other way of looking at this is what’s known as the ulp, effectively the difference between (different) consecutive Doubles. Around 1.0, they differ by around 2.2 x 10^-16, which is tiny by any standard. At 0.5 x 10^16, the difference is just 1.0, and when you get up to really huge Doubles at 1.0 x 10^32, consecutive Doubles are nearly 2 x 10^16 apart.

Negative Doubles behave as in mirror images of these graphs: the closer they towards 0.0, the more there are, and the smaller their difference. The more negative they become, the fewer there are of them.

This uneven distribution comes back to bite when you perform arithmetic with large numbers, particularly when subtracting two large numbers whose difference is relatively small. Indeed if that difference is small enough, the two large numbers might be represented by identical Doubles, making their difference 0.0. That’s a cancellation error.

4Comments

Add yours

1

Andrew Reilly on April 27, 2021 at 8:12 am

Bravo! Always a good idea to understand how computers actually work. The way to think about floating point numbers is that they’re like the scientific notation setting on a calculator: there’s always an exponent displayed, and there’s always a fixed number of digits, only one of which is to the left of the decimal point. With floating point numbers in a computer the digits are binary: only one and zero, and the exponent is powers of two.
Modern processors also support a variation that does kind of use decimals (actual basis is thousands) for the benefit of financial calculations, mostly. And for reducing user-surprise for calculator-like applicaitons, because fractions like 0.1 can be represented exactly (binary floats can’t, which can lead to _interesting_ result outputs). (Older computers, like some IBM mainframes used a floating point based on 4-bit hexadecimal digits, which turned out to be more complicated than necessary.)

The rules of constant-precision have an exception very close to zero, the “denormalised numbers”, which are intended to provide a little more wriggle room for getting a useful answer in those difference-between-close-values situations.

Some languages (python) support “arbitrary precision” integers, where numbers can be represented (internally) by arrays of machine integers, and so aren’t limited by the usual 2^63 range.

Some (some of the lisps and schemes) support arbitrary precision fractions too.

Those fancier variants trade quite a lot of speed for their additional reduction of surprise, but sometimes that’s the right trade-off.

LikeLiked by 1 person
- 2
  
  hoakley on April 27, 2021 at 8:55 pm
  
  Thank you.
  Yes, I did mention some of those variants in my article here last Saturday.
  Howard.
  
  LikeLike
3

Ed on April 28, 2021 at 9:25 am

Summary: doubles store 16 precision digits :)

LikeLiked by 1 person
- 4
  
  hoakley on April 28, 2021 at 10:05 am
  
  Sure but that doesn’t give the consequences such as cancellation errors to most users.
  Howard
  
  LikeLike

Share this:

Related