I'm very annoyed by this title. It's a beginners mistake and FP numbers just are what they are -- a tradeoff between precision, range and efficiency. If you use FP, you should be familiar with its intricacies in the same way a C programmer needs to be aware of unsigned overflows.
I wish that fixed point numbers (as in Qx.y) was a first class citizen in programming languages. As well as saturation arithmetic.
Actually I wonder if anybody did a performance/power comparison of using floating point vs fixed point math in some common tasks using modern CPUs with extensive FP support.
Floating point behaviors are well understood. This type of problem shouldn’t happen if your developers are qualified.
In Julia, there is the `isapprox` function to inexactly compare 2 numbers. (you can use it in infix form: `x≈y`. By default, two numbers are approximately equal if their relative tolerance is less than `sprt(eps(typeof(x))` (around 1e-8 for 64bit Floating point numbers)
Using equality to compare 2 floating point numbers doesn't get you anywhere, specially if you are using mathematical functions (the implementation of `sin` or `exp` could vary with operating systems or software versions)
This is an issue that has bitten people before. Indeed, there's probably been some discussion of it and might even be some common mitigations taught as "best practice" in some places.
Do we ever need to store floats? using them in flight is one thing, but stored data so often needs to be some fixed precision.
Asking basic questions about floating-point in interviews remains to this day a good way to distinguish good and bad programmers.
You'd think everyone would know the basics, but even in stuff like finance, people routinely fuck up decimals.
And that's why you user "currency" type of style. Both as calculation in your programming language (Go in this article) and in DB (PostgreSQL in this article). Fixed width "floating" number which is actually your largest integer type (64 bits these days, but there are libraries for 128 bits aplenty as well) is your friend.
Floating point arithmetic doesn't «suck», it's the code that treats floats as decimal numbers that sucks.
> We are sending the data to the warehouse as a float
There is your problem
Any vector addition done concurrently will result in indeterminate roundoff errors (to be precise there might be O(N!) different possible outputs depending on the order of calculation), this is literally the point of the field of numerical analysis. But that's really the tip of the iceberg. Beware any mathematical operation done with numbers of vastly different orders of magnitude. Beware spinning your own matmul even if you can't guarantee some normalcy to the numbers in your matrices.
Calculating an average is fairly painless, you can create an upper bound on the possible error that isn't usually dangerous. This isn't the case for all computations.
The vast majority of problems with floating point numbers are simply due to people not understanding number bases. Only a tiny subset is caused by people not understanding floats proper.
Consider the number 1/3. In ternary (base 3), you write that as 0.1, whereas in decimal (base 10) you write it as 0.3333... recurring. If you try to represent that number with a fixed number of decimal places, you have precision issues. E.g. decimal 0.333 converts to 0.02222220210 in ternary.
Now, the thing with that example is that we treat decimal as a special, privileged representation, so we accept that 1/3 doesn't have a finite representation in decimal as an entirely natural fact of life, while we treat that same problem converting between decimal and binary as a fundamental deficiency of binary.
Let's talk about why some numbers have finite representations while others don't. 53/100 is written as 0.53 in decimal. The general rule is that if the denominator is a power of ten, you just write the numerator, then put the decimal mark have as many digits from the right as the power of ten. If the denominator is not a power of ten, you make it one. 1/2 turns into 5/10, which is 5 with one decimal place, or 0.5. Obviously, you can't actually do this for 1/3. There's no integer `a` where 3a is a power of ten. The general rule here is that, if the denominator has any prime factors not present in the base, you don't have a finite representation. 4/25 = 16/100 = 0.16 has a finite representation, but 5/7, 13/3 don't.
Now, because 3 is coprime with 10, no number with a finite ternary representation can have a finite decimal representation (and vice versa), and we're used to it. Where binary vs decimal becomes tricky and confuses people is that 10 = 2 * 5, so numbers that can be expressed with a power-of-two denominator have finite representations in both binary and decimal, so you can convert some numbers back and forth with no loss of precision. Numbers with a factor of 5 somewhere in their denominator can have finite decimal represntations (1/5 = 2/10 = 0.2), but can't have finite representations in binary. 0.1 = 1/10 = 1/(2 * 5) and you can't get rid of that five. And that's why everybody gets bitten by 0.1 seeming to be broken.
Ancient COBOL had binary coded decimal.
Yep, floats should be considered dangerous.
$ python3 -c 'print(.1 + .2)'
0.30000000000000004
Everyone knows this from using calculators in school, even non programmers knows this.
Obligatory: What Every Computer Scientist Should Know About Floating-Point Arithmetic
https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.h...
The problem is that we're still stuck with only binary floating point types in our CPUs and compilers and runtime environments, over a decade after ieee754 released their decimal float spec [1].
Once we finally move away from binary floats, these nasty binary-exponent-decimal-exponent lossy conversions will end (as will the horribly complicated nasty scanf and printf algorithms).
This is a solved problem. Our actual problem now is momentum of adoption.
I've anticipated that this will eventually happen (hopefully sooner rather than later), and proactively marked binary floats as "legacy formats" [2].
[1] https://en.wikipedia.org/wiki/Decimal64_floating-point_forma...
[2] https://github.com/kstenerud/concise-encoding/blob/master/ce...