Double Equality
Today I once again was hit by a simple question: when are two double
(or float
) values equal?
Intro
Please note that I refer to double
in the following text, but everything said is
true for float
, and not only for Java: most programming languages share some of these
problems.
If you get here you are probably familiar with the problem:
the simple question “are two double
values equal?” does not have a simple answer
on a computer. The standard ANSI / IEEE Std 754
which is nowadays used for all floating point arithmetics on computers hardcoded inside
their processors defines some special numbers:
- There is not just one zero, there is a positive and a negative zero. The idea behind that is that you can differ how you reach a possible underflow, and that you can decide whether dividing by the given value results in positive or negative infinity. Still the standard says that they represent the same number and have to be handled as if they are equal.
- So there are two infinities, too: positive and negative infinity. Again dividing a standard number by either will give positive or negative zero.
- Last there is also NaN (Not a Number) which you e.g. get
by calculating
0.0/0.0
. Indeed, there is not only one such number, but a whole universe of them, which is sometimes used to hide data inside adouble
(a technique known as NaN Boxing). Nearly any math calculation with a NaN involved will result in NaN. The surprising fact regarding NaN is that nearly all comparisons where NaN is involved result infalse
, because it is considered out of the range of numbers. So it is unequal to any number including itself:Double.NaN == Double.NaN
isfalse
. And it does not compare, so the same happens for<
,<=
,>=
, and>
. OnlyDouble.NaN != Double.NaN
istrue
.
It’s obvious that this last feature of NaN
makes it hard to check two floating
point values for equality. Keeping the above in mind the following simple
solution will return false
when called with two NaN
values:
// DON'T USE THIS!
static boolean areEqual(double d1, double s2)
{
return d1 == d2;
}
Please note that there is a related question: is a double value equal to a constant?
In this case using ==
is no problem, as you know that at least one of the values
is not NaN
(if you don’t use NaN
deliberately, but why should you).
So you can still use something like if (x == 3.0)
(see chapter
A Last Warning below why this is still a bad idea).
Standard Solution
The standard solution in Java is falling back to
java.lang.Double#compare(double, double)
. If it does return 0
both
values are considered equal:
// YOU SHOULDN'T USE THIS! But sorrily that's how Java works.
static boolean areEqual(double d1, double s2)
{
return Double.compare(d1, d2) == 0;
}
It avoids the basic NaN
quirk indeed, but let’s see how it is implemented:
// CITATION FROM JAVA SOURCECODE OF java.lang.Double
public static int compare(double d1, double d2) {
if (d1 < d2)
return -1; // Neither val is NaN, thisVal is smaller
if (d1 > d2)
return 1; // Neither val is NaN, thisVal is larger
// Cannot use doubleToRawLongBits because of possibility of NaNs.
long thisBits = Double.doubleToLongBits(d1);
long anotherBits = Double.doubleToLongBits(d2);
return (thisBits == anotherBits ? 0 : // Values are equal
(thisBits < anotherBits ? -1 : // (-0.0, 0.0) or (!NaN, NaN)
1)); // (0.0, -0.0) or (NaN, !NaN)
}
The implementor added helpful comments, so its workings should be clear.
For two equal values, or if at least one is NaN, everything falls through,
and then the bits are compared directly. It will indeed return 0
when called with two Double.NaN
values. All NaN
values will be placed after
everything else including the infinities.
But there are a few remaining problems, still. As said there is a myriad of possibly
NaN
values (exactly 9,007,199,254,740,990 in a 64bit double precision value,
with equal subsets quiet and signalling NaNs),
and the compare
method above will differ between all of them
which may not be what you want as Double.isNaN()
will return true
for any of them. In general the rest of Java is basically designed as if there
is only one NaN value.
But the greater problem is that it will still differ between the two 0.0
values, and this is quite unnatural as they represent the same mathematical number,
and standard compare will consider them the same.
Improved Solution
The following solution which is contained in the
de.caff.generics.Primitives
class in the
generics module of my de·caff Commons
provides what I consider the most natural equality check. It is shown
for both double
and float
, and you don’t need the whole jar, you can
easily copy the following into your own project. It collapses both positive and negative
zero as well as all possible NaN values.
/**
* Are two primitive {@code double} values equal?
* <p>
* Thanks to the underlying standard ANSI / IEEE Std 754-1985 defining floating point arithmetic
* {@code Double.NaN} is not equal to anything, not even to itself. This is often not what you want.
* {@link java.lang.Double#compare(double, double)} falls back to the binary representation in
* some cases, which again is not always a good idea as it differs between {@code -0.0} and
* {@code +0.0} which are considered not equal, although they represent the same mathematical number.
* <p>
* This method considers two values equal if the both represent the same number.
* So two {@code NaN} values are equal, and both {@code 0.0} values are equal in any combination, too.
* This seems the most natural way to compare doubles.
* @param v1 first value to compare
* @param v2 second value to compare
* @return {@code true}: if both values are considered equal according to the above definition<br>
* {@code false}: otherwise
*/
public static boolean areEqual(double v1, double v2)
{
if (v1 == v2) { // this handles -0.0 == 0.0 correctly
return true;
}
return Double.isNaN(v1) && Double.isNaN(v2); // make any 2 NaN values also equal
}
/**
* Are two primitive {@code float} values equal?
* <p>
* Thanks to the underlying standard ANSI / IEEE Std 754-1985 defining floating point arithmetic
* {@code Float.NaN} is not equal to anything, not even to itself. This is often not what you want.
* {@link java.lang.Float#compare(float, float)} falls back to the binary representation in
* some cases, which again is not always a good idea as it differs between {@code -0.0f} and
* {@code +0.0f} which are considered not equal, although they represent the same mathematical number.
* <p>
* This method considers two values equal if the both represent the same number.
* So two {@code NaN} values are equal, and both {@code 0.0f} values are equal in any combination, too.
* This seems the most natural way to compare floats.
* @param v1 first value to compare
* @param v2 second value to compare
* @return {@code true}: if both values are considered equal according to the above definition<br>
* {@code false}: otherwise
*/
public static boolean areEqual(float v1, float v2)
{
if (v1 == v2) { // this handles -0.0f == 0.0f correctly
return true;
}
return Float.isNaN(v1) && Float.isNaN(v2); // make any 2 NaN values also equal
}
Porting to other languages should be dead-simple, the library method isNaN(v)
just returns the
result of v != v
which is only true
for NaN
values.
Still this does not handle hashing (which is also broken in that it differs between -0
and +0
and
all possible NaN
values) and sorting. Thanks to automatic boxing any double
value
can become a Double
with different behavior (see examples in the next section) which makes
it impossible to circumvent these problems without too much overhead even if you want to.
The Primitives
class above still provides methods for hashing double
and float
values in
a way that agrees with the equality methods above.
A Dark Corner of Java
The standard solution for equality checks in Java is the comparison method mentioned above.
This leads to a dark corner because it makes java.lang.Double
and double
behave differently
and can lead to subtle bugs.
All asserts in the following code are fulfilled:
// === Negative and positive zero ===
final double nz = -0.0; // negative zero
final double pz = +0.0; // positive zero, same as 0.0
final Double abNZ = nz; // automatically boxed negative zero (i.e. Double.valueOf())
final Double abPZ = pz; // automatically boxed positive zero (i.e. Double.valueOf())
// Equality
assert nz == pz; // equal because they represent the same number (IEEE 784)
assert !abNZ.equals(abPZ); // but Java considers them to be different when boxed
// Hashing
assert Double.hashCode(nz) != Double.hashCode(pz); // hashcodes differ
assert Double.hashCode(nz) == abNZ.hashCode();
assert Double.hashCode(pz) == abPZ.hashCode();
// Comparison
assert !(nz < pz); // still the same, so neither is smaller or greater
assert abNZ.compareTo(abPZ) < 0; // Java thinks differently
// Automatic boxing/unboxing
assert abNZ == pz && !abNZ.equals(pz); // same, but not equal
// === NaN ===
final double nan = Double.NaN;
final Double bNaN = nan;
assert bNaN != nan && bNaN.equals(nan); // deliberately: different values, but equal
Thanks to automatic boxing and unboxing especially the highlighted line commented with same, but not equal is basically pure evil: these two things are identical, but they are not equal.
As a result any Map with Double
keys (or Set
with Double
values) will show
strange behavior if you use both -0.0
and +0.0
. This may easily happen
if they key comes from a calculation which results in an underflow of the possible range.
As this behavior is codified deep inside the implementation of java.lang.Double
since Java 1.0 there
is not much hope that the above will ever change. Software might depend on it, although I’m pretty sure
that for more than 99% of usages this behavior regarding the two zero values is unwanted, unexpected
and therefore dangerous.
A Last Warning
In most cases you should not check floating point values for exact equality like is done above.
Usually floating point calculation introduce inaccuracies, so you should always be prepared
for deviations by comparing with a small margin, usually a small number named epsilon
.
It is often not simple to decide for a good epsilon, as it depends on the expected range of handled numbers.
Nevertheless, for completeness here are also implementations for the good way to compare floating numbers:
/**
* Are two primitive {@code double} values nearly equal?
* @param v1 first value to compare
* @param v2 second value to compare
* @param eps allowed deviation for both values to be still considered equal, a small non-negative number
* @return {@code true}: if both values are considered equal within the given allowance<br>
* {@code false}: if they differ too much
*/
public static boolean areEqual(double v1, double v2, double eps)
{
assert eps >= 0.0;
return Math.abs(v1 - v2) <= eps;
}
/**
* Are two primitive {@code float} values nearly equal?
* @param v1 first value to compare
* @param v2 second value to compare
* @param eps allowed deviation for both values to be still considered equal, a small non-negative number
* @return {@code true}: if both values are considered equal within the given allowance<br>
* {@code false}: if they differ too much
*/
public static boolean areEqual(float v1, float v2, float eps)
{
assert eps >= 0.0f;
return Math.abs(v1 - v2) <= eps;
}
Using an epsilon of 0.0
in the above methods would also provide an exact equality
method which is usually doing okay, but involves a bit more calculations. As expected
a NaN
value for any parameter will result in false
, i.e. not equal, which again
implies NaN != NaN
. But as you use these methods on calculated and/or expected values
calling them with NaN
would be a sign of a problem elsewhere, and the assertions
(if enabled) would take care of the stupid idea of making epsilon NaN
.