Language Design: Equality & Identity
Part 4: Fixing Haskell
Please refrain from linking this page on community foris of any language mentioned herein.
Basic goals, as mentioned in the previous parts:
You should not lose values inside a data structure.
Here is the simple example again, demonstrating the issue:
elem (0.0/0.0) [0.0/0.0] -- False
To be clear, elem
is picked as the simplest example possible.1
Status Quo
Why is Eq
not doing its job, or rather – what is its job description in the first place?
According to Data.Eq
, not much:
The Haskell Report defines no laws for Eq. However, instances are encouraged to follow these properties: […]
Why does Eq
provide no laws, compared to many other typeclasses that state them strongly and explicitly?
The reason becomes obvious if one scrolls though the list of typeclass instances. If the “encouraged properties” were upgraded to “laws”, Haskell would have law-breaking typeclass instances in one of its most central modules.
The fault of Haskell’s Eq
is that it lives in a world in which there is only a single definition of equality per type, which is incorrect in general. Case in point: floating point numbers, which have multiple definitions (see IEE754 §5.10 and §5.11).
There can be multiple useful definitions of equality per type, and – depending on the use-case – having only one to pick is not sufficient.
Even simple functions like list’s elem
might require both equality and identity operations to work correctly,
which means the frequently suggested newtype hack to swap one implementation of Eq
for another one is insufficient.
Available Options
The key point is that only two of the three properties can be satisfied:
- correct data structures
- abstraction/polymorphism
- a single “universal” equality function
The status quo is discarding property 1.:
- Data structures which lose values, example:
elem (0.0/0.0) [0.0/0.0]
returnsFalse
. - Polymorphic data structures like
List
s andVector
s. - A single equality function
==
onEq
.
Discarding property 2. would give you:
- (Some) correctly working data structures, example:
elem (0.0/0.0) someFloatList
returnsTrue
. - Specialized data structures like
FloatList
s orDoubleVector
s (likely in addition to polymorphic variants), which implement functions likeelem
using bothEq
’s==
and functions specific toFloat
s orDouble
s. Semantics would differ between polymorphic and specialized variants. - A single equality function
==
onEq
, and type-specific functions in specialized data structures.
Discarding property 3. would give you:
- Correctly working data structures, example:
elem (0.0/0.0) [0.0/0.0]
returnsTrue
. - Polymorphic data structures like
List
s andVector
s. - An identity function (
===
), in addition toEq
s existing equality function (==
).
Here is how the last approach can be implemented:
A Solution
The simplest fix (instead of introducing a new typeclass for identity) is to extend Eq
as follows:
class Eq a where
(==), (/=), (===), (/==) :: a -> a -> Bool
x == y = not (x /= y)
x /= y = not (x == y)
x === y = x == y -- new
x /== y = not (x === y) -- new
Nothing would change for 99% of Eq
’s instances, but the typeclass instances for Float
, Double
, … would now be defined as:
instance Eq Float where
(==) = eqFloat
(===) = Numeric.IEEE.identicalIEEE -- new
Then change the documentation replacing the “expected properties” of ==
and /=
with “laws” of ===
and /==
.
As a last exercise, adjust usages of x == y
with x == y || x === y
where appropriate (e. g. list’s elem
).
And there you have it:
- strong assurances instead of “it would be nice”
- lawful instances where there were none before
- more reliable behavior of data structures
- no need to spin new types for things which should work out-of-the-box, because deriving
Eq
just works
-
Even arguing that
False
is the correct result of the code example does not detract from the points being made, as there are plenty of other examples. TakingOrd
’s issues into account, which are very similar, would add even more examples. ↩