Language Design: Equality & Identity

Part 4: Fixing Haskell

Published on 2019-03-14. Last updated on 2022-06-09

Basic goals, as mentioned in the previous parts:

You should not lose values inside a data structure.

Here is the simple example again, demonstrating the issue:

elem (0.0/0.0) [0.0/0.0] -- False

To be clear, elem is picked as the simplest example possible.¹

Status Quo

Why is Eq not doing its job, or rather – what is its job description in the first place? According to Data.Eq, not much:

The Haskell Report defines no laws for Eq. However, instances are encouraged to follow these properties: […]

Why does Eq provide no laws, compared to many other typeclasses that state them strongly and explicitly?

The reason becomes obvious if one scrolls though the list of typeclass instances. If the “encouraged properties” were upgraded to “laws”, Haskell would have law-breaking typeclass instances in one of its most central modules.

The fault of Haskell’s Eq is that it lives in a world in which there is only a single definition of equality per type, which is incorrect in general. Case in point: floating point numbers, which have multiple definitions (see IEE754 §5.10 and §5.11).

There can be multiple useful definitions of equality per type, and – depending on the use-case – having only one to pick is not sufficient. Even simple functions like list’s elem might require both equality and identity operations to work correctly, which means the frequently suggested newtype hack to swap one implementation of Eq for another one is insufficient.

Available Options

The key point is that only two of the three properties can be satisfied:

correct data structures
abstraction/polymorphism
a single “universal” equality function

The status quo is discarding property 1.:

Data structures which lose values, example: elem (0.0/0.0) [0.0/0.0] returns False.
Polymorphic data structures like Lists and Vectors.
A single equality function == on Eq.

Discarding property 2. would give you:

(Some) correctly working data structures, example: elem (0.0/0.0) someFloatList returns True.
Specialized data structures like FloatLists or DoubleVectors (likely in addition to polymorphic variants), which implement functions like elem using both Eq’s == and functions specific to Floats or Doubles. Semantics would differ between polymorphic and specialized variants.
A single equality function == on Eq, and type-specific functions in specialized data structures.

Discarding property 3. would give you:

Correctly working data structures, example: elem (0.0/0.0) [0.0/0.0] returns True.
Polymorphic data structures like Lists and Vectors.
An identity function (===), in addition to Eqs existing equality function (==).

Here is how the last approach can be implemented:

A Solution

The simplest fix (instead of introducing a new typeclass for identity) is to extend Eq as follows:

class  Eq a  where
  (==), (/=), (===), (/==) :: a -> a -> Bool

  x ==  y              = not (x /= y)
  x /=  y              = not (x == y)
  x === y              = x == y         -- new
  x /== y              = not (x === y)  -- new

Nothing would change for 99% of Eq’s instances, but the typeclass instances for Float, Double, … would now be defined as:

instance Eq Float where
  (==)  = eqFloat
  (===) = Numeric.IEEE.identicalIEEE  -- new

Then change the documentation replacing the “expected properties” of == and /= with “laws” of === and /==.

As a last exercise, adjust usages of x == y with x == y || x === y where appropriate (e. g. list’s elem).

And there you have it:

strong assurances instead of “it would be nice”
lawful instances where there were none before
more reliable behavior of data structures
no need to spin new types for things which should work out-of-the-box, because deriving Eq just works

Even arguing that False is the correct result of the code example does not detract from the points being made, as there are plenty of other examples. Taking Ord’s issues into account, which are very similar, would add even more examples. ↩