When writing some code using UndecidableInstances
earlier, I ran into something that I found very odd. I managed to unintentionally create some code that typechecks when I believed it should not:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE UndecidableInstances #-}
data Foo = Foo
class ConvertFoo a b where
convertFoo :: a -> b
instance (ConvertFoo a Foo, ConvertFoo Foo b) => ConvertFoo a b where
convertFoo = convertFoo . (convertFoo :: a -> Foo)
evil :: Int -> String
evil = convertFoo
Specifically, the convertFoo
function typechecks when given any input to produce any output, as demonstrated by the evil
function. At first, I thought that perhaps I managed to accidentally implement unsafeCoerce
, but that is not quite true: actually attempting to call my convertFoo
function (by doing something like evil 3
, for example) simply goes into an infinite loop.
I sort of understand what’s going on in a vague sense. My understanding of the problem is something like this:
- The instance of
ConvertFoo
that I have defined works on any input and output,a
andb
, only limited by the two additional constraints that conversion functions must exist froma -> Foo
andFoo -> b
. - Somehow, that definition is able to match any input and output types, so it would seem that the call to
convertFoo :: a -> Foo
is picking the very definition ofconvertFoo
itself, since it’s the only one available, anyway. - Since
convertFoo
calls itself infinitely, the function goes into an infinite loop that never terminates. - Since
convertFoo
never terminates, the type of that definition is bottom, so technically none of the types are ever violated, and the program typechecks.
Now, even if the above understanding is correct, I am still confused about why the whole program typechecks. Specifically, I would expect the ConvertFoo a Foo
and ConvertFoo Foo b
constraints to fail, given that no such instances exist.
I do understand (at least fuzzily) that constraints don’t matter when picking an instance—the instance is picked based solely on the instance head, then constraints are checked—so I could see that those constraints might resolve just fine because of my ConvertFoo a b
instance, which is about as permissive as it can possibly be. However, that would then require the same set of constraints to be resolved, which I think would put the typechecker into an infinite loop, causing GHC to either hang on compilation or give me a stack overflow error (the latter of which I’ve seen before).
Clearly, though, the typechecker does not loop, because it happily bottoms out and compiles my code happily. Why? How is the instance context satisfied in this particular example? Why doesn’t this give me a type error or produce a typechecking loop?
I wholeheartedly agree that this is a great question. It speaks to how our intuitions about typeclasses differ from the reality.
Typeclass surprise
To see what is going on here, going to raise the stakes on the type signature for
evil
:Clearly the
Covert a b
instance is being chosen as there is only one instance of this class. The typechecker is thinking something like this:Convert a X
is true if...Convert a X
is true [true by assumption]Convert X X
is trueConvert X X
is true if...Convert X X
is true [true by assumption]Convert X X
is true [true by assumption]Convert X b
is true if...Convert X X
is true [true from above]Convert X b
is true [true by assumption]The typechecker has surprised us. We do not expect
Convert X X
to be true as we have not defined anything like it. But(Convert X X, Convert X X) => Convert X X
is a kind of tautology: it is automatically true and it is true no matter what methods are defined in the class.This might not match our mental model of typeclasses. We expect the compiler to gawk at this point and complain about how
Convert X X
cannot be true because we have defined no instance for it. We expect the compiler to stand at theConvert X X
, to look for another spot to walk to whereConvert X X
is true, and to give up because there is no other spot where that is true. But the compiler is able to recurse! Recurse, loop, and be Turing-complete.We blessed the typechecker with this capability, and we did it with
UndecidableInstances
. When the documentation states that it is possible to send the compiler into a loop it is easy to assume the worst and we assumed that the bad loops are always infinite loops. But here we have demonstrated a loop even deadlier, a loop that terminates – except in a surprising way.(This is demonstrated even more starkly in Daniel's comment:
.)
The perils of undecidability
This is the exact sort of situation that
UndecidableInstances
allows. If we turn that extension off and turnFlexibleContexts
on (a harmless extension that is just syntactic in nature), we get a warning about a violation of one of the Paterson conditions:"No smaller than instance head," although we can mentally rewrite it as "it is possible this instance will be used to prove an assertion of itself and cause you much anguish and gnashing and typing." The Paterson conditions together prevent looping in instance resolution. Our violation here demonstrates why they are necessary, and we can presumably consult some paper to see why they are sufficient.
Bottoming out
As for why the program at runtime infinite loops: There is the boring answer, where
evil :: a -> b
cannot but infinite loop or throw an exception or generally bottom out because we trust the Haskell typechecker and there is no value that can inhabita -> b
except bottom.A more interesting answer is that, since
Convert X X
is tautologically true, its instance definition is this infinite loopWe can similarly expand out the
Convert A B
instance definition.This surprising behavior, and how constrained instance resolution (by default without extensions) is meant to be as to avoid these behaviors, perhaps can be taken as a good reason for why Haskell's typeclass system has yet to pick up wide adoption. Despite its impressive popularity and power, there are odd corners to it (whether it is in documentation or error messages or syntax or maybe even in its underlying logic) that seem particularly ill fit to how we humans think about type-level abstractions.