Type inference for Haskell, part 15
This is part 15 of the series on type inference for Haskell.
Welcome to the third and final section of this series. The first section covered an algorithm for basic type inference. In the second section, that algorithm was extended with polymorphism, recursive bindings, explicitly typed bindings, kind checking, and pattern matching. This third section will cover the biggest extension, type classes.
From here on out, the code referenced will all come from the Typing Haskell in Haskell paper itself. The code in previous posts attempted to imitate the paper’s code as closely as possible, so many functions there should look familiar. The website for the paper does include a downloadable copy of the code, however it no longer works with modern Haskell compilers (it is nearly 20 years old, after all!). You can find an updated copy here.
In this post, I’ll start with a informally describing type classes, the rules placed on them, and how they behave.
One last note before getting started. There are many extensions to Haskell’s type system, and a lot of them have to do with type classes. Examples include multi-parameter type classes and functional dependencies. The discussion here does not cover these extensions. The language without any extensions offers more than enough complexiy on its own!
Remember that polymorphic functions are those that can take on different types in each place they are used. For example:
head :: [a] -> a
is the type of a function that returns the first element of a list. This function is polymorphic because it can be applied to a list of any type. The
a stands in for a as-of-yet unknown type.
What about functions that need to do some operation on the unknown type? For example, what about a function that finds the least element in a list?
minOfList [x] = x minOfList (x:xs) = let mxs = minOfList xs in if x < mxs then x else xms
Say the type was
minOfList :: [a] -> a
What would happen if it were given an list of things that cannot be compared with
<? I suppose that a runtime error is the best you could do. This is why type classes are useful. They allow code to request that a type support some function without needing to specify what that type is.
The correct type for
minOfList :: (Ord a) => [a] -> a
(Ord a) part specifies that whatever type gets used in the place of
a must have an instance of the
Ord class. (
Ord provides comparison operations like
<.) This part is called a class predicate, or just predicate for short. It’s a logical predicate in the sense that it holds true whenever
(Ord a) => in the type almost looks like an extra argument. If you read about dictionary passing (which is outside of the scope of this series), you’ll see why this is no mistake.
The parts of the puzzle
Depending on how much you’ve used Haskell, there is a good chance you’ve created your own typeclass before. Still, you might find some details in this section you weren’t aware of.
In order to use a class, it first has to be declared. That declaration includes:
- A unique class name.
- A type variable.
- One or more method type declarations.
- Zero or more superclasses.
For example, the definition of
Ord look roughly like:
Ord definition has more methods, but this is enough for demonstration purposes.)
Let’s break down what is in this definition. This is declaring a class called
Ord. It uses the type variable
a to stand for some type implementing this class. The
(Eq a) => part says that there is one superclass,
Eq. This means that all types implementing
Ord must also have an implementation of
Eq and that the implementations of
Ord may implicitly depend on
Eq without declaring it. The superclass hierarchy must be acyclic.
where keyword, the types of the methods in the class are declared. Note how they involve the type variable
a that was declared before. It’s required that the type of each method involve the type variable somewhere (that’s important for deciding which instance is the right one to use). Note that the type variable doesn’t necessarily need to appear in one of the argument positions. For example, this class is silly, but completely valid even though
a only appears in the return type:
One detail on the allowed types for the methods: They are allowed to have class predicates on any type variable other than then one that is the instance of the class. For example, this is valid:
But this is not (at least, not without enabling non-standard language features):
It will fail to compile with
Constraint 'Show a' in the type of 'someMethod' constrains only the class type variables.
Lastly, it’s OK for those predicates to contain cyclic references to the class. This will compile:
I’ll bet at least something here was a feature of typeclasses you didn’t know of before.
In order to use a class for anything, you probably want an instance of it. An instance contains three things:
- Which class this is an instance of.
- What type this instance is for.
- Zero or more predicates on that type.
- An implementation for each of the methods.
For example, one possible instance of the silly class above is:
This is saying that there is an instance of the class
ListOf for the type
Bool, and provides the logic for implementing the method. Note that you are actually not allowed to declare the type of the
listOf function here. The combination of the type in the class declaration along with the type
Bool is enough for the compiler to tell what type it is supposed to have.
You can optionally have some predicates on the type that the implementation is for. As an example, look at this class and instance:
This says that in order to use the
PrettyPrint class on the
Either type, the first type
Either is parameterized with must implement
Show and the second must implement
PrettyPrint. Unlike the superclass hierarchy, this is allowed to contain cyclic references to classes.
Any given class can only be implemented for a given type once. For example, you can’t have two different
instance ListOf Bool where declarations in the same module. In fact, you cannot even have the type “overlap” with the type for any other instance. Two types overlap if, even though they might not be identical, they share some types in common. For example,
(a, Int) and
(Bool, b) overlap because
(Bool, Int) is an instance of both of them. This usually isn’t a problem unless you have the
FlexibleInstances language feature enabled, because by default
(a, Int) and
(Bool, b) aren’t valid types for an instance declaration.
The reason those types aren’t valid is because the types are required to be a type constructor (so it can’t be just a type variable) instantiated with zero or more type variables. The problem with both
(a, Int) and
(Bool, b) is that the type constructor for a 2-tuple (which is spelled
(,,) in Haskell) is being instantiated with another type constructor (
Int in the first case) and not just type variables. As another example,
[a] is a valid type for an instance but
[String] is not.
For more details, see Class and instance declarations in the GHC users guide.
How type classes are used
When code uses a function from a typeclasses, the only thing that might change about the code is that its binding (the variable or function declaration) might get some predicates attached to its type.
As always, examples are best. Let’s start with an example that doesn’t use type classes in order to see how it changes when type classes are added. Imagine we had a concrete
showInt function that converts integers to strings. I’ll write these functions along with the type that gets inferred for them.
Now let’s generalize that function to any showable input type with the help of the
show function from the
Remarkably little has changed. The
[Int] part of the type was generalized to
[a], and a predicate
(Show a) was added. How did it know that was the right type? It knows that the argument is a list because of the pattern matching in the definition of the function. The type inside the list is left a generic
a because nothing specifies the type any more than that. Even the call to
show doesn’t force it to be a particular type. That is because
show’s type is
(Show a) => a -> String. However, the call to
show does add a predicate,
(Show a). In this case, that predicate just gets bubbled up to the function declaration.
It’s possible to use type classes without a predicate showing up in the function’s type. To see why, look at this really silly function:
show, yet there is no predicate
(Show something) in the function’s type. While calling
show does still add that predicate, we soon learn that
show is only being called on the concrete type
Bool. That means the predicate now looks like
(Show Bool). Since there is a
Show instance for
Bool, we know that predicate always holds true. So why bother writing it?
This example also used another type class,
Eq by using the
== function. That predicate, which became
(Eq String), was removed in a similar way.
So, predicates can be kept and added to the function’s type, or they can be removed because they are obviously true. Can anything else happen? It turns out, yes! Predicates can be modified before being added to the function’s type.
Time to look at another silly function.
This uses the
Eq type class because it uses the
== function. The type of
(Eq a) => a -> a -> Bool. Here, we’re applying it to lists. That means the type will instantiate to
(Eq [t]) => [t] -> [t] -> Bool. Based on this, you might expect the type of
areTheSame to be
(Eq [t]) -> t -> t -> Bool. After all, there needs to be a way to compare lists of t with
==, not just individual elements of
t, right? Why does it only require that elements of
t be comparable with
This is where predicates can be reduced using knowledge of what instances of a class exist. In this case, it takes advantage of the fact that there is an instance of
Eq for lists,
instance (Eq a) => Eq [a] where .... This says that there is an instance of
Eq for a list type provided that there is also an instance of
Eq for the type inside the list. When trying to reduce the
(Eq [t]) predicate on the
areTheSame function, it uses that instance to “trade in”
(Eq [t]) for
This is a good thing, because Haskell requires that all class predicates reference a type variable, not a concrete type.
There is one more possibility: trying to use a function from a type class might result in an error. For example:
(Remember that in Haskell, list concatenation is done with
+ function is from the
Num class and has the type
(Num a) => a -> a -> a. Thus, using the
+ function will add the predicate
(Num [Bool]). Unlike
Eq, there is no instance of
Num for a list of anything. The step where it tries to simplify the predicate to something more like
(Num t) fails as a result.
Now, at this point it won’t leave
(Num [Bool]) as a predicate on your
doesNotCompile function. It already figured out that there is no possible type that can be used in that function, so it’s better to give an error to tell the programmer that this function is broken rather than leave a function that can never be called.
To talk in more general terms about the behavior of type classes, there are a few things that might happen to the class constraints that get added when using a method from a typeclass.
- The predicates be added to the type directly, like
- They might be reduced before being added to the type.
- The predicates might even disappear completely.
- They might result in a type error if it becomes apparent that we are trying to use a type as a class it clearly does not have an implementation of.
This actually isn’t the full set of what can happen. There are two edge cases that I’ll explain in individual posts, ambiguities and deferred predicates. For now, don’t worry about those.
To give a sneak peak of what the next posts cover, here’s one tidbit about how inference for this actually works: type inference for expressions just captures a list of all the predicates it sees and returns them. It doesn’t do any processing to them. It’s only when dealing with a binding that the inference starts checking that they make sense.