Type inference for Haskell, part 12
This is part 12 of the series on type inference for Haskell.
This post will take a look at how to deal with functions (or variables) where the user declared a type. Bindings with a type siguature are called “explicitly typed” bindings. As seen in the previous post, it is sometimes necessary to declare a type in order to get the type you want for a recursive function.
Declaring a type can also be used to restrict the type of a binding to be a less general type than what type inference would have picked. A programmer might find that ability useful for documenting the indent of a function or to help catch bugs. For example, type inference would have picked the type
[a] -> [a] for this function:
(THIH Section 11.6.1, Explicitly Typed Bindings)
There are a few issues to consider when dealing with explicitly typed bindings. For instance, what if a user declares the wrong type? If the given type is one that the binding cannot possibly have, that should result in a type error. This means that there needs to be some code to check if the type given by the user actually makes sense.
You also have to think about how explicitly typed bindings interact with implicitly typed bindings. Which order do you process the bindings in? Let’s look at an example with cyclic dependencies (
b, which references
c, which references
b wasn’t explicitly typed, these three definitions would constitute a single binding group. Due to the restriction on the types of things in binding groups,
a would have the type
Char -> [Char] and
b would have the type
Char -> Char. (
c’s type is
Char -> [Char] regardless of whether
b is explicitly typed or not.)
Explicitly specifying the type for
b breaks apart this binding group. Now the type of
t -> [t] and the type of
a -> a. Explicitly stating the type of
b changed not only
b’s type, but also the type of
a. That’s because
a is not longer in a binding group with
Before computing the binding groups and sorting them, explicitly typed bindings are first pulled out of the call graph1. They are kept in a separate list. Then the remaining bindings (the implicitly typed ones) are examined to find the dependencies between them. All this is done with the goal of making the types for these bindings be as general as possible. Typing
t -> [t] is preferrable to typing it as
Char -> [Char].
A new list is added to the
BindGroup type to hold the explicitly typed bindings:
Expl holds an explicitly typed binding. It’s like an
Impl with a type scheme added to it.
We have to be careful about the order we process these bindings in. As you saw in the example above, an implicitly typed binding might depend on an explicitly typed binding. That means that the types of the explicitly typed bindings have to be added to the assumptions when inferring the type of implicitly typed bindings.
Explicitly typed bindings can also reference implicitly typed ones. When checking that the type of the explicitly typed binding is valid, we have to know the type of the implicitly typed bindings in order to be able to tell if the type the user provided is correct or not. This means that checking the validity of explicitly typed bindings must happen after processing the implicitly typed ones.
When processing the implicitly typed bindings, we have to trust that the user-provided types are correct because we can’t actually check them until after processing the implicitly typed bindings. Here’s what that logic now looks like:
tiBindGroup :: Infer BindGroup [Assump] tiBindGroup as (es,iss) = do let as' = [ v :>: sc | (v, sc, _) <- es ] as'' <- tiSeq tiImpls (as' ++ as) iss _ <- mapM (tiExpl (as'' ++ as' ++ as)) es return $ as'' ++ as'
In this code,
es is the list of explicitly typed bindings.
as' is the assumptions for the explicitly typed bindings (trusting that the provided type is correct). Then
as'' is the list of assumptions that come from the implicitly typed bindings. Finally,
tiExpl is applied to each of the explicitly typed bindings to check that their type is valid.
tiExpl :: Infer Expl () tiExpl as (i, sc, e) = do t <- freshInst sc tiBinding as e t s <- getSubst let t' = apply s t let fs = tv (apply s as) let gs = tv t' \\ fs let sc' = quantify gs t' if sc /= sc' then fail "signature too general" else return ()
The logic in
tiExpl starts off by using the same
tiBinding function that
tiImpls used. Remember that
tiBinding unifies the type found for the expression
e with the type
t. If the declared type is invalid in some way, that unification will fail. For example, if the type contained
Bool where the code needed an
Int, that wouldn’t unify. That’s the first way this could fail.
If the explicit type is less general than the most general type that could have been inferred, unification will update the substitution to restrict the type down to what is explicitly stated.
The middle section of
tiExpl finds the set of type variables
gs to quantify over. This is the same process that happened in
The last section quantifies the type and checks that the quantified type matches the explicitly declared type. If it doesn’t, that means the type is too general. For example, if you wrote
myFunction :: a -> a when it actually only works with strings, that type declaration would be too general. Types that are too specific or just plain wrong would have already been caught by unification. But if the declared type was too general, unification would have happily accepted it. That’s why this additional check is needed. This is the second way this function could fail.
That’s all we need to handle explicitly typed bindings. Though the logic is a bit subtle, the code doesn’t need to change that much.
The finished product is now in explicit.hs.
So far we’ve added three extensions to our type system: let polymorphism, recursive bindings, and now explicitly typed bindings. There are a few minor things left to add. Stay tuned for a discussion of kinds in the next post.
more work that we assume happens as a preprocessing stage↩