Type inference for Haskell, part 17
This is part 17 of the series on type inference for Haskell.
The previous post defined the data structures we’ll use when working with type classes. Today, we’re going to be looking at a behavior of Haskell where it “cleans up” the set of predicates attached to a binding. The resulting types are intuative enough that you might not have even realized that this process was happening.
(THIH Section 7.3, Entailment)
The feature is called entailment (at least, it’s called that by Typing Haskell in Haskell). It’s easiest to show what it is with some examples.
When you have a function with a type signature like
(Show a) => String -> Maybe a -> String
The (Show a)
part is a predicate. It says that the type a
must have an implementation of the Show
class. You can turn this into a question: Does a
implement Show
? If so, that predicate holds true.
Consider another type
(Eq a, Ord a) => a -> a -> Bool
This is actually equivalent to writing
(Ord a) => a -> a -> Bool
Why can we leave off the Eq a
? That’s because Eq
is a superclass of Ord
, meaning that if a
is an instance of Ord
, then it definitely is also an instance of Eq
. So we don’t have to manually write down the fact that we want it to also be an instance of Eq
- we get that for free.
That’s the first of the two related things that entailment handles.
Now look at this type signature:
foo :: (Show Int) => Int -> Int -> String
foo a b = show (max a b)
Writing Show Int
isn’t useful here because we already know that Int
implements Show
. If it didn’t, the show
call in the function body would fail to type check. Either way, we aren’t helped by writing Show Int
. It’s better to leave it out of the function’s type.
This is the second thing that entailment looks at.
When inferring the type for a function, we might get predicates like those in the examples above. The entailment algorithm is used to clean them up. (It is also used for a few other things that we’ll come to in another post.)
I still haven’t told you what the function actually does. It takes in a predicate (like (Eq a)
) that we want to evaluate and a list of other predicates that we are assuming hold true (like (Ord a)
). It also takes information about what classes exist and what instances they have. It returns true if, given the other predicates, we know that the predicate holds true. It has no error cases.
Here’s how the logic works for that. If our predicate appears in the list of other predicates, then it trivially holds true if they all hold true (that’s like asking, given that A is true, is A true?). It also holds true if it appears in any of the superclasses of the other predicates (that’s what handles our (Eq a, Ord a)
example). Finally, the predicate also holds true if there is an instance of the class for that type (the (Show Int)
case) where any predicates on that instance also hold true (with a recursive call to entail
).
Show Int
doesn’t have any predicates on the instance. For an example that does have some, consider an instance declaration like
instance (Foo a) => Foo [a] where
And a predicate like (Foo [Bar])
. When checking that predicate, we’ll first find the instance above. Foo [a]
matches with Foo [Bar]
, so the instance can be used. From there, the (Foo a)
predicate on the instance means that we will next have to check if (Foo Bar)
holds true. If there is an instance of class Foo
for the type Bar
, then we’ll recursively check its predicates, too.
Here’s the actual entail
function from THIH:
entail :: ClassEnv -> [Pred] -> Pred -> Bool
entail ce ps p =
any (p `elem`) (map (bySuper ce) ps) ||
case byInst ce p of
Nothing -> False
Just qs -> all (entail ce ps) qs
Note that it ORs two cases together. The first case handles the situation where the predicate is either directly in the list of predicates we are assuming hold true, or it is a superclass of at least one of them. That would return true for Eq a
if Ord a
was in the list of given predicates. The second case is for the situation where the class has an instance for the type mentioned in the predicate. Note that this case involves a recursive call to entail
. That call checks the predicates (if there are any) on that particular instance of the class.
Note that bySuper
includes the predicate itself, not just superclasses.
bySuper :: ClassEnv -> Pred -> [Pred]
bySuper ce p@(IsIn i t) =
p : concat [ bySuper ce (IsIn i' t) | i' <- super ce i ]
And here is byInst
. It attempts to find an instance of the class mentioned in the predicate that works for the type mentioned in the predicate. It uses a function called matchPred
to determine if a given instance of the class works for this particular case. msum
takes the first Just _
value in the list, or returns Nothing
if there are none.
byInst :: ClassEnv -> Pred -> Maybe [Pred]
byInst ce p@(IsIn i t) = msum [ tryInst it | it <- insts ce i ]
where tryInst (ps :=> h) = do
u <- matchPred h p
Just (map (apply u) ps)
The matchPred
first checks to make sure the classes being talked about are the same, then checks if the types in the predicate “match.” If you look at the code of match
, you are probably reminded of the mgu
function. match
is like a one-sided version of mgu
. Instead of finding a substitution that can be applied to both types to make them the same, it finds a substitution that can be applied to the left type that can turn it into the right type.
This allows the right type to be more specific than the left type, but the left type cannot be more specific than the right. This would mean that we could match [a]
with [Bar]
to determine that an instance of a class on [a]
can be used for the type [Bar]
. The match
function is how you tell if the particular instance you are looking at can be used for the type you have.
matchPred :: Pred -> Pred -> Maybe Subst
matchPred = lift match
lift m (IsIn i t) (IsIn i' t')
| i == i' = m t t'
| otherwise = fail "classes differ"
match :: Monad m => Type -> Type -> m Subst
match (TAp l r) (TAp l' r') = do
sl <- match l l'
sr <- match r r'
merge sl sr
match (TVar u) t | kind u == kind t =
return (u +-> t)
match (TCon tc1) (TCon tc2) | tc1 == tc2 =
return nullSubst
match t1 t2 =
fail "types do not match"
+->
is that infix function defined a long while ago that creates a single-element substitution.
Also notice that matching is using merge
to combine substitutions instead of composing them. merge
is interesting because it might fail. Let’s look at why.
For example:
match `(a, b)` `(Int, String)`
works and returns the substitution {a: Int, b: String}
. However, in this example:
match `(a, a)` `(Int, String)`
it would fail because we’d have to substitute a
for Int
on one side and for String
on the other. This is where merge
detects an error because the two substitutions don’t agree.
merge :: Monad m => Subst -> Subst -> m Subst
merge s1 s2 = if agree then return (s1++s2) else fail "merge fails"
where agree = all (\v -> apply s1 (TVar v) == apply s2 (TVar v))
(map fst s1 `intersect` map fst s2)
That’s it for the logic behind entailment. We won’t see where entail
is called for another two posts, and it will be even longer before finding out where this hooks back up to the type inference algorithm. The short version of the story is that this is one of the steps that happens when processing the predicates that are going to be saved for the type of a binding.
To unwind the stack of where we currently are:
merge
combines two substitutions, assuming they agree.match
checks that you can change one type into the other with a substitution.matchPred
both runsmatch
on the types and verifies that the predicates are talking about the same class.- That allows
byInst
to find an instance of a given class that matches the type mentioned in the predicate. - If
byInst
returns a type, that’s used byentail
to check if a predicate like(Show Int)
is known to be true thanks to an instance declaration.
And all of this is in the service of making nice type signatures! (And other things, but those are for another day.)