Type inference for Haskell, part 17
This is part 17 of the series on type inference for Haskell.
The previous post defined the data structures we’ll use when working with type classes. Today, we’re going to be looking at a behavior of Haskell where it “cleans up” the set of predicates attached to a binding. The resulting types are intuative enough that you might not have even realized that this process was happening.
(THIH Section 7.3, Entailment)
The feature is called entailment (at least, it’s called that by Typing Haskell in Haskell). It’s easiest to show what it is with some examples.
When you have a function with a type signature like
(Show a) => String -> Maybe a -> String
The (Show a) part is a predicate. It says that the type a must have an
implementation of the Show class. You can turn this into a question: Does a
implement Show? If so, that predicate holds true.
Consider another type
(Eq a, Ord a) => a -> a -> Bool
This is actually equivalent to writing
(Ord a) => a -> a -> Bool
Why can we leave off the Eq a? That’s because Eq is a superclass of Ord,
meaning that if a is an instance of Ord, then it definitely is also an
instance of Eq. So we don’t have to manually write down the fact that we want
it to also be an instance of Eq - we get that for free.
That’s the first of the two related things that entailment handles.
Now look at this type signature:
foo :: (Show Int) => Int -> Int -> String
foo a b = show (max a b)
Writing Show Int isn’t useful here because we already know that Int
implements Show. If it didn’t, the show call in the function body would fail
to type check. Either way, we aren’t helped by writing Show Int. It’s better
to leave it out of the function’s type.
This is the second thing that entailment looks at.
When inferring the type for a function, we might get predicates like those in the examples above. The entailment algorithm is used to clean them up. (It is also used for a few other things that we’ll come to in another post.)
I still haven’t told you what the function actually does. It takes in a
predicate (like (Eq a)) that we want to evaluate and a list of other
predicates that we are assuming hold true (like (Ord a)). It also takes
information about what classes exist and what instances they have. It returns
true if, given the other predicates, we know that the predicate holds true. It
has no error cases.
Here’s how the logic works for that. If our predicate appears in the list of
other predicates, then it trivially holds true if they all hold true (that’s
like asking, given that A is true, is A true?). It also holds true if it appears
in any of the superclasses of the other predicates (that’s what handles our (Eq a, Ord a) example). Finally, the predicate also holds true if there is an
instance of the class for that type (the (Show Int) case) where any predicates
on that instance also hold true (with a recursive call to entail).
Show Int doesn’t have any predicates on the instance. For an example that does
have some, consider an instance declaration like
instance (Foo a) => Foo [a] where
And a predicate like (Foo [Bar]). When checking that predicate, we’ll first
find the instance above. Foo [a] matches with Foo [Bar], so the instance can
be used. From there, the (Foo a) predicate on the instance means that we will
next have to check if (Foo Bar) holds true. If there is an instance of class
Foo for the type Bar, then we’ll recursively check its predicates, too.
Here’s the actual entail function from THIH:
entail :: ClassEnv -> [Pred] -> Pred -> Bool
entail ce ps p =
any (p `elem`) (map (bySuper ce) ps) ||
case byInst ce p of
Nothing -> False
Just qs -> all (entail ce ps) qsNote that it ORs two cases together. The first case handles the situation where
the predicate is either directly in the list of predicates we are assuming hold
true, or it is a superclass of at least one of them. That would return true for
Eq a if Ord a was in the list of given predicates. The second case is for
the situation where the class has an instance for the type mentioned in the
predicate. Note that this case involves a recursive call to entail. That call
checks the predicates (if there are any) on that particular instance of the
class.
Note that bySuper includes the predicate itself, not just superclasses.
bySuper :: ClassEnv -> Pred -> [Pred]
bySuper ce p@(IsIn i t) =
p : concat [ bySuper ce (IsIn i' t) | i' <- super ce i ]And here is byInst. It attempts to find an instance of the class mentioned in
the predicate that works for the type mentioned in the predicate. It uses a
function called matchPred to determine if a given instance of the class works
for this particular case. msum takes the first Just _ value in the list, or
returns Nothing if there are none.
byInst :: ClassEnv -> Pred -> Maybe [Pred]
byInst ce p@(IsIn i t) = msum [ tryInst it | it <- insts ce i ]
where tryInst (ps :=> h) = do
u <- matchPred h p
Just (map (apply u) ps)The matchPred first checks to make sure the classes being talked about are the
same, then checks if the types in the predicate “match.” If you look at the
code of match, you are probably reminded of the mgu function. match is
like a one-sided version of mgu. Instead of finding a substitution that can be
applied to both types to make them the same, it finds a substitution that can
be applied to the left type that can turn it into the right type.
This allows the right type to be more specific than the left type, but the left
type cannot be more specific than the right. This would mean that we could match
[a] with [Bar] to determine that an instance of a class on [a] can be used
for the type [Bar]. The match function is how you tell if the particular
instance you are looking at can be used for the type you have.
matchPred :: Pred -> Pred -> Maybe Subst
matchPred = lift match
lift m (IsIn i t) (IsIn i' t')
| i == i' = m t t'
| otherwise = fail "classes differ"
match :: Monad m => Type -> Type -> m Subst
match (TAp l r) (TAp l' r') = do
sl <- match l l'
sr <- match r r'
merge sl sr
match (TVar u) t | kind u == kind t =
return (u +-> t)
match (TCon tc1) (TCon tc2) | tc1 == tc2 =
return nullSubst
match t1 t2 =
fail "types do not match"+-> is that infix function defined a long while ago that creates a
single-element substitution.
Also notice that matching is using merge to combine substitutions instead of
composing them. merge is interesting because it might fail. Let’s look at why.
For example:
match `(a, b)` `(Int, String)`
works and returns the substitution {a: Int, b: String}. However, in this
example:
match `(a, a)` `(Int, String)`
it would fail because we’d have to substitute a for Int on one side and for
String on the other. This is where merge detects an error because the two
substitutions don’t agree.
merge :: Monad m => Subst -> Subst -> m Subst
merge s1 s2 = if agree then return (s1++s2) else fail "merge fails"
where agree = all (\v -> apply s1 (TVar v) == apply s2 (TVar v))
(map fst s1 `intersect` map fst s2)That’s it for the logic behind entailment. We won’t see where entail is called
for another two posts, and it will be even longer before finding out where this
hooks back up to the type inference algorithm. The short version of the story is
that this is one of the steps that happens when processing the predicates that
are going to be saved for the type of a binding.
To unwind the stack of where we currently are:
mergecombines two substitutions, assuming they agree.matchchecks that you can change one type into the other with a substitution.matchPredboth runsmatchon the types and verifies that the predicates are talking about the same class.- That allows
byInstto find an instance of a given class that matches the type mentioned in the predicate. - If
byInstreturns a type, that’s used byentailto check if a predicate like(Show Int)is known to be true thanks to an instance declaration.
And all of this is in the service of making nice type signatures! (And other things, but those are for another day.)