Type inference for Haskell, part 17

Posted on January 17, 2019

This is part 17 of the series on type inference for Haskell.

The previous post defined the data structures we’ll use when working with type classes. Today, we’re going to be looking at a behavior of Haskell where it “cleans up” the set of predicates attached to a binding. The resulting types are intuative enough that you might not have even realized that this process was happening.

(THIH Section 7.3, Entailment)

The feature is called entailment (at least, it’s called that by Typing Haskell in Haskell). It’s easiest to show what it is with some examples.

When you have a function with a type signature like

(Show a) => String -> Maybe a -> String

The (Show a) part is a predicate. It says that the type a must have an implementation of the Show class. You can turn this into a question: Does a implement Show? If so, that predicate holds true.

Consider another type

(Eq a, Ord a) => a -> a -> Bool

This is actually equivalent to writing

(Ord a) => a -> a -> Bool

Why can we leave off the Eq a? That’s because Eq is a superclass of Ord, meaning that if a is an instance of Ord, then it definitely is also an instance of Eq. So we don’t have to manually write down the fact that we want it to also be an instance of Eq - we get that for free.

That’s the first of the two related things that entailment handles.

Now look at this type signature:

foo :: (Show Int) => Int -> Int -> String
foo a b = show (max a b)

Writing Show Int isn’t useful here because we already know that Int implements Show. If it didn’t, the show call in the function body would fail to type check. Either way, we aren’t helped by writing Show Int. It’s better to leave it out of the function’s type.

This is the second thing that entailment looks at.

When inferring the type for a function, we might get predicates like those in the examples above. The entailment algorithm is used to clean them up. (It is also used for a few other things that we’ll come to in another post.)

I still haven’t told you what the function actually does. It takes in a predicate (like (Eq a)) that we want to evaluate and a list of other predicates that we are assuming hold true (like (Ord a)). It also takes information about what classes exist and what instances they have. It returns true if, given the other predicates, we know that the predicate holds true. It has no error cases.

Here’s how the logic works for that. If our predicate appears in the list of other predicates, then it trivially holds true if they all hold true (that’s like asking, given that A is true, is A true?). It also holds true if it appears in any of the superclasses of the other predicates (that’s what handles our (Eq a, Ord a) example). Finally, the predicate also holds true if there is an instance of the class for that type (the (Show Int) case) where any predicates on that instance also hold true (with a recursive call to entail).

Show Int doesn’t have any predicates on the instance. For an example that does have some, consider an instance declaration like

instance (Foo a) => Foo [a] where

And a predicate like (Foo [Bar]). When checking that predicate, we’ll first find the instance above. Foo [a] matches with Foo [Bar], so the instance can be used. From there, the (Foo a) predicate on the instance means that we will next have to check if (Foo Bar) holds true. If there is an instance of class Foo for the type Bar, then we’ll recursively check its predicates, too.

Here’s the actual entail function from THIH:

Note that it ORs two cases together. The first case handles the situation where the predicate is either directly in the list of predicates we are assuming hold true, or it is a superclass of at least one of them. That would return true for Eq a if Ord a was in the list of given predicates. The second case is for the situation where the class has an instance for the type mentioned in the predicate. Note that this case involves a recursive call to entail. That call checks the predicates (if there are any) on that particular instance of the class.

Note that bySuper includes the predicate itself, not just superclasses.

And here is byInst. It attempts to find an instance of the class mentioned in the predicate that works for the type mentioned in the predicate. It uses a function called matchPred to determine if a given instance of the class works for this particular case. msum takes the first Just _ value in the list, or returns Nothing if there are none.

The matchPred first checks to make sure the classes being talked about are the same, then checks if the types in the predicate “match.” If you look at the code of match, you are probably reminded of the mgu function. match is like a one-sided version of mgu. Instead of finding a substitution that can be applied to both types to make them the same, it finds a substitution that can be applied to the left type that can turn it into the right type.

This allows the right type to be more specific than the left type, but the left type cannot be more specific than the right. This would mean that we could match [a] with [Bar] to determine that an instance of a class on [a] can be used for the type [Bar]. The match function is how you tell if the particular instance you are looking at can be used for the type you have.

+-> is that infix function defined a long while ago that creates a single-element substitution.

Also notice that matching is using merge to combine substitutions instead of composing them. merge is interesting because it might fail. Let’s look at why.

For example:

match `(a, b)` `(Int, String)`

works and returns the substitution {a: Int, b: String}. However, in this example:

match `(a, a)` `(Int, String)`

it would fail because we’d have to substitute a for Int on one side and for String on the other. This is where merge detects an error because the two substitutions don’t agree.

That’s it for the logic behind entailment. We won’t see where entail is called for another two posts, and it will be even longer before finding out where this hooks back up to the type inference algorithm. The short version of the story is that this is one of the steps that happens when processing the predicates that are going to be saved for the type of a binding.

To unwind the stack of where we currently are:

  • merge combines two substitutions, assuming they agree.
  • match checks that you can change one type into the other with a substitution.
  • matchPred both runs match on the types and verifies that the predicates are talking about the same class.
  • That allows byInst to find an instance of a given class that matches the type mentioned in the predicate.
  • If byInst returns a type, that’s used by entail to check if a predicate like (Show Int) is known to be true thanks to an instance declaration.

And all of this is in the service of making nice type signatures! (And other things, but those are for another day.)