Type inference for Haskell, part 9

Posted on January 9, 2019

This is part 9 of the series on type inference for Haskell.

This is the end of the first third of the series. Thus far, we’ve defined a small subset of Haskell, the type rules for it, and implemented type inference for those rules. This system is far from complete. It is missing syntax features (patterns), semantic features (recursion), and type system features (let polymorphism, explicitly typed bindings, and type classes).

Review

Now is a good time to recap what you’ve seen so far. A substitution is a mapping from type variable to type. It is used to store the current solution by mapping a type variable to what we know about it thus far. By replacing one type variable with another, it can capture relationships where two types need to be the same.

The process of unification updates that substitution with more information about the solution. Its job is to figure out how to unify two types. If it succeeds, the substitution will be updated. That updated substitution will now, if applied to those two types, turn them into the same thing. Unification is where most type errors are discovered.

The type inference algorithm then uses unification, the ability to generate fresh type variables, information about the variables already defined, and of course the substitution to implement the type rules of the language. The code is a fairly direct implementation of the rules, so understanding the rules is helpful for understanding the code.

The Code

Here’s the full code for the type inference system so far. There are a few pieces in there that have not been discussed yet, so let’s go over those now.

The code uses a monad to manage the type inference computation. This used for storing the current substitution (called s in the code below) and the number of type variables generated thus far (n in the code below).

In the post on substitutions, a function called apply was defined. It applied a substitution to a type. This function is useful to apply (no pun intended) to other data types as well, so it is generalized into a type class. That type class also contains a function called tv which returns the list of type variables found in the given type. Right now this function is only used for the occurs check, but later it is used more heavily.

(nub is a function that deduplicates a list.)

Wrapping up

At this point, I recommend you read over the entire file. If there are any sections you find confusing, it might be helpful to re-read the relevant post in this series. For anything that still doesn’t make sense, feel free to ask me on Twitter.

Once you feel like you have a good grasp of what the code is doing, try re-implementing it in some other language. There is nothing quite like implementing it yourself for understanding an algorithm!


The next third of this series will add some extensions to the type system, after which the last third will add type classes.