Type inference for Haskell, part 4
This is part 4 of the series on type inference for Haskell.
After seeing some examples but before getting to the actual type inference code, it is good to know the rules of the type system. If you wanted, you could start from these rules and arrive at a proof of the soundness of the type system. However, I’m interested in going in another direction with them, and using the rules to help clarify what the type inference code is doing. The code is actually a fairly direct translation of the rules. Hopefully knowing the rules will help bridge the gap between what the type inference code is doing and what is going on in the code it is inferring the type for.
These rules align to syntactic forms. To recap, those forms are literal values, variables, let expressions, function abstractions (aka lambda expressions), constructors, and function applications (aka function calls).
The process of type inference can be thought of as two processes: reading the syntax tree to find what rules apply and then finding a solution that conforms to those rules. In practice, these two steps are done at once, but it would be possible to separate them if wanted. All of the constraints either say something about the type of the current expression, or the type of one of the sub-expressions. The final solution will meet the constraints defined by all of the rules.
Literals are simple. The only rule is that the type of a literal is determined directly by what sort of literal it is.
123 is an
"foo" is a string, and so on.
Variables, on the other hand, are somewhat subtle. The two rules could be summarized as: the variable must be in scope, and the type of an expression which is a variable is the type of that variable. However, there are some details not really captured by those two points.
The first detail is what it means for a variable to be in scope. In an expression like
let v1 = e1 in e2, the variable v1 is in scope only in the expression e2 (and not in e11). In an expression like
(\v1 -> e1), v1 is in scope in e1. In either case, v1 might shadow another definition of the variable that is in scope already.
Second, “the type of that variable” is rather hand-wavey. This refers to the final type of the variable after all the constraints have been solved for. If a variable is used in multiple places, it will have exactly the same type in all of those places2. That means that if one usage of a variable impacts the type, then it impacts the type for all the different places where that variable is used. More on that in a moment.
Let bindings also have two rules. In an expression like
let v1 = e1 in e2, the type of the variable v1 and the expression e1 must be the same. Note that this constraint is bi-directional - the type of v1 can influence the type of e1 the same way that the type of e1 can influence the type of v1. A restriction to the type of either applies to both. As mentioned previously, v1 is not in scope in e1, so that needs to be taken into account when inferring the type for e1.
Note that this requirement that the types be equal is different from how languages like Java work where the expression could possibly be a subtype of the variable’s type (e.g.
List x = new ArrayList()). When the program is run, the value computed by the expression e1 will be assigned to v1, which will certainly be valid if both e1 and v1 are the same type.
The second rule is that the type of the whole let expression is the type of the second sub-expression e2. The type of that is inferred with v1 in scope. That’s because when a let expression is evaluated, the result of the second inner expression is used as the result of the whole expression.
(Remember that all functions actually have a single argument.)
First of all, the type of a function abstraction needs to be a function type. That is to say, the type for an expression like
(\v1 -> e1) will have some type
t1 -> t2.
The type t1 will be whatever type is found for the variable v1. This refers to the final type the variable gets at the end of type inference, after all the usages of the variable have been taken into account.
Similarly, the return type t2 will be the type that is found for the expression e1. The variable v1 is in scope when inferring the type of e1. That variable shadows any existing definition of a variable with the same name.
Function applications like
e1 e2 have the function on the left-hand side (e1) and the argument on the right-hand side (e2).
Starting with the left-hand side, the type of e1 must be some function type. That means it will be in the form
t1 -> t2, for some t1 and t2.
Looking at the right-hand side, the type of e2 must be the same as the argument type t1 of the function.
Finally, the type of the whole expression is the return type of the function t2. Like with variables, this refers to the final type the return type of the function takes on when the type inference process has finished evaluating all the other rules.
Left) are handled a bit differently. Their type is determined by the declaration of that data type. For example, the type of the constructor
Nothing would be
Maybe a. The constructor
Left would have the type
a -> Either a b. It’s a function type because you use
Left just like a function, applying it to another value. For example,
Left "an error" is an expression of type
Either String b.
We won’t be dealing with the logic to figure out what the types are based on the data type declarations, but it isn’t particularly complicated to implement.
There are some details and implications of these rules that might not be clear from the statement of the rules themselves.
The equality of types is transitive. If one rule states that the expression e1 has the same type as the expression e2, and e2 has the same type as e3, then you know that the type of e1 is the same as the type of e3. If some rules then tells you something about the type of e1, that gets applied to the type of e2 and of e3 as well. When a rule says that the type of one thing should be the same as the type of some other thing, it means “add that constraint to the list of things we need to solve for” not “use what we currently know about the type of the other thing as the type of this thing.” All the rules will be things that hold true when type inference has finished.
It’s also worth pointing out the fact that you can’t fully determine the type of an expression until you’ve looked at all of the other expressions. Any expression somewhere else in the program might use a variable and restrict its type. If the expression you are looking at also uses that variable, that could impact the resulting type of this expression.
Function applications can provide information about the type of both the left-hand side (the expression that results in a function) and the right-hand side (the expression that produces the value to use as an argument). In fact, function application is the only operation that provides more information about the type of other expressions. That information will ultimately result in updating the type of variables (assuming, of course, that there are no type errors).
An implicit assumption of these rules is that there are no type errors in inner expressions - these rules assume that the rules in sub-expressions also get solved for by the type inference algorithm.
Another implicit assumption is that the type of a variable is the same everywhere that variable is used (ignoring variable shadowing, which technically creates a new variable). This is not the case in Haskell. To extend this system to match Haskell’s behavior, the rule about a variable will need to be changed in an interesting way. For now, just assume that a variable has a single type and that any rules that talk about its type are all referring to the same thing.
It might sound like it would be really complicated to implement logic to figure out how to solve for all of these rules. However, you are in luck. The type inference algorithm actually consists of a few simple uses of a handful of smaller algorithms. That’s what we’ll look at in the next post.
For anyone comparing this to what they are reading elsewhere, know that the type system these rules describe is slightly less powerful than Hindley-Milner type inference (aka Algorithm-W). Our system is (at the moment) missing let-polymorphism.