Type inference for Haskell, part 23
This is part 23, the final part of the series on type inference for Haskell.
All the hard work for type classes is done. The data structures have been defined. We dealt with context reduction, which involved predicate entailment and converting predicates to head normal form. Most of the predicates needed to be retained on the current binding, but some needed to be deferred to an enclosing binding. Some predicates resulted in ambiguities (some of which could be resolved by defaulting). And finally, we added the monomorphism restriction.
Now we just need to attach this logic to the rest of the type inference system. You already saw two changes to existing functions in the previous post. The top level function
tiProgram was changed to apply a defaulting substitution, and
tiImpls gained logic to deal with the monomorphism restriction. There was another change to
tiImpls: It now returns a list of predicates in addition to the list of assumptions it was returning before. Where do those predicates come from?
Ultimately, they can come from one of four places.
First, a number literal now comes with a predicate attached. Where we previously had hard-coded a type of
Integer for a number, we can now return a generic type with the stipulation that the type be in some
Num class. The
tiLit function is now:
tiLit :: Literal -> TI ([Pred], Type) tiLit (LitChar _) = return (, tChar) tiLit (LitInt _) = do v <- newTVar Star return ([IsIn "Num" v], v) tiLit (LitStr _) = return (, tString) tiLit (LitRat _) = do v <- newTVar Star return ([IsIn "Fractional" v], v)
LitRat case handles literals like
Second, the N+K pattern also can add a predicate. This restricts the type a bit more than a literal number does (
Integral is a subclass of
If you look at the other cases in
tiPat (I won’t duplicate all of them here), you’ll see that most do one of two things. Either they return an empty list of predicates, or they return the list of predicates from processing an inner pattern. The constructor pattern is interesting because it needs to combine multiple sources of predicates:
tiPat (PCon (i:>:sc) pats) = do (ps,as,ts) <- tiPats pats t' <- newTVar Star (qs :=> t) <- freshInst sc unify t (foldr fn t' ts) return (ps++qs, as, t')
qs are lists of predicates. The list
ps is also the result of combining lists of predicates:
tiPats :: [Pat] -> TI ([Pred], [Assump], [Type]) tiPats pats = do psasts <- mapM tiPat pats let ps = concat [ ps' | (ps',_,_) <- psasts ] as = concat [ as' | (_,as',_) <- psasts ] ts = [ t | (_,_,t) <- psasts ] return (ps, as, ts)
This is a general pattern you’ll see throughout the code. A small handful of things actually give predicates. The rest either give an empty list of predicates or return the predicates given by nested calls. Throughout all the type inference code, it just builds up a list of all the predicates it can find.
The third place you can get a predicate from is a variable. If a variable’s binding carries some predicates, they will be added to the list when that variable is used. This might happen if the variable is the name of a method in a type class. As an example, if you use the function
show, you would get the predicate
Finally, the fourth source of predicates is constructor expressions. In practice, these will mostly have an empty list of predicates.
After that, everything else just collects and returns predicates. That holds true for the other cases in
tiExpr ce as (Lit l) = do (ps,t) <- tiLit l return (ps, t) tiExpr ce as (Ap e f) = do (ps,te) <- tiExpr ce as e (qs,tf) <- tiExpr ce as f t <- newTVar Star unify (tf `fn` t) te return (ps ++ qs, t) tiExpr ce as (Let bg e) = do (ps, as') <- tiBindGroup ce as bg (qs, t) <- tiExpr ce (as' ++ as) e return (ps ++ qs, t)
You might notice that the type for
Infer Expr Type, which makes use of the type alias
Infer. That is defined as follows:
The rule of gathering and returning predicates is also followed by
tiAlt :: Infer Alt Type tiAlt ce as (pats, e) = do (ps, as', ts) <- tiPats pats (qs,t) <- tiExpr ce (as'++as) e return (ps++qs, foldr fn t ts) tiAlts :: ClassEnv -> [Assump] -> [Alt] -> Type -> TI [Pred] tiAlts ce as alts t = do psts <- mapM (tiAlt ce as) alts mapM (unify t) (map snd psts) return (concat (map fst psts))
The same is true of
tiBindGroup :: Infer BindGroup [Assump] tiBindGroup ce as (es,iss) = do let as' = [ v:>:sc | (v,sc,alts) <- es ] (ps, as'') <- tiSeq tiImpls ce (as'++as) iss qss <- mapM (tiExpl ce (as''++as'++as)) es return (ps++concat qss, as''++as')
tiSeq :: Infer bg [Assump] -> Infer [bg] [Assump] tiSeq ti ce as  = return (,) tiSeq ti ce as (bs:bss) = do (ps,as') <- ti ce as bs (qs,as'') <- tiSeq ti ce (as'++as) bss return (ps++qs, as''++as')
Let’s now look at where the rubber meets the road and see where these lists of predicates gathered up end up going to all that code we defined before. We need to look at both
Explicitly typed bindings are handled by
tiExpl. This means there was a type declared for them. That type is stored in the scheme
sc. Remember before how we needed to check that the declared type wasn’t too general for the binding? We need to do something similar with predicates. When the scheme
sc is instantiated, we get the predicates
qs. After type inference is done, we apply the current substitution
qs to get
qs'. This makes sure all the type variables mentioned in
qs are up to date.
Back where the inference happened, you can see that
tiAlts gathers a list of predicates
ps. This is what we need to check works with
qs'. If the declared type for the binding didn’t mention a predicate that turns out to be necessary, that would be a problem. Haskell wouldn’t allow you to write this because the type you declare is missing a
(Show a) predicate:
So, we need to check that the predicates you supply cover everything that is needed. This is the second place the
entail function is called (the first was in simplifying the list of predicates). The predicates in
ps are filtered out if they are entailed by the declared predicates
qs. Any that remain are stored in
It’s actually possible for it to be OK for there to be predicates left in
ps'. If all of the predicates that are left are deferred, then that means that there is no problem with the declared type. Some other binding will need to have those predicates. The call to
split determines whether the predicate is deferred or retained. If any retained predicates are left, that does indicate an error.
tiExpl :: ClassEnv -> [Assump] -> Expl -> TI [Pred] tiExpl ce as (i, sc, alts) = do (qs :=> t) <- freshInst sc ps <- tiAlts ce as alts t s <- getSubst let qs' = apply s qs t' = apply s t fs = tv (apply s as) gs = tv t' \\ fs sc' = quantify gs (qs':=>t') ps' = filter (not . entail ce qs') (apply s ps) (ds,rs) <- split ce fs gs ps' if sc /= sc' then fail "signature too general" else if not (null rs) then fail "context too weak" else return ds
You have already seen the last few lines of
tiImpls in the previous post. Here is the function in its entirety:
tiImpls :: Infer [Impl] [Assump] tiImpls ce as bs = do ts <- mapM (\_ -> newTVar Star) bs let is = map fst bs scs = map toScheme ts as' = zipWith (:>:) is scs ++ as altss = map snd bs pss <- sequence (zipWith (tiAlts ce as') altss ts) s <- getSubst let ps' = apply s (concat pss) ts' = apply s ts fs = tv (apply s as) vss = map tv ts' gs = foldr1 union vss \\ fs (ds,rs) <- split ce fs (foldr1 intersect vss) ps' if restricted bs then let gs' = gs \\ tv rs scs' = map (quantify gs' . (:=>)) ts' in return (ds++rs, zipWith (:>:) is scs') else let scs' = map (quantify gs . (rs:=>)) ts' in return (ds, zipWith (:>:) is scs')
It’s handling of predicates is actually simpler than
tiExpl because it doesn’t have to check if any set of predicates is correct. When the types are inferred for each implicitly typed binding, the result is a list of lists of predicates
pss. These are concatenated together before having the current substitution
s applied to them, resulting in
Those predicates are then passed to
split to decide which are deferred (
ds) and which are retained (
rs). From there, the rest is all logic that was discussed in the previous post, but there’s one last thing I want to mention. Assuming the monomorphism restriction doesn’t apply to these bindings, the retained predicates
rs are added to the type schemes of all of the bindings in the group. That might make one of the types ambiguous if one of those retained predicates referred to a type variable not found in the type of one of those bindings. To ensure that doesn’t happen, the list of type variables passed to
split is the intersection (not the union) of the type variables (
vss) that appear in all the binding’s types (
We’re finally done! I hope this has been useful to you.
This covered nearly all of the code found in the Typing Haskell in Haskell paper. If you haven’t already, I highly encourage you to read over both that paper and the included code. It has a number of details that I didn’t include here.
The code (and the THIH paper) can be found at https://web.cecs.pdx.edu/~mpj/thih/. However, this code is from 2000 and is thus slightly out of date. I’ve uploaded a modified version that compiles with GHC 8.0 here (alt link).