Type inference for Haskell, part 21

Posted on January 22, 2019

This is part 21 of the series on type inference for Haskell.

The previous post looked at an issue where predicates might get attached to a inner binding when it would have made more sense for them to be on an outer binding. Today we are going to take a look at a related issue. What happens when you have a predicate that doesn’t seem to belong to any binding?

In Haskell, there are various number types including Int and Float. It’s often valid to use any one of those types. Sometimes there is no clear way to decide which to use. Consider:

stringInc x = show (read x + 1)

stringInc takes a string and returns a string. Yet, while processing, it produces a number that might be an Int, or it might be a Float (or any number type). The type of the expression read x has to be some kind of number in order to be able to add one to it, but any kind of number will do. The type of stringInc contains predicates that aren’t attached to anything in the rest of the type:

(Read a, Num a) => String -> String

This means that there is an ambiguity.

Note that the syntax 1 can be used for either integer or floating values. It would still be possible to create the ambiguity even if that were not the case. For example:

stringDouble x = show (let n = read x in n + n)

Finding ambiguities

The code to find ambiguities is fairly small:

type Ambiguity = (Tyvar, [Pred])

ambiguities :: ClassEnv -> [Tyvar] -> [Pred] -> [Ambiguity]
ambiguities ce vs ps =
  [ (v, filter (elem v . tv) ps) | v <- tv ps \\ vs ]

This takes the predicates for a binding (just the ones that are being retained) and finds ambiguities in them, if there are any. The resulting ambiguities are grouped, not by predicate, but by type variable in the predicate. If there are two ambiguous predicates that reference the same type variable (like (Read a, Num a)) they will be grouped together here (and the output will be (a, [Read a, Num a])).

What are the arguments to this function? First up is ce. Interestingly, the code here from the THIH paper doesn’t end up using the class environment (ce) that is passed in. ps is the list of predicates we are working on. In the middle, there is vs. What is the list of [Tyvar] that is being given? If you look up the description of the split function in THIH section 11.5, you can see that it is the combination of type variables appearing in the current scope and the type variables in the type of the binding that we currently processing predicates for. Essentially, it’s all type variables that might reasonably appear in a predicate. If a predicate has a type variable not in that list, it is ambiguous.

Does this function ever fail? No. What does it return? Lists of ambiguous predicates, grouped by the type variable that is ambiguous.

What does it do? It first finds all the type variables that appear in the predicates but not in that list passed in. This (tv ps \\ vs) evaluates to a list of type variables that aren’t bound in the type of the current binding or any other binding in scope. For each of those type variables, it grabs all the predicates that contain it.

OK, we can find ambiguities. What does Haskell do about them?

Finding types that might work

Haskell’s designers made the choice to have the language pick a default type to use when a type is ambiguous. This is only allowed when certain criteria are met, though in practice it tends to be allowed for most ambiguities you’d see in real code. Any ambiguities that don’t meet the criteria result in a type error.

What are the criteria? First, know that these requirements apply to a value of type Ambiguity, so a tuple containing a type variable and a list of predicates. There are four requirements must be met.

The types in all the predicates in the ambiguity must be just a type variable (i.e. not a type variable applied to another type).
A least one of the classes involved must be a numeric class (Num, Integral, Floating, etc). This means that defaulting only applies for types that we know will be some sort of number.
All of the classes involved must be standard classes. This includes number classes, plus some others like Show, Eq, Ord, etc. See THIH for the full list.
There has to be a type in the list of default types that satisfies all the predicates. Satisfying the predicates amounts to having an implementation of all of the classes they mention.

What is this list of default types? It’s a list of types stored in the class environment, defined way back in section 7.2 of THIH. The list is simply [tInteger, tDouble]. Haskell code is allowed to change the list of defaults, though I’ve yet to see any real code bother to do so. If more than one default type works, the first one is chosen.

Example: say our predicates were (Read a, Num a). The first requirement is met since a is just a type variable. The second requirement is met because Num is one of the classes. The third requirement is met because both Read and Num are standard classes. Finally, we can look at the list of default types. First up, Integer. Does this implement Read and Num? It does, so that’s the type we’ll go with.

Going back to the original example, which was:

stringInc x = show (read x + 1)

This will be given the type String -> String (note that the predicates aren’t there any more). The type of the expression read x is now Integer.

On to the code that implements these four rules:

candidates           :: ClassEnv -> Ambiguity -> [Type]
candidates ce (v, qs) =
  [ t' | let is = [ i | IsIn i t <- qs ]
         ts = [ t | IsIn i t <- qs ],
         all ((TVar v)==) ts,
         any (`elem` numClasses) is,
         all (`elem` stdClasses) is,
         t' <- defaults ce,
         all (entail ce []) [ IsIn i t' | i <- is ] ]

The nested list comprehensions are a little confusing, but just know what the lines staring with any or all act as filters on the result list. This function can “fail” in the sense that when the 4 criteria aren’t met it returns an empty list.

Combining the two

Now that we can find ambiguities in a list of predicates and can find types for resolving ambiguities, it’s time to put the two together.

withDefaults :: Monad m => ([Ambiguity] -> [Type] -> a)
                  -> ClassEnv -> [Tyvar] -> [Pred] -> m a
withDefaults f ce vs ps
    | any null tss  = fail "cannot resolve ambiguity"
    | otherwise     = return (f vps (map head tss))
      where vps = ambiguities ce vs ps
            tss = map (candidates ce) vps

defaultedPreds :: Monad m => ClassEnv -> [Tyvar] -> [Pred] -> m [Pred]
defaultedPreds = withDefaults (\vps ts -> concat (map snd vps))

The function withDefaults is given another function f which it runs on the list of ambiguities (vps) and the list of a default types to pick for each one (map head tss). It fails if it was unable to find a working default type for any of the ambiguities discovered (any null tss).

defaultedPreds uses withDefaults to grab a list of all the predicates that can be removed by defaulting (which is just the predicates that appear in all of the ambiguities, because at this point we know that there is a solution to the ambiguities).

The paper also defines another similar function, defaultSubst, but we won’t get to what this is about until the next post.

defaultSubst   :: Monad m => ClassEnv -> [Tyvar] -> [Pred] -> m Subst
defaultSubst = withDefaults (\vps ts -> zip (map fst vps) ts)

Another example

Before jumping back up a level, let’s look at one example where defaulting fails.

foo s = show [read s]

Compiling this results in errors like

Ambiguous type variable ‘t0’ arising from a use of ‘read’
prevents the constraint ‘(Read t0)’ from being solved.

The type of the value returned by read s needs to be an instance of Read, hence the (Read t0) predicate. There is also a (Show [t0]) predicate that gets converted down to a (Show t0) predicate when converting to HNF.

Since the type of foo is String -> String, the predicates (Read t0, Show t0) are both ambiguous. When trying to deal with them, we go through the checklist again. Point 1 holds true, but there is no numeric class in the list, so point 2 doesn’t. As a result, we don’t get any default type and type inference for this function fails.

Split

Finally, we can finish the code for what happens to the predicates:

split :: Monad m => ClassEnv -> [Tyvar] -> [Tyvar] -> [Pred]
                      -> m ([Pred], [Pred])
split ce fs gs ps = do
  ps' <- reduce ce ps
  let (ds, rs) = partition (all (`elem` fs) . tv) ps'
  rs' <- defaultedPreds ce (fs++gs) rs
  return (ds, rs \\ rs')

This:

Does context reduction by calling reduce. This is where the predicates were converted to HNF then redundant ones were removed with the help of entail.
Splits predicates into deferred and retained lists.
Applies defaulting to any retained predicates that need it, potentially failing if some ambiguities can’t be dealt with.

This then returns the deferred predicates and the smaller list of retained predicates. There, finally the processing of predicates is done!

That split function we just defined ties together all the logic for processing predicates. As we’ll see in the next posts, this is the function that gets called from the rest of the type inference code. We’re almost done! There is just one tricky point left to deal with: the infamous “monomorphism restriction.”