Changes between Version 39 and Version 40 of GhcKinds/KindInference

Feb 23, 2018 4:50:18 PM (21 months ago)



  • GhcKinds/KindInference

    v39 v40  
    33This page summarises several alternative designs for doing kind inference for
    4 types and classes, under `-XPolyKinds`.  See the extensive discussion on #9200.  (See also #9201.)
     4types and classes, under `-XPolyKinds`.
     5Though originally written with respect to #9200, there are more issues at play here. Relevant other tickets: #9201, #9427, #14451, #14847, and (to a lesser extent) #12088, which is all about instances.
     7== Trouble when fixing #14066 ==
     9Ticket #14066 requests a more robust way of tracking the scope of type
     10variables. A full discussion is out of scope here, but it required hard
     11thought about kind inference. Here are some notes I (Richard) have
     12drawn up around it:
     14=== Introduction ===
     16The central challenge is this:
     19data Proxy1 a where
     20  Mk :: Proxy1 (a :: k)
     23To infer the right kind for `Proxy1`, GHC has to guess a kind for `a` -- call it kappa -- and then process the type of `Mk`. What's very bizarre is that the occurrence of `Proxy1` in the type of `Mk` leads GHC to unify kappa with `k`, even though `k` has a local scope. In other words, kappa unifies with an out-of-scope skolem. GHC's current (8.4) kind-generalization process ends up quantifying over the `k` in the kind of `Proxy1`, and so disaster is averted. But it really works only by accident.
     25And it doesn't always work.
     28data Proxy2 a where
     29  Mk1 :: Proxy2 (a :: k)
     30  Mk2 :: Proxy2 (b :: j)
     33GHC does the same thing here, but it rejects the definition because `k` doesn't unify with `j`. This is absurd, because these two variables are skolems that share no scope whatsoever. Interestingly, GHC 7.10 accepts this last definition, so the rejection is a hitherto-unknown regression.
     35=== Simon's Proposed Solution ===
     37Datatype declarations are kind-checked in two passes. The first pass looks through all the constructors, accumulating constraints on the type's kind. Then, once the kind is known, all the constructors are checked ''again'' with respect to the known kind. Note that we need to look at constructors to determine the kind of the datatype; otherwise, GHC would choke on declarations like
     40data App a b = MkApp (a b)
     43The kinds for `a` and `b` can be known only by looking at `MkApp`.
     45Simon suggested that, in the first pass, we instantiate all the user-written type variables in a constructor's declaration with unification variables and then proceed to infer the kind. This would mean that the `j` and `k` in `Proxy2`'s constructors would both be instantiated with unification variables. These would all unify but would remain unconstrained. GHC would then quantify over this kind, as it should. Note that it's possible for GHC to infer something silly here, but it would then be caught on the second pass.
     47=== Aside: Polymorphic Recursion and Type Inference ===
     49GHC claims that it infers most general types. But this claim is simply not true in the presence of polymorphic recursion. For example, consider this equation defining `f`:
     52f x y = f y x
     55The most general type for `f` is `forall a b c. a -> b -> c`. GHC indeed accepts this type signature for `f`. However, if `f` is written without a type signature, GHC infers `forall a b. a -> a -> b`.
     57What has happened to GHC's claim of inferring most general types? It is true, but it is true only with respect to the fragment of Haskell that prohibits polymorphic recursion. Note that GHC's inferred type leads to no polymorphic recursion. The lesson here is that a claim of most general types makes sense only with respect to some restriction on the features in the language. (For example, we all know that GHC won't infer a higher-rank type, even if one is more general than another type.)
     59=== Simon's Algorithm Breaks Principal Types ===
     61Under Simon's algorithms, some forms of polymorphic recursion are indeed accepted. For example:
     64data T a where
     65  Mk :: forall k1 k2 (a :: k1) (b :: k2). T b -> T a
     68Simon's algorithm instantiates `k1`, `k2`, `a`, and `b` with fresh unification variables. Suppose `T` is guessed to have kind `kappa -> Type`. Then, `kappa`, `k1`, and `k2` all get unified together, with no constraints. GHC will quantify, giving `T` the kind `forall k. k -> Type`. During the second pass, this kind works fine, instantiated to `k1` and `k2`. Thus, GHC accept the polymorphic-recursive `T`.
     70So: what forms of polymorphic recursion are accepted? Not all of them. Take this example:
     73data T2 a where
     74  Mk :: forall (a :: k). T2 Maybe -> T2 a
     77Under Simon's algorithm, the `T2` is guessed to have kind `kappa -> Type`, and then `kappa` will unify with `Type -> Type`. `T2` is assigned the kind `(Type -> Type) -> Type`. On the second pass, `Mk` is rejected, applying `T2` to something of kind `k`. We can see, though, that `T2` could be assigned the kind `forall k. k -> Type` quite profitably. Nevertheless, I agree with Simon's algorithm that rejecting is the correct behavior -- we don't want to infer polymorphic recursion.
     79After some thinking, Adam, Csongor and I came up with this rule: Simon's algorithm accepts polymorphic recursive definitions when the recursive occurrences are at types that instantiate kind variables with kind variables, but never concrete kinds. Note that in `T`, the recursive occurrence replaces `k1` with `k2`, but `T2` replaces `k` with `Type -> Type`. Let's call a polymorphic recursive definition where recursive occurrences replace variables with variables "pseudopolymorphic recursive".
     81However, Simon's algorithm does not always infer the most general type with respect to the fragment of the language containing pseudopolymorphic recursion (but not general polymorphic recursion). For example:
     84data T3 a b where
     85  Mk :: T3 b a -> T3 a b
     88Simon's algorithm infers the kind `forall k. k -> k -> Type` for `T3`, even though it could have the kind `forall k1 k2. k1 -> k2 -> Type` and remain pseudopolymorphic recursive. Thus, I claim that Simon's algorithm does not infer most general types.
    690== Current strategy (GHC 7.8) ==