Changes between Version 1 and Version 2 of NewImpredicativity


Ignore:
Timestamp:
Jun 16, 2015 6:40:32 AM (4 years ago)
Author:
AlejandroSerrano
Comment:

Move page to better location

Legend:

Unmodified
Added
Removed
Modified
  • NewImpredicativity

    v1 v2  
    1 = New impredicativity =
    2 
    3 The **goal** is to build a better story for impredicative and higher-rank polymorphism in GHC. For that aim we introduce a new type of constraint, `InstanceOf t1 t2`, which expresses that //type `t1` is an instance of `t2`//. This new type of constraint is inspired on ideas from the MLF and HML systems.
    4 
    5 == Notation ==
    6 
    7 ||=Type variables                   =|| `alpha`, `beta`, `gamma` ||
    8 ||=Type constructors                =|| `T`                      ||
    9 ||=Type families                    =|| `F`                      ||
    10 ||=Constraints                      =|| `Q`                      ||
    11 ||=Monomorphic types                =|| `mu    ::= alpha | a | mu -> mu | T mu ... mu | F mu ... mu`                   ||
    12 ||=Types without top-level `forall` =|| `tau   ::= alpha | a | sigma -> sigma | T sigma ... sigma | F sigma ... sigma` ||
    13 ||=Polymorphic types                =|| `sigma ::= forall a. Q => tau`                                                 ||
    14 
    15 == Some basic facts ==
    16 
    17 * `InstanceOf` has kind `* -> * -> Constraint`.
    18 * The evidence for `InstanceOf sigma1 sigma2` is a function `sigma2 -> sigma1`.
    19 * The canonical forms associated with the constraint are `InstanceOf sigma1 alpha1` and `InstanceOf alpha2 sigma2`, where `sigma1` is not a type variable.
    20 
    21 //Implementation note//: `InstanceOf` needs to be defined in `libraries/ghc-prim/GHC/Types.hs`.
    22 
    23 //Implementation note//: new canonical forms need to be defined in `compiler/typecheck/TcRnTypes.hs` by extending the `Ct` data type.
    24 
    25 == Changes to constraint solver ==
    26 
    27 Luckily, in order to work with `InstanceOf` constraints, we only need to add new rules to the canonicalization step in the solver. These rules are:
    28 
    29 * [IOCan1] `InstanceOf t1                     (T sigma1 ... sigman)`   ---->  `t1 ~ (T sigma1 ... sigman)`
    30 * [IOCan2] `InstanceOf (T sigma1 ... sigman)  (forall a. Q2 => tau2)`  ---->  `(T sigma1 ... sigman) ~ [a/alpha]tau2  /\  [a/alpha]Q2`
    31 * [IOCan3] `InstanceOf (forall a. Q1 => tau1) sigma2`                  ---->  `forall a. (Q1 => InstanceOf tau1 sigma2)`
    32 
    33 But we also need to generate evidence for each of these steps!
    34 
    35 //Implementation note//: implement these rules in `compiler/typecheck/TcCanonical.hs`.
    36 
    37 //Implementation note//: add the new types of evidence to `EvTerm` in `compiler/typecheck/TcEvidence.hs`.
    38 
    39 //Implementation note//: extend the desugarer to convert from evidence to actual functions.
    40 
    41 === Example ===
    42 
    43 Suppose we want to type check `runST ($) (e :: forall s. ST s Int)`. Let us denote `alpha` the type of `runST`, `beta` the type of `e` and `gamma` the type of the entire expression. The initial set of constraints which are generated (details on generation below) are:
    44 
    45 {{{
    46 InstanceOf (alpha -> beta -> gamma) (forall a b. (a -> b) -> a -> b)  [from ($)]
    47 InstanceOf alpha (forall a. (forall s. ST s a) -> a)                  [from runST]
    48 InstanceOf beta  (forall s. ST s Int)                                 [from e]
    49 }}}
    50 
    51 The series of solving steps are:
    52 
    53 {{{
    54 InstanceOf (alpha -> beta -> gamma) (forall a b. (a -> b) -> a -> b)  [1]
    55 InstanceOf alpha (forall a. (forall s. ST s a) -> a)                  [2]
    56 InstanceOf beta  (forall s. ST s Int)                                 [3]
    57 
    58 ----> [IOCan2] over [1]
    59 
    60 (alpha -> beta -> gamma) ~ ((delta -> epsilon) -> delta -> epsilon)   [4]
    61 + [2] and [3]
    62 
    63 ----> type deconstruction in [4]
    64 
    65 alpha ~ delta -> epsilon
    66 beta  ~ delta
    67 gamma ~ epsilon
    68 + [2] and [3]
    69 
    70 ----> substitution in [2] and [3]
    71 
    72 InstanceOf (delta -> epsilon) (forall a. (forall s. ST s a) -> a)     [5]
    73 InstanceOf delta (forall s. ST s Int)                                 [6]
    74 
    75 ----> [IOCan2] over [5]
    76 
    77 (delta -> epsilon) ((forall s. ST s eta) -> eta)                      [7]
    78 InstanceOf delta (forall s. ST s Int)
    79 
    80 ----> type deconstruction in [7]
    81 
    82 delta   ~ forall s. ST s eta
    83 epsilon ~ eta
    84 InstanceOf delta (forall s. ST s Int)
    85 
    86 ----> substitution
    87 
    88 InstanceOf (forall s. ST s eta) (forall s. ST s Int)                  [8]
    89 
    90 ----> [IOCan3] over [8]
    91 
    92 forall s. (_ => Instance (ST s eta) (forall s'. ST s' Int)            [9]
    93 
    94 ----> [IOCan2] under (=>) of [9]
    95 
    96 forall s. (_ => Instance (ST s eta) (ST pi Int)                       [10]
    97 
    98 ----> canonicalization under (=>)
    99 
    100 forall s. (_ => s ~ pi /\ eta ~ Int)                                  [11]
    101 
    102 ----> float constraints out of (=>)
    103 
    104 eta ~ Int
    105 forall s. (_ => s ~ pi)
    106 
    107 ----> FINISHED!
    108 }}}
    109 
    110 We get that the type assigned to the whole expression is `gamma ~ epsilon ~ eta ~ Int`, as we expected :)
    111 
    112 === Evidence generation ===
    113 
    114 For [IOCan1] we want to find evidence for `W1 :: InstanceOf t1 (T sigma1 ... sigman)` from `W2 :: t1 ~ (T sigma1 ... sigman)`. Such an evidence must be a function `W1 :: (T sigma1 ... sigman) -> T1`. We can get it by applying the coercion resulting from `W2`. More schematically:
    115 
    116 {{{
    117 W1 :: InstanceOf t1 (T sigma1 ... sigman)
    118 
    119 ---->
    120 
    121 W1 :: T sigma1 ... sigman -> t1
    122 W1 = \x -> x |> (sym W2)
    123 
    124 W2 :: t1 ~ (T sigma1 ... sigman)
    125 }}}
    126 
    127 
    128 {{{
    129 W1 :: InstanceOf (T sigma1 ... sigman)  (forall a. Q1 ... Qn => tau2)
    130 
    131 ---->
    132 
    133 W1 :: (forall a. Q1 ... Qn => tau2) -> T sigma1 ... sigman
    134 W1 = \x -> (x alpha V1 ... Vn) |> (sym W2)
    135 
    136 W2 :: (T sigma1 ... sigman) ~ [a/alpha]tau2
    137 V1 :: [a/alpha]Q1, ..., Vn :: [a/alpha]Qn
    138 }}}
    139 
    140 
    141 The case [IOCan3] is the most complex one: we need to generate a function from the evidence generated by an implication. Such an implication generates a series of bindings, which we denote here using `[ ]`. Note that we abstract by values, types and constraints, but this is OK, because it is a System FC term.
    142 
    143 {{{
    144 W1 :: InstanceOf (forall a. Q1 => tau1) sigma2
    145 
    146 ---->
    147 
    148 W1 :: sigma2 -> (forall a. Q1 => tau1)
    149 W1 = \x -> /\a -> \(d : Q1) -> let [ ] in (W2 x)
    150 
    151 W2 :: forall a. (d : Q1) => (W2 :: InstanceOf tau1 sigma2)
    152 }}}
    153 
    154 
    155 === Design choice: `InstanceOf` and `->` ===
    156 
    157 In the designed proposed above, `->` is treated as any other type constructor. That means that if we are canonicalizing `InstanceOf (sigma1 -> sigma2) (sigma3 -> sigma4), the result is `sigma1 ~ sigma3 /\ sigma2 ~ sigma4`. That is, `->` is treated invariantly in both arguments. Other possible design choices are:
    158 
    159 * `->` treated co- and contravariantly, leading to `InstanceOf sigma3 sigma1 /\ InstanceOf sigma2 sigma4`.
    160 * Treat inly the co-domain covariantly, leading to `sigma1 ~ sigma3 /\ InstanceOf sigma2 sigma4`.
    161 
    162 Which are the the benefits of each option?
    163 
    164 
    165 == Changes to approximation ==
    166 
    167 One nasty side-effect of this approach is that the solver may produce non-Haskell 2010 types. For example, when type checking `singleton id`, where `singleton :: forall a. a -> [a]` and `id :: forall a. a -> a`, the result would be `forall a. InstanceOf a (forall b. b -> b) => [a]`. In short, we want to get rid of the `InstanceOf` constraints once a binding has been found to type check. This process is part of a larger one which in GHC is known as **approximation**.
    168 
    169 There are two main procedures to move to types without `InstanceOf` constraints:
    170 
    171 * Convert all `InstanceOf` into type equality. In the previous case, the type of `singleton id` is `forall a. a ~ forall b. b -> b => [a]`, or equivalently, `[forall b. b -> b]`.
    172 * Generate a type with the less possible polymorphism, by moving quantifiers out of the `InstanceOf` constraints to top-level. In this case, the type given to `singleton id` is `forall b. [b -> b]`.
    173 
    174 We aim to implement the second option, since it leads to types which are more similar to those already inferred by GHC. Note that this approximation only applies to unannotated top-level bindings: the user can always ask to give `[forall a. a -> a]` as a type for `singleton id` via an annotation.
    175 
    176 The procedure works by appling repeatedly the following rules:
    177 
    178 {{{
    179 InstanceOf a (forall b. Q => tau)  ---->  a ~ [b/beta]tau  /\  [b/beta]Q
    180 InstanceOf (forall b. Q => tau) a  ---->  a ~ (forall b. Q => tau)
    181 }}}
    182 
    183 The first rule is a version of [IOCon2] which applies to canonical `InstanceOf` constraints. The second rule ensures that the `InstanceOf` constraint is satisfied.
    184 
    185 //Implementation note//: change the `simplifyInfer` function in `compiler/typecheck/TcSimplify.hs` to generate candidate approximations using the previous two rules.
    186 
    187 
    188 == Changes to constraint generation ==
    189 
    190 Constraint generation is the phase prior to solving, in which constraints reflecting the relations between types in the program are created. We describe constraint generation rules in this section using the same formalism as OutsideIn(X), that is, as a judgement `Gamma |- e: tau --> C`: under a environment `Gamma`, the expression `e` is assigned type `tau` subject to constraints `C`.
    191 
    192 In principle, the only rule that needs to change is that of variables in the term level, which is the point in which instantiation may happen:
    193 
    194 {{{
    195    x : sigma \in \Gamma        alpha fresh
    196 --------------------------------------------- [VAR]
    197 Gamma |- x : alpha --> InstanceOf alpha sigma
    198 }}}
    199 
    200 
    201 Unfortunately, this is not enough. Suppose we have the following piece of code:
    202 
    203 {{{
    204 (\f -> (f 1, f True)) (if ... then id else not)
    205 }}}
    206 
    207 We want to typecheck it, and we give the argument `f` a type variable `alpha`, and each of its appearances the types variables `beta` and `gamma`. The constraints that are generated are:
    208 
    209 {{{
    210 InstanceOf beta  alpha  [usage in (f 1)]
    211 InstanceOf gamma alpha  [usgae in (f True)]
    212 InstanceOf alpha (forall a. a -> a)
    213 InstanceOf alpha (Bool -> Bool)
    214 }}}
    215 
    216 At this point we are stuck, since we have no rule that could be applied. One might think about applying transitivity of `InstanceOf`, but this is just calling trouble, because it is not clear how to do this without losing information.
    217 
    218 Our solution is to make this situation impossible by generating `beta ~ alpha` and `gamma ~ alpha` instead of their `InstanceOf` counterparts. We do this by changing the [VAR] rule in such a way that `~` is generated when the variable comes from an unannotated abstraction or unannotated `let`. The environment is responsible for keeping track of this fact for each binding, by a small tag.
    219 
    220 {{{
    221     x :_~ sigma \in \Gamma
    222 ------------------------------ [VAR~]
    223 Gamma |- x : sigma --> nothing
    224 }}}
    225 
    226 Notice the change from `:` to `:_~` in the rule. As stated above, some other rules need to be changed in order to generate this tag for their enclosed variables:
    227 
    228 {{{
    229 alpha fresh    Gamma, (x :_~ alpha) |- e : tau --> C
    230 ----------------------------------------------------
    231        Gamma |- \x -> e : alpha -> tau --> C
    232 
    233 Gamma, (x :_~ alpha) |- e : tau1 --> C1    Gamma, (x :_~ alpha) |- b : tau2 --> C2
    234 ----------------------------------------------------------------------------------
    235            Gamma |- let x = e in b : tau2 --> C1 /\ C2 /\ alpha ~ tau1
    236 }}}
    237 
    238 With this change, our initial example leads to an error (`f cannot be applied to both Bool and Int`), from which one can recover by adding an extra annotation. This is a better situation, though, that getting stuck in the middle of the solving process.
    239 
    240 
    241 //Implementation note//: the type of local environments, `TcLclEnv` in `compiler/typecheck/TcExpr.hs`, needs to be upgraded to take into account whether a variable is tagged as generating `~`. Maybe just change `type TcTypeEnv = NameEnv (TcTyThing, Bool)`?
    242 
    243 //Implementation note//: constraint generation appears in GHC source code as `tcExpr` in `compiler/typecheck/TcExpr.hs`.
    244 
    245 === Adding propagation ===
    246 
    247 Still, this is not enough! Suppose you write the following code:
    248 
    249 {{{
    250 f :: (forall a. a -> a) -> (Int, Bool)
    251 f x = (x 1, x True)
    252 
    253 g = f (\x -> x)
    254 }}}
    255 
    256 None of them will work! The problem is that, in the first case, we do not use the information in the signature when generating constraints for the function. Thus, `x` will be added to the environment with the `~` tag, effectively forbidding to be applied to both `Int`s and `Bool`s.
    257 
    258 In the second case the solver does not know that it should generalize at the point of the `\x -> x` expression. Thus, we will come to a point where we have `tau -> tau ~ forall a. a -> a`, which leads to an error, since quantified and not quantified types cannot be equated.
    259 
    260 However, we expect both cases to work. After all, the information is there, we only have to make it flow to the right place. This is exactly the goal of adding **propagation** to the constraint generation phase. Lucikly, GHC already does some propagation now, as reflected in the type of the function `tcExpr`. The main change is that, whereas the current implementation pushes down and infers shapes of functions, the new one is simpler, and only pushes information down. A PDF with the rules is available at https://goo.gl/FjNiui
    261 
    262 The most surprising rule is the one named [AppFun], which applies when we have a block of known expressions `f1 ... fm` whose type can be recovered from the environment followed by some other freely-shaped expressions. For example, the case of `f (\x -> x)` above, where `f` is in the environment of `g`. In that case, we compute the type that the first block ought to have, and propagate it to the rest of arguments.
    263 
    264 The reason for including a block of `fi`s is to cover cases such as `runST $ do ...`, or more clearly `($) runST (do ...)`, where some combinators are used between functions. Should the rule [AppFun] only include the case `f e1 ... fm`, the common `runST $ do ...` could not be typed without an annotation.
    265 
    266 
    267 == Type classes and families ==
    268 
    269 There are some unwanted interactions between type classes and families and the `InstanceOf` constraint. For example, if we want to type `[] == []`, we obtain as canonical constraints:
    270 
    271 {{{
    272 Eq a /\ InstanceOf a (forall b. [b])
    273 }}}
    274 
    275 At this point we are stuck. We need to instantiate `b` before `Eq` can scrutinize its argument to check whether an instance is available. One possibility is to instantiate by default every type linked to a variable appearing in a type class or type family.
    276 
    277 That solution poses its own problems. Consider the following type family:
    278 
    279 {{{
    280 type family F a b
    281 type instance F [a] b = b -> b
    282 }}}
    283 
    284 Using the rule of always instantiating, the result of `gamma ~ F [Int] b, InstanceOf b (forall a. a -> a)` is `gamma ~ (delta -> delta) -> (delta -> delta)`. We have lost polymorphism in a way which was not expected. What we hoped is to get `gamma ~ (forall a. a -> a) -> (forall a. a -> a)`.
    285 
    286 Thus, we need to have a way to instantiate variables appearing in type classes and families, but only as necessary. We do this by temporarily instantiating variables when checking for axiom application, and returning extra constraints which make this instantiation possible if the match is successful. For example, in the previous case we want to apply the axiom `forall e. Eq e => Eq [e]`, and thus we need to instantiate `a`. We return as residual constraints `Eq e /\ Eq a ~ Eq [e]`, and the solver takes care of the rest, that is, `InstanceOf [e] (forall b. [b])`.
    287 
    288 //Implementation note//: the changes need to be done in the `lookupInstEnv'` function in `compiler/types/InstEnv.hs`. The solver needs to be changed at `compiler/typecheck/TcInteract.hs` to use the new information.
     1Moved to [[wiki:ImpredicativePolymorphism/Impredicative-2015]]