Changes between Version 9 and Version 10 of ImplementingTreesThatGrow/TreesThatGrowGuidance


Ignore:
Timestamp:
May 31, 2018 8:28:07 AM (16 months ago)
Author:
simonpj
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ImplementingTreesThatGrow/TreesThatGrowGuidance

    v9 v10  
    22[http://www.jucs.org/jucs_23_1/trees_that_grow/jucs_23_01_0042_0062_najd.pdf The Trees that Grow (TTG) idiom] can be used to provide different forms of extensions and variations on an AST. Since April 2018, the HsSyn AST inside GHC supports the TTG idiom. This page provides a set of guiding principles for GHC developers on how to understand and use the TTG idiom in HsSyn.
    33
    4 == Context and Scope ==
    5 The new HsSyn AST supporting the TTG idiom (from now on referred to as TTG HsSyn) is engineered to subsume five different representations of Haskell syntax:
     4== Context and Scope ==
     5
     6The new HsSyn AST supporting the TTG idiom (from now on referred to as TTG HsSyn) is designed to subsume five different representations of Haskell syntax:
    67
    78* AST GhcPs: the AST used in GHC's parsing phase
     
    1112* AST HSE:   the AST used in an external tool such as Haskell-Src-Exts
    1213
     14By "subsume" we mean that it should be possible to instantiate TTG HsSyn to serve all five use-cases.
     15
    1316The subsumption of above five ASTs is done by providing instances for the extension type families.
    1417For instance, the AST for GHC's parsing, renaming, and typechecking are defined by providing instances of the extension type families using accordingly the indices `GhcPs`, `GhcRn`, and `GhcTc`.
    1518[https://github.com/ghc/ghc/blob/master/compiler/hsSyn/HsExpr.hs#L737-L835 Here] is the actual code providing such instances for the `HsExpr` datatype of expressions in the TTG HsSyn.
    16  
    1719
    18 Subsuming above five trees fixes the scope of the design space. For example, TTG HsSyn is not intended to subsume the AST in the GHC's backend (i.e., GHC Core), but it can indeed be used for other purposes like prettyprinting and IDEs.
     20== General pattern for TTG ==
    1921
    20  
    21 == Guiding Principles ==
     22In general a TTG-idiom data type has
     23* A type parameter, called the ''phase descriptor'', that indexes which particular instantiation is required
     24* One ''extension field'' in each data constructor, whose type is given by a type family.  By giving phase-specific instances to this type family, we can add phase-specific information to the constructor.
     25* One unary ''extension constructor'' for each data type, whose argument type is given by a type family. By giving phase-specific instances to this type family, we can add extra phase-specific constructors to the type.
    2226
    23 a. The instantiation of TTG HsSyn should result in a tree that does not have extra fields and constructors.
    24    
     27For example:
     28{{{
     29data Exp x
     30  = Var (XVar x) (IdP x)
     31  | Lam (XLam x) (IdP x)  (Exp x)
     32  | App (XApp x) (Exp x) (Exp x)
     33  | New (XNew x)
     34
     35type family XVar x
     36type family XLam x
     37type family XApp x
     38type family XNew x
     39
     40}}}
     41Here the phase descriptor is `x`.  The first field of each constructor (of type `XVar x` etc) are the extension fields.  The data construcotr `XNew` is the extension constructor.
     42
     43All fields of the data constructors except the first (extension) field are called ''payload fields''.  They are present in every instantiation of the data type.
     44
     45== Guiding Principles ==
     46
     47The design of TTG HsSyn follows these principles:
     48
     49a. The base TTG HsSyn should have all the constructors common across all five ASTs (the ''common data constructors''). These constructors should have, as payload fields, all the fields common across all five ASTs.
     50
     51b. Note, however, that the type of a payload field of a constructor may vary with phase.  For example, in `Lam` above, the first payload field has type `Id x`, and that may vary with phase:
     52{{{
     53type family IdP x
     54type instance IdP GhcPs = RdrName
     55type instance IdP GhcRn = Name
     56type instance IdP GhcTc = Id
     57}}}
     58  But it is still a payload field, because every instantiation of `Exp` has a lambda with a binder; albeit the type of that binder field varies.  This happens in HsSyn: for example, the type of the common (payload) field of the common constructor `HsVar`of `HsExpr x` is `IdP x` where `IdP` is a type family and `x` the phase descriptor.
     59
     60c. The non-payload (i.e. phase-specific) fields of a data constructor are grouped together and introduced via the extension field.  Similarly the phase-specific data constructors are introduced using the extension constructor.
     61
     62d. The instantiation of TTG HsSyn, for a particular phase, should result in a tree that has no redundant fields and constructors.
     63
    2564   For example, the `HsExpr GhsPs` expressions of AST GhcPs should not have the constructor `HsUnboundVar` of the post-renaming phases, or its `HsMultiIf` constructor should also not have an unused field (of the type `Type`) to store the related type produced in the typechecking phase.
    2665
     
    2867
    2968   For example, if `HsExpr GhsPs` expressions of AST GhcPs had the constructor `HsUnboundVar` then it had to depend on the code defining `UnboundVar` (a field of `HsUnboundVar`) in the renaming phase, or if its constructor `MultiIf` had a field of type `Type` then it had to depend on the code defining `Type` in the typechecking phase.
    30    
    31    
    32 b. The base TTG HsSyn should have all the constructors common across all five ASTs, and these constructors should have all the fields common across all five ASTs (even if the type of some fields vary from an AST to another).
    33    
    34    SPJ refers to these common fields as "payload fields" (as opposed to extension fields).
    35 
    36 c. The constructors that are not common are introduced using TTG's new constructor extensions.
    37 
    38 d. For common constructors, their fields that are not common are grouped together and introduced using TTG's new field extensions.
    39 
    40 e. For common constructors, their common fields with a varying type, are given a type using a new type family that extracts from the phase descriptor the type specific to each AST.
    41 
    42    For example, the type of the common (payload) field of the common constructor `HsVar`of `HsExpr x` is `IdP x` where `IdP` is a type family and `x` the phase descriptor.
    4369
    4470
     
    5379-- ...
    5480
    55 data ExpPs 
     81data ExpPs
    5682  = Var RdrName
    57   | Abs RdrName ExpPs
     83  | Lam RdrName ExpPs
    5884  | App ExpPs   ExpPs
    5985}}}
     
    6793data ExpRn
    6894  = Var Name
    69   | Abs Name  ExpRn
     95  | Lam Name  ExpRn
    7096  | App ExpRn ExpRn
    7197  | UVar UnboundVar
     
    80106data ExpTc
    81107  = Var  Id
    82   | Abs  Id   ExpTc
     108  | Lam  Id   ExpTc
    83109  | App  Type ExpTc ExpTc
    84110  | UVar UnboundVar
     
    90116module AST where
    91117
    92 data Exp x 
    93   = Var (XVar x) (XId x)
    94   | Abs (XAbs x) (XId x) (Exp x)
     118data Exp x
     119  = Var (XVar x) (IdP x)
     120  | Lam (XLam x) (IdP x) (Exp x)
    95121  | App (XApp x) (Exp x) (Exp x)
    96122  | New (XNew x)
    97123
    98124type family XVar x
    99 type family XAbs x
     125type family XLam x
    100126type family XApp x
    101127type family XNew x
    102128
    103 type family XId  x
     129type family IdP  x
    104130}}}
    105131
     
    118144
    119145type instance XVar Ps = ()
    120 type instance XAbs Ps = ()
     146type instance XLam Ps = ()
    121147type instance XApp Ps = ()
    122148type instance XNew Ps = Void
    123149
    124 type instance XId  Ps = RdrName
     150type instance IdP  Ps = RdrName
    125151}}}
    126152
     
    136162
    137163type instance XVar Rn = ()
    138 type instance XAbs Rn = ()
     164type instance XLam Rn = ()
    139165type instance XApp Rn = ()
    140166type instance XNew Rn = UnboundVar
    141167
    142 type instance XId  Rn = Name
     168type instance IdP  Rn = Name
    143169}}}
    144170
     
    154180
    155181type instance XVar Tc = ()
    156 type instance XAbs Tc = ()
     182type instance XLam Tc = ()
    157183type instance XApp Tc = Type
    158184type instance XNew Tc = UnboundVar
    159185
    160 type instance XId  Tc = Id
     186type instance IdP  Tc = Id
    161187}}}