Changes between Version 46 and Version 47 of NewtypeOptimizationForGADTS

Sep 8, 2016 8:58:14 PM (3 years ago)



  • NewtypeOptimizationForGADTS

    v46 v47  
    11= Optimising newtype-like data types with existentials =
    3 == Motivation == 
     3== Motivation ==
    44Consider this data type declaration
    77  MkT :: !(Foo a) -> T
    9 So `a` is an existentially bound variable, and we cannot use a newtype for `T`.  So, as things stand, a value of type `T` will be represented by a heap-allocated box containing a single pointer to a value of type `(Foo ty)` for some `ty`. 
     9So `a` is an existentially bound variable, and we cannot use a newtype for `T`.  So, as things stand, a value of type `T` will be represented by a heap-allocated box containing a single pointer to a value of type `(Foo ty)` for some `ty`.
    1111This is tantalising.  The extra indirection and allocation gains nothing.  Since `MkT` is strict in its only argument, we could (at codegen time) ''represent'' a value of type `T` by a value of type `Foo ty`.
    1414that we guarantee an efficient unboxed representation for certain data types.
    16 The ticket to track this idea is #1965. 
     16The ticket to track this idea is #1965.
    1818See also [ this OCaml pull request], which does exactly the same kind of thing for OCaml.
    7676Do such constraints just pile on extra junk?
    78 === Strictness === 
     78=== Strictness ===
    7979Unlike a true `newtype`, pattern matching on the constructor ''must'' force the contents to maintain type safety.  In particular, matching on the constructor reveals an existential and/or type information. As Dan Doel found, and pumpkin relayed in, we have to be careful not to reveal such information without forcing the evidence. Since we're using the newtype optimization, the evidence is in the contained field itself.
    139139instance GetLabel (Label1 a) a where getLabel (Label1 x) = x
    141 {* can't do either of these:
    142 newtype PersonId = PersonId Int    -- no type param
     141{- can't do either of these:
     142newtype PersonId = PersonId Int      -- no type param
    143143newtype PersonId a = PersonId Int    -- type param not used
    144 *}
    146146data PersonId a where
    180180== Implementation ==
    182 I know few details about Core, but I would ''guess'' that the only Core feature we'd need to add is a flag for each data type indicating whether it is eligible for the newtype optimization. If so, then that optimization can be applied in code generation when types are dropped. There might be some future tuning to adjust how the inliner treats the constructor, but that doesn't seem at all urgent for now.
    184 That said, it would be nice also to try to avoid "double forcing" when digging through the constructors.
     182Implementation is unfortunately tricky. Simply eliminating the boxing in Stg is
     183easy, and this by itself saves us two words per value + pointer dereferencing.
     184However, the generated code will be ugly, and if we could do this in Core
     185instead of Stg, the simplifier would be able to do some follow-up optimizations
     186and generate good code.
     188To be more specific, we want to do these transformations:
     192D arg1 arg2 ... argN
     194nv_arg (where nv_arg is the only non-void argument)
     197(but we somehow need to bind other args or do substitution. If we do this Stg
     198though we don't need to bind those args as unarise doesn't care about what a
     199void argument is as long as it's void it gets rid of it and it can check
     200void-ness by looking at Id's type)
     205case <exp1> of
     206  D arg1 arg2 ... argN -> <exp2>
     208let arg1 = ...
     209    arg2 = ...
     210    arg3 = ...
     211 in <exp2>
     213(we know only one of these args will be non-void, but all of them should be
     214bound as they can be referred in <exp2>)
     216If we do this in Stg we lose some optimization opportunities and generate ugly
     217code. For example, if the first transformation happens in a let-binding RHS
     218maybe simplifier decides to inline it as it can't duplicate work after the
     219transformation. Similarly it can decide to inline the non-void argument after
     220second transformation which may lead to further optimizations etc.
     222For an example of an ugly code, suppose we had this:
     225case <exp1> of
     226  D (T x) -> <exp2>
     229in Stg this looks like
     232case <exp1> of
     233  D v -> case v of
     234           T x -> <exp2>
     237So now if we do the second transformation we get
     240let v = <exp1> in
     241case v of
     242  T x -> <exp2>
     245but ideally we'd get
     248case <exp1> of
     249  T x -> <exp2>
     252Simplifier would be able to do this after the second transformation.
     254So the problem is
     256- If we implement this in Stg we generate ugly code, and miss some optimization
     257  opportunities (and arguably it doesn't buy us much, it saves 2 words per
     258  allocation + pointer dereferencing)
     260- Implementing this in Core is very hard, if not impossible, without losing
     261  type safety.
     263(copied from [ the mailing list discussion])