This page tracks @akio's attempt at reviving @nomeata's work on [wiki:NestedCPR].
~~Latest code can be found at https://github.com/takano-akio/ghc/compare/master...nested-cpr .~~
Was last rebased by @mpickering here: https://github.com/ghc/ghc/compare/master...mpickering:nested-cpr. [https://phabricator.haskell.org/D4244 D4244] is probably more up to date.
== Design
=== Changes to `CPRResult`
* In the product case (`RedProd`), keep one `DmdResult` for each component. This is
the main change in the branch.
* Add `NeverReturns` contructor. It means that no value is returned from this
expression. It's different from `ThrowsExn` in that it might perform some
side effect. Indeed it might be an IO-performing infinite loop. The addition
of this constructor is necessary to infer a nested CPR property for a tail-
recursive function that does some I/O.
In summary, `CPRResult` is changed from:
{{{
NoCPR
/ \
RetProd RetSum ConTag
}}}
to:
{{{
NoCPR
/ \
RetProd [DmdResult] RetSum ConTag
\ /
NeverReturns
}}}
=== Changes to `DmdResult`
* The new `Converges` contructor is added to track definite convergence.
{{{
Dunno CPRResult
/ \
ThrowsExn Converges CPRResult (new)
/
Diverges
}}}
This information is used to tell when it is safe to perform a nested CPR
worker-wrapper transformation. Unpacking a nested component in the return
value is safe only when that component definitely converges.
=== Changes to the demand analyzer
* Infer nested CPR property when possible.
* Slightly strenghten the strictness analysis so that in the following code,
both `foo` and `bar` get the strictness `S(SS)` on their first argument.
Previously only `foo` got `S(SS)`, where `bar` got `S`.
{{{
{-# LANGUAGE BangPatterns #-}
data Complex a = !a :+ !a
foo :: Complex Double -> Int -> Complex Double
foo !x 0 = x
foo (a :+ b) _ = b :+ a
bar :: Complex Double -> Int -> Complex Double
bar x 1 = x
bar (a :+ b) _ = b :+ a
}}}
=== Changes to the worker-wrapper transformer
* Apply nested CPR transformation. For example, if a function that returns
`(# State# RealWorld, (Int, Int, Int) #)` has the CPR information
`m(t, tm(tm(t), m(t), t))`, then the worker function would return the type
`(# State# RealWorld, Int#, Int, Int #)`, unboxing the triple and one of the
`Int`s.
== Examples
=== simple.hs
This is a simple recursive function with an easy-to-spot nested CPR property. Note that `a - 1` is always convering in an obvious way.
Status: ok
{{{#!hs
module Foo where
f :: Int -> (Int, Int)
f 0 = (1, 2)
f n
| even n = f (div n 2)
| otherwise = case f (n - 1) of
(a, b) -> (a - 1, b)
}}}
=== strictness.hs
This one is trickier in that the analysis has to use the nested strictness of `foo` on `p` to give `p` a nested CPR property. Inspired by `nofib/imaginary/x2n1`.
Indeed, it's not obvious how the last branch has the nested CPR property. Consider the condition for the first branch, which will evaluate `p` and its two `Int`s completely. So, we have the components of `p` available deconstructed. According to `Note [CPR in a product case alternative]` this is enough to give `p` the nested CPR property, in the sense that we could supply a constructed product of that depth if we wanted to (because we can immediately deconstruct it when the wrapper is inlined, for example).
Status: ok
{{{#!hs
{-# LANGUAGE BangPatterns #-}
module Foo where
foo :: Int -> (Int, Int) -> (Int, Int)
foo n p
| even (n + uncurry (+) p), n /= 0 = foo (n - 1) p
| n == 0 = (1, 2)
| otherwise = p
}}}
=== strict_field.hs
This one needs a correct handling of strict constructor fields. Inspired by `nofib/imaginary/x2n1`.
Status: ok
{{{#!hs
{-# LANGUAGE BangPatterns #-}
module Foo where
data C a = C !a !a
pow :: C Double -> Int -> C Double
pow !_ 0 = C 0 1
pow !c 1 = c
pow c n
| even n = let d = pow c (div n 2) in mul d d
| otherwise = mul c (pow c (n - 1))
mul :: C Double -> C Double -> C Double
mul (C a b) (C d e) = C (a*d-b*e) (a*e+b*d)
}}}
=== strict_field1.hs
This is similar to `strict_field.hs`, but needs a more aggressive worker-wrapper.
Status: ok
Changing CPR analysis alone wouldn't help here. We need to give the function a better strictness as well.
{{{#!hs
module Foo (pow) where
data C a = C !a !a
pow :: C Double -> Int -> C Double
pow x y
| even y = pow (x `mul` x) (y `quot` 2)
| y == 1 = x
| otherwise = pow (x `mul` x) ((y - 1) `quot` 2) `mul` x
mul :: C Double -> C Double -> C Double
mul (C a b) (C d e) = C (a*d-b*e) (a*e+b*d)
}}}
=== peek.hs
This example involves reading a tuple of Ints from memory.
Status: needs -fcpr-depth=4 or higher.
{{{#!hs
{-# LANGUAGE BangPatterns #-}
module Foo (peek4) where
import Foreign.Storable
import Foreign.Ptr
peek4 :: Ptr Int -> Ptr Int -> (Ptr Int -> IO (Ptr Int, Ptr Int)) -> IO (Ptr Int, Ptr Int, (Int, Int, Int, Int))
peek4 ptr end req
| end `minusPtr` ptr >= 32 = do
val <- (,,,)
<$> peekByteOff ptr 0
<*> peekByteOff ptr 8
<*> peekByteOff ptr 16
<*> peekByteOff ptr 24
let !ptr' = ptr `plusPtr` 32
return (ptr', end, val)
| otherwise = do
(ptr', end') <- req ptr
peek4 ptr' end' req
}}}
== beam-word-poke.hs
This is a real-world example taken from https://github.com/tsurucapital/beamable/blob/master/src/Data/Beamable/Internal.hs#L159.
It serializes a `Word` using a variable-length encoding.
Status: ok
{{{#!hs
module Foo(beamWordPoke) where
import Data.Bits
import Data.Monoid
import Data.Word
import Foreign.Ptr
import Foreign.Storable
newtype Poke = Poke (Ptr Word8 -> IO (Ptr Word8))
instance Monoid Poke where
mempty = Poke return
mappend (Poke a) (Poke b) = Poke $ \ptr -> a ptr >>= b
beamWordPoke :: Word -> Poke
beamWordPoke n
| next == 0 = pokeWord8 firstSeptet
| otherwise = pokeWord8 (firstSeptet .|. 0x80) <> beamWordPoke next
where
firstSeptet :: Word8
firstSeptet = fromIntegral $ n .&. 0x7F
next = n `shiftR` 7
pokeWord8 :: Word8 -> Poke
pokeWord8 w = Poke $ \p -> do poke p w; return $! p `plusPtr` 1
}}}