This page tracks @akio's attempt at reviving @nomeata's work on NestedCPR.

Latest code can be found at . Was last rebased by @mpickering here: D4244 is probably more up to date.


Changes to CPRResult

  • In the product case (RedProd), keep one DmdResult for each component. This is the main change in the branch.
  • Add NeverReturns contructor. It means that no value is returned from this expression. It's different from ThrowsExn in that it might perform some side effect. Indeed it might be an IO-performing infinite loop. The addition of this constructor is necessary to infer a nested CPR property for a tail- recursive function that does some I/O.

In summary, CPRResult is changed from:

                   /    \
              RetProd   RetSum ConTag


                   /    \
  RetProd [DmdResult]    RetSum ConTag
                  \     /

Changes to DmdResult

  • The new Converges contructor is added to track definite convergence.
               Dunno CPRResult
               /         \
           ThrowsExn      Converges CPRResult (new)

This information is used to tell when it is safe to perform a nested CPR worker-wrapper transformation. Unpacking a nested component in the return value is safe only when that component definitely converges.

Changes to the demand analyzer

  • Infer nested CPR property when possible.
  • Slightly strenghten the strictness analysis so that in the following code, both foo and bar get the strictness S(SS) on their first argument. Previously only foo got S(SS), where bar got S.
{-# LANGUAGE BangPatterns #-}
data Complex a = !a :+ !a

foo :: Complex Double -> Int -> Complex Double
foo !x 0 = x
foo (a :+ b) _ = b :+ a

bar :: Complex Double -> Int -> Complex Double
bar x 1 = x
bar (a :+ b) _ = b :+ a

Changes to the worker-wrapper transformer

  • Apply nested CPR transformation. For example, if a function that returns (# State# RealWorld, (Int, Int, Int) #) has the CPR information m(t, tm(tm(t), m(t), t)), then the worker function would return the type (# State# RealWorld, Int#, Int, Int #), unboxing the triple and one of the Ints.



This is a simple recursive function with an easy-to-spot nested CPR property. Note that a - 1 is always convering in an obvious way.

Status: ok

module Foo where
f :: Int -> (Int, Int)
f 0 = (1, 2)
f n
  | even n = f (div n 2)
  | otherwise = case f (n - 1) of
    (a, b) -> (a - 1, b)


This one is trickier in that the analysis has to use the nested strictness of foo on p to give p a nested CPR property. Inspired by nofib/imaginary/x2n1.

Indeed, it's not obvious how the last branch has the nested CPR property. Consider the condition for the first branch, which will evaluate p and its two Ints completely. So, we have the components of p available deconstructed. According to Note [CPR in a product case alternative] this is enough to give p the nested CPR property, in the sense that we could supply a constructed product of that depth if we wanted to (because we can immediately deconstruct it when the wrapper is inlined, for example).

Status: ok

{-# LANGUAGE BangPatterns #-}
module Foo where

foo :: Int -> (Int, Int) -> (Int, Int)
foo n p
  | even (n + uncurry (+) p), n /= 0 = foo (n - 1) p
  | n == 0 = (1, 2)
  | otherwise = p


This one needs a correct handling of strict constructor fields. Inspired by nofib/imaginary/x2n1.

Status: ok

{-# LANGUAGE BangPatterns #-}
module Foo where
data C a = C !a !a

pow :: C Double -> Int -> C Double
pow !_ 0 = C 0 1
pow !c 1 = c
pow c n
  | even n = let d = pow c (div n 2) in mul d d
  | otherwise = mul c (pow c (n - 1))

mul :: C Double -> C Double -> C Double
mul (C a b) (C d e) = C (a*d-b*e) (a*e+b*d)


This is similar to strict_field.hs, but needs a more aggressive worker-wrapper.

Status: ok

Changing CPR analysis alone wouldn't help here. We need to give the function a better strictness as well.

module Foo (pow) where

data C a = C !a !a

pow :: C Double -> Int -> C Double
pow x y
  | even y    = pow (x `mul` x) (y `quot` 2)
  | y == 1    = x
  | otherwise = pow (x `mul` x) ((y - 1) `quot` 2) `mul` x

mul :: C Double -> C Double -> C Double
mul (C a b) (C d e) = C (a*d-b*e) (a*e+b*d)


This example involves reading a tuple of Ints from memory.

Status: needs -fcpr-depth=4 or higher.

{-# LANGUAGE BangPatterns #-}
module Foo (peek4) where

import Foreign.Storable
import Foreign.Ptr

peek4 :: Ptr Int -> Ptr Int -> (Ptr Int -> IO (Ptr Int, Ptr Int)) -> IO (Ptr Int, Ptr Int, (Int, Int, Int, Int))
peek4 ptr end req
  | end `minusPtr` ptr >= 32 = do
      val <- (,,,)
          <$> peekByteOff ptr 0
          <*> peekByteOff ptr 8
          <*> peekByteOff ptr 16
          <*> peekByteOff ptr 24
      let !ptr' = ptr `plusPtr` 32
      return (ptr', end, val)
  | otherwise = do
      (ptr', end') <- req ptr
      peek4 ptr' end' req


This is a real-world example taken from It serializes a Word using a variable-length encoding.

Status: ok

module Foo(beamWordPoke) where

import Data.Bits
import Data.Monoid
import Data.Word
import Foreign.Ptr
import Foreign.Storable

newtype Poke = Poke (Ptr Word8 -> IO (Ptr Word8))

instance Monoid Poke where
  mempty = Poke return
  mappend (Poke a) (Poke b) = Poke $ \ptr -> a ptr >>= b

beamWordPoke :: Word -> Poke
beamWordPoke n
    | next == 0 = pokeWord8 firstSeptet
    | otherwise = pokeWord8 (firstSeptet .|. 0x80) <> beamWordPoke next
        firstSeptet :: Word8
        firstSeptet = fromIntegral $ n .&. 0x7F
        next = n `shiftR` 7

pokeWord8 :: Word8 -> Poke
pokeWord8 w = Poke $ \p -> do poke p w; return $! p `plusPtr` 1
Last modified 21 months ago Last modified on Apr 3, 2018 11:57:08 AM