Opened 5 years ago

Closed 5 years ago

#9344 closed bug (duplicate)

takeWhile does not participate in list fusion

Reported by: dfeuer Owned by:
Priority: normal Milestone:
Component: libraries/base Version: 7.8.3
Keywords: Cc: hvr, ekmett
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #9132 Differential Rev(s):
Wiki Page:

Description

takeWhile doesn't do the list fusion thing. This alternative definition seems to fix that, at least to a great extent. It fused completely in a simple test, and incompletely but still usefully in a more complex one. I don't know how to write the appropriate translate/untranslate RULES for it yet.

{-# LANGUAGE ScopedTypeVariables #-}

takeWhileFB :: forall a . (a -> Bool) -> [a] -> [a]
takeWhileFB p xs = build tw'
  where
    tw' :: forall b . (a -> b -> b) -> b -> b
    tw' kons knil = foldr go knil xs
      where
        go x rest | p x       = x `kons` rest
                  | otherwise = knil

Change History (7)

comment:1 Changed 5 years ago by nomeata

I have some idea to make the translate/untranslate dance generic (and thus reliably usable by people wanting list fusion for their own functions) but I’ll ponder it some more.

Can you explain why it fails in complex cases? And how does it perform? Is a fused takeWhile (<10) [1..10000000] slower than a non-fused? What happens with takeWhile (<10) (cycle [1,100]) (if cycle fuses at all).

comment:2 Changed 5 years ago by dfeuer

I wouldn't call it "failing" so much as "incomplete success". I haven't investigated in detail, and may not be good enough with Core to figure it out anyway. The simple example

print $ length $ takeWhileFB (<10000000) [1..20000000]

allocates virtually nothing. The more complex example

print $ length $ filter (\x->x `rem` 3 == 0) $
  takeWhileFB (<10000000) $ filter even [1..20000000]

allocates a substantial amount, but only something like a third of what Prelude.takeWhile does. I haven't attempted any benchmarking, and I'm too tired right now to look into those test cases. As for speed, if I'm not very much mistaken it takes a lot of slowness and/or a good number of mispredicted branches to make up for the cache effects that excessive allocation will have when combined with non-trivial code, so I believe that should probably be a secondary concern.

comment:3 Changed 5 years ago by nomeata

As for speed, if I'm not very much mistaken it takes a lot of slowness and/or a good number of mispredicted branches to make up for the cache effects that excessive allocation will have when combined with non-trivial code, so I believe that should probably be a secondary concern.

True, and I’m not worried about that; I’m worried about the short-circuting of takeWhile: Does it properly stop the producer from calculating the rest of the list or not. From looking at the code I believe it does, though.

comment:4 Changed 5 years ago by nomeata

I have some idea to make the translate/untranslate dance generic (and thus reliably usable by people wanting list fusion for their own functions) but I’ll ponder it some more.

I have pondered them some more, but they didn’t work out so far.

comment:5 Changed 5 years ago by dfeuer

I messed around with RULES a bit, and the following seem to work okay for starters. Some more rules will be needed to make map/takeWhile and takeWhile/map and whatever else do the right thing. One thing I'm not too clear about: what's the advantage of the explicitly recursive default definition over takeWhile p = foldr (\x r -> if p x then x:r else []) []?

{-# INLINE [0] takeWhileFB #-}
takeWhileFB :: (elt -> lst -> lst) -> lst -> (elt -> Bool) -> elt -> lst -> lst
takeWhileFB kons knil p = \x rest -> if p x then x `kons` rest else knil

{-# NOINLINE [1] takeWhile #-}
takeWhile :: (a -> Bool) -> [a] -> [a]
takeWhile _ []          =  []
takeWhile p (x:xs)
            | p x       =  x : takeWhile p xs
            | otherwise =  []


{-# RULES
"takeWhile"     [~1] forall p xs. takeWhile p xs = build (\kons knil -> foldr (takeWhileFB kons knil p) knil xs)
"takeWhileList" [1]  forall p.    foldr (takeWhileFB (:) [] p) [] = takeWhile p
"takeWhileFB"        forall kons knil p q. takeWhileFB (takeWhileFB kons knil p) knil q =
                        \x rest -> if (q x && p x) then x `kons` rest else knil
 #-}

comment:6 Changed 5 years ago by nomeata

One thing I'm not too clear about: what's the advantage of the explicitly recursive default definition over takeWhile p = foldr (\x r -> if p x then x:r else []) []

Nothing per se, unless benchmarking shows that the explicit version is faster, I guess.

It might even be that simply

takeWhile p xs = build (\kons knil -> foldr (takeWhileFB kons knil p) knil xs)
{-# INLINEABLE takeWhile #-}

is good enough, assuming the non-inlined takeWhile gets compiled to good code (i.e. build, foldr and takeWhileFB are inlined). But I don’t have much experiences to predict that, so you’ll have to experiment and look at core to find out

comment:7 Changed 5 years ago by dfeuer

Resolution: duplicate
Status: newclosed
Note: See TracTickets for help on using tickets.