Opened 3 years ago

Closed 3 years ago

## #12241 closed bug (duplicate)

# Surprising constructor accumulation

Reported by: | dfeuer | Owned by: | |
---|---|---|---|

Priority: | normal | Milestone: | |

Component: | Runtime System | Version: | 7.10.3 |

Keywords: | Cc: | simonmar | |

Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |

Type of failure: | Runtime performance bug | Test Case: | |

Blocked By: | Blocking: | ||

Related Tickets: | #2607 | Differential Rev(s): | |

Wiki Page: |

### Description

`containers`

version 0.5.7.1 (and a few earlier versions) uses the following implementation of `fromList`

by Ross Paterson:

fromList :: [a] -> Seq a fromList = Seq . mkTree 1 . map_elem where {-# SPECIALIZE mkTree :: Int -> [Elem a] -> FingerTree (Elem a) #-} {-# SPECIALIZE mkTree :: Int -> [Node a] -> FingerTree (Node a) #-} mkTree :: (Sized a) => Int -> [a] -> FingerTree a mkTree !_ [] = EmptyT mkTree _ [x1] = Single x1 mkTree s [x1, x2] = Deep (2*s) (One x1) EmptyT (One x2) mkTree s [x1, x2, x3] = Deep (3*s) (One x1) EmptyT (Two x2 x3) mkTree s (x1:x2:x3:x4:xs) = case getNodes (3*s) x4 xs of (ns, sf) -> case mkTree (3*s) ns of !m -> Deep (3*size x1 + size m + size sf) (Three x1 x2 x3) m sf getNodes :: Int -> a -> [a] -> ([Node a], Digit a) getNodes !_ x1 [] = ([], One x1) getNodes _ x1 [x2] = ([], Two x1 x2) getNodes _ x1 [x2, x3] = ([], Three x1 x2 x3) getNodes s x1 (x2:x3:x4:xs) = (Node3 s x1 x2 x3:ns, d) where (ns, d) = getNodes s x4 xs map_elem :: [a] -> [Elem a] #if __GLASGOW_HASKELL__ >= 708 map_elem xs = coerce xs #else map_elem xs = Data.List.map Elem xs #endif {-# INLINE map_elem #-}

This uses one lazy list per "level" in the tree being constructed. I believe Paterson (and pretty much everyone else) expected that there would be `O(log n)`

pair constructors and conses live at any given time. Wadler's technique in Fixing some space leaks with a garbage collector, which the GHC commentary indicates is used in GHC, should clean up the pairs in `getNodes`

's `d`

thunks as they reach WHNF.

Lennart Spitzner dug into the unimpressive performance of the above code and using

main = evaluate $ S.fromList [(0::Int)..999999]

produced this heap profile. If I'm reading it right, this suggests that there are lots of `(,)`

and also `(:)`

constructors live, more `O(n)`

than `O(log n)`

.

I had previously found that I could improve performance by building the intermediate lists strictly, but that violates the generational hypothesis and leads to a slow-down for very large arguments (Spitzner's heap profile). Spitzner was able to come up with a very clever (but much trickier) implementation that skirted all these problems (profile) and avoids ever allocating the troublesome pairs.

So the problem is thoroughly bypassed for `containers`

, but it seems like something is not quite right here, and it might bear looking into.

### Change History (2)

### comment:1 Changed 3 years ago by

### comment:2 Changed 3 years ago by

Related Tickets: | → #2607 |
---|---|

Resolution: | → duplicate |

Status: | new → closed |

**Note:**See TracTickets for help on using tickets.

I haven't looked into this specific issue, but I just wanted to mention that the "selector thunk" optimisation that would eliminate those pairs in the heap is sometimes fragile. See #2607 for example.