#3076 closed bug (wontfix)
Make genericLength tail-recursive so it doesn't overflow stack
Reported by: | Syzygies | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Compiler | Version: | 6.10.1 |
Keywords: | genericLength | Cc: | |
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | Runtime performance bug | Test Case: | |
Blocked By: | Blocking: | ||
Related Tickets: | Differential Rev(s): | ||
Wiki Page: |
Description
A likely use for genericLength is to count lists of more than Int elements. However, the source code is not tail-recursive, making a poor "consumer", so genericLength easily overflows stack.
Here is the source code from 6.10.1:
genericLength :: (Num i) => [b] -> i genericLength [] = 0 genericLength (_:l) = 1 + genericLength l
Here is a proposed alternative:
genericLength ∷ (Num i) ⇒ [b] → i genericLength = len 0 where len n [] = n len n (_:xt) = len (n+1) xt
In my test application (enumerating the 66,960,965,307 atomic lattices on six atoms) this alternative avoids overflowing the stack.
[This is not the same issue as http://hackage.haskell.org/trac/ghc/ticket/2962]
Attachments (1)
Change History (6)
comment:1 Changed 11 years ago by
difficulty: | → Unknown |
---|---|
Resolution: | → invalid |
Status: | new → closed |
comment:2 Changed 11 years ago by
Keywords: | genericLength added |
---|---|
Resolution: | invalid |
Status: | closed → reopened |
That was enlightening. Thanks.
The library genericLength blows stack on very short lists. My first proposal fails on infinite lists, using Peano arithmetic. So they both fail reasonable tests. However, one can make a logarithmic improvement in the stack usage of the library function, and pass both tests:
module Main where import Data.Int data Nat = Zero | Succ Nat deriving (Show, Eq) instance Num Nat where fromInteger 0 = Zero fromInteger n = Succ $ fromInteger $ n - 1 Zero + n = n Succ m + n = Succ (m + n) Zero * _ = Zero Succ m * n = n + m * n abs n = n signum Zero = Zero signum (Succ _) = Succ Zero instance Ord Nat where compare Zero Zero = EQ compare Zero _ = LT compare _ Zero = GT compare (Succ m) (Succ n) = compare m n genericLength1 ∷ (Num i) ⇒ [b] → i genericLength1 [] = 0 genericLength1 (_:l) = 1 + genericLength1 l genericLength2 ∷ (Num i) ⇒ [b] → i genericLength2 = len 0 where len n [] = n len n (_:xt) = len (n+1) xt genericLength3 ∷ (Num i) ⇒ [b] → i genericLength3 = len 1 0 where len _ n [] = n len m n (_:xt) | m == n = n + len (n+n) 1 xt | otherwise = len m (n+1) xt intLength1, intLength2, intLength3 ∷ [a] → Int64 intLength1 = genericLength1 intLength2 = genericLength2 intLength3 = genericLength3 natLength1, natLength2, natLength3 ∷ [a] → Nat natLength1 = genericLength1 natLength2 = genericLength2 natLength3 = genericLength3 list :: Int64 -> [Bool] list 0 = [] list n = True : list (n-1) main ∷ IO () main = do -- print $ intLength1 $ list $ 2^19 -- fails w/ 8388608 byte stack print $ intLength2 $ list $ 2^32 print $ intLength3 $ list $ 2^32 print $ natLength1 (repeat 1) > 5 -- print $ natLength2 (repeat 1) > 5 -- fails print $ natLength3 (repeat 1) > 5
Getting this to work requires specializing genericLength, to avoid class overhead. This is nevertheless preferable to writing a length function from scratch, which might fall into either of these traps.
Getting this to work also requires a "good producer". I was surprised to discover that in this context, [1..n] isn't a "good producer".
Changed 11 years ago by
comment:3 Changed 11 years ago by
I ran a timing test, to compare intLengthN for N=1,2,3 on an example that N=1 (the existing library code) could handle:
print $ sum $ map intLength1 $ take 1000 $ iterate tail $ list $ 2^18
intLength3 runs slower (of course) than the invalid intLength2, but 5.5x faster than the current library code intlength1.
comment:4 Changed 10 years ago by
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
This assumes that (==)
terminates, which it might not do, e.g. with some datatypes representing the reals.
comment:5 Changed 10 years ago by
Type of failure: | → Runtime performance bug |
---|
That isn't an equivalent definition, e.g.:
works
works, butfails
doesn't terminate:If you restrict the type to
Int
orInteger
, then it will work thanks to #2962, though.