Opened 13 years ago

Closed 3 years ago

#876 closed bug (fixed)

Length is not a good consumer

Reported by: ariep@… Owned by:
Priority: lowest Milestone: 7.6.2
Component: libraries/base Version: 6.5
Keywords: length Cc: rich.neswold@…, george.colpitts@…
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case: perf/should_run/T876
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

The program

module Main where

main = print $ length . filter odd $ [0 .. 999999999]

, compiled with ghc-6.5.20060508 or later with optimisation turned on (-O), overflows the stack when run. It seems that the generated code is not tail-recursive. 6.4.2 and 6.5 snapshots up to 20060507 do not have this problem.

This may be a silly example, but the same thing happens if you replace 'odd' with some more interesting predicate, and let the length of the input list be chosen by the user.

Change History (31)

comment:1 Changed 13 years ago by simonmar

Milestone: 6.6

Thanks for narrowing this down to a particular day! I'd be willing to bet money that this is the culprit:

http://www.haskell.org/pipermail/cvs-libraries/2006-May/004996.html

comment:2 Changed 13 years ago by igloo

I've confirmed that the patch Simon M indicates is indeed the problem, and I've copied it below for reference.

It makes sense to me that the above behaviour is seen: length is now a good consumer, but it generates 1+(1+(1+(1+... as it is consuming, and this causes a stack overflow. I don't think we can fix this while staying with fold/build fusion, so it looks to me like the patch should be reverted and the whole problem looked at post-6.6.

Do you agree, Simon PJ?

Thanks Ian

[Make length a good consumer
simonpj@microsoft**20060508142726

 Make length into a good consumer.  Fixes Trac bug #707.

 (Before length simply didn't use foldr.)

] {
hunk ./GHC/List.lhs 114
-length                  :: [a] -> Int
-length l                =  len l 0#
-  where
-    len :: [a] -> Int# -> Int
-    len []     a# = I# a#
-    len (_:xs) a# = len xs (a# +# 1#)
+length :: [a] -> Int
+length l = lenAcc 0# l
+
+{-# RULES
+"length" [~1] forall xs. length xs = foldr incL (I# 0#) xs
+"lenAcc" [1]  forall n#. foldr incL (I# n#) = lenAcc n#
+ #-}
+
+incL :: a -> Int -> Int        -- Internal
+{-# NOINLINE [0] incL #-}
+incL x n = n `plusInt` oneInt
+
+lenAcc :: Int# -> [a] -> Int   -- Internal
+lenAcc a# []     = I# a#
+lenAcc a# (_:xs) = lenAcc (a# +# 1#) xs
}

comment:3 Changed 13 years ago by simonmar

Component: Compilerlibraries/base
Milestone: 6.66.6.1

I rolled back the patch for 6.6 - linear stack space is more important than the constant factor speedup from fusion. Let's look at this again for 6.6.1.

comment:4 Changed 13 years ago by simonpj

Summary: stack overflow on 'length . filter odd $ [0 .. 999999999]'Length is not a good consumer

Changing title to reflect current bug

comment:5 Changed 13 years ago by simonpj

Milestone: 6.6.16.8
Owner: set to simonpj

comment:6 Changed 13 years ago by simonpj

This will probably get fixed as a result of revising list fusion: task #915

comment:7 Changed 13 years ago by igloo

Test Case: list003

comment:8 Changed 12 years ago by guest

probably this ticket should be joined to the #915?

comment:9 Changed 12 years ago by guest

Cc: rich.neswold@… added

comment:10 Changed 12 years ago by simonmar

Milestone: 6.8 branch6.10 branch

comment:11 Changed 11 years ago by igloo

Milestone: 6.10 branch6.12 branch

comment:12 Changed 10 years ago by igloo

Milestone: 6.12 branch6.12.3

comment:13 Changed 9 years ago by igloo

Milestone: 6.12.36.14.1
Priority: normallow

comment:14 Changed 9 years ago by igloo

Milestone: 7.0.17.0.2

comment:15 Changed 9 years ago by igloo

Milestone: 7.0.27.2.1

comment:16 Changed 8 years ago by igloo

Milestone: 7.2.17.4.1

comment:17 Changed 8 years ago by igloo

Milestone: 7.4.17.6.1
Priority: lowlowest

comment:18 Changed 7 years ago by igloo

Milestone: 7.6.17.6.2

comment:19 Changed 7 years ago by george.colpitts

Keywords: length added
Type of failure: None/Unknown

comment:20 Changed 7 years ago by george.colpitts

Type of failure: None/UnknownRuntime performance bug

I don't remember explicitly setting Type of failure to None/Unknown when I added a keyword of length but it seems I did. I have now changed the failure type to runtime performance bug. On ghc 7.4.1 the problem I see is that the space usage of length is O(n) not O(1) in the size of the input, perhaps this is a better summary of the bug. I'm assuming that this is what "length is not a good consumer" means. This of course also slows down execution speed. Given the preceding I'm not sure why priority is lowest.

*Main> length  [0 .. (2^20)]
1048577
it :: Int
(0.03 secs, 42303644 bytes)
*Main> length  [0 .. (2^30)]
1073741825
it :: Int
(23.08 secs, 42950370096 bytes)
*Main> 42950370096 / 42303644
1015.2877160180338

comment:21 in reply to:  20 Changed 7 years ago by george.colpitts

Architecture: x86Unknown/Multiple

Replying to george.colpitts: Got it, length is not a good consumer, refers to list fusion (deforestation) as explained in e.g. 7.17.4 of the ghc 7.4.1 Users Guide.

As this is on multiple platforms including x86, I have changed the architecture to Multiple/Unknown

comment:22 Changed 7 years ago by george.colpitts

Cc: george.colpitts@… added

comment:23 Changed 7 years ago by morabbin

Resolution: invalid
Status: newclosed

Would have been covered by #915, which is marked closed as invalid; doing the same here.

comment:24 Changed 7 years ago by morabbin

Owner: simonpj deleted
Resolution: invalid
Status: closednew

Premature; reopening as a performance bug. #915 was about the general stream fusion technique, which needs moar research. This is a good place to track an exemplar of the problem.

comment:25 Changed 7 years ago by simonpj

OK, so the problem in Ian's comment 6 years ago (ha ha!) was that the naive version of length using foldr does not lead to a tail recursive function. But after a little reflection I realise that what we want is foldl with a strict accumulating parameter; and that can be expressed using foldr. So here is a version that does work (notice the strict apply) in incL:

len :: [a] -> Int
len xs = foldr incL id xs 0

incL :: a -> (Int -> Int) -> Int -> Int
incL = \_ g x -> g $! (x+1)

To demonstrate:

foo :: Int -> Int -> Int
foo p q = len [p..q]

Compile with -O to get this nice tight code

Len.$wfoo :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
Len.$wfoo =
  \ (ww_sgn :: GHC.Prim.Int#) (ww1_sgr :: GHC.Prim.Int#) ->
    case GHC.Prim.># ww_sgn ww1_sgr of _ {
      GHC.Types.False ->
        letrec {
          $wgo_sgA [Occ=LoopBreaker]
            :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
          [LclId, Arity=2, Str=DmdType LL]
          $wgo_sgA =
            \ (w_sga :: GHC.Prim.Int#) (ww2_sgd :: GHC.Prim.Int#) ->
              case GHC.Prim.==# w_sga ww1_sgr of _ {
                GHC.Types.False ->
                  $wgo_sgA (GHC.Prim.+# w_sga 1) (GHC.Prim.+# ww2_sgd 1);
                GHC.Types.True -> GHC.Prim.+# ww2_sgd 1
              }; } in
        $wgo_sgA ww_sgn 0;
      GHC.Types.True -> 0
    }

So I think we can do this for length in the library. I need a clean nofib run to test, though.

Simon

comment:26 Changed 7 years ago by simonpj

Resolution: fixed
Status: newclosed
Test Case: list003perf/should_run/T876

This patch to GHC.List makes length a good consumer, for what it's worth. The nofib numbers didn't budge.

Simon

commit 82f56e5a331f9bbd5c8afda9e8d8ad71059e934b
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Thu Feb 14 08:22:08 2013 +0000

    Make 'length' into a good consumer, fixing Trac #876
    
    Trac #876 is the oldest ticket I have fixed in a long time.
    I finally figured out how to make foldr behave in a non
    space-leaky way for length.
    
    Thanks to Andy for re-opening.

>---------------------------------------------------------------

 GHC/List.lhs |   23 ++++++++++++++++++-----
 1 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/GHC/List.lhs b/GHC/List.lhs index 02d6965..049aa2a 100644
--- a/GHC/List.lhs
+++ b/GHC/List.lhs
@@ -114,12 +114,25 @@ null (_:_)              =  False
 -- | /O(n)/. 'length' returns the length of a finite list as an 'Int'.
 -- It is an instance of the more general 'Data.List.genericLength',
 -- the result type of which may be any kind of number.
+{-# NOINLINE [1] length #-}
 length                  :: [a] -> Int
-length l                =  len l 0#
-  where
-    len :: [a] -> Int# -> Int
-    len []     a# = I# a#
-    len (_:xs) a# = len xs (a# +# 1#)
+length l                =  lenAcc l 0#
+
+lenAcc :: [a] -> Int# -> Int
+lenAcc []     a# = I# a#
+lenAcc (_:xs) a# = lenAcc xs (a# +# 1#)
+
+incLen :: a -> (Int# -> Int) -> Int# -> Int incLen _ g x = g (x +# 1#)
+
+-- These rules make length into a good consumer
+-- Note that we use a higher-order-style use of foldr, so that
+-- the accumulating parameter can be evaluated strictly
+-- See Trac #876 for what goes wrong otherwise {-# RULES
+"length"     [~1] forall xs. length xs = foldr incLen I# xs 0#
+"lengthList" [1]  foldr incLen I# = lenAcc  #-}
 
 -- | 'filter', applied to a predicate and a list, returns the list of
 -- those elements that satisfy the predicate; i.e.,

comment:27 Changed 6 years ago by Ian Lynagh <igloo@…>

In fb9a2203f1964c465189c12832650df2d234fb9e/ghc:

Add a test for length not causing a stack overflow (from #876)

comment:28 Changed 6 years ago by Joachim Breitner <mail@…>

In 393ea739567206d848f53e9ca75f532a49218694/ghc:

Update test cases due to call arity

Some nice improvements on already succeeding test cases (#876, #7954
and #4267)

Test #149 needed a little change, lest call arity causes a allocation
change that we do not want to test here.

comment:29 Changed 3 years ago by George

Resolution: fixed
Status: closednew

I believe this has regressed.

 length [0..10]
11
it :: Int
(0.11 secs, 85,536 bytes)
Prelude> length [0..(10^7)]
10000001
it :: Int
(0.25 secs, 720,084,176 bytes)
Prelude> 

My guess, fwiw, is that this is connected with AMP, making list an instance of Foldable and the issue that rewrite rules cannot be applied to class methods as discussed in recent emails between Conal and SimonPJ.

I think the current priority of lowest is correct but thought it worth noting that the problem still exists. Also I think that the user's guide does not accurately reflect the state of list fusion, i.e. https://ghc.haskell.org/trac/ghc/ticket/915. The User's Guide, section 9.32.6 states "This list could readily be extended; if there are Prelude functions that you use a lot which are not included, please tell us." As this bug and ticket 915 indicates extending the list is not always easy. I can file a doc bug if people would like.

comment:30 Changed 3 years ago by nomeata

This code, compiled with `-O, does fuse, and allocates nothing (or constant amounts)

module Foo where
x :: Int -> Int
x n = length [0..(10^n)::Int]
$ ghci -fobject-code -O Foo
GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
[1 of 1] Compiling Foo              ( Foo.hs, Foo.o )
Ok, modules loaded: Foo.
Prelude Foo> :set +s
Prelude Foo> x 1
11
(0.03 secs, 14,976,744 bytes)
Prelude Foo> x 7
10000001
(0.02 secs, 0 bytes)
Prelude Foo> x 8
100000001
(0.04 secs, 0 bytes)

(almost) HEAD:

GHCi, version 8.1.20161117: http://www.haskell.org/ghc/  :? for help
[1 of 1] Compiling Foo              (.hs -> .o)
WARNING: file compiler/simplCore/SimplCore.hs, line 663
  Simplifier bailing out after 4 iterations [58, 14, 2, 2]
    Size = {terms: 96, types: 32, coercions: 0}
Ok, modules loaded: Foo (Foo.o).
Prelude Foo> :set +s
Prelude Foo> x 1
11
(0.19 secs, 94,792 bytes)
Prelude Foo> x 2
101
(0.01 secs, 94,648 bytes)
Prelude Foo> x 7
10000001
(0.01 secs, 98,568 bytes)
Prelude Foo> x 8
100000001
(0.05 secs, 98,448 bytes)

Testing this in with interpreted code is not sufficient, as the optimizer does less in that case. So so far, everything seems as expected to me.

comment:31 Changed 3 years ago by simonmar

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.