Opened 10 years ago

Closed 9 years ago

Last modified 4 years ago

#3738 closed bug (fixed)

Typechecker floats stuff out of INLINE right hand sides

Reported by: rl Owned by: igloo
Priority: normal Milestone: 7.0.1
Component: Compiler Version: 6.13
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case: T3738
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by bgamari)

Small program:

foo :: Num a => [a] -> a
{-# INLINE foo #-}
foo = go 0
  where
    go m (n:ns) = m `seq` go (m+n) ns
    go m []     = m

bar :: [Int] -> Int
{-# INLINE bar #-}
bar = foo

Here is what bar looks like in the interface file:

a6de4c46e53e565ed25ab5a38910e9cb
  $wgo :: GHC.Prim.Int# -> [GHC.Types.Int] -> GHC.Prim.Int#
    {- Arity: 2, HasNoCafRefs, Strictness: LS -}
6838e3faa095285614477ebc92f54987
  bar :: [GHC.Types.Int] -> GHC.Types.Int
    {- Arity: 1, HasNoCafRefs, Strictness: Sm, Inline: INLINE,
       Unfolding: InlineRule: (arity 0 False) (Foo.bar_foo) -}
5d06906ae99b9aefa1c6d251c3f2fc46
  bar_foo :: [GHC.Types.Int] -> GHC.Types.Int
    {- Arity: 1, HasNoCafRefs, Strictness: Sm,
       Unfolding: InlineRule: (arity 0 True) (\ w :: [GHC.Types.Int] ->
                                              case @ GHC.Types.Int Foo.$wgo 0 w of ww { DEFAULT ->
                                              GHC.Types.I# ww }) -}

Note that the loop has disappeared from bar's unfolding. Also, bar_foo doesn't have an INLINE pragma.

Incidentally, GHC specialises foo here and the specialisation doesn't get an INLINE pragma, either:

  foo :: forall a. GHC.Num.Num a => [a] -> a
    {- Arity: 1, HasNoCafRefs, Strictness: L, Inline: INLINE,
       Unfolding: InlineRule: (arity 1 False) ... -}

 foo_$sfoo :: [GHC.Types.Int] -> GHC.Types.Int
    {- Arity: 1, HasNoCafRefs, Strictness: Sm,
       Unfolding: InlineRule: (arity 0 False) ... -}

"SPEC Foo.foo [GHC.Types.Int]" ALWAYS forall $dNum :: GHC.Num.Num GHC.Types.Int
  Foo.foo @ GHC.Types.Int $dNum = Foo.foo_$sfoo

Change History (10)

comment:1 Changed 10 years ago by simonpj

Milestone: 6.14 branch
Summary: INLINE function loses unfoldingTypechecker floats stuff out of INLINE right hand sides

I'm not too bothered about this:

  • It's true that bar doesn't get the loop in its unfolding. But it does if you use -fno-method-sharing. Reason: the typechecker floats (foo Int) out of the RHS of bar. I will fix this when doing the new constraint simplifier. But meanwhile it doesn't matter, does it. The top-level $wgo loop works just fine.
  • The specialised version of foo doesn't seem to get an inline pragma, but what matters is that it has an InlineRule unfolding. So that part is ok. Same for bar_foo.

I'm going to leave this open with a new title, for the new constraint simplifier.

Simon

comment:2 Changed 10 years ago by rl

I have no problems with waiting for the new constraint simplifier but would like to point out that this does matter quite a bit. Change the pragma on foo to INLINE [1] and add a rewrite rule for foo. We'd expect the rule to fire when we inline bar but this doesn't happen since foo doesn't appear in the unfolding.

comment:3 Changed 10 years ago by simonpj

Not a problem. See Note [Simplifying inside InlineRules] in SimplUtils.

Since the InlineRule for bar is active in the initial phase we will not inline foo since the inlining only becomes active in phase 1.

Simon

comment:4 Changed 10 years ago by rl

Are you saying that it's not a problem now or won't be a problem with the new constraint simplifier? At the moment, foo won't appear in bar's unfolding even with {-# INLINE [1] foo #-}. This has nothing to do with the simplifier; as you point out, it is floated out by the typechecker which doesn't care about phasing.

comment:5 Changed 10 years ago by simonpj

You are quite right. I was being stupid. The type checker really, really should not float that stuff out.

Simon

comment:6 Changed 9 years ago by igloo

Milestone: 6.14 branch6.14.1

comment:7 Changed 9 years ago by rl

This seems to be fixed in the current HEAD. However, the problem affects vector quite a bit in 6.12. Would it be easy to fix? FWIW, here is a small example which demonstrates how this breaks list fusion. The two functions must be in different modules.

foo :: Num a => a -> [a]
{-# INLINE foo #-}
foo x = map (+1) (repeat x)

bar :: Int -> [Int]
{-# INLINE bar #-}
bar x = map (+2) (foo x)

comment:8 Changed 9 years ago by simonpj

Owner: set to igloo

Ian: could you turn Romans's comment immediately above into a test? The code I get at the moment for 'bar' is

T3738b.bar =
  \ (x_aaz :: GHC.Types.Int) ->
    let {
      a_smr [Dmd=Just L] :: GHC.Types.Int
      [LclId, Str=DmdType]
      a_smr =
        case x_aaz of _ { GHC.Types.I# x1_ajM ->
        GHC.Types.I# (GHC.Prim.+# (GHC.Prim.+# x1_ajM 1) 2)
        } } in
    letrec {
      xs_smt [Occ=LoopBreaker] :: [GHC.Types.Int]
      [LclId, Str=DmdType]
      xs_smt = GHC.Types.: @ GHC.Types.Int a_smr xs_smt; } in
    xs_smt

Note the nice loop for xs_smt.

To test that this stays working, here's a test:

module T3738a where

foo :: Num a => a -> [a]
{-# INLINE foo #-}
foo x = map (+1) (repeat x)

-------------------------
module Main where

import T3738a

bar :: Int -> [Int]
{-# INLINE bar #-}
bar x = map (+2) (foo x)

main = print (bar 2 !! 10000)

Running the program with +RTS -sstderr I get

-- With ghc 6.12:
./T3738 +RTS -sstderr 
5
         953,088 bytes allocated in the heap


-- With HEAD:
./T3738 +RTS -sstderr 
5
          60,368 bytes allocated in the heap

That seems like a big enough difference that the test could spot it.

Ian, could you add that? Thanks.

Simon

comment:9 Changed 9 years ago by igloo

Resolution: fixed
Status: newclosed
Test Case: T3738

Test added.

comment:10 Changed 4 years ago by bgamari

Description: modified (diff)
Note: See TracTickets for help on using tickets.