Opened 5 years ago

Closed 3 years ago

#9509 closed bug (fixed)

No automatic specialization of inlinable imports in 7.8

Reported by: dolio Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.8.3
Keywords: Cc: alpmestan@…, nfrisby
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case: simplCore/should_compile/T9509
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

According to the GHC manual, any uses of an imported definition that is overloaded and INLINABLE will automatically cause a SPECIALIZE to be generated for those uses, if appropriate. However, this no longer appears to be the case in 7.8(.3). Here is a simple test case:

module A (foo) where

import Data.IORef

foo :: Ord a => a -> IO a
foo x = newIORef x >>= readIORef >>= \y -> case compare x y of LT -> return x ; _ -> return y
{-# INLINABLE foo #-}
module Main (main) where

import A

main = foo (5 :: Int) >>= print

foo is constructed to be long enough that GHC 7.8.3 will elect to not inline it.

When compiling with 7.6.3, the core contains the following:

Main.$sfoo1
  :: GHC.Types.Int
     -> GHC.Prim.State# GHC.Prim.RealWorld
     -> (# GHC.Prim.State# GHC.Prim.RealWorld, GHC.Types.Int #)
[GblId,
 Arity=2,
 Caf=NoCafRefs,
 Str=DmdType LL,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=2, Value=True,
         ConLike=True, WorkFree=True, Expandable=True,
         Guidance=IF_ARGS [20 0] 55 30}]
Main.$sfoo1 =
  \ (x_XkE :: GHC.Types.Int)
    (eta_B1 :: GHC.Prim.State# GHC.Prim.RealWorld) ->
    case GHC.Prim.newMutVar#
           @ GHC.Types.Int @ GHC.Prim.RealWorld x_XkE eta_B1
    of _ { (# ipv_amT, ipv1_amU #) ->
    case GHC.Prim.readMutVar#
           @ GHC.Prim.RealWorld @ GHC.Types.Int ipv1_amU ipv_amT
    of ds1_amJ { (# ipv2_Xn8, ipv3_Xna #) ->
    case x_XkE of wild_axu { GHC.Types.I# x#_axw ->
    case ipv3_Xna of _ { GHC.Types.I# y#_axA ->
    case GHC.Prim.<# x#_axw y#_axA of _ {
      GHC.Types.False -> ds1_amJ;
      GHC.Types.True -> (# ipv2_Xn8, wild_axu #)
    }
    }
    }
    }
    }

Main.$sfoo :: GHC.Types.Int -> GHC.Types.IO GHC.Types.Int
[GblId,
 Arity=2,
 Caf=NoCafRefs,
 Str=DmdType LL,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=True,
         ConLike=True, WorkFree=True, Expandable=True,
         Guidance=ALWAYS_IF(unsat_ok=True,boring_ok=True)}]
Main.$sfoo =
  Main.$sfoo1
  `cast` (<GHC.Types.Int>
          -> Sym <(GHC.Types.NTCo:IO <GHC.Types.Int>)>
          :: (GHC.Types.Int
              -> GHC.Prim.State# GHC.Prim.RealWorld
              -> (# GHC.Prim.State# GHC.Prim.RealWorld, GHC.Types.Int #))
               ~#
             (GHC.Types.Int -> GHC.Types.IO GHC.Types.Int))

...

------ Local rules for imported ids --------
"SPEC A.foo [GHC.Types.Int]" [ALWAYS]
    forall ($dOrd_sxj :: GHC.Classes.Ord GHC.Types.Int).
      A.foo @ GHC.Types.Int $dOrd_sxj
      = Main.$sfoo

However, in 7.8.3 and newer (I'm actually using 7.9.20140725 for this particular output, but 7.8.3 is similar), we see the following:

Main.main1
  :: GHC.Prim.State# GHC.Prim.RealWorld
     -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)
[GblId,
 Arity=1,
 Str=DmdType <L,U>,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=1, Value=True,
         ConLike=True, WorkFree=True, Expandable=True,
         Guidance=IF_ARGS [0] 100 0}]
Main.main1 =
  \ (s_aIb :: GHC.Prim.State# GHC.Prim.RealWorld) ->
    case ((A.foo @ GHC.Types.Int GHC.Classes.$fOrdInt Main.main2)
          `cast` (GHC.Types.NTCo:IO[0] <GHC.Types.Int>_R
                  :: GHC.Types.IO GHC.Types.Int
                     ~R# (GHC.Prim.State# GHC.Prim.RealWorld
                          -> (# GHC.Prim.State# GHC.Prim.RealWorld, GHC.Types.Int #))))
           s_aIb
    of _ [Occ=Dead] { (# ipv_aIe, ipv1_aIf #) ->
    GHC.IO.Handle.Text.hPutStr2
      GHC.IO.Handle.FD.stdout
      (GHC.Show.$fShowInt_$cshow ipv1_aIf)
      GHC.Types.True
      ipv_aIe
    }

There is no local rules section in 7.8.3. Putting a manual SPECIALIZE pragma in the main module generates comparable code to 7.6.3, so it does not appear to be a failure at that level. Rather, GHC seems to just not be generating the SPECIALIZE equivalent.

Presumably this is a regression, but even if not, the manual should be revised.

Change History (7)

comment:1 Changed 5 years ago by alpmestan

Cc: alpmestan@… added

comment:2 Changed 3 years ago by nfrisby

Cc: nfrisby added

comment:3 Changed 3 years ago by nfrisby

With no pragma, foo has the following unfolding and gets inlined.

 Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True,
         Guidance=ALWAYS_IF(arity=0,unsat_ok=True,boring_ok=True)
         Tmpl= foo1 `cast` ...}

With the INLINABLE pragma, foo has the following unfolding and does not get inlined -- the final code passes a dictionary to foo.

 Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 210 60
         Tmpl= <lightly simplified RHS, including ETA expansion> `cast` ...}

(Edit upon Edit: The RHS of the unfolding is indeed just a cast.)

In both cases, its the unfolding is just a cast, due to eta-expanding wrt IO's state token. This cast prevents auto-specialization; see Note [Specialisation shape].

https://github.com/ghc/ghc/blob/c36904d66f30d4386a231ce365a056962a881767/compiler/specialise/Specialise.hs#L1607

I don't have much insight yet into "what's going wrong", but I do find it unexpected that adding INLINABLE causes worse code (in this simple of a case, anyway).

At the moment, it seems like this ticket serves as the real example mentioned in Note [Specialisation shape]: "Thus far I have not tried to fix this (wait till there's a real example)." (Edit: see comment:4 -- the cast shouldn't be there!)

Last edited 3 years ago by nfrisby (previous) (diff)

comment:4 Changed 3 years ago by nfrisby

I now have a theory. Based on some of the notes I've found in the source code, I'm concluding (/recalling!) that "unfoldings should only be simplified gently" and "gently implies no eta-expansion". Notably, SimplUtils.updModeForStableUnfoldings sets sm_eta_expand = False, and the simplifier invokes that before entering a stable unfolding or a rule.

However, in this case, we're seeing eta-expansion in the unfolding! I believe that is due to this use of gopt Opt_DoLambdaEtaExpansion dflags in SimplUtils.mkLam --- it's checking the DynFlag instead of checking the sm_eta_expand flag within the SimplEnv (which is not passed to mkLam). Parts of the simplifier, notably simplLam, call mkLam.

I've made "the fix" locally, and it indeed fixes this ticket's example case.

comment:5 Changed 3 years ago by simonpj

Nicolas I think you are spot on. I'll fix.

comment:6 Changed 3 years ago by Simon Peyton Jones <simonpj@…>

In e07ad4d/ghc:

Don't eta-expand in stable unfoldings

See SimplUtils Note [No eta expansion in stable unfoldings],
and Trac #9509 for an excellend diagnosis by Nick Frisby

comment:7 Changed 3 years ago by simonpj

Resolution: fixed
Status: newclosed
Test Case: simplCore/should_compile/T9509

Thanks for the analysis!

Note: See TracTickets for help on using tickets.