Opened 3 years ago

Last modified 3 years ago

#13014 new bug

Seemingly unnecessary marking of a SpecConstr specialization as a loopbreaker

Reported by: nfrisby Owned by:
Priority: normal Milestone:
Component: Compiler Version: 8.0.1
Keywords: SpecConstr Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by nfrisby)

SpecConstr creates the following rules, with the right cajoling. (I've used unboxed integers merely to avoid w/w, which only adds noise in this example.)

(See comment:3 for a Minimal Working Example.)

data VL :: [k] -> * where
  VLZ :: VL '[]
  VLS :: VL as -> VL (a ': as)

lengthVL :: GHC.Types.SPEC -> VL as -> Int#
{-# INLINABLE lengthVL #-}
lengthVL !sPEC VLZ = 0#
lengthVL !sPEC (VLS vl) = 1# +# lengthVL sPEC vl

==================== Tidy Core rules ====================
"SC:lengthVL0" [ALWAYS]
    forall (@ a) (@ (as :: [*])) (sc :: VL as).
      lengthVL @ (a : as)
               SPEC
               (VLS
                  @ * @ (a : as) @ as @ a @~ (<a : as>_N :: (a : as) ~ (a : as)) sc)
      = lengthVL_$slengthVL1 @ a @ as sc
"SC:lengthVL1" [ALWAYS]
    forall (sc :: VL '[]).
      lengthVL @ '[] SPEC sc
      = lengthVL_$slengthVL sc

But the cons-case specialization, lengthVL_$slengthVL1, is marked as a loopbreaker. Consider the following idiomatic usage to see why that is problematic.

class KnownSpine (as :: [k]) where sing :: VL as
instance KnownSpine '[] where   -- '
  {-# INLINE sing #-}
  sing = VLZ
instance KnownSpine as => KnownSpine (a ': as) where   -- '
  {-# INLINE sing #-}
  sing = VLS sing

example :: Int
example = I# $ lengthVL SPEC (sing :: VL '[Int,Char,Bool])

The right-hand side of example would ideally be simplified to 3. It's not, ultimately because the specialization is marked as a loopbreaker.

I switched on -dverbose-core2core to track the simplification of the right-hand side of example. 1) The sing dictionary is unfolded to constructor applications. 2) Those are floated out but then pre-inlined-unconditionally right back in before CSE gets a chance to spoil it. 3) Thus the VLS rule fires. But it only fires once, because of the loopbreaker designation!

I have not yet investigated why the specialization in the cons-case is marked a loopbreaker.

(Even if the specialization wasn't being considered a loopbreaker --- which immediately makes this approach to optimization a dead-end --- I don't know with any certainty how to force the specialization to be inlined in those cases where its right-hand side was relatively large.)

Attachments (1)

T13014.tar.gz (695 bytes) - added by nfrisby 3 years ago.
A demonstration for ticket #13014

Download all attachments as: .zip

Change History (12)

comment:1 Changed 3 years ago by nfrisby

I anticipate that the specialization would be marked a loopbreaker if it were created in a downstream module (i.e. as an orphan rule). But my naive attempt at that didn't work, so I'm suspecting that SpecConstr doesn't consider imported ids (even INLINABLE ones)?

comment:2 Changed 3 years ago by simonpj

Where does the specialisation come from? You can't be showing us all the code to reproduce! Can you do that? Then I can answer about the loop breaker stuff.

Simon

Changed 3 years ago by nfrisby

Attachment: T13014.tar.gz added

A demonstration for ticket #13014

comment:3 Changed 3 years ago by nfrisby

I've attached three hs files as T13014.tar.gz. They demonstrate the issue when they're all compiled with -O. (The second module specifies OPTIONS_GHC -fspec-constr.)

The characteristic of this ticket is that the -ddump-simpl for the GADTSpecConstr3.hs module does not define test as simply 3. More specifically: I expect that the SC rule for lengthVL applied to VLS should fire three times, but it's only firing once.

Embarassingly and frustratingly, I've lost the VLZ rule with my interim edits and I can't get it back. Thankfully, it's not actually necessary for this ticket.

Last edited 3 years ago by nfrisby (previous) (diff)

comment:4 Changed 3 years ago by nfrisby

I'm seeing the same behavior if I disable -fspec-constr and instead try to use SPECIALIZE INLINE.

{-# SPECIALIZE INLINE lengthVL :: SPEC -> VL (a ': as) -> Int# #-}

I get the expected rule.

==================== Tidy Core rules ====================
"SPEC lengthVL" [ALWAYS]
    forall (@ k) (@ (a :: k)) (@ (as :: [k])).
      lengthVL @ k @ (a : as)
      = lengthVL_$slengthVL @ k @ a @ as

but the specialization lengthVL_$slengthVL is again marked as a loopbreaker, and so it only fires once in the definition of test, as with the -fspec-constr route.

Edit: A ha! If I put that SPECIALIZE INLINE pragma in the downstream module instead of in the same module that defines lengthVL, then the rule does indeed fire three times in the right-hand side of test. If I add a rule for the '[] case, then GHC simplifies all the way to test = 3. (This is related to my suspicion in comment:1.)

Last edited 3 years ago by nfrisby (previous) (diff)

comment:5 Changed 3 years ago by simonpj

Ah I see. Consider

f [] = 0
f (x:xs) = f xs

boo = f [1,2,3,4,5,6,6]

Would you expect f to inline 7 times in the RHS of boo yielding 0? Maybe it'd be good, but GHC doesn't do that, because inlining a recursive function repeatedly can make the compiler loop at compile time, or just to arbitrary code growth. Yet in this case it's good.

In your example we have

lengthVL_$slengthVL
  = ...
    case lengthVL @ k @ as sPEC_XyR sc_sCE of wild_Xc { __DEFAULT ->
    ...

and a RULE

"SC:lengthVL0"
    forall ...
      lengthVL @ k @ (a : as) sc1_sCC
               (GADTSpecConstr.VLS ...)
      = lengthVL_$slengthVL @ k @ a @ as sc_sCE sc1_sCC

So if I inline lengthVL_$slengthVL I might create an opportunity for the RULE to fire; which gives ries to a new call of lengthVL_$slengthVL, and the whole process repeasts. Perhpas indefinitely.

It's all very much like the f/boo case above, and for that reason GHC marks lengthVL_$slengthVL as a loop breaker.

There is extensive commentary under Note [Choosing loop breakers] in OccurAnal.

I wish I knew how to make this better, but I don't. Yet anyway.

comment:6 Changed 3 years ago by nfrisby

Description: modified (diff)

comment:7 Changed 3 years ago by nfrisby

Thank you for your attention, Simon.

I've confirmed your example: these modules A and B give the same behavior: the specialization in A is a loopbreaker and the rule only fires once in B.

{-# Language MagicHash #-}
{-# OPTIONS_GHC -fspec-constr #-}
module A where
import GHC.Prim
import GHC.Types

f :: [a] -> Int#
f [] = 0#
f (x:xs) = f xs

f' a b = f (a:b)   -- a call pattern to specialize
{-# Language MagicHash #-}
module B where
import GHC.Types
import A

boo = I# (f [1,2,3,4,5,6,6])

And your explanation makes total sense: unexpected supercompilation could have terrible consequences. Also, that's something I'm usually aware of, when I'm not wearing my blinders :).


The SPECIALIZE INLINE alternative I mentioned in comment:4 is interesting. It's possible as a "workaround" for lengthVL precisely because the type constructors (of the spine) are 1-to-1 with the data constructors; thus SPECIALIZE can be used to emulate -fspec-constr.


I opened #13016 regarding the SPECIALIZE INLINE specialization being a loopbreaker -- that seems like a bug.

comment:8 in reply to:  5 Changed 3 years ago by nfrisby

Replying to simonpj:

[snip]

...

I wish I knew how to make this better, but I don't. Yet anyway.

One quick idea: introduce GHC.Types.DEEP_SPEC alongside SPEC. The key difference is that the occurrence analyzer would try very hard to not mark specializations involving DEEP_SPEC as loopbreakers. The docs for this would likely emphasize some of the potential downsides of this unbounded inlining.

Unless you think that's promising, let's close this ticket as "Not a Bug". I don't recall wanting aggressive inlining of a recursive function that wasn't driven by a type-refining GADT, so #13016 ought to handle my needs.

comment:9 Changed 3 years ago by simonpj

Can you be more specific about "try very hard"? The thing is, as far as I can see, it is loop breaker, as my comment shows. That is, not marking it as such would allow arbitrary inlining.

comment:10 in reply to:  9 Changed 3 years ago by nfrisby

Replying to simonpj:

Can you be more specific about "try very hard"? The thing is, as far as I can see, it is loop breaker, as my comment shows. That is, not marking it as such would allow arbitrary inlining.

My basic thought is that DEEP_SPEC is to SPEC as SPECIALIZE INLINE is to SPECIALIZE. (Thus, SPEC_INLINE might be a better/more consistent name than DEEP_SPEC.)

The hypothetical power user would intentionally choose DEEP_SPEC instead of SPEC specifically to allow arbitrary inlining. With SPEC (i.e. normal -fspec-constr), GHC assumes the burden of ensuring that the specialization cannot lead to an infinite loop. By choosing DEEP_SPEC, the user intentionally transfers this burden from GHC to themselves. The docs for DEEP_SPEC would include a warning similar to those already present for SPECIALIZE INLINE ("Warning: you can make GHC diverge by using SPECIALISE INLINE on an ordinarily-recursive function") and RULES ("GHC makes no attempt to make sure that the rules are confluent or terminating. For example: ... This rule will cause the compiler to go into an infinite loop.").

DEEP_SPEC would be a tool for the power user who wrote the f and boo functions you came up with and did want GHC to inline f 7 times so that boo was identified as 0 at compile-time.

In terms of mechanism, I don't have enough command of OccAnal to be much more specific. My basic idea is that it wouldn't mark a DEEP_SPEC specialization as a loopbreaker unless 1) the specialization directly calls itself by name or 2) for whatever reason, OccAnal has no other choice. In other words, GHC would only mark a DEEP_SPEC specialization as a loopbreaker if it concluded that not doing so would always lead to an infinite inlining --- "possibly infinite inlining" would no longer be cause enough to mark a particular specialization as a loopbreaker.

I'll mention one last time that I do not have a real-world motivation for this new behavior in-hand. It doesn't seem like an obviously bad idea (buoyed by saying "power user" and "warnings in the docs"). But it's also not obviously better than all currently existing choices for this hypothetical power user! I just haven't thought about this long enough yet.

Last edited 3 years ago by nfrisby (previous) (diff)

comment:11 Changed 3 years ago by simonpj

OK fine. Let's leave this thread open and return to it if we have applications.

Note: See TracTickets for help on using tickets.