Opened 3 years ago

Last modified 2 years ago

#12747 new feature request

INLINE vs NOINLINE vs <nothing> give three different results; two would be better

Reported by: MikolajKonarski Owned by:
Priority: normal Milestone:
Component: Compiler Version: 8.0.1
Keywords: Inlining Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #12603 #12781 Differential Rev(s):
Wiki Page:

Description (last modified by MikolajKonarski)

INLINE vs NOINLE vs <nothing> for this function

https://github.com/LambdaHack/LambdaHack/blob/e7b31a0d937b6ef6e53665eab23663dcaf4ced81/Game/LambdaHack/Client/UI/DrawM.hs#L145

produces distinct heap memory allocation figures in all three cases, see

https://ghc.haskell.org/trac/ghc/ticket/12603?replyto=8#comment:7

and some later comments, but I'd like to move here the discussion of the specific feature request that <nothing> should at any given point be equivalent to either INLINE or NOINLINE.

Three different results are very unwieldy while optimizing code, because one cannot tweak the INLINE pragma for functions in isolation, via fixing INLINE or NOINLINE for all other functions and varying only one, because the optimal setting for some functions may be <nothing>, and it naturally varies (as it should; that is fine; but in each particular codebase/compilation flags state it should be equal to either of INLINE or NOINLINE).

It seems almost as if GHC considers a function for inlining in some stages, optimizes accordingly, but changes its mind in other places, not being bound by either INLINE nor NOINLINE, and so the resulting stack of optimizations is different than in either case. IMHO that's too complex behaviour for the programmer to follow, it should irreversibly decide whether to inline at exactly the same place it first takes into account the NO/INLINE pragmas, if present.

[Edit: another example is the code in description of ticket https://ghc.haskell.org/trac/ghc/ticket/12603, where INLINE, NOINLINE and <nothing> each produce different "MUT time" in the place marked with "and NOINLINE lands in between".]

Change History (9)

comment:1 Changed 3 years ago by MikolajKonarski

Actually, a weaker guarantee would still be very useful and may be preferable: the above for functions that are referred to only once and, generalizing, for functions that may be inlined in many places, let <nothing> be equivalent to INLINABLE and a certain set of inline and noinline at each occurrence of the function.

Right now this weaker condition doesn't hold, because the function from the test is referred to only once (taking into account any other inlining).

comment:2 Changed 3 years ago by MikolajKonarski

Description: modified (diff)

comment:3 Changed 3 years ago by MikolajKonarski

comment:4 Changed 3 years ago by MikolajKonarski

Description: modified (diff)

comment:5 Changed 3 years ago by MikolajKonarski

Description: modified (diff)

comment:6 Changed 3 years ago by MikolajKonarski

Description: modified (diff)

comment:7 Changed 3 years ago by MikolajKonarski

Actually, I now see INLINABLE is named similarly to INLINE only by coincidence and also by coincidence it's used (though it's much more powerful for that) together with inline as a per-call-side analogue of INLINE.

I think what we need is more generally giving a precise and symmetric and orthogonal semantics to the optimization-related pragmas (including adding many more). If somebody writes a proper GHC proposal for that, please close this ticket and let's have the discussion there.

I also hereby drop the feature request that the behaviour X of GHC in the absence of the pragma controlling X for a particular piece of code should be equivalent to the source code with pragma X or it's negation. I see GHC may control X in a much more fine-grained way than source code manipulation permits (e.g., at the level of Core), so it's OK (even if not ideal) if GHC produces something smarter than the programmer can possibly express.

However, I still think major GHC optimizations should possess informal semantics in terms of source code manipulation and should be controllable via a complete set of pragmas that on the level of granularity available to the programmer produce exactly the same results as said source code manipulations. I think we need per-call-site, per-expression (where applicable), per-definition, recursive-per-definition (where applicable), per-module and per-project versions of local (only within the module) and global (everywhere) versions of positive and negative versions of INLINE, KEEP_UNFOLDING_IN_HI_FILE, SPECIALIZABLE (a part of what is currently INLINABLE), FLOAT_OUT and a few more pragmas for GHC optimizations that have or can be made to have the key property that they have a clear, simple semantics in terms of original source code transformation.

Granted, a few of these combinations don't make sense, e.g., per-module and especially per-project INLINE. If GHC can do much smarter things for a chunk of program than what the simple source-manipulation semantics implies (e.g., by looking at other, distant bits of code or measuring sizes and complexity of Core resulting from subexpressions), let's keep the smart version to be used when neither pragma X nor it's negation is specified, or when some special pragma X_HIPER is given and let's nevertheless have pragma X do the simple, naive thing. I see this goes in the direction of meta-programming, but to the extent that GHC in fact does it, we should embrace meta-programming instead of hiding it. The fact that the meta-programming preserves semantics (and only changes performance), doesn't make it any less in need of proper, civilized support. Also note that while we want GHC to be as smart behind the scenes as possible in the final program, we want it to behave in explicit and simple way when we juggle pragmas while optimizing by experimentation or while debugging GHC itself.

comment:8 Changed 3 years ago by simonpj

However, I still think major GHC optimizations should possess informal semantics in terms of source code manipulation and should be controllable via a complete set of pragmas that on the level of granularity available to the programmer produce exactly the same results as said source code manipulations

I'm all for that in principle. It just needs someone to devise a design, and then implement it.

A complicating factor with any per-definition control is that GHC's inlining means that that functions get inlined into each other like crazy, so it's never clear where one function begins and another ends.

comment:9 Changed 2 years ago by mpickering

Keywords: Inlining added
Note: See TracTickets for help on using tickets.