Opened 3 years ago

Last modified 19 months ago

#12463 new feature request

SPECIALIZABLE pragma?

Reported by: bgamari Owned by:
Priority: low Milestone:
Component: Compiler Version: 8.0.1
Keywords: Inlining Cc: mpickering, kosmikus
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: 14917 Differential Rev(s):
Wiki Page:

Description (last modified by bgamari)

Currently it is common practice for library authors to use the INLINEABLE pragma to make it more likely that a polymorphic function should get an unfolding in the module's interface file to ensure that GHC is able to specialize. While in practice this works reasonably well, it's not really saying what we often mean: we don't want to inline, we really just want GHC to behave like each use-site's module has a SPECIALISE pragma for each concrete type that the function is used at. For instance, consider,

module ALibrary where

aLibraryFunction :: AClass a => a -> a
aLibraryFunction = {- some large expression involving methods of AClass -}


module SomeUser where

import ALibrary

aUser :: Int -> Int
aUser = {- some large expression involving aLibraryFunction -}

Ideally, we would want GHC to take and produce one specialized version of aLibraryFunction for every concrete type which it is used at. However, without an INLINEABLE function, GHC won't even consider producing an unfolding for aLibraryFunction due to its size. We can only convince GHC to produce an unfolding for aLibraryFunction if we annotate it with an INLINEABLE pragma. While this is often effective, it doesn't really say what we mean: We don't never want GHC to inline; merely to specialize. This is issue especially prevalent in code using MTL-style effects, where we have ubiquitous overloading of very frequently-used functions (e.g. bind).

Really what we want in this case is a way of indicating to GHC that a function shouldn't be inlined (use-sites replaced with the body of the function), but rather that GHC should try hard to specialize away particular type variables. This might look like,

aLibraryFunction :: AClass a => a -> a
aLibraryFunction = {- some large expression involving methods of AClass -}
{-# SPECIALISE(a) forall a. aLibraryFunction :: a -> a #-}

The list of type binders after SPECIALISE is the set of binders which GHC would attempt to specialize.

This pragma requests that GHC keep an inlining around and produce a specialized version of aLibraryFunction every time it saw a concrete instantiation of a. Moreover, the produced symbols could be declared as weak, allowing the linker to cull duplicated code when possible.

Transitive specialisation

The above SPECIALISE pragma still doesn't address the fragility of specialisation, however. Namely, consider,

module ALibrary where
class AClass
instance AClass Int 
aLibraryFunction :: AClass a => a -> a
aLibraryFunction = {- some large expression involving methods of AClass -}
{-# SPECIALISE(a) forall a. aLibraryFunction :: a -> a #-}

module AnotherLibrary where
import ALibrary
aFunction :: AClass a => a -> a
aFunction x = {- ... -} aLibraryFunction x {- ... -}

module AUser where
import AnotherLibrary
f = let x :: Int
        x = 5
    in aFunction x

Here aLibraryFunction may depend crucially on specialisation; however, the polymorphic user aFunction has no way of knowing this and may be too large for GHC to produce an unfolding automatically. This ultimately means that GHC will be unable to specialise the eventual instantiation at Int in AUser.f. This will mean that the performance characteristics of ALibrary will be rather fragile.

One (admittedly rather heavy) approach to solving this fragility is to inform GHC that aLibraryFunction's polymorphic callsites should have unfoldings, ensuring that we are able to specialise the eventual monomorphic callsite,

aLibraryFunction :: AClass a => a -> a
aLibraryFunction = {- some large expression involving methods of AClass -}
{-# SPECIALISE_RECURSIVE(a) forall a. aLibraryFunction :: a -> a #-}

Which would ensure that polymorphic use-sites of aLibraryFunction would themselves be marked as SPECIALISE_RECURSIVE, shielding users from the need to know about aLibraryFunction's expectations of the simplifier.

A definition of SPECIALISE_RECURSIVE ==

The SPECIALISE_RECURSIVE pragma can be attached to top-level identifiers. Like INLINEABLE, SPECIALISE_RECURSIVE would force GHC to produce an unfolding for the identifier to which it is attached. Unlike INLINEABLE, it also forces GHC to produce an unfolding for all top-level identifiers which contain a polymorphic call-site of an identifier marked as SPECIALISE_RECURSIVE. This ensures that GHC is able to produce specialisations for all concrete instantiations of functions marked as SPECIALISE_RECURSIVE.

Change History (21)

comment:1 Changed 3 years ago by mpickering

Cc: mpickering added

comment:2 Changed 3 years ago by bgamari

Description: modified (diff)

comment:3 Changed 3 years ago by simonpj

I think you are describing what INLINEABLE already does, although I agree that SPECIALISABLE would be a less confusing name (or synoynym).

INLINEABLE:

  • Keeps a copy of the original (desugared) source code for the function.
  • If it has any type-class overloading in its type, it specialises the function in every module where it is called, at the types at which it is called. This works even if the function is recursive.

comment:4 Changed 3 years ago by bgamari

It is true that the SPECIALISABLE proposal is essentially INLINEABLE. However, I think the recursive variant is novel (although admittedly poorly specified).

comment:5 Changed 3 years ago by simonpj

However, I think the recursive variant is novel (although admittedly poorly specified).

What is "the recursive variant"?

The Description says:

If we include an INLINEABLE pragma (as most performance-aware authors would do) then we can convince GHC to produce an unfolding, but only at the expense of lowering its inlining cost as wel.

That is not true. INLINABLE does not reduce the inlining cost. It merely (and solely) arranges to capture the entire (Core of the) source-code defnition, including if the function is recursive. No more and no less.

We really just want GHC to behave like each use-site's module has a SPECIALISE pragma for each concrete type that the function is used at.

And that is exactly what INLINABLE does.

I feel I'm missing something.

comment:6 Changed 3 years ago by bgamari

Description: modified (diff)

What is "the recursive variant"?

The variant of the RECURSIVE_SPECIALISABLE pragma that I describe in the ticket summary. A better name might be *transitive specialisable*. See #8774 for another description of the motivation for this idea.

That is not true. INLINABLE does not reduce the inlining cost. It merely (and solely) arranges to capture the entire (Core of the) source-code defnition, including if the function is recursive. No more and no less.

Indeed, this was an inaccurate statement and I've removed it.

We really just want GHC to behave like each use-site's module has a SPECIALISE pragma for each concrete type that the function is used at.

And that is exactly what INLINABLE does.

Is that true? My understanding is that INLINEABLE will merely tell GHC to produce an unfolding; it won't ensure that GHC will use that unfolding to specialise use-sites. This is the goal here; we want to /ensure/ that GHC will specialise if at all possible.

In the case of the RECURSIVE_SPECIALISABLE pragma this even means ensuring that polymorphic use-sites are also marked as INLINEABLE, to ensure that GHC can specialise the final concrete instantiation.

Last edited 3 years ago by bgamari (previous) (diff)

comment:7 Changed 3 years ago by bgamari

Description: modified (diff)

comment:8 Changed 3 years ago by simonpj

My understanding is that INLINEABLE will merely tell GHC to produce an unfolding; it won't ensure that GHC will use that unfolding to specialise use-sites.

Actually it does tell GHC to do exactly that. In other words, it already seeks to meet the goal. It may or not be working, of course.

I did try #8774 with HEAD and it worked flawlessly.

comment:9 Changed 3 years ago by bgamari

Bah, yes, I suppose you are right. The name INLINEABLE is still rather unsatisfying but that is a very minor issue.

However, putting SPECIALIZABLE aside for a moment, I do wonder there might still be value in the RECURSIVE_SPECIALIZEABLE variant. Admittedly it is an extremely large hammer, but there are sometimes cases where you really want to avoid dictionary passing and dynamic dispatch if at all possible. This is especially true of CPS'd code (the binary library, for instance), where the entire point is that we want GHC to collapse code from various points in the program into a single straight run. Currently composing binary decoders from across modules requires quite some care as a single missing INLINEABLE can have significant performance implications which can currently only be spotted by looking carefully at the simplified Core.

comment:10 Changed 3 years ago by bgamari

Here is a only-slightly-silly example demonstrating a typical case where an unsuspecting application author is bitten by the poor performance due to missing unfoldings,

-- File: Lib.hs
module Lib where
import Data.Binary
import Data.Binary.Get
import Control.Applicative
import Control.Monad

-- | Here we have a combinator carefully crafted with an
-- INLINEABLE pragma by a library author to ensure that the
-- @Binary a@ dictionary is statically resolved.
aDecoder :: Binary a => Get ([Int], a)
aDecoder = (,) <$> replicateM 4 get <*> get
{-# INLINEABLE aDecoder #-}


-- File: User.hs
module User where
import Data.Binary
import Control.Monad
import Lib

-- | Here an unsuspecting application author tries to use aDecoder
decodeIt :: Binary a => Get [([Int], a)]
decodeIt = replicateM 5 aDecoder
--{-# INLINEABLE user1 #-}
-- If the user forgets this INLINEABLE pragma then the library
-- author's care is all for naught; the user's program will be
-- a lumbering, allocating beast for reasons he has no understanding of


-- File: Main.hs
{-# LANGUAGE TypeApplications #-}
import qualified Data.ByteString.Lazy as BS
import Data.Binary.Get
import User

-- Here is the final callsite where the user instantiates
-- @a@
main :: IO ()
main = do
    bs <- BS.getContents
    print $ runGetOrFail (decodeIt @Int) bs
Last edited 3 years ago by bgamari (previous) (diff)

comment:11 Changed 3 years ago by nomeata

Cc: kopernikus added

I’m sure that Andres has interest in a RECURSIVE_SPECIALIZEABLE pragma, or some variant thereof (e.g. attached to the type class itself).

comment:12 Changed 3 years ago by bgamari

Cc: kosmikus added; kopernikus removed

comment:13 in reply to:  11 Changed 3 years ago by simonpj

Replying to nomeata:

I’m sure that Andres has interest in a RECURSIVE_SPECIALIZEABLE pragma, or some variant thereof (e.g. attached to the type class itself).

But what is RECURSIVE_SPECIALIZEABLE? It's described only in rather elliptical fashion above, and I have no clear idea of what its specification is. Would someone care to write a spec, so we can all be sure we are discussing the same thing?

comment:14 Changed 3 years ago by mpickering

Here is my understanding of what Ben means.

---

We introduce a new top-level pragma which is introduced the the syntax {-# RECURSIVE_SPECIALISABLE varid #-}.

Only top-level functions can be marked with this pragma.

For a function f which is marked with RECURSIVE_SPECIALISABLE:

  1. When f is exported, f's unfolding is included in the interface file. (As if f was marked INLINABLE).
  2. When f is used in the definition of another function g, g's unfolding is included in the interface file when g is exported. (As if g was marked INLINABLE).

---

Is that what you mean Ben? My questions are

  1. Why would you mark your function as INLINABLE rather than RECURSIVE_SPECIALISABLE?
  2. What advantages does this pragma have over including the unfoldings of all polymorphic functions ?

comment:15 Changed 3 years ago by bgamari

Is that what you mean Ben? My questions are

  1. Why would you mark your function as INLINABLE rather than RECURSIVE_SPECIALISABLE?

RECURSIVE_SPECIALISABLE carries a potentially significant cost above INLINEABLE as it may produce many more inlinings which we have to later read and decide whether to use at every callsite.

  1. What advantages does this pragma have over including the unfoldings of all polymorphic functions?

This is a good question. It depends upon whether we feel that the costs above are large enough to warrant yet another pragma. Frankly, users complain a great deal about compiler performance and one of the reasons for this is that GHC applies all of its might to all of the code it compiles with -O. In light of this it seems like giving GHC more information about where it should be focusing its attention may be worthwhile.

Last edited 3 years ago by bgamari (previous) (diff)

comment:16 Changed 3 years ago by bgamari

Description: modified (diff)

comment:17 Changed 3 years ago by MikolajKonarski

I just wanted to confirm this feature is very much needed and point to a snippet of authoritative explanation of related things: https://ghc.haskell.org/trac/ghc/ticket/12963#comment:6

comment:18 Changed 3 years ago by carter

https://phabricator.haskell.org/D2929 resolves this or potentially does?

comment:19 Changed 3 years ago by MikolajKonarski

I'd like to point out that the combination of -fexpose-all-unfoldings plus -fspecialise-aggressively has the effect of INLINEABLE on all overloaded functions in the program. This is much less fine grained than the proposed SPECIALISE_RECURSIVE, but it works for me at the cost of much higher compilation times (e.g., travis now refuses to compile my program with -O2 due to memory constraints). I wish somebody told me about that trick before I added my 600 INLINABLE and started writing corresponding 600 SCC pragmas by hand, so I'm sharing here. :)

Last edited 3 years ago by MikolajKonarski (previous) (diff)

comment:20 Changed 3 years ago by mpickering

I recently documented -fspecialise-aggressively if people are wondering what it does.

https://phabricator.haskell.org/rGHCa8c81f3c102988e0f4216b7cb5fec7958e60b4e4

comment:21 Changed 19 months ago by andrewthad

I added a related ticket. Although it is under the guise of loosening restrictions around levity polymorphism, it's in part a request for the same thing: a guarantee that specialization will happen.

Note: See TracTickets for help on using tickets.