Opened 4 years ago

Closed 3 years ago

#11443 closed bug (wontfix)

SPECIALIZE pragma does not work + compilation times regression in GHC 8.0-rc1

Reported by: danilo2 Owned by: bgamari
Priority: highest Milestone: 8.0.2
Component: Compiler Version: 8.0.1-rc1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by bgamari)

Hello! I've just hit a strange issue. I might missinterpret how the SPECIALIZE pragma works, but if I understand correctly, then there is a bug in GHC. Lets consider this simple code:

module A:

{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE TypeOperators        #-}
{-# LANGUAGE KindSignatures       #-}
{-# LANGUAGE DataKinds            #-}
{-# LANGUAGE TypeFamilies         #-}
{-# LANGUAGE FlexibleInstances    #-}

module A where

import Prelude
import GHC.TypeLits

-- TF utils

type family (a :: Nat) :== (b :: Nat) where
    a :== a = 'True
    a :== b = 'False

type family If cond (a :: Nat) (b :: Nat) where
    If 'True  a b = a
    If 'False a b = b

-- Heavy TF computations

type family HeavyTF (n :: Nat) (i :: Nat) :: Nat where
    HeavyTF n 0 = 0
    HeavyTF n i = If (HeavyTF' n :== 0) (HeavyTF n (i - 1)) 1

type family HeavyTF' (n :: Nat) :: Nat where
    HeavyTF' 0 = 0
    HeavyTF' n = HeavyTF' (n - 1)

-- Params for tests (bigger numbers = longer compile times)

type family NatOf a :: Nat
type instance NatOf Int    = 120
type instance NatOf String = 120

-- Type class to check GHC behavior
class PerfC1 a where
    perfc1 :: a -> String
instance CheckOk (HeavyTF 10 (NatOf a)) => PerfC1 a where
    perfc1 _ = "oh"
    {-# INLINABLE perfc1 #-}

class CheckOk (n :: Nat)
instance CheckOk 0 -- where

main_cache :: IO ()
main_cache = do
    print $ perfc1 (1 :: Int)
    print $ perfc1 ("a" :: String)

perfc1_Int :: Int -> String
perfc1_Int = perfc1

perfc1_String :: String -> String
perfc1_String = perfc1

{-# SPECIALIZE perfc1 :: Int -> String #-}
{-# SPECIALIZE perfc1 :: String -> String #-}

-----

perfc1' :: PerfC1 a => a -> String
perfc1' = perfc1
-- {-# INLINABLE perfc1' #-}
-- {-# NOINLINE perfc1' #-}

{-# SPECIALIZE perfc1' :: Int -> String #-}
{-# SPECIALIZE perfc1' :: String -> String #-}

module Test1:

import A

main = do
    print $ perfc1 (1 :: Int)
    print $ perfc1 ("a" :: String)

module Test2:

import A

main = do
    print $ perfc1' (1 :: Int)
    print $ perfc1' ("a" :: String)

module Test3:

import A

main = do
    print $ perfc1_Int (1 :: Int)
    print $ perfc1_String ("a" :: String)

Compile with: ghc 7.10.3 : ghc -O2 -fenable-rewrite-rules Test<n>.hs ghc 8.0-rc1 : ghc -O2 -fenable-rewrite-rules -freduction-depth=0 Test<n>.hs

(I've used -fenable-rewrite-rules explicitly just to be sure it is enabled. We can omit it because -O2 enables it)

If module A was already compiled the compilation times for ghc 7.10.3 were as follow:

  • Test1: ~ 16s
  • Test2: ~ 16s
  • Test3: almost instant

And for ghc 8.0-rc1 were as follow:

  • Test1: ~ 28s
  • Test2: ~ 28s
  • Test3: almost instant

Here are 2 bugs to note:

1) the compilation times are much longer with new GHC

2) the specialize pragmas do not work

EDIT

There is yet another funny issue here. If I try to compile the modules like so: time ghc -O2 -fenable-rewrite-rules -ddump-spec B.hs GHC prints the following lines and hangs forever eating GBs of RAM:

[1 of 2] Compiling A                ( A.hs, A.o )

==================== Specialise ====================
Result size of Specialise
  = {terms: 60, types: 80, coercions: 3,048,032}

Rec {
$dShow_a20B :: Show String
[LclId,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True,
         Guidance=ALWAYS_IF(arity=0,unsat_ok=True,boring_ok=True)}]
$dShow_a20B = GHC.Show.$fShow[]_$s$fShow[]1

$dPerfC1_a1Rk :: PerfC1 Int
[LclId,
 Arity=1,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 20 60}]

EDIT 2

I would like to take the opportunity here to ask a related question – I was trying to specify the rewrite rules manually, but GHC rejects the following ones (It accepts one of them, but not both, somehow thinking that perfcx is monomorphic). I know that the rules are fired when GHC uses CORE, so typeclasses are "just normal polymorphic objects" and "hidden inputs", but are we able to specify them somehow?

{-# RULES
"perfcx/Int"    forall (a :: Int).    perfcx (a :: Int)    = perfc1_Int    a
"perfcx/String" forall (b :: String). perfcx (b :: String) = perfc1_String b
    #-}

perfcx = perfc1
{-# NOINLINE perfcx #-}

[...]

But If I'm dumping the rules generated by GHC (using -ddump-rules) I can see both of the rules generated, so there probably is a way to define them:

"SPEC perfc1'" [ALWAYS]
    forall ($dPerfC1 :: PerfC1 Int). perfc1' @ Int $dPerfC1 = $sperfc3
"SPEC perfc1'" [ALWAYS]
    forall ($dPerfC1 :: PerfC1 String).
      perfc1' @ String $dPerfC1
      = $sperfc1
"SPEC/A perfc1 @ Int" [ALWAYS]
    forall (tpl :: PerfC1 Int). perfc1 @ Int tpl = $sperfc3
"SPEC/A perfc1 @ String" [ALWAYS]
    forall (tpl :: PerfC1 String). perfc1 @ String tpl = $sperfc1

Change History (17)

comment:1 Changed 4 years ago by danilo2

Operating System: Unknown/MultipleMacOS X

comment:2 Changed 4 years ago by danilo2

Description: modified (diff)

comment:3 Changed 4 years ago by danilo2

Operating System: MacOS XUnknown/Multiple

comment:4 Changed 4 years ago by danilo2

Priority: highhighest
Summary: SPECIALIZE pragma does not work + performance drop in GHC 8.0-rc1SPECIALIZE pragma does not work + compilation times regression in GHC 8.0-rc1

comment:5 Changed 4 years ago by danilo2

Description: modified (diff)

comment:6 Changed 4 years ago by danilo2

Description: modified (diff)

comment:7 Changed 4 years ago by danilo2

Description: modified (diff)

comment:8 Changed 4 years ago by danilo2

Description: modified (diff)

comment:9 Changed 4 years ago by danilo2

Description: modified (diff)

comment:10 Changed 3 years ago by bgamari

Milestone: 8.0.2

comment:11 Changed 3 years ago by bgamari

Status: newinfoneeded
Type of failure: None/UnknownCompile-time performance bug

comment:12 Changed 3 years ago by bgamari

Status: infoneedednew

Oh dear, this one somehow slipped through the cracks during the release cycle; I'll try to have a look this week.

comment:13 Changed 3 years ago by bgamari

Owner: set to bgamari

comment:14 Changed 3 years ago by bgamari

Description: modified (diff)

    

comment:15 Changed 3 years ago by bgamari

Looking at the verbose-core2core output from Test2 it seems quite clear that the specialisation rules are firing as expected: While the dump-ds output contains an application of perfc1' @Int ... as expected, this is rewritten to A.$sperfc3 ... in the first simplifier phase (where $sperfc3 is indeed the expected Int-specialised binding).

So, the question is: why are things slowing down despite this? I know that Richard did make some changes in how type families are reduced (see 3f5d1a13f112f34d992f6b74656d64d95a3f506d and 3e1b8824c849d063c7354dbdf63ae2910cf0fdfc). Perhaps the next place to look is the tc-trace output.

comment:16 Changed 3 years ago by bgamari

Status: newinfoneeded

danilo2, could you please try this with 8.0.1 and if necessary update the testcase to reproduce the issue. I'm having some trouble reproducing your findings.

comment:17 Changed 3 years ago by bgamari

Resolution: wontfix
Status: infoneededclosed

I'm going to close this for now. Danilo2, if you can test with 8.0.1 then feel free to reopen.

Note: See TracTickets for help on using tickets.