Opened 3 years ago

Closed 3 years ago

#13056 closed bug (fixed)

Deriving Foldable causes GHC to take a long time (GHC 8.0 ONLY)

Reported by: ezyang Owned by:
Priority: normal Milestone: 8.2.1
Component: Compiler Version: 8.0.2-rc2
Keywords: deriving-perf Cc: RyanGlScott
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case: perf/compiler/T13056
Blocked By: Blocking:
Related Tickets: #12234 Differential Rev(s):
Wiki Page:

Description (last modified by ezyang)

This file never finishes compiling with optimization (-O) on GHC 8.0.1, and GHC 8.0.2 (dated 20161213):

{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}

module Bug where
import Data.Typeable
import GHC.Generics
import Data.Data

data Condition v = Condition
    deriving (Functor, Foldable)

data CondTree v c a = CondNode
    { condTreeData        :: a
    , condTreeConstraints :: c
    , condTreeComponents  :: [CondBranch v c a]
    }
    deriving (Functor, Foldable)

data CondBranch v c a = CondBranch
    { condBranchCondition :: Condition v
    , condBranchIfTrue    :: CondTree v c a
    , condBranchIfFalse   :: Maybe (CondTree v c a)
    }
    deriving (Functor, Foldable)

The problem seems to be fixed in HEAD but I haven't looked for the commit that fixed it.

Change History (13)

comment:1 Changed 3 years ago by ezyang

Description: modified (diff)

comment:2 Changed 3 years ago by RyanGlScott

Cc: RyanGlScott added

Thanks for the report, ezyang. I'll look at this tomorrow.

comment:3 Changed 3 years ago by RyanGlScott

Keywords: deriving-perf added

Here's a stripped-down version that doesn't use any GHC extensions:

module Bug where

newtype CondTree a = CondNode
    { condTreeComponents :: [CondBranch a]
    }

data CondBranch a = CondBranch
    { condBranchIfTrue  :: CondTree a
    , condBranchIfFalse :: CondTree a
    }

instance Foldable CondBranch where
  foldr f_a3sF z_a3sG (CondBranch a1_a3sH a2_a3sI)
    = (\ b1_a3sJ b2_a3sK -> foldr f_a3sF b2_a3sK b1_a3sJ)
        a1_a3sH
        ((\ b3_a3sL b4_a3sM -> foldr f_a3sF b4_a3sM b3_a3sL)
           a2_a3sI z_a3sG)
  foldMap f_a3sN (CondBranch a1_a3sO a2_a3sP)
    = mappend
        (foldMap f_a3sN a1_a3sO)
        (foldMap f_a3sN a2_a3sP)

instance Foldable CondTree where
  foldr f_a3sQ z_a3sR (CondNode a1_a3sS)
    = (\ b3_a3sT b4_a3sU
         -> foldr
              (\ b1_a3sV b2_a3sW -> foldr f_a3sQ b2_a3sW b1_a3sV)
              b4_a3sU
              b3_a3sT)
        a1_a3sS z_a3sR
  foldMap f_a3sX (CondNode a1_a3sY)
    = foldMap (foldMap f_a3sX) a1_a3sY

This shows that the program doesn't loop forever, but rather it just takes a long time to compile:

$ time /opt/ghc/8.0.1/bin/ghc -O1 -fforce-recomp Bug.hs
[1 of 1] Compiling Bug              ( Bug.hs, Bug.o )

real    0m3.331s
user    0m3.280s
sys     0m0.044s

Adding more polymorphic recursion increases compilation time exponentially. For example, this program (with a modified definition and Foldable instance for CondBranch):

module Bug where

newtype CondTree a = CondNode
    { condTreeComponents :: [CondBranch a]
    }

data CondBranch a = CondBranch
    { condBranchIfTrue  :: CondTree a
    , condBranchIfFalse :: Maybe (CondTree a)
    }

instance Foldable CondBranch where
  foldr f_a3sL z_a3sM (CondBranch a1_a3sN a2_a3sO)
    = (\ b1_a3sP b2_a3sQ -> foldr f_a3sL b2_a3sQ b1_a3sP)
        a1_a3sN
        ((\ b5_a3sR b6_a3sS
            -> foldr
                 (\ b3_a3sT b4_a3sU -> foldr f_a3sL b4_a3sU b3_a3sT)
                 b6_a3sS
                 b5_a3sR)
           a2_a3sO z_a3sM)
  foldMap f_a3sV (CondBranch a1_a3sW a2_a3sX)
    = mappend
        (foldMap f_a3sV a1_a3sW)
        (foldMap (foldMap f_a3sV) a2_a3sX)

instance Foldable CondTree where
  foldr f_a3sY z_a3sZ (CondNode a1_a3t0)
    = (\ b3_a3t1 b4_a3t2
         -> foldr
              (\ b1_a3t3 b2_a3t4 -> foldr f_a3sY b2_a3t4 b1_a3t3)
              b4_a3t2
              b3_a3t1)
        a1_a3t0 z_a3sZ
  foldMap f_a3t5 (CondNode a1_a3t6)
    = foldMap (foldMap f_a3t5) a1_a3t6

has twice the compilation time.

$ time /opt/ghc/8.0.1/bin/ghc -O1 -fforce-recomp Bug.hs
[1 of 1] Compiling Bug              ( Bug.hs, Bug.o )

real    0m6.489s
user    0m6.396s
sys     0m0.092s

Now to find the commit responsible for fixing this and backport it to GHC 8.0.3. I have a hunch that it's the same commit that fixed #12234, but it'll be nice to confirm it.

comment:4 Changed 3 years ago by RyanGlScott

Indeed, commit 517d03e41b4f5c144d1ad684539340421be2be2a (which fixed #12234) also fixed this issue. I was a bit skeptical that it would, since I thought #12234 only applies in cases of coercibility-solving for newtypes, and the original program doesn't appear to use any newtypes. But then it occurred to me - the original program actually does involve newtypes, but they're hidden in the default definitions of some Foldable class methods:

class Foldable t where
    -- | The largest element of a non-empty structure.
    maximum :: forall a . Ord a => t a -> a
    maximum = fromMaybe (errorWithoutStackTrace "maximum: empty structure") .
       getMax . foldMap (Max #. (Just :: a -> Maybe a))
    
    -- | The least element of a non-empty structure.
    minimum :: forall a . Ord a => t a -> a
    minimum = fromMaybe (errorWithoutStackTrace "minimum: empty structure") .
       getMin . foldMap (Min #. (Just :: a -> Maybe a))
    
    -- | The 'sum' function computes the sum of the numbers of a structure.
    sum :: Num a => t a -> a
    sum = getSum #. foldMap Sum
    
    -- | The 'product' function computes the product of the numbers of a
    -- structure.
    product :: Num a => t a -> a
    product = getProduct #. foldMap Product

And (#.) is defined to be:

(#.) :: Coercible b c => (b -> c) -> (a -> b) -> (a -> c)
(#.) _f = coerce

Quite sneaky.

Until we can get 517d03e41b4f5c144d1ad684539340421be2be2a backported to GHC 8.0.3, a workaround is to manually define these Foldable methods for polymorphically recursive datatypes such that they don't use coerce:

{-# LANGUAGE CPP #-}
module Bug where

import Data.Maybe (fromMaybe)

newtype CondTree a = CondNode
    { condTreeComponents :: [CondBranch a]
    }

data CondBranch a = CondBranch
    { condBranchIfTrue  :: CondTree a
    , condBranchIfFalse :: Maybe (CondTree a)
    }

instance Foldable CondBranch where
  foldr f z (CondBranch a1 a2) = foldr f (foldr (flip (foldr f)) z a2) a1
  foldMap f (CondBranch a1 a2) = mappend (foldMap f a1) (foldMap (foldMap f) a2)
#if MIN_VERSION_base(4,8,0)
  sum     = foldr (+) 0
  product = foldr (*) 1
  minimum = fromMaybe (error "minimum: empty") . foldr (min . Just) Nothing
  maximum = fromMaybe (error "maximum: empty") . foldr (max . Just) Nothing
#endif

instance Foldable CondTree where
  foldr f z (CondNode a) = foldr (flip (foldr f)) z a
  foldMap f (CondNode a) = foldMap (foldMap f) a
#if MIN_VERSION_base(4,8,0)
  sum     = foldr (+) 0
  product = foldr (*) 1
  minimum = fromMaybe (error "minimum: empty") . foldr (min . Just) Nothing
  maximum = fromMaybe (error "maximum: empty") . foldr (max . Just) Nothing
#endif

comment:5 Changed 3 years ago by ezyang

Thanks for the speedy triage!

comment:6 Changed 3 years ago by simonpj

Ryan you are so fast! Could you add your stripped-down tests as a performance regression test? Probably redundant, but always good to check that we don't accidentally break it again. Thanks!

Simon

comment:7 Changed 3 years ago by Ryan Scott <ryan.gl.scott@…>

In 5088110/ghc:

Add performance test for #13056

This performance regression was fixed by commit
517d03e41b4f5c144d1ad684539340421be2be2a (#12234). Let's add a performance test
to ensure that it doesn't break again.

comment:8 Changed 3 years ago by RyanGlScott

Milestone: 8.0.3
Status: newmerge

This can be merged alongside #12234.

comment:9 Changed 3 years ago by RyanGlScott

Test Case: perf/compiler/T13056

comment:10 Changed 3 years ago by EyalLotem

Much smaller example that reproduces the issue:

data A a = A (B a) (B a) deriving (Functor)
data B a = B (A a) deriving (Functor)

comment:11 Changed 3 years ago by simonpj

Would the much smaller example be a good perf test?

comment:12 Changed 3 years ago by bgamari

Milestone: 8.0.38.2.1

At this point it is rather unlikely that there will be an 8.0.3. Re-milestoning.

comment:13 Changed 3 years ago by bgamari

Resolution: fixed
Status: mergeclosed
Note: See TracTickets for help on using tickets.