Opened 3 years ago
Closed 3 years ago
#12425 closed bug (fixed)
With -O1 and above causes ghc to use all available memory before being killed by OOM killer
Reported by: | erikd | Owned by: | |
---|---|---|---|
Priority: | high | Milestone: | 8.2.1 |
Component: | Compiler | Version: | 8.0.1 |
Keywords: | Inlining | Cc: | |
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | Compile-time performance bug | Test Case: | perf/compiler/T12425 |
Blocked By: | Blocking: | ||
Related Tickets: | Differential Rev(s): | ||
Wiki Page: |
Description
I've just taken over an existing haskell package and in the process of getting it to work with ghc 8.0 discovered that when compiling one file ghc tires to use all my ram. It basically ends up using 40% of 16 Gig before being killed by the OOM (out-of-memory) killer.
The project is at: https://github.com/erikd/conduit-find.git
The problem can be reproduced by:
git clone https://github.com/erikd/conduit-find.git cd conduit-find git checkout test/ghc-8.0 cabal sandbox init cabal install --dependencies-only cabal build
which on two of my machines fails building the Data.Cond
module.
This is regression, because this code compiles fine with ghc-7.10.3 and ghc-7.8.4.
Change History (21)
comment:1 Changed 3 years ago by
comment:2 Changed 3 years ago by
Maybe not too surprisingly, removing all the INLINE
pragmas (51 in 500 lines of code) allows the code to compile even with -O2
. (suggested by @mpickering).
comment:3 Changed 3 years ago by
Priority: | normal → high |
---|
Reproducible with HEAD. Here is a testcase that doesn't depend on any packages that aren't in the GHC tree.
module T12425 where import Control.Applicative import Control.Monad import Control.Monad.Trans.State.Lazy (StateT(..)) data Result a m b = RecurseOnly (Maybe (CondT a m b)) | KeepAndRecurse b (Maybe (CondT a m b)) instance Monad m => Functor (Result a m) where fmap f (RecurseOnly l) = RecurseOnly (liftM (fmap f) l) fmap f (KeepAndRecurse a l) = KeepAndRecurse (f a) (liftM (fmap f) l) {-# INLINE fmap #-} newtype CondT a m b = CondT { getCondT :: StateT a m (Result a m b) } instance Monad m => Functor (CondT a m) where fmap f (CondT g) = CondT (liftM (fmap f) g) {-# INLINE fmap #-} instance Monad m => Applicative (CondT a m) where pure = undefined (<*>) = undefined instance Monad m => Monad (CondT a m) where return = undefined (>>=) = undefined
@erikd: the following change fixes the problem.
instance Monad m => Functor (CondT a m) where - fmap f (CondT g) = CondT (liftM (fmap f) g) + fmap f (CondT g) = CondT (liftA (fmap f) g) {-# INLINE fmap #-}
Tested with GHC 8 and HEAD. To compile conduit-find
with HEAD, you need to make the following other changes:
- add
Cabal < 1.25
to the .cabal file, to workaround https://github.com/ekmett/distributive/issues/17 - use
conduit
with this patch: https://github.com/snoyberg/conduit/pull/274 - use
tagged
>= 0.8.5, which fixes https://github.com/ekmett/semigroupoids/issues/48
comment:4 Changed 3 years ago by
@thomie Oh wow, that change you suggested (switching from liftM
to liftA
) prevents this problem in conduit-find
even with stock 8.0.1.
Edit: And a bit of CPP lets me use liftA
for ghc >= 7.10 and liftM
for earlier.
comment:5 Changed 3 years ago by
A guess would be that in 8.0’s library version, liftM
is implemented in a way that mentions fmap
, and liftM
is inlined to expose that. Then all your unconditinally marked inline functions become effectively recursive, but still keep inlining. But didn’t investigate deeper.
comment:6 Changed 3 years ago by
Thats seems to be a possible explanation. Is there no loop checker for the inliner?
comment:7 Changed 3 years ago by
Keywords: | Inlining added |
---|
comment:8 Changed 3 years ago by
Type of failure: | Other → Compile-time performance bug |
---|
This looks bad. Inlining should not go on for ever ... there's a tick counter as a last-ditch hard stop. It could just possibly be too many INLINE pragmas, but it's quite difficult to eat 16G.
Does anyone feel able to investigate further?
Simon
comment:9 Changed 3 years ago by
Indeed there does appear to be some sort of loop here. -ddump-inlinings
shows long repeated runs of,
... Inlining done: Control.Monad.Trans.State.Lazy.$fMonadStateT1 Inlining done: Control.Monad.Trans.State.Lazy.$fMonadStateT_$c>>= Inlining done: GHC.Base.$ Inlining done: Control.Monad.Trans.State.Lazy.runStateT Inlining done: Control.Monad.Trans.State.Lazy.runStateT1 Inlining done: Control.Monad.Trans.State.Lazy.runStateT Inlining done: Control.Monad.Trans.State.Lazy.runStateT1 Inlining done: GHC.Base.liftM Inlining done: Control.Monad.Trans.State.Lazy.$fMonadStateT_$creturn ...
These runs are punctuated by runs of,
... Inlining done: GHC.Base.$fApplicativeMaybe_$sliftM ...
The runs appear to get longer as the simplifier progresses. It indeed seems plausible that all of these bindings have INLINE pragmas attached.
comment:10 Changed 3 years ago by
Milestone: | 8.0.2 → 8.0.3 |
---|
It doesn't look like this will get fixed for 8.0.2.
comment:12 Changed 3 years ago by
Thomie, THANK YOU for the small repro case in comment:3. With its help I rapidly nailed the bug. Here's what is happening.
Suppose we have
let rec { f = ...g...g... ; g = ...f...f... } in case x of True -> ...f.. False -> ..f...
In each iteration of the simplifier the occurrence analyser OccAnal
chooses a loop breaker. Suppose in iteration 1 it choose g
as the loop breaker. That means it is free to inline f
.
Suppose that GHC decides to inline f
in the branches of the case
, but (for some reason; eg it is not satureated) in the rhs of g
. So we get
let rec { f = ...g...g... ; g = ...f...f... } in case x of True -> ...g...g..... False -> ..g..g....
Now suppose that, for some reason, in the next iteraion the occurrence analyser chooses f
as the loop breaker, so it can freely inling g
. And again for some reason the simplifer inlines g
at its calls in the case
branches, but not in the RHS of f
. Then we get
let rec { f = ...g...g... ; g = ...f...f... } in case x of True -> ...(...f...f...)...(...f..f..)..... False -> ..(...f...f...)...(..f..f...)....
You can see where this is going! Each iteration of the simplifier doubles the number of calls to f
or g
. No wonder GHC is slow!
(In the particular example in comment:3, f
and g
are the two mutually recursive fmap
instances for CondT
and Result
. They are both marked INLINE which, oddly, is why they don't inline in each other's RHS, because the call there is not saturated.)
The root cause is that we flip-flop on our choice of loop breaker. I always thought it didn't matter, and indeed for any single iteration to terminate, it doesn't matter. But when we iterate, it matters a lot!!
I think this was not happening before because, by a fluke, we tended to always choose the same one. But, perhaps as a resul of Bartosz's work on making GHC deterministic, the occurrence analyser now reliably flip-flops in each iteration, as above.
Trac #12234 turns out to be exactly the same issue
What to do? We want the occurrence analyser to choose a loop breaker based on stable criteria, not arbitrary things. Simple idea: if two or more potential loop breakers look equally plausible, choose the one that was a loop breaker in the previous iteration. That should make it stable.
Stay tuned. But I thought I'd explain what is happening in case I get distracted.
comment:14 Changed 3 years ago by
Status: | new → merge |
---|---|
Test Case: | → perf/compiler/T12425 |
I think I nailed this one, in HEAD at least. Can you see if it fixes your original library?
Probably worth merging this to 8.0.3 if we produce such a thing.
comment:15 Changed 3 years ago by
I built a compiler from git HEAD, but that wouldn't build the dependencies for my library (numerous problems with the primitve and vector libraries).
I tried checking out the ghc-8.0
branch and then cherry-picking commit d03e41b4f5 but that failed due to merge conflicts in compiler/simplCore/OccurAnal.hs
that seem rather tricky to resolve.
comment:16 Changed 3 years ago by
Thanks for trying.
I think this would be retro-fittable to 8.0.3 with a bit of work -- if it really matters.
comment:17 Changed 3 years ago by
Sorry erikd, several of those dependencies not building on GHC HEAD were probably my fault. After adjusting some template-haskell
upper version bounds on dependent packages on Hackage, I was able to successfully install conduit-find
with GHC HEAD, and I did not observe any noticeable slowness.
comment:18 Changed 3 years ago by
Thanks @RyanGlScott . As long it compiled with -O1
or above, then its fixed. The original problem was GHC chewing up all memory when *compiling* conduit-find
.
comment:19 Changed 3 years ago by
Just for the record, this bug can be reproduced by compiling Hackage's find-conduit 0.4.4 with GHC 8.0.1. find-conduit is the old version of conduit-find prior to a maintainer change, and prior to the workarounds to avoid this bug that have been incorporated into conduit-find.
comment:20 Changed 3 years ago by
Milestone: | 8.0.3 → 8.2.1 |
---|
At this point it is rather unlikely that there will be an 8.0.3. Re-milestoning.
comment:21 Changed 3 years ago by
Resolution: | → fixed |
---|---|
Status: | merge → closed |
@carter suggested that adding the
Safe
language pragma might prevent this, but it didn't help.