Opened 2 years ago

Closed 21 months ago

Last modified 20 months ago

#14338 closed bug (fixed)

Simplifier fails with "Simplifier ticks exhausted"

Reported by: dredozubov Owned by: bgamari
Priority: high Milestone: 8.4.1
Component: Compiler Version: 8.2.1
Keywords: Cc: bgamari
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Compile-time crash or panic Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

While upgrading a codebase to GHC 8.2.1(it compiles reliably with 7.10.3) we encountered an issue with core simplifier(and rewrite rules if I correctly understood the meaning of RuleFired):

ghc: panic! (the 'impossible' happened)
  (GHC version 8.2.1 for x86_64-apple-darwin):
        Simplifier ticks exhausted
  When trying RuleFired Class op $p2HModify
  To increase the limit, use -fsimpl-tick-factor=N (default 100)
  If you need to do this, let GHC HQ know, and what factor you needed
  Total ticks:     10450410

  1494659 PreInlineUnconditionally
    149100 w_ipsJ
    149095 w_ipps
    149084 w_iplQ
    149084 w1_iplR
    149084 w2_iplS
    149084 w3_iplT
    149078 w_ipm7
    149078 w1_ipm8
    149078 w2_ipm9
    149078 w3_ipma
    120 $d~_iplr
    120 $d~1_ipls
    120 irred_iplu
    120 eta_iplv
    39 v_spsR
    38 v_sp9c
    38 v1_sp9d
    38 v_spii
    38 v1_spij
    38 v2_spik
    38 v3_spil
    37 v_sp99
    37 v1_sp9a
    37 v_spin
    37 v1_spio
    37 v2_spip
    37 v3_spiq
    36 v_sp9f
    36 v1_sp9g
    36 v_spis
...skipping...
    1 cobox1_apuk
    1 cobox_apuq
    1 cobox1_apur
    1 cobox_apux
    1 cobox1_apuy
  2 CaseIdentity 2 ds1_iqAB
  4 FillInCaseDefault
    1 nt_sqyY
    1 nt_sqz9
    1 nt_sqze
    1 nt_sqzj
  Call stack:
      CallStack (from HasCallStack):
        prettyCurrentCallStack, called at compiler/utils/Outputable.hs:1133:58 in ghc:Outputable
        callStackDoc, called at compiler/utils/Outputable.hs:1137:37 in ghc:Outputable
        pprPanic, called at compiler/simplCore/SimplMonad.hs:199:31 in ghc:SimplMonad

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

We tried to shrink reproducible example to something reasonable and this is what we got: https://github.com/4e6/webapp-template-hs/tree/simpl-tick-factor

Compiling it with stack build --ghc-options='-fsimpl-tick-factor=1000'(ten times the default) will demonstrate the issue.

It fails reliably with a combination of servant and hset libraries. Removing one route from a servant API or moving the PayloadX to the head of type-level list makes it compilable again.

Attachments (2)

Runner.dump-simpl-stats (170 bytes) - added by dredozubov 2 years ago.
simpl-stats
Runner.2.dump-simpl-stats (170 bytes) - added by dredozubov 2 years ago.

Download all attachments as: .zip

Change History (40)

comment:1 Changed 2 years ago by mpickering

Can you paste the full -ddump-simpl-stats output?

Changed 2 years ago by dredozubov

Attachment: Runner.dump-simpl-stats added

simpl-stats

comment:2 Changed 2 years ago by mpickering

That file seems to be truncated to 6 lines?

Changed 2 years ago by dredozubov

Attachment: Runner.2.dump-simpl-stats added

comment:3 Changed 2 years ago by mpickering

Still truncated?, perhaps it is too big for trac to deal with? Can you upload it elsewhere?

comment:5 Changed 2 years ago by mpickering

It seems that the problems are because we are doing lots (and lots (and lots)) of inlining of functions which produce and handle coercions.

Here is the iface file which contains the functions which appear many times in the ddump-simpl-stats log. https://gist.github.com/mpickering/30adf1e6c61bcfa1462031f947079c10

comment:6 Changed 2 years ago by simonpj

I can't even get to base-camp (with HEAD). cabal install (with --allow-newer else nothing at all works) says

....lots of stuff...
Preprocessing library unordered-containers-0.2.8.0...
[1 of 8] Compiling Data.HashMap.PopCount ( Data/HashMap/PopCount.hs, dist/build/Data/HashMap/PopCount.o )
[2 of 8] Compiling Data.HashMap.Unsafe ( Data/HashMap/Unsafe.hs, dist/build/Data/HashMap/Unsafe.o )
[3 of 8] Compiling Data.HashMap.Array ( Data/HashMap/Array.hs, dist/build/Data/HashMap/Array.o )
[4 of 8] Compiling Data.HashMap.UnsafeShift ( Data/HashMap/UnsafeShift.hs, dist/build/Data/HashMap/UnsafeShift.o )
[5 of 8] Compiling Data.HashMap.Base ( Data/HashMap/Base.hs, dist/build/Data/HashMap/Base.o )
[6 of 8] Compiling Data.HashMap.Strict ( Data/HashMap/Strict.hs, dist/build/Data/HashMap/Strict.o )
[7 of 8] Compiling Data.HashMap.Lazy ( Data/HashMap/Lazy.hs, dist/build/Data/HashMap/Lazy.o )
[8 of 8] Compiling Data.HashSet     ( Data/HashSet.hs, dist/build/Data/HashSet.o )

Data/HashSet.hs:80:39: error:
    Module ‘Data.Semigroup’ does not export ‘Monoid(..)’
   |
80 | import Data.Semigroup (Semigroup(..), Monoid(..))
   |                                       ^^^^^^^^^^
cabal: Leaving directory '/tmp/cabal-tmp-22962/unordered-containers-0.2.8.0'
Installed iproute-1.7.1

comment:7 Changed 2 years ago by bgamari

simonpj, perhaps try running

git clone git://github.com/RyanGlScott/unordered-containers
cabal install --allow-newer unordered-containers/

and then try again.

comment:8 Changed 2 years ago by 4e6

I've updated webapp-template-hs example (branch simpl-tick-factor) to be compatible with GHC 7 and 8.

With ghc-7.10.3 it compiles with 2534 simplifier ticks, and with ghc-8.2.1 it panics with 'Simplifier ticks exhausted' on 256649 ticks. See readme for details.

Here is the full -ddump-simpl-stats output for both cases: https://gist.github.com/4e6/5ef65efdb309daa373a928ec36404fd7

comment:9 Changed 2 years ago by simonpj

Thanks. I now get this

simonpj@cam-05-unx:~/tmp/webapp-template-hs$ cabal install --allow-newer --with-ghc=/home/simonpj/5builds/HEAD/inplace/bin/ghc-stage2
Resolving dependencies...
Configuring basement-0.0.2...
Configuring colour-2.3.3...
Configuring blaze-builder-0.4.0.2...
Configuring primitive-0.6.2.0...
Configuring psqueues-0.2.4.0...
Configuring system-filepath-0.4.13.4...
Configuring unordered-containers-0.2.8.0...
Building basement-0.0.2...
Building colour-2.3.3...
Building blaze-builder-0.4.0.2...
Building psqueues-0.2.4.0...
Building primitive-0.6.2.0...
Building unordered-containers-0.2.8.0...
Failed to install blaze-builder-0.4.0.2
Build log ( /home/simonpj/.cabal/logs/blaze-builder-0.4.0.2.log ):
cabal: Entering directory '/tmp/cabal-tmp-35090/blaze-builder-0.4.0.2'
Configuring blaze-builder-0.4.0.2...
Building blaze-builder-0.4.0.2...
Preprocessing library blaze-builder-0.4.0.2...
[ 1 of 10] Compiling Blaze.ByteString.Builder.Internal.Write ( Blaze/ByteString/Builder/Internal/Write.hs, dist/build/Blaze/ByteString/Builder/Internal/Write.o )

Blaze/ByteString/Builder/Internal/Write.hs:122:10: error:
    • No instance for (Semigroup Poke)
        arising from the superclasses of an instance declaration
    • In the instance declaration for ‘Monoid Poke’
    |
122 | instance Monoid Poke where
    |          ^^^^^^^^^^^

Blaze/ByteString/Builder/Internal/Write.hs:132:10: error:
    • No instance for (Semigroup Write)
        arising from the superclasses of an instance declaration
    • In the instance declaration for ‘Monoid Write’
    |
132 | instance Monoid Write where
    |          ^^^^^^^^^^^^

... and lots more like that...

comment:10 Changed 2 years ago by 4e6

Ok, I pushed the update that removes almost all external dependencies. With this update, I was able to build the project with latest ghc-8.3-start-1211-g82b77ec375 (version by git-describe)

Interestingly enough, ghc-8.3 doesn't show the presence of the bug. It means that the regression was fixed after the ghc-8.2.1 release.

I also updated the gist with -ddump-simpl-stats output for all three ghc versions. Total ticks number for ghc-8.3 is similar to ghc-7.10.3 value.

comment:11 Changed 2 years ago by 4e6

I ran git bisect between ghc-8.2.1-release and master, and the first commit that doesn't have the issue is 33452dfc6c Refactor the Mighty Simplifier. Is there a chance to include these changes into the upcoming ghc-8.2.2 release?

comment:12 Changed 2 years ago by 4e6

Ugh, that's not all. I tried to build our (big) project with 33452dfc6c Refactor the Mighty Simplifier GHC revision and it was still failing. So, I returned to the test example and was able to trigger the issue on GHC master branch by implementing some methods and slightly increasing reader size.

Summary

webapp-template-hs branch simpl-tick-factor
Builds with ghc-7.10.3
Fails with ghc-8.2.1 and master branch
gist with updated -ddump-simpl-stats output

comment:13 Changed 2 years ago by bgamari

Priority: normalhigh

Hmm, bad news bears. We'll need to have a look at this.

comment:14 Changed 2 years ago by bgamari

Milestone: 8.4.1
Owner: set to bgamari

comment:15 Changed 2 years ago by taylorfausak

I also ran into this problem trying to build hasql-0.20.0.1 with GHC 8.0.2: https://github.com/nikita-volkov/hasql/issues/79#issuecomment-339441282

comment:16 Changed 2 years ago by bgamari

Note that comment:15 is likely a different issue; there are many reasons why the the simplifier may exhaust its ticks limit. I believe this particular issue wasn't observed in 8.0.2 (although it would be nice if 4e6 could confirm this).

taylorfausak, could you open a new ticket for that, please?

Last edited 2 years ago by bgamari (previous) (diff)

comment:17 Changed 23 months ago by bgamari

So this is an interesting case. The module that blows up is quite small; it seems that the blowing up code all comes from unfoldings introduced via,

mainH :: HReader AppPayload Int :<|> HReader AppPayload Int
mainH = handleReaderRequest' :<|> handleReaderRequest

where AppPayload is a type-level list. The number of coercions in the desugared core seems to scale quadratically with the length of this list,

Phase N=1 N=2 N=3 N=4 N=5
Desugar (terms) 74 87 100 113 126
Desugar (types) 350 502 694 926 1198
Desugar (coerc) 395 746 1238 1895 2741

Yet in all of these cases somehow Core Tidy brings all of these programs down to precisely 67 terms and a couple hundred types/coercions.

comment:18 Changed 23 months ago by bgamari

Indeed essentially all of this bloat comes from specialisations of the HModify and HGet dictionaries (see Data.HSet included in the testcase). For instance, we have,

-- RHS size: {terms: 5, types: 52, coercions: 426, joins: 0/0}
$dHModify
  :: Data.HSet.HModify
       '[Payload 3, Payload 4, Payload 5, Payload 6, PayloadX]
       '[Payload 3, Payload 4, Payload 5, Payload 6, PayloadX]
       PayloadX
       PayloadX
       ('TypeFun.Data.Peano.S
          ('TypeFun.Data.Peano.S
             ('TypeFun.Data.Peano.S
                ('TypeFun.Data.Peano.S 'TypeFun.Data.Peano.Z))))

It looks like we end up producing O(N) of these things, with each having O(N) (perhaps more?) coercions.

I have added this ticket to the list of "coercion pile-up" issues on Performance/Compiler, since it seems pretty clear that this ticket falls in this bucket. Consequently, this program will almost certainly benefit from the coercion dropping proposed in #8095.

Last edited 23 months ago by bgamari (previous) (diff)

comment:19 Changed 23 months ago by dredozubov

I've seen quite a lot HModify and HGet popping up in a core dump, but haven't realized it's linked to coercions. The comments above supports my experience with this piece of code and the issue. The issue definitely arises when increasing the length of a type-level list AppPayload in this case. In the context of a larger project, it was pretty difficult to pinpoint, an additional servant endpoint can trigger it for example.

Do I understand correctly that you're referring to the idea of dropping explicit coercions for compilation without -dcore-lint?

comment:20 Changed 23 months ago by simonpj

Dropping coercions won't change the number of simplifier ticks though, will it? So even if we do that, we may still get this "ticks-exhausted" issue, no? Do you have any insight about that?

comment:21 Changed 23 months ago by bgamari

Indeed dropping coercions won't fix the simplifier ticks issue. Nevertheless, I thought it would be important to characterise the coercion issue since this is one of the simplest reproducers of non-linear coercion growth to-date.

The major contributors to the ticks summary look like this,

N PreInlineUncond RuleFired BetaReduction Total
1 244 247 724 1772
2 262 284 811 1940
3 309 354 992 2295
4 396 481 1343 2963
5 569 713 2016 4227
6 907 1146 3317 6653
7 1561 1972 5848 11343
8 2860 3575 10831 20555
9 5444 6723 20696 38765

In the case of N=9 the most frequently-firing rules are,

6723 RuleFired
  4348 Class op HEq_sc
  1029 Class op $p1HModify
  1029 Class op $p2HModify
  114 Class op $p1HGet
  ...

The top PreInlineUnconditionally targets are,

5444 PreInlineUnconditionally
  505 w
  505 w
  503 w
  503 w1
  503 w2
  503 w3
  503 w
  503 w1
  503 w2
  503 w3
  86 $dHGet 
  ...

And the top beta reduction targets are,

20696 BetaReduction
  505 i
  505 e1
  505 ex
  505 els1
  505 e2
  505 els2
  505 w
  505 w1
  505 w2
  505 i
  505 e1
  505 ex
  505 els1
  505 e2
  505 els2
  505 w
  505 w1
  505 w2
  503 i
  503 e1
  503 ex
  503 els1
  503 e2
  503 els2
  503 w
  503 w1
  503 w2
  503 w3
  503 i
  503 e1
  503 ex
  503 els1
  503 e2
  503 els2
  503 w
  503 w1
  503 w2
  503 w3
  86 i
  ...

Unfortunately I'm having quite some trouble finding any of these bindings for further investigation.

Last edited 23 months ago by bgamari (previous) (diff)

comment:22 Changed 23 months ago by simonpj

Vaguely reminds me of #13253.

comment:23 Changed 23 months ago by bgamari

So I augmented the -ddump-simpl-stats output to show the types of the named binders. Here I'll be looking at the N=9 case, since this demonstrates the problem clearly.

In the case of the the PreInlineUnconditionally ticks, the high-count binders are all of one of the following forms

  505 w_i4Ig :: ('TypeFun.Data.Peano.S
                   i_i4Ia :: TypeFun.Data.Peano.N)
                Data.Type.Equality.~ (TypeFun.Data.List.IndexOf
                                        e1_i4Ib (ex_i4Ic : els1_i4Id) :: TypeFun.Data.Peano.N)
  503 w2_i4HF :: Data.HSet.HModify
                   els1_i4HA els2_i4HC e1_i4Hy e2_i4HB i_i4Hx
  503 w3_i4HG :: TypeFun.Data.List.NotElem ex_i4Hz els2_i4HC

There are ten binders in this list, each with either 505 or 503 ticks credited.

The beta-reduced binders are all type variables of one of the following forms

  505 i_i4Ia :: TypeFun.Data.Peano.N
  505 e1_i4Ib :: *
  505 els1_i4Id :: [*]
  505 w_i4Ig :: ('TypeFun.Data.Peano.S
                   i_i4Ia :: TypeFun.Data.Peano.N)
                Data.Type.Equality.~ (TypeFun.Data.List.IndexOf
                                        e1_i4Ib (ex_i4Ic : els1_i4Id) :: TypeFun.Data.Peano.N)}}}
  505 w2_i4Ii :: Data.HSet.HModify
                   els1_i4Id els2_i4If e1_i4Ib e2_i4Ie i_i4Ia

there are a few dozen of these, each with either 505 or 503 credited ticks.

comment:24 Changed 23 months ago by bgamari

To reproduce this with HEAD you will first need to install cabal-install from HEAD (this will only be a few minutes):

$ git clone git://github.com/haskell/Cabal
$ cd Cabal
$ cabal install Cabal/ cabal-install/

Now you can reproduce with,

$ git clone git@github.com:bgamari/webapp-template-hs.git
$ cd webapp-template-hs
$ export ghc=$PATH_TO_GHC
$ cabal new-build -w $ghc --only-dependencies
$ n=9 ./build.sh  -O1 -fsimpl-tick-factor=1000 -v -dverbose-core2core \
    -ddump-to-file -dsuppress-ticks -ddump-simpl-stats \
    -dsuppress-idinfo -dsuppress-coercions

There is also a ./build-all.sh script which builds all ns from 1 to 9, dumping the output of each compiler run to a directory of the form ./dump-$n/.

comment:25 Changed 23 months ago by dredozubov

It seems like you forgot to commit src/Runner.hs. There's a src/Runner.hs.in, but it fails to build dependencies with cabal new-build -w $ghc --only-dependencies without src/Runner.hs.

Last edited 23 months ago by dredozubov (previous) (diff)

comment:26 Changed 23 months ago by bgamari

Ahh, fair point: Runner.hs is generated by build.sh. It is necessary to run build.sh once before running cabal new-build.

comment:27 Changed 23 months ago by dredozubov

I ran into a few issues running this on my machine, so I made a few changes, maybe it'll be helpful to someone else:

$ git clone git@github.com:dredozubov/webapp-template-hs.git
$ cd webapp-template-hs
$ git checkout simpl-tick-factor
$ export ghc=$PATH_TO_GHC
$ cabal new-build -w $ghc --only-dependencies
$ n=9 ./build_all.sh -package-db PATH_TO_PACKAGE_DB

Path to package db was ~/.cabal/store/ghc-8.3.20171108/package.db for me

comment:28 Changed 21 months ago by mpickering

I think this regression is caused by https://git.haskell.org/ghc.git/commitdiff/1722fa106e10e63160bb2322e2ccb830fd5b9ab3

I built a compiler which comments out the relevant two lines from that patch and the number of ticks for n=10 is around 1500 vs 75000 for an unpatched compiler.

So, relevant tickets are #13032 and #11230

comment:29 Changed 21 months ago by simonpj

Going from 1,500 to 75,000 ticks is stunning. Thank you for locating the offenting patch.

I spent some while digging. Here's what I learned

  • Discarding (case e of co -> blah) is indeed unsound, unless we know that e terminates, so Richard's patch in #11230 is right.
  • But his patch does not say why the remaining "optimisation" (which works for superclass selectors over a Coercible dictionary) is sound. And indeed, it is not; in Core I could produce a bottoming Coercible dictionary. This "optimsation" in CoreOpt is all in service of Note [Getting the map/coerce RULE to work], and I'm not sure how important that is.
  • I was also perplexed that in the case of #11230 the coericon involved seemed to be dead. The code is
    testPhantom :: Phantom Char -> Phantom Bool
    testPhantom x = id x
    
    which should generate something like
    data Wag x = MkWag ()
    type role Wag phantom
    
    testPhantom x = let co :: Wag Char ~ Wag Bool = error "blah"
                    in id @(PhantomChar) x |> co
    
    so that coercion co certainly isn't dead! But what happens is this. We actually generate
    testPhantom x = let co :: Char ~# Bool = error "blah"
                    in id @(PhantomChar) x |> mkSubCo (Wag co)
    
    where the mkSubCo turns a nominal coercion Char ~# Bool into a representational one. Rather than just generate SubCo always, it pushes the sub inwards. Because of the role of Wag, we then want to turn the nominal co into a phantom version, via downgradeRole. But (currently) that dicards co (retaining only its kind) -- see Coecion.toPhantomCo -- so now co appears to be dead. This seems wrong to me; the (nominal-equality) evidence really is needed and should really still be free in the result.

That's one set of issues. Now, returning to this ticket:

  • The perf changes in this ticket are presumably because of HEq_sc selections, as reported in #13032.
  • And here the evidence really is not needed!
  • And hence it's really wrong to force the dictionary; that makes the function stricter (in its dictionary argument) than it should be. (As well as less efficient.)

Solution: do not generate all these speculative "given" bindings in the first place. Instead, in the desugarer, figure out which given bindings are needed, and only emit those ones. That will generate less code -- perhaps a lot less in some exotic programs -- and be better all round.

comment:30 Changed 21 months ago by Simon Peyton Jones <simonpj@…>

In 954cbc7c/ghc:

Drop dead Given bindings in setImplicationStatus

Trac #13032 pointed out that we sometimes generate unused
bindings for Givens, and (worse still) we can't always discard
them later (we don't drop a case binding unless we can prove
that the scrutinee is non-bottom.

It looks as if this may be a major reason for the performace
problems in #14338 (see comment:29).

This patch fixes the problem at source, by pruning away all the
dead Givens.  See Note [Delete dead Given evidence bindings]

Remarkably, compiler allocation falls by 23% in
perf/compiler/T12227!

I have not confirmed whether this change actualy helps with

comment:31 Changed 21 months ago by simonpj

OK Matthew, can you see if the above patch helps?

comment:32 Changed 21 months ago by mpickering

I will try this again tomorrow.

comment:33 Changed 21 months ago by mpickering

Resolution: fixed
Status: newclosed

Cheers Simon. That did the trick.

comment:34 Changed 21 months ago by bgamari

Status: closedmerge

comment:35 Changed 21 months ago by k-bx

Hi. Is there a chance this patch could be included in the 8.2.3 release? We think that our codebase was affected by the bug, we've resolved the issue by applying -O0 in one of the modules, but it sounds like a rather risky hack to have, and also some other modules from our codebase still go considerably higher in RAM consumption upon compilation. Waiting for 8.4 would be rather unfortunate.

Thank you!

comment:36 Changed 21 months ago by bgamari

It seems quite unlikely that there will be an 8.2.3 at this point.

comment:37 Changed 21 months ago by bgamari

Status: mergeclosed

comment:38 Changed 20 months ago by k-bx

Just wanted to mention that a direct application of a patch on top of 8.2.2 (with some conflict resolution done) doesn't work for me and I'm not experienced enough to understand what's going on https://gist.github.com/k-bx/9b4a3d95f2de9c86b50dca1c615c63e6

I first cherry-picked the commit, but then tried manual change-by-change cherry-picking, getting same error on three machines (1 macOS, 2 Ubuntu).

I guess my another attempt would be to see if building 5124b04f10adfee6390f435a493984f2b45062d0 itself as a replacement for our 8.2.2 will be a good idea.

Note: See TracTickets for help on using tickets.