Opened 3 years ago

Closed 3 years ago

#12996 closed bug (fixed)

Memory leak in recursion when switching from -O1 to -O2

Reported by: AndreasK Owned by:
Priority: normal Milestone:
Component: Compiler Version: 8.0.1
Keywords: Memory leak, optimization Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case: testsuite/tests/perf/should_run/T12996
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by AndreasK)

For example code see attachment.

When compiling with -O1 the code behaves as expected with each loop taking about the same amount of time.

At -O2 each iteration takes (nonlinear) more time than the last and leaks memory. It gets slow enough that I never bothered to let it run past 40 iterations since the slowdown seems to be exponential.

I initially hit the bug in code using a Set instead of list but turns out the issue can be reduced to an example with lists.

Removing either filter, pattern matching or changing the loop fixes the issue.

I tested it under the following conditions:

  • Compiler flags: -fno-full-laziness/No flags
  • Compiler versions: GHC 8.0.1 and 7.10.3e
  • Operating Systems: Windows 10, Linux Mint

In every case -O1 works as expected with -O2 showing exponential slowdown with each iteration.

Attachments (5)

Main.hs (521 bytes) - added by AndreasK 3 years ago.
Example Code
O1.log (1.8 KB) - added by AndreasK 3 years ago.
Peformance reference for -O1
O2.log (1.2 KB) - added by AndreasK 3 years ago.
Performance reference for -O2
Main.verbose-stg2stg.O1 (2.8 KB) - added by AndreasK 3 years ago.
STG output at -O1
Main.verbose-stg2stg.O2 (6.1 KB) - added by AndreasK 3 years ago.
STG output at -O2

Download all attachments as: .zip

Change History (11)

Changed 3 years ago by AndreasK

Attachment: Main.hs added

Example Code

comment:1 Changed 3 years ago by AndreasK

Description: modified (diff)

Changed 3 years ago by AndreasK

Attachment: O1.log added

Peformance reference for -O1

Changed 3 years ago by AndreasK

Attachment: O2.log added

Performance reference for -O2

comment:2 Changed 3 years ago by AndreasK

Description: modified (diff)

Changed 3 years ago by AndreasK

Attachment: Main.verbose-stg2stg.O1 added

STG output at -O1

Changed 3 years ago by AndreasK

Attachment: Main.verbose-stg2stg.O2 added

STG output at -O2

comment:3 Changed 3 years ago by osa1

I analyzed this a little bit, here's my findings:

  • The function that causes this slowness is generated by SpecConstr. If you compile with -O2 -fno-spec-constr it works fine.
  • When appLoop function is defined like this:
appLoop :: AppState -> IO ()
appLoop s = do
  let AppState state = s
  print state
  appLoop $ AppState (cycleState state)

Demand analyzer thinks appLoop is not strict on s. If we define it like this:

appLoop :: AppState -> IO ()
appLoop (AppState state) = do
  print state
  appLoop $ AppState (cycleState state)

Worker/wrapper runs instead of SpecConstr and this version compiles to fast code.

I don't understand why -O2 version is getting slower in each iteration, but I suspect it has do to with Cmm rather than STG.

comment:4 Changed 3 years ago by simonpj

This turns out to be an example of #11731. It turns out that the thunk for cycleState state is wrongly marked "used-once", and that's why there's a massive slow-down. (Each iteration evaluates it twice, so it goes exponential.)

The patch for #11731 fixes it, but it never made it into GHC 8.0. I'll mark it thus in case we ship 8.0.3. It really is quite a serious bug.

I'll make this bug into a test case.

Meanwhile:

Demand analyzer thinks appLoop is not strict on s.

It looks strict, but the demand analyser has a hack for I/O: earlier operations migth throw an exception, and so we treat them as lazy in their continuation. See Note [IO hack in the demand analyser] in DmdAnal.

comment:5 Changed 3 years ago by Simon Peyton Jones <simonpj@…>

In 4535fa2/ghc:

Test Trac #12996

comment:6 Changed 3 years ago by simonpj

Resolution: fixed
Status: newclosed
Test Case: testsuite/tests/perf/should_run/T12996

Thank you AndreasK. It's hard to tickle this bug, but when tickled it can make things exponentially expensive as you discovered. Thanks for the nice test case.

I'll close this, leaving #11731 open so that we remember to merge to 8.0.3

Note: See TracTickets for help on using tickets.