Opened 8 years ago

Closed 8 years ago

#5631 closed bug (fixed)

Compilation slowdown from 7.0.x to 7.2.x

Reported by: bjpop Owned by: igloo
Priority: highest Milestone: 7.4.1
Component: Compiler Version: 7.2.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case: perf/compiler/T5631
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Compile time performance for GHC appears to have slowed down significantly for the package:

http://hackage.haskell.org/package/language-python

Time to compile the package:

  • GHC 7.0.4: 3m57s
  • GHC 7.2.1: 49m22s
  • GHC 7.2.2: 49m56s

The culprits are the Happy generated modules:

Language.Python.Version3.Parser.Parser Language.Python.Version2.Parser.Parser

GHC uses up to 700MB of memory compiling these files (my machine has 2GB of RAM and it does not appear to be thrashing during the compilation).

All versions of GHC in this ticket were downloaded as binary packages from the ghc website.

System:

OS X 10.7.2 1.4 GHz Intel Core 2 Duo 2GB RAM GCC 4.2.1

Attachments (3)

test_rules2.hs (30.6 KB) - added by simonmar 8 years ago.
compilation-time regression test
test_rules2.y (2.0 KB) - added by simonmar 8 years ago.
happy input file
ghc-stage2.prof (159.8 KB) - added by simonmar 8 years ago.

Download all attachments as: .zip

Change History (15)

comment:1 Changed 8 years ago by simonmar

Milestone: 7.4.1
Priority: normalhigh

Thanks. Looks pretty serious, we'll investigate.

comment:2 Changed 8 years ago by simonpj@…

commit 207a2a63e039af2bf6c331ef5a005512dd25fd77

Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Thu Nov 17 13:13:15 2011 +0000

    Remove a quadratic complexity blow-up in coercionKind
    thereby fixing Trac #5631.
    
    See Note [Nested InstCos] in Coercion

 compiler/types/Coercion.lhs |   51 ++++++++++++++++++++++++++++++------------
 1 files changed, 36 insertions(+), 15 deletions(-)

comment:3 Changed 8 years ago by simonmar

While trying to make a test case for this bug, I appear to have found another compilation-time regression (or possibly the same one in a different guise). Compiling the attached module (test_rules2.hs), I get the following times:

GHC 7.0.3:
     642,777,576 bytes allocated in the heap
  Total time    1.44s  (  1.95s elapsed)

GHC 7.2.1:
     615,395,312 bytes allocated in the heap
  Total   time    1.38s  (  2.45s elapsed)

GHC 7.3.20111119:
   6,063,152,720 bytes allocated in the heap
  Total   time    6.81s  (  6.89s elapsed)

GHC is allocating 10x as much as 7.2.1 while compiling this module!

Changed 8 years ago by simonmar

Attachment: test_rules2.hs added

compilation-time regression test

Changed 8 years ago by simonmar

Attachment: test_rules2.y added

happy input file

comment:4 Changed 8 years ago by simonmar

Priority: highhighest

comment:5 Changed 8 years ago by simonmar

Owner: set to simonpj

We have a partial fix that I am validating right now, this brings it down to

GHC 7.3.20111119:
   1,647,677,024 bytes allocated in the heap
  Total   time    4.21s  (  4.27s elapsed)

I will also attach a profile. Simon is going to look into the remaining problems.

Changed 8 years ago by simonmar

Attachment: ghc-stage2.prof added

comment:6 Changed 8 years ago by marlowsd@…

commit ed043776f3363b2b9e56f1e9836a250f43c28d26

Author: Simon Marlow <marlowsd@gmail.com>
Date:   Tue Nov 22 14:09:05 2011 +0000

    uType_defer: only call mkErrInfo if -ddump-tc-trace is on (#5631)
    
    Profiling shows we're wasting a lot of time calling mkErrInfo and
    tidying the context, work that is wasted if we're not tracing.
    
    There is more to this compilation performance regression than this,
    but this is one bug.

 compiler/typecheck/TcUnify.lhs |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

comment:7 Changed 8 years ago by simonmar

To remind us: we need to look into why uType_defer was being called here rather than unifying directly. Simon PJ was suspicious.

comment:8 Changed 8 years ago by dimitris

I've done a lot of modifications to remove various bottlenecks in our new design and just brought the time for this down to 1.8 secs again. I've got a big patch pending along with several more performance benchmarks for the testuite which I will push on Monday.

comment:9 Changed 8 years ago by simonpj@…

commit e99f323405a8398f0521686d87124453e85b599a

Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Sat Nov 26 21:47:39 2011 +0000

    Add missing cases in TcUnify.uUnfilledVars
    
    These missing cases dealt with unifying a meta type variable with a
    skolem when the kinds match -- a pretty common case.  The missing
    cases meant that instead of directly solving on the fly (which is easy
    in this situation) we were generating an equality constraint viat
    `utype_defer`.  This isn't *wrong*, but it's a lot less efficient than
    it could be!
    
    All this arose from investigating #5631.  This one change does this
    to the compiler allocation
    
    Before:
         821,257,552 bytes allocated in the heap
                  94 MB total memory in use (0 MB lost due to fragmentation)
    
      MUT     time    1.54s  (  1.67s elapsed)
      GC      time    1.36s  (  1.60s elapsed)
      Total   time    2.93s  (  3.27s elapsed)
    
    After:
         424,244,124 bytes allocated in the heap
                  49 MB total memory in use (0 MB lost due to fragmentation)
    
      MUT     time    0.64s  (  0.89s elapsed)
      GC      time    0.83s  (  0.77s elapsed)
      Total   time    1.47s  (  1.66s elapsed)
    
    Not bad for a 3-line change!

 compiler/typecheck/TcUnify.lhs |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

comment:10 Changed 8 years ago by simonpj

Owner: changed from simonpj to igloo
Test Case: perf/cmpiler/T5631

OK I fixed utype_defer, and added a regression test (the one Simon constructed). Ian, would you like to compare the original report error: compiling language-python with 7.0 vs HEAD?

Simon

comment:11 Changed 8 years ago by dimitris

Test Case: perf/cmpiler/T5631perf/compiler/T5631

comment:12 Changed 8 years ago by igloo

difficulty: Unknown
Resolution: fixed
Status: newclosed

language-python (with a handful of build fixes) just compiled for me with HEAD in 4:01.49 total.

Note: See TracTickets for help on using tickets.