Opened 10 years ago

Closed 10 years ago

#3655 closed bug (fixed)

Performance regression relative to 6.10

Reported by: simonmar Owned by: simonpj
Priority: high Milestone: 6.12.2
Component: Compiler Version: 6.10.4
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


The attached program runs more slowly when compiled with 6.12 compared to 6.10. The current HEAD is also worse than 6.10, but not as bad as 6.12.

The results are below, on x86-64/Linux, first with -O:

                time       allocation
6.10.2           9.6s      6.5GB
6.12.20091011   11.0s      7.5GB
6.13.20091111   10.2s      6.2GB

Interestingly, -O2 makes things even worse with 6.12, but makes things slightly better with both 6.10 and 6.13:

                time       allocation
6.10.2           9.5s      6.5GB
6.12.20091011   11.8s      7.5GB
6.13.20091111   10.1s      6.2GB

It may be that there is some degradation due to the new IO library, since the program generates a fair amount of output. That may account for some of the difference between 6.10.2 and 6.12/6.13, but it doesn't account for the difference between 6.12 and 6.13, which are both using the new IO library.

The program is in one module, compile with no special options. To run it:

./pHlcm mushroom.dat 100 >/dev/null +RTS -s

Attachments (1)

3655.tar.gz (42.0 KB) - added by simonmar 10 years ago.

Download all attachments as: .zip

Change History (6)

Changed 10 years ago by simonmar

Attachment: 3655.tar.gz added

comment:1 Changed 10 years ago by simonmar

Strange things are afoot. With the STABLE buildbot build, this program only allocates 3.5GB and runs in 8.5s. Simon PJ's HEAD build has similar behaviour, but none of the other HEAD builds I have tried do - they all behave similarly to the 6.13 results above.

We continue to investigate with -ticky.

comment:2 Changed 10 years ago by simonmar

Type of failure: Runtime performance bug

comment:3 Changed 10 years ago by simonmar

We tracked it down to sequence_ not fusing with map in a couple of places, due to sequence_ not having arity 2 which was preventing a partially-applied call being inlined, which meant that the rule didn't fire. Simon PJ is going to investigate further.

comment:4 Changed 10 years ago by simonpj

Owner: set to simonpj

comment:5 Changed 10 years ago by simonpj

Resolution: fixed
Status: newclosed

OK, it seems ok in the HEAD, and I'm disinclined to follow it up further:

          Time     Allocation
6.10.2    9.27s     6.5G
HEAD      8.56      3.8G

So HEAD allocates a lot less! Both spend 30% of time in GC.

For HEAD I had to add +RTS -K9m, whereas 6.10.2 worked with -K8m (but not -K7m), so stack space seem to have increased a bit.


Note: See TracTickets for help on using tickets.