Opened 11 years ago

Last modified 5 years ago

#2465 new feature request

Fusion of recursive functions

Reported by: ryani Owned by:
Priority: low Milestone:
Component: Compiler Version: 6.8.2
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: x86
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #3123 Differential Rev(s):
Wiki Page:

Description

This came up while playing with the ICFP2007 task.

I have a function dnaView :: DNA -> [D]; DNA is a [ByteString] and D is just the enum I|C|F|P. When I call a function that just pattern matches on the result, I think the pattern matching function should be fused with the view -- the view should run as a coroutine.

I've attached an example.

Attachments (2)

plzoptimize.hs (3.8 KB) - added by ryani 11 years ago.
Source code
plzoptimize.core (23.7 KB) - added by ryani 11 years ago.
ghc6.8.2 -ddump-simpl output

Download all attachments as: .zip

Change History (11)

Changed 11 years ago by ryani

Attachment: plzoptimize.hs added

Source code

Changed 11 years ago by ryani

Attachment: plzoptimize.core added

ghc6.8.2 -ddump-simpl output

comment:1 Changed 11 years ago by simonpj

difficulty: Unknown

I don't understand the example. Can you point to the exact code in the Core output that you think should be different, and what you are expecting to see instead?

Note also that dnaView is recursive, which sharply limits how much inlining GHC can do.

Simon

comment:2 Changed 11 years ago by ryani

I intentionally put the most pie-in-the-sky example in my bug; I realize this is a hard problem. The code I actually used, which still didn't get any fusion, only examined a small finite amount of the result of dnaView, following it by a recursive call to "consts'" using (dnaDrop n dna). In the "secrets of the GHC inliner" paper, I seem to recall reading that loop unrolling is just inlining of recursive functions, so I'd hope that dnaView gets loop-unrolled 2 or 3 times which would solve that problem.

That said, I do think it's possible to get from "consts + dnaView" (code which I feel is beautiful & slow) to "optConsts" (which is ugly, hand-optimized code to match what I would like the compiler to generate for this case).

So, some thoughts...

Is there a better way to write recursive functions that allows GHC to inline them better? I've seen several examples of this format:

{{{func a b c = go a b c where

go Something = base case go SomethingElse = combiner (SomeResult) (go RestOfData)}}}

If that really helps, it seems like a trivial transformation to do to simple recursive functions at compile-time.

I can experiment with a few different constructions of the program, but my point is that this is what most users would write as the direct specification -> implementation of this problem, the performance is pretty bad, and the "right" answer seems tantalizingly close.

When the optimizer is good, it's amazingly good, but when it goes wrong it looks so much worse by comparison. There isn't much of a "graceful falloff" in performance, you either get blazing fast or bad.

comment:3 Changed 11 years ago by ryani

Ugh, sorry about the formatting; forgot to click "preview".

func a b c = go a b c where
  go Something     = base case
  go SomethingElse = combiner SomeResult (go RestOfData)

comment:4 Changed 11 years ago by ryani

Also, to answer your question directly... what in the core should be different?

The list thunk allocations in $sdnaView (line 490) and dnaView (line 521) immediately get destructed by $wconsts (line 386), causing a lot of allocator churn.

If those two functions are fused, as they are in optConsts, the code runs ~5-10x faster (from my recollection).

comment:5 Changed 11 years ago by simonpj

Milestone: _|_
Priority: normallow

Fusion of recursive functions is a very interesting and well-studied area.

GHC offers so-called "short-cut deforestation" for lists. (Search for that string.) If you write dnaView using build (to make it a good producer), and consts' using foldr (to make it a good consumer) you may get better results.

Also worth a look is Coutts/Leschinskiy et al on stream fusion.

But GHC isn't going to fuse recursive functions all by itself anytime soon. So I'll refile this as low priority.

Don't let this discourage you from suggesting other optimisation "misses" though.

Simon

comment:6 Changed 10 years ago by simonmar

Type: compile-time performance bugrun-time performance bug

comment:7 Changed 10 years ago by simonmar

Type of failure: Runtime performance bug

comment:8 Changed 5 years ago by gintas

Operating System: WindowsUnknown/Multiple

comment:9 in reply to:  5 Changed 5 years ago by thomie

Summary: View + Pattern Match not fused sufficientlyFusion of recursive functions
Type: bugfeature request

But GHC isn't going to fuse recursive functions all by itself anytime soon.

Note: See TracTickets for help on using tickets.