Opened 2 years ago

Closed 7 months ago

Last modified 7 months ago

#14037 closed bug (fixed)

Fix fusion for GHC's utility functions

Reported by: goldfire Owned by: rockbmb
Priority: normal Milestone:
Component: Compiler Version: 8.3
Keywords: newcomer Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #15263 Differential Rev(s): Phab:D5249
Wiki Page:

Description

In a patchset I'm working on, I encountered a smallish "bytes allocated" regression in perf/compiler/T5631. After some sleuthing, I discovered that the regression is caused by using a zipWith3M where previously there was a zipWithM. Interestingly, it was not the zipped function that caused the problem, but the zipWith3M itself. Further sleuthing found that zipWithM is inlined into a combination of sequenceA and zipWith, both of which have been taught to work nicely with fusion.

I thought of doing the same to GHC's zipWith3M, but I discovered that base's zipWith3 has no rules or inlining pragmas, suggesting that it does not fuse. I was quite surprised to find this, thinking that all functions as basic as zipWith3 in base would play nicely with fusion. (Perhaps it does fuse -- I didn't actually check.) I then checked out GHC's MonadUtils, which defines a bunch of functions, some of which must be hammered on, and only one of which has any kind of performance-enhancing pragma. (The one is zipWithAndUnzipM, whose INLINABLE pragma was added only after much pain suffered by your dear author.)

So: this ticket is a request to sweep through MonadUtils and Util and perhaps ListSetOps, looking for performance enhancements via fusion, tighter definitions, and such. Also, get zipWith3 (and the other zipWiths) to fuse in base.

I figure this is all some fairly low-hanging fruit for performance improvements in GHC.

Change History (17)

comment:1 Changed 2 years ago by Richard Eisenberg <rae@…>

In fb75213/ghc:

Track visibility in TypeEqOrigin

A type equality error can arise from a mismatch between
*invisible* arguments just as easily as from visible arguments.
But we should really prefer printing out errors from visible
arguments over invisible ones. Suppose we have a mismatch between
`Proxy Int` and `Proxy Maybe`. Would you rather get an error
between `Int` and `Maybe`? Or between `*` and `* -> *`? I thought
so, too.

There is a fair amount of plumbing with this one, but I think
it's worth it.

This commit introduces a performance regression in test
perf/compiler/T5631. The cause of the regression is not the
new visibility stuff, directly: it's due to a change from
zipWithM to zipWith3M in TcUnify. To my surprise, zipWithM
is nicely optimized (it fuses away), but zipWith3M is not.
There are other examples of functions that could be made faster,
so I've posted a separate ticket, #14037, to track these
improvements. For now, I've accepted the small (6.6%) regression.

comment:2 Changed 14 months ago by thomie

Keywords: newcomer added

comment:3 Changed 11 months ago by rockbmb

I'd like to take this issue on, but before I do I'll take a look at the modules in question.

Is there a benchmarking suite in GHC that includes the functions mentioned? If not then that would also be part of the issue.

comment:4 Changed 11 months ago by rockbmb

Owner: set to rockbmb

comment:5 Changed 11 months ago by bgamari

There is nothing that tests these functions in isolation although it's possible improvements would be reflected in the compiler allocations measured by the Nofib suite.

comment:6 in reply to:  5 Changed 11 months ago by rockbmb

Replying to bgamari:

There is nothing that tests these functions in isolation although it's possible improvements would be reflected in the compiler allocations measured by the Nofib suite.

I've been reading this resource on NoFib, very useful. I'll keep what you said in mind.

Since the OP also mentions testsuite/tests/perf/compiler/T5631, I'm also reading https://ghc.haskell.org/trac/ghc/wiki/Building/RunningTests so I can test the specific case described.

I've already added some pragma code to zipWith3, zipWith3M, zipwith4, zipwith4M. In the coming days I'll run benchmarks.

comment:7 Changed 11 months ago by rockbmb

Differential Rev(s): D5240

comment:8 Changed 11 months ago by rockbmb

Differential Rev(s): D5240Phab:D5240

comment:9 Changed 11 months ago by rockbmb

Differential Rev(s): Phab:D5240

comment:10 Changed 11 months ago by rockbmb

Differential Rev(s): Phab:5249

comment:11 Changed 11 months ago by potato44

Differential Rev(s): Phab:5249Phab:D5249

comment:12 in reply to:  11 Changed 11 months ago by rockbmb

Replying to potato44: Thanks, I didn't notice that mistake.

comment:13 Changed 11 months ago by osa1

Status: newpatch

comment:14 Changed 11 months ago by TDecki

comment:15 Changed 7 months ago by rockbmb

Resolution: fixed
Status: patchclosed

comment:16 Changed 7 months ago by rockbmb

Closed after this MR.

Last edited 7 months ago by rockbmb (previous) (diff)

comment:17 Changed 7 months ago by Marge Bot <ben+marge-bot@…>

In 9059343e/ghc:

base: Allow fusion for zip7 and related

Fixes #14037.

Metric Decrease:
    T9872b
    T9872d

Reviewers: bgamari, simonpj, hvr

Reviewed By: simonpj

Subscribers: AndreasK, simonpj, osa1, dfeuer, rwbarton, carter

GHC Trac Issues: #14037

Differential Revision: https://phabricator.haskell.org/D5249
Note: See TracTickets for help on using tickets.