Opened 5 years ago

Closed 5 years ago

#9740 closed bug (fixed)

D380 caused fft2 regressions

Reported by: dfeuer Owned by:
Priority: normal Milestone: 7.10.1
Component: Core Libraries Version: 7.8.3
Keywords: Cc: core-libraries-committee@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D394
Wiki Page:

Description

When I got rid of the hand-unboxing in GHC.List, I missed the fact that a take helper function took an unboxed value and was therefore strict. When I changed that to a regular function, I didn't make it strict, and this caused a regression in fft2 and (I believe) also a test suite perf test. The function is properly strictified in D390.

Change History (10)

comment:1 Changed 5 years ago by dfeuer

Status: newpatch

comment:2 Changed 5 years ago by dfeuer

Cc: core-libraries-committee@… added
Component: CompilerCore Libraries

comment:3 Changed 5 years ago by Joachim Breitner <mail@…>

In 5f69c8efd94862261bc6730f8dd80c2b67b430ad/ghc:

Reorder GHC.List; fix performance regressions

Rearrange some oddly placed code.

Modify `take` to make the fold unconditionally strict in the passed
`Int`. This clears up the `fft2` regression.
This fixes #9740. Differential Revision: https://phabricator.haskell.org/D390

comment:4 Changed 5 years ago by nomeata

Resolution: fixed
Status: patchclosed

Thanks!

comment:5 Changed 5 years ago by nomeata

Resolution: fixed
Status: closednew

It fixed the regression in T7257: http://ghcspeed-nomeata.rhcloud.com/timeline/?exe=2&base=2%2B68&ben=tests%2Falloc%2FT7257&env=1&revs=50&equid=on but not the one in fft2: http://ghcspeed-nomeata.rhcloud.com/timeline/?exe=2&base=2%2B68&ben=nofib%2Fallocs%2Ffft2&env=1&revs=50&equid=on

Do your measurements disagree, or were you just expecting this to fix it. Could you have a second look?

comment:6 in reply to:  5 Changed 5 years ago by dfeuer

Replying to nomeata:

It fixed the regression in T7257: http://ghcspeed-nomeata.rhcloud.com/timeline/?exe=2&base=2%2B68&ben=tests%2Falloc%2FT7257&env=1&revs=50&equid=on but not the one in fft2: http://ghcspeed-nomeata.rhcloud.com/timeline/?exe=2&base=2%2B68&ben=nofib%2Fallocs%2Ffft2&env=1&revs=50&equid=on

Do your measurements disagree, or were you just expecting this to fix it. Could you have a second look?

That's most peculiar. I had not measured T7257, but I definitely saw an improvement in fft2, unless I read something all wrong. I'm running nofib now and will review the numbers again.

comment:7 Changed 5 years ago by dfeuer

I think I found the problem. I accidentally had an unrestricted INLINE on take. I'm trying again now.

comment:8 Changed 5 years ago by dfeuer

Differential Rev(s): Phab:D390Phab:D394
Status: newpatch

It looks like I've actually squashed this one now and recovered nomeata's full fft2 improvement.

comment:9 Changed 5 years ago by Joachim Breitner <mail@…>

In 64d0a198be05c7baff36e43ab96928a402f00a19/ghc:

Really fix fft2 regression. #9740

Rewrite `take` more aggressively for fusion. Add some more explicit
strictness to `unsafeTake` and `unsafeDrop` that seems to help code size and
allocation just a drop in some nofib tests. They were not previously
strict in their numerical arguments, but always called in contexts where
those had been forced; it didn't make a difference in simple test cases,
but made a small difference for nofib. See #9740.

Differential Revision: https://phabricator.haskell.org/D394

comment:10 Changed 5 years ago by nomeata

Resolution: fixed
Status: patchclosed

Fix confirmed. The regression had also affected mandel, where performance is recovered as well.

Thanks for taking the time to hunt down the issue!

Note: See TracTickets for help on using tickets.