Opened 6 years ago

Closed 4 years ago

Last modified 4 years ago

#8832 closed bug (fixed)

Constant-folding regression wrt `clearBit (bit 0) 0 `

Reported by: hvr Owned by: bgamari
Priority: high Milestone: 8.0.1
Component: Core Libraries Version: 7.8.1-rc2
Keywords: Cc: ekmett
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case: simplCore/should_compile/T8832
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D1255
Wiki Page:

Description

While implementing zeroBits (see [83bd2f5fc7e/base]) I noticed that constant folding of the expression clearBit (bit 0) 0 regressed (and improved at the same time) from GHC 7.6.3 to GHC 7.8.1, specifically, the following module

{-# LANGUAGE CPP #-}

module M where

import Data.Bits
import Data.Int
import Data.Word

#define T(s,T) \
s :: T ; \
s = clearBit (bit 0) 0 ; \

T(i,Int)
T(i8,Int8)
T(i16,Int16)
T(i32,Int32)
T(i64,Int64)

T(w,Word)
T(w8,Word8)
T(w16,Word16)
T(w32,Word32)
T(w64,Word64)

T(z,Integer)

compiled with GHC 7.8.1RC2 results in the following Core output:

-- GHC 7.8.1RC2

i = I# (andI# 1 (notI# 1))

i8 = I8# 0
i16 = I16# 0
i32 = I32# 0
i64 = I64# 0

w = W# (__word 0)
w8 = W8# (__word 0)

w16 = W16# (__word 0)
w32 = W32# (__word 0)
w64 = W64# (__word 0)

z2 = $w$cbit 0
z1 = complementInteger z2
z = andInteger z2 z1

Thus, i and z are not properly constant-folded in GHC 7.8.1RC2. With GHC 7.6.3, however, i and z were properly folded to 0:

-- GHC 7.6.3

i = I# 0

i8 =
  case $fBitsInt8_$cbit i of _ { I8# x#_aDf ->
  case $fBitsInt8_$cbit i of _ { I8# x#1_aDr ->
  I8#
    (word2Int#
       (and#
          (int2Word# x#_aDf)
          (xor# (int2Word# x#1_aDr) (__word 18446744073709551615))))
  }
  }

i16,i32,i64 -- equivalent to i8

w = W# (__word 0)

w8 =
  case $fBitsWord8_$cbit i of _ { W8# x#_aEV ->
  case $fBitsWord8_$cbit i of _ { W8# x#1_aF5 ->
  W8# (and# x#_aEV (xor# x#1_aF5 (__word 255)))
  }
  }

w16,w32,w64 -- equivalent to w8

z = __integer 0

Change History (34)

comment:1 Changed 6 years ago by simonpj

Cc: jstolarek added

Jan might you look at this? It seems to be in the general area you were working in. Yell if you can't. Thanks!

Simon

comment:2 Changed 6 years ago by jstolarek

Sorry Simon, but I don't have enough time to look into this one. I'm working on #8707 at the moment.

comment:3 Changed 6 years ago by simonpj

Owner: set to simonpj

I have a fix for this, but only on my laptop. And since I can't build on my laptop (we have a seg-fault on Windows that needs someone to pay attention to) it'll have to wait till I'm back in the office.

Simon

comment:4 Changed 6 years ago by Simon Peyton Jones <simonpj@…>

In 8fd7d581da448d81fc2f9d47366c36c5f57ed564/ghc:

Add BuiltinRules for constant-folding not# and notI# (logical complement)

I don't know why these constant-folding rules were implemented for
and/or/xor but not for 'not'.

Adding them is part of the fix for Trac #8832.
(The other part is in Data.Bits.)

comment:5 Changed 6 years ago by Simon Peyton Jones <simonpj@…>

In ea6dcef1d9800953b1791304d52884359f415ad9/ghc:

Test Trac #8832

The test is a bit crude; -ddump-simpl | grep '#'.

I'm concerned that the -ddump-simpl output may differ on 32 and 64-bit
platforms.  So far I've only put in output for 64-bit platforms.

comment:6 Changed 6 years ago by Simon Peyton Jones <simonpj@…>

In f7a7b586bc89ca7fd56792da4172bd93a2acdae9/base:

Add shiftR and shiftL implementations to instance Bits Integer

Apart from simply making sense (avoid the conditional in 'shift'),
this makes left and right shifts on Integer more likely to inline
(plain 'shift' is just too large); and this in turn is important
when fixing the Integer case of #8832

comment:7 Changed 6 years ago by simonpj

I believe I've fixed this. I'm not sure if it's worth merging into 7.8; the status quo is definitely not a show-stopper.

Simon

comment:8 Changed 6 years ago by simonpj

Test Case: simplCore/should_compile/T8832

comment:9 Changed 6 years ago by thoughtpolice

Milestone: 7.8.1
Resolution: fixed
Status: newclosed

Thanks Simon. I don't think this is worth merging with the fix Herbert put in the 7.8 branch, so I'm punting it off the 7.8.1 milestone now.

comment:10 Changed 5 years ago by simonpj

Owner: simonpj deleted
Resolution: fixed
Status: closednew

The test fails on 32-bit machines, because we don't have 64-bit primops on a 32-bit machine, so the constant folding doesn't work.

Maybe we should have 64-bit primops on a 32-bit machine, but that's a separate story.

Meanwhile, we need to fix the test so that it omits the 64-bit test on 32-bit machines. Austin can you do this, post release?

Simon

comment:11 Changed 5 years ago by simonpj

Owner: set to thoughtpolice

comment:12 Changed 5 years ago by Herbert Valerio Riedel <hvr@…>

In 40ba3daa8ce68eea92f7a42b0c1f6c716636b494/ghc:

Expect test failure for T8832 on 32bit (re #8832)

Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org>

comment:13 Changed 5 years ago by thoughtpolice

Milestone: 7.10.1

comment:14 Changed 5 years ago by simonpj

Priority: normalhigh

The bug is fixed. The only issue is that the test has some 64-bit stuff in it, and on a 32-bit machine that doesn't constant-fold. Which is probably fine.

To close the ticket we just need to make the test depend on whether we are on a 64-bit machine. I don't know how to do that, but it can't be hard. Could someone do it?

I'll make it "high" priority to get it some attention, rather than because it's terribly important.

Simon

comment:15 Changed 5 years ago by Reid Barton <rwbarton@…>

In a72614c40186521da7ba090b102436e61a80b7a7/ghc:

Make T8832 operative on 32-bit systems (#8832)

(Also, the 'extra_clean' was unnecessary.)

comment:16 Changed 5 years ago by rwbarton

Is this what you had in mind, Simon? Without the #ifdefs the grep output on a 32-bit system would contain uniques, so I couldn't just add a .stdout-ws-32 file.

comment:17 Changed 5 years ago by simonpj

Resolution: fixed
Status: newclosed

That's great thank you.

(Do you know about -dsuppress-uniques? But in any case, it's probably best to suppress the 64-bit test on a 32-bit system.)

comment:18 in reply to:  15 ; Changed 5 years ago by hvr

Replying to Reid Barton <rwbarton@…>:

In a72614c40186521da7ba090b102436e61a80b7a7/ghc:

Make T8832 operative on 32-bit systems (#8832)

(Also, the 'extra_clean' was unnecessary.)

Btw, now that I saw that commit, I've seen other testcases making use of the pre-defined SIZEOF_HSWORD (and there's also WORD_SIZE_IN_BITS) CPP symbol to test for wordsize. Would that have worked as well?

comment:19 in reply to:  18 Changed 5 years ago by rwbarton

Replying to hvr:

Btw, now that I saw that commit, I've seen other testcases making use of the pre-defined SIZEOF_HSWORD (and there's also WORD_SIZE_IN_BITS) CPP symbol to test for wordsize. Would that have worked as well?

Predefined by what? I tried ghc -E -optP-dM -cpp foo.hs; cat foo.hspp as suggested somewhere in the User's Guide to see what C preprocessor symbols were automatically defined and there wasn't anything like that among them. Or do you mean #includeing one of GHC's header files?

comment:20 Changed 5 years ago by thomie

Owner: thoughtpolice deleted
Resolution: fixed
Status: closednew

z is still not properly constant-folded.

The test is deceiving. It greps for '#', and there is no '#' in the output for z (not until #8274 is implemented, which I'm working on).

$ ghc-7.10.0.20150123 Temp.hs -ddump-simpl -fforce-recomp -dsuppress-all -O
[1 of 1] Compiling M                ( Temp.hs, Temp.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 10, types: 3, coercions: 0}

z2
z2 = bitInteger 0

z1
z1 = complementInteger z2

z
z = andInteger z2 z1

comment:21 Changed 5 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to the 7.12.1 milestone, as these tickets won't be fixed in time for the 7.10.1 release (unless you, the reader, help write a patch :)

comment:22 Changed 4 years ago by Thomas Miedema <thomasmiedema@…>

In 5e66a698dae8c01bcd1a9335346145b32016e119/ghc:

Testsuite: change some expect_fail tests to expect_broken

Change the following tests from expect_fail to expect_broken: and list
the ticket number:

    * driver/sigof03m/sigof03 (#9252)
    * driver/static001 (#8127)
    * partial-sigs/should_compile/EqualityConstraint (#9478)
    * partial-sigs/should_compile/ExtraNumAMROn (#9478)
    * partial-sigs/should_compile/PatBind2 (#9478)
    * partial-sigs/should_fail/TidyClash2 (#9478)
    * simplCore/should_compile/T8832 (#8832)

The following tests are still marked as expect_fail, but it is not
clearly documented why so:

    * gadt/lazypatok
    * indexed-types/should_fail/SkolemOccursLoop

All other expect_fail tests are only expected to fail on either a
certain platform/os or for a certain way only.

Differential Revision: https://phabricator.haskell.org/D966

comment:23 Changed 4 years ago by bgamari

The problem here is that the Bits instance for Integer overrides bit, using it's own bitInteger function in place of the usual bitDefault. bitDefault is constant folded by PrelRules by virtue of being implemented in terms of shiftL.

This is presumably done to optimize the case of construction of a BigNat for large arguments, but bitBigNat#, which is what handles this case, is currently just the naive implementation with a TODO.

comment:24 Changed 4 years ago by simonpj

Cc: ekmett added
Component: CompilerCore Libraries
Owner: set to ekmett

OK so this is a library problem, not a GHC problem. Core Libraries Committee, might you take this on?

Simon

comment:25 Changed 4 years ago by bgamari

For the record, hvr has indicated that he will eventually be implementing a more efficient bitInteger. However, this still means that Integer won't get proper constant folding (as it won't be implemented in terms of shiftL).

I'm not really sure it's fair to call this a library problem: the compiler currently makes the library author choose between having a constant-folded bit implementation (by implementing in terms of other operations which are constant folded) or an implementation which is implemented efficiently for the type in question. There are a few ways we could handle this:

  1. add a PrelRule to specifically handle Bits.bits
  2. somehow extend the rule rewriting system to allow these rules to be expressed in the source language

Option 1 is unfortunate in that Bits.bits is likely far from the last operation which will need this treatment. Moreover, it leaves users who want to implement other types implementing Bits and similar classes covered by PrelRules high and dry.

Arguably this is a limitation of the rule rewriting system, hence option 2. At first glance it would seem like allowing a rule to match conditioned on the nature of its arguments (either literal or non-literal) and allowing compile-time evaluation of the RHS may be sufficient to address this. This, however, would be non-trivial to implement (namely compile-time evaluation would require the interpreter, hence the need for PrelRules) and may present termination issues. For these reasons (and perhaps others I haven't yet thought of) I doubt this is a viable path.

Last edited 4 years ago by bgamari (previous) (diff)

comment:26 Changed 4 years ago by ekmett

Simon,

I can talk to Herbert about fixing up bitBigNat#, but if the issue is that having it at all kills constant folding, it sounds like we'd still have a problem.

comment:27 Changed 4 years ago by bgamari

Owner: changed from ekmett to bgamari

I discussed this with Simon during out GHC meeting and we agreed that the best solution here would be to add a PrelRule covering bit.

comment:28 Changed 4 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:29 Changed 4 years ago by jstolarek

Cc: jstolarek removed

comment:30 Changed 4 years ago by bgamari

Differential Rev(s): Phab:D1255
Status: newpatch

comment:31 Changed 4 years ago by Austin Seipp <austin@…>

In cf90a1e/ghc:

Add constant-folding rule for Data.Bits.bit

This adds a constant-folding rule for `Integer`'s implementation of `bit` and
fixes the `T8832` testcase. Fixes #8832.

Reviewed By: simonpj, austin

Differential Revision: https://phabricator.haskell.org/D1255

GHC Trac Issues: #8832

comment:32 Changed 4 years ago by Thomas Miedema <thomasmiedema@…>

In 4f9ee91/ghc:

Testsuite: update expected output for T8832 on 32-bit systems (#8832)

comment:33 Changed 4 years ago by thomie

Resolution: fixed
Status: patchclosed

comment:34 Changed 4 years ago by Thomas Miedema <thomasmiedema@…>

In 5883b56/ghc:

Testsuite: properly fix T8832.stdout-ws-32 (#8832)
Note: See TracTickets for help on using tickets.