Opened 8 years ago

Closed 8 years ago

#5152 closed bug (fixed)

GHC generates poor code for large 64-bit literals

Reported by: bos Owned by: igloo
Priority: normal Milestone: 7.4.1
Component: Compiler Version: 7.0.3
Keywords: Cc: as@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

I have a very simple test module:

module Word64 where

import Data.Word (Word64)

a :: Word64
a = 0x9ddfea08eb382d69

If I compile this module with a 64-bit version of GHC, I find that GHC is actually generating for a an Integer like so:

a2 :: Integer
a2 = S# 2152696470933351786

a1 :: Integer
a1 = S# 9223372036854775807

a :: Word64
a =
  case plusInteger a2 a1 of _ {
    S# i_atz ->
      W64# (int2Word# i_atz);
    J# s_au0 d_au1 ->
      case integer2Word# s_au0 d_au1
      of ww_au3 { __DEFAULT ->
      W64# ww_au3
      }
  }

This is bad because it means that a is always boxed.

If I attempt to manually inline such a definition, the code gets dramatically worse:

b :: Word64
b = 0xc3a5c85c97cb3127
{-# INLINE b #-}

Here's some generated core from a hot inner loop of a 64-bit hash function:

              case lvl3_r1aA of _ {
                Type.S# i_aVp ->
                  W64#
                    (xor#
                       (timesWord#
                          (or#
                             (uncheckedShiftL# ww10_aUE 22)
                             (uncheckedShiftRL# ww10_aUE 42))
                          (int2Word# i_aVp))
                       ww5_s17Y);
                Type.J# s_aVO d_aVP ->
                  case GMP.Internals.integer2Word# s_aVO d_aVP
                  of ww11_aVR { __DEFAULT ->
                  W64#
                    (xor#
                       (timesWord#
                          (or#
                             (uncheckedShiftL# ww10_aUE 22)
                             (uncheckedShiftRL# ww10_aUE 42))
                          ww11_aVR)
                       ww5_s17Y)
                  }
              } } in

We can see that GHC does case inspection on an Integer, then some casting and boxing on the result. This destroys performance of code that otherwise ought to be able to run from registers without any allocation at all.

Attachments (1)

CityHash.hs (2.6 KB) - added by bos 8 years ago.
The attached example code is not expected to run, as it's incomplete. I just noticed this bug while writing it, and the code does compile.

Download all attachments as: .zip

Change History (9)

Changed 8 years ago by bos

Attachment: CityHash.hs added

The attached example code is not expected to run, as it's incomplete. I just noticed this bug while writing it, and the code does compile.

comment:1 Changed 8 years ago by simonpj

Owner: set to igloo

I see. The reason for this is that, in general

  • An integer literal k means fromInteger (k::Integer)
  • If the Integer k is big, then then it is represented using a chain of multiplies and adds; see MkCore.mkIntegerExpr.

So the your 64-bit word constant is getting represented as

integerToWord64 (plusInteger (S# blah1) (S# blah2))

where blah, blah2 :: Int# are (small) int literals in Core.

GHC already has a mechanism to short-cut this stuff when we know all the types. The short cut is embodied in TcHsSyn.shortCutLit:

shortCutLit :: OverLitVal -> TcType -> Maybe (HsExpr TcId)
shortCutLit (HsIntegral i) ty
  | isIntTy ty && inIntRange i   = Just (HsLit (HsInt i))
  | isWordTy ty && inWordRange i = Just (mkLit wordDataCon (HsWordPrim i))
...etc...

All we need to do is to add cases for Word64 ... and for all the other fixed-width types.

Ian would you like to do that?

Simon

comment:2 Changed 8 years ago by igloo

Milestone: 7.4.1

comment:3 Changed 8 years ago by thoughtpolice

I've fixed this in the Word64 case at least. My branch with the fix is here:

https://github.com/thoughtpolice/ghc/tree/trac-5152

Relevant commit:

https://github.com/thoughtpolice/ghc/commit/18cf5ba840a7ecdf1860e54b8a6c8f3fcdf0441d

With GHC 7.0.4, here are my results:

$ cat ghc_5152.hs
module Main where

import Data.Word (Word64)

{-# NOINLINE a #-}
a :: Word64
a = 0x9ddfea08eb382d69

main = print a

I get:

ghc -ddump-simpl -fforce-recomp ghc_5152.hs
[1 of 1] Compiling Main             ( ghc_5152.hs, ghc_5152.o )

==================== Tidy Core ====================
Main.a [InlPrag=NOINLINE] :: GHC.Word.Word64
[GblId]
Main.a =
  GHC.Num.fromInteger
    @ GHC.Word.Word64
    GHC.Word.$fNumWord64
    (GHC.Integer.plusInteger
       (GHC.Integer.smallInteger 2152696470933351786)
       (GHC.Integer.smallInteger 9223372036854775807))

Main.main :: GHC.Types.IO ()
[GblId]
Main.main =
  System.IO.print @ GHC.Word.Word64 GHC.Word.$fShowWord64 Main.a

:Main.main :: GHC.Types.IO ()
[GblId]
:Main.main = GHC.TopHandler.runMainIO @ () Main.main

With my patch, I get:

~/code/ghc/inplace/bin/ghc-stage2 -ddump-simpl -fforce-recomp ghc_5152.hs
[1 of 1] Compiling Main             ( ghc_5152.hs, ghc_5152.o )

==================== Tidy Core ====================
Result size = 12

Main.a [InlPrag=NOINLINE] :: GHC.Word.Word64
[GblId, Caf=NoCafRefs]
Main.a = GHC.Word.W# __word 11376068507788127593

Main.main :: GHC.Types.IO ()
[GblId]
Main.main =
  System.IO.print @ GHC.Word.Word64 GHC.Word.$fShowWord64 Main.a

:Main.main :: GHC.Types.IO ()
[GblId]
:Main.main = GHC.TopHandler.runMainIO @ () Main.main

Which should be exactly what you want, Bryan.

I'm willing to extend this patch to handle other fixed-width types - I assume Int64 probably needs handling among other things. If this is what should be done, I'm willing to assign this ticket to myself and add those cases.

comment:4 Changed 8 years ago by thoughtpolice

Cc: as@… added

comment:5 Changed 8 years ago by simonpj

c.f all the stuff in #5284.

comment:6 Changed 8 years ago by bos

Looks great, Ian, thanks!

comment:7 Changed 8 years ago by simonpj

The whole approach in this ticket is a bit unsatisfactory (I know, I recommended it!) because it only works if the typechecker knows enough at the moment it encounters the literal. The vaguaries of inference might mean it doesn't know enough, and it certainly won't know enough if you have constant in code that is subsequently specialised:

f :: Num a => a -> a
f a = a + 0x9ddfea08eb382d69
{-# SPECIALISE f :: Word64 -> Word64 #-}

The Right Thing is really for the optimiser to do the right constant-folding thing. I can see two reasonable alternative approaches.

The first is to treat big integer literals using a list of coefficients thus:

makeInteger :: [Int] -> Integer

(This would replace the chain of plus/multiply stuff at the moment.) With the idea that

makeInteger [a,b,c] = a + 10000*(b + 10000*c)

(Actually 230 would probably be a better constant than 10000.) Now you could sensibly have a rule

RULE "fromInteger/Float"
 fromInteger (makeInteger (a:as)) = a + 10000*fromInteger (makeInteger as) :: Word64

or something like that. (This would allow out of range Word constants, which would just overflow. I'd prefer a static error...)

A second approach would be to make Core have a built-in, boxed, literal type Integer, not (visibly) represented with a data type. It would directly support constant-folding etc just as Int# does. Then in the back end it'd be expanded to a call to makeInteger or something like that.

comment:8 Changed 8 years ago by igloo

Resolution: fixed
Status: newclosed

With ghc -O -ddump-simpl we now get:

==================== Tidy Core ====================
Result size = 6

Word64.a :: GHC.Word.Word64
[GblId,
 Caf=NoCafRefs,
 Str=DmdType m,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=IF_ARGS [] 10 110}]
Word64.a = GHC.Word.W64# __word 11376068507788127593

Word64.b [InlPrag=INLINE (sat-args=0)] :: GHC.Word.Word64
[GblId,
 Caf=NoCafRefs,
 Str=DmdType m,
 Unf=Unf{Src=InlineStable, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False,
         Guidance=ALWAYS_IF(unsat_ok=False,boring_ok=False)
         Tmpl= GHC.Word.$fBitsWord64_$cfromInteger
                 __integer 14097894508562428199}]
Word64.b = GHC.Word.W64# __word 14097894508562428199
Note: See TracTickets for help on using tickets.