Opened 20 months ago

Last modified 13 months ago

#14971 new task

Use appropriatly sized comparison instruction for small values.

Reported by: AndreasK Owned by:
Priority: normal Milestone:
Component: Compiler (CodeGen) Version:
Keywords: CodeGen Cc: vanessa.mchale@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

GHC currently defaults all comparisons originating from Cmm switch statements to 64bit on x64.

This incurs a small overhead in instruction size. Fixing this manually gave a speedup of ~1,5% in microbenchmarks.

In detail we generate Cmm of the form:

      _s8Dg::P64 = R1;
      _c8EF::P64 = _s8Dg::P64 & 7;
      switch [1 .. 2] _c8EF::P64 {
          case 1 : goto c8Ey;
          case 2 : goto c8EC;
      }

Which results in assembly like:

	andl $7,%ebx
	cmpq $1,%rbx

It's obvious that c8EF fits into a byte, but is sized up to 64 bits. Changing this would enable us to use cmpl instead of cmpq and save us a byte on each comparison.

While this isn't major in my microbenchmarks it resultet in a speedup of ~1,5% for such constructs in inner loops.

Change History (3)

comment:1 Changed 20 months ago by simonpj

Keywords: CodeGen added

Sounds good! 1.5% is a lot. How did you get that figure? Do you have a patch :-)?

comment:2 in reply to:  1 Changed 20 months ago by AndreasK

Replying to simonpj:

Sounds good! 1.5% is a lot. How did you get that figure?

Indeed it's more than I would have expected.

But these are micro benchmarks so it's hard to predict how much of that will remain when running regular programs. Given that cache behaviour and cpu bottlecks would be different there.

I've compiled two criterion benchmarks and changed the assembly by Hand. One example was:

module Main (main) where

import Criterion.Main

mapInt :: Int -> Int
mapInt 1 = 111
mapInt 2 = 211
mapInt 3 = 311

{-# NOINLINE sumAndLookup #-}
sumAndLookup :: [Int] -> Int
sumAndLookup = sum . map mapInt

main = do
  let value = map (\x -> 1 + x `mod` 3) [1..30]
  print (sumAndLookup value)
  defaultMain [
    bgroup "opSize"
        [ bench "caseOfThree"  $ whnf sumAndLookup value]
    ]

Do you have a patch :-)?

Not yet. I submitted a project proposal for Summer of Code which includes writing a patch for this.

But if that should not work out I still expect to get around to it eventually.

comment:3 Changed 13 months ago by vanessamchale

Cc: vanessa.mchale@… added
Note: See TracTickets for help on using tickets.