Opened 5 years ago

Closed 5 years ago

#9188 closed bug (duplicate)

quot with a power of two is not optimized to a shift

Reported by: tibbe Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.8.2
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #8311, #5615 Differential Rev(s):
Wiki Page:

Description

The follow code:

module Test where

f :: Int -> Int
f n = n `quot` 2

results in the following core:

Test.f :: GHC.Types.Int -> GHC.Types.Int
Test.f =
  \ (n_aeH :: GHC.Types.Int) ->
    case n_aeH of _ { GHC.Types.I# ww_aiQ ->
    GHC.Types.I# (GHC.Prim.quotInt# ww_aiQ 2)
    }

which in turn generates this Cmm

sjI_ret()
        { Just sjI_info:
                   const 0;
                   const 32;
        }
    cjX:
        Hp = Hp + 16;
        if (Hp > I64[BaseReg + 144]) goto ck3;
        _sjH::I64 = %MO_S_Quot_W64(I64[R1 + 7], 2);
        I64[Hp - 8] = PicBaseReg + GHC.Types.I#_con_info;
        I64[Hp + 0] = _sjH::I64;
        R1 = Hp - 7;
        Sp = Sp + 8;
        jump (I64[Sp + 0]); // [R1]
    ck1: jump (I64[BaseReg - 16]); // [R1]
    ck3:
        I64[BaseReg + 192] = 16;
        goto ck1;
}

which finally ends up as this assembly:

sjI_info:
_cjX:
	addq $16,%r12
	cmpq 144(%r13),%r12
	ja _ck3
	movl $2,%ecx
	movq 7(%rbx),%rax
	cqto
	idivq %rcx
	movq %rax,%rbx
	leaq GHC.Types.I#_con_info(%rip),%rax
	movq %rax,-8(%r12)
	movq %rbx,0(%r12)
	leaq -7(%r12),%rbx
	addq $8,%rbp
	jmp *0(%rbp)

Ideally this should have turned into a shift, not a division. compiler/nativeGen/X86/CodeGen.hs lacks any peephole optimizations for division.

Change History (3)

comment:1 Changed 5 years ago by rwbarton

See also #8311.

Note that quot by 2 rounds towards zero, so for negative Ints it's not quite just a shift. The LLVM backend does handle this well, producing this sequence for the division:

  7c:   48 89 d0                mov    %rdx,%rax
  7f:   48 c1 e8 3f             shr    $0x3f,%rax
  83:   48 01 d0                add    %rdx,%rax ; add sign bit to move towards 0
  86:   48 d1 f8                sar    %rax

A div by 2 really would be a single arithmetic shift, but at the Cmm level there is still a function call to divInt# :(

comment:2 Changed 5 years ago by jstolarek

Johan, do you think we can consider your report a duplicate of #5615?

comment:3 Changed 5 years ago by tibbe

Resolution: duplicate
Status: newclosed

Duplicate of #5615.

Note: See TracTickets for help on using tickets.