Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#10677 closed bug (fixed)

slightly silly assembly for testing whether a Word# is 0##

Reported by: rwbarton Owned by:
Priority: low Milestone: 8.0.1
Component: Compiler (CodeGen) Version: 7.11
Keywords: newcomer Cc:
Operating System: Unknown/Multiple Architecture: x86_64 (amd64)
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #10750 Differential Rev(s): Phab:D1137
Wiki Page:

Description

{-# LANGUAGE MagicHash #-}

module Zero where

import GHC.Exts

f :: Word# -> Word#
f 0## = 1##
f x## = x##

in HEAD produces

0000000000000018 <ZZero_f_info>:
  18:	49 83 fe 01          	cmp    $0x1,%r14
  1c:	72 06                	jb     24 <ZZero_f_info+0xc>
  1e:	4c 89 f3             	mov    %r14,%rbx
  21:	ff 65 00             	jmpq   *0x0(%rbp)
  24:	bb 01 00 00 00       	mov    $0x1,%ebx
  29:	ff 65 00             	jmpq   *0x0(%rbp)

Well, cmp $0x1,%r14/jb isn't wrong. But it's a byte longer than test %r14,%r14/je, so the latter should be preferred.

GHC 7.10 produced the test/je version, so I'm guessing this is a side effect of nomeata's work on #10137.

Change History (11)

comment:1 Changed 4 years ago by rwbarton

Keywords: newcomer added

This is probably pretty easy to fix as a special case in the Cmm switch->branches code.

comment:2 Changed 4 years ago by rwbarton

See also #10750

comment:3 Changed 4 years ago by nomeata

Is there a good way to have test cases for these issues?

comment:4 Changed 4 years ago by nomeata

I tried to fix this, but now (presumably due to a later optimization) it produces

	testq %r14,%r14
	jne _cQl

I hope that is ok as well.

comment:5 Changed 4 years ago by nomeata

Differential Rev(s): Phab:D1137
Status: newpatch

comment:6 in reply to:  4 Changed 4 years ago by rwbarton

Replying to nomeata:

Is there a good way to have test cases for these issues?

Maybe grep -ddump-opt-cmm output for the condition if (%MO_F_Gt_W64(... and check that it compares something to 0, not 1?

Replying to nomeata:

I tried to fix this, but now (presumably due to a later optimization) it produces

	testq %r14,%r14
	jne _cQl

I hope that is ok as well.

Yes, that's just as good (generally speaking, in any particular case one may be better than the other because they imply different basic block layouts, but understanding that is a whole separate issue).

comment:7 Changed 4 years ago by Joachim Breitner <mail@…>

In 92f35cd9/ghc:

cmmCreateSwitchPlan: Handle singletons up-front

and make sure these are implemented with an equality check, which is a
shorter instruction. This was suggested by rwbarton in #10677.

Differential Revision: https://phabricator.haskell.org/D1137

comment:8 Changed 4 years ago by nomeata

Resolution: fixed
Status: patchclosed

comment:9 Changed 4 years ago by nomeata

This causes a + 15.08% runtime regression in fannkuch-redux:

https://perf.haskell.org/ghc/#graph/nofib/time/fannkuch-redux;hl=92f35cd9829db7555397aa3dc8cd243d17694fee

This is a bit strange... is this just a possible reordering of basic blocks?

comment:10 Changed 4 years ago by rwbarton

I would guess so. Maybe worth further investigation though. Another possible cause is #8279.

comment:11 Changed 4 years ago by thomie

Milestone: 8.0.1
Note: See TracTickets for help on using tickets.