Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#11486 closed bug (fixed)

info tables are no longer aligned

Reported by: rwbarton Owned by:
Priority: high Milestone: 8.0.1
Component: Compiler Version: 8.1
Keywords: Cc: olsner, simonmar
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D1847
Wiki Page:

Description

Take a simple program like

main = mapM_ print "Hello, world"

and build with -O.

Versions of ghc as late as 7.11.20150806 generate assembly that includes

...
_c3mn:
	movq $sat_s3m2_info,-16(%r12)
	movq %r14,(%r12)
	movq $block_c3mi_info,-16(%rbp)
	movl $GHC.Types.True_closure+2,%edi
	movq %rsi,%rax
	leaq -16(%r12),%rsi
	movl $GHC.IO.Handle.FD.stdout_closure,%r14d
	movq %rax,-8(%rbp)
	addq $-16,%rbp
	jmp GHC.IO.Handle.Text.hPutStr2_info
.text
	.align 8
	.quad	1
	.quad	32
block_c3mi_info:
_c3mi:
	movq 8(%rbp),%rbx
	addq $16,%rbp
	jmp stg_ap_v_fast
...

Note the .align 8 ensuring that the info table (and the code _c3mi itself) is 8-byte aligned.

In 8.1.20160107 we instead get

_c5gE:
	movq $sat_s5g9_info,-16(%r12)
	movq %r14,(%r12)
	movq $block_c5gz_info,-16(%rbp)
	movl $GHC.Types.True_closure+2,%edi
	movq %rsi,%rax
	leaq -16(%r12),%rsi
	movl $GHC.IO.Handle.FD.stdout_closure,%r14d
	movq %rax,-8(%rbp)
	addq $-16,%rbp
	jmp GHC.IO.Handle.Text.hPutStr2_info
_c5gF:
	movq $24,904(%r13)
_c5gC:
	movl $Main.main2_closure,%ebx
	jmp *-8(%r13)
	.quad	1
	.quad	32
block_c5gz_info:
_c5gz:
	movq 8(%rbp),%rbx
	addq $16,%rbp
	jmp stg_ap_v_fast
	.size Main.main2_info, .-Main.main2_info

There is some minor rearrangement, but more importantly there is no longer any .align 8 before .quad 1. That means the info table is not necessarily aligned, and indeed c5gz_info ended up at the address 40676f.

I'm guessing this is bad for performance, though I don't know how bad.

I'm pretty sure this is due to 4a32bf925b8aba7885d9c745769fe84a10979a53 "Implement function-sections for Haskell code" though I haven't looked at that commit in detail.

Change History (9)

comment:1 Changed 4 years ago by thomie

Type of failure: None/UnknownRuntime performance bug

comment:2 Changed 4 years ago by simonpj

Cc: simonmar added

Simon O, Simon M, it'd b good to nail this. Looks bad to me. Well spotted Reid.

comment:3 Changed 4 years ago by bgamari

I suspect the issue here is the pprSectionHeader removed from pprBasicBlock.

@@ -113,7 +109,6 @@ pprBasicBlock info_env (BasicBlock blockid instrs)
     maybe_infotable = case mapLookup blockid info_env of
        Nothing   -> empty
        Just (Statics info_lbl info) ->
-           pprSectionHeader Text $$
            infoTableLoc $$
            vcat (map pprData info) $$
            pprLabel info_lbl

comment:4 Changed 4 years ago by bgamari

Differential Rev(s): Phab:D1846
Status: newpatch

Possibly fixed in Phab:D1846.

comment:5 Changed 4 years ago by bgamari

Differential Rev(s): Phab:D1846Phab:D1847

More thorough fix in Phab:D1847.

comment:6 Changed 4 years ago by Ben Gamari <ben@…>

In 0dc7b36c/ghc:

Restore original alignment for info tables

This was broken in 4a32bf925b8aba7885d9c745769fe84a10979a53, meaning
that info tables and subsequent code are no longer guaranteed to have
the recommended alignment.  Split up the section header and section
alignment printers, and print an appropriate alignment directive before
each info table.

Fixes Trac #11486

Reviewers: austin, bgamari, rwbarton

Reviewed By: bgamari, rwbarton

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1847

GHC Trac Issues: #11486

comment:7 Changed 4 years ago by bgamari

Status: patchmerge

comment:8 Changed 4 years ago by bgamari

Resolution: fixed
Status: mergeclosed

comment:9 Changed 4 years ago by rwbarton

This had a small, but mostly beneficial effect on performance: https://perf.haskell.org/ghc/#revision/0dc7b36c3c261b3eccf8460581fcd3d71f6e6ff6

As expected, there is a code size cost, which has an indirect performance cost as well (since density of real code in the instruction cache is lower).

Not exactly germane to this ticket, but perhaps it would be worth experimenting with half-word-aligned info tables? Many of the info table fields are half-word-sized anyways.

Note: See TracTickets for help on using tickets.