Opened 21 months ago

Last modified 13 months ago

#15126 new task

Opportunity to compress common info table representation.

Reported by: AndreasK Owned by:
Priority: normal Milestone: 8.10.1
Component: Compiler Version: 8.2.2
Keywords: CodeGen Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D4632
Wiki Page:


I've looked at a lot of GHC produced assembly recently and noticed that most info tables describing stacks have the form:

.align 8
	.long	SDjR_srt-(block_cHmk_info)+296
	.long	0
	.quad	6151
	.quad	4294967326

I haven't managed to dig fully into the description however some observations:

  • I noticed that the second .long directive almost always ends up being zero.
  • When figuring out what is what I realized the first quad (describing the pointers) is almost never fully used.
  • The last entrie (closure type + ?), here 4294967326 also seems quite repetitive given the size reserved.

So I looked in detail at spectral/simple:

  • There are 2012 info tables of this sort with all of them having a zero in the second long.
  • We also reserve 8 byte for the stack layout. However only a single of these tables requires more than 4 byte.

The compiled module has a size of 276384 Bytes, with 16092 being redundant:

  • 4 bytes for 0
  • 4 bytes unused stack description
  • times 2012 info tables.

That is an overhead of 5,8% which seems like quite a lot to me.

The questions where to put that information is a different one. But only looking at the data and not how it is used tagging the pointer to the SRT table seems like a possibility.

The info table description 4294967326 also appeared over 1k times. Maybe it's possible to come up with a more efficient encoding there as well.

I didn't give it much thought yet since I don't have the time to do anything about it in the near future. But putting it here in case anyone is interested or looks into this in the future.

Change History (6)

comment:1 Changed 21 months ago by simonpj

Compressing info tables looks like a good plan. But be careful about imposing a slow-down in the fast path, if extra instructions are need to decode the info table. With a bit of luck, we could have zero overhead for the fast path.

comment:2 Changed 20 months ago by AndreasK

Differential Rev(s): Phab:D4632

comment:3 Changed 20 months ago by simonmar

Phab:D4634 reduces the size of info tables with an SRT by one word on x86_64, which implements some (but perhaps not all) of the opportunities for savings mentioned here.

comment:4 Changed 19 months ago by bgamari


comment:5 Changed 14 months ago by AndreasK

I've collected some rudimentary stats by analysing the assembly dump for nofib/spectral/Simple.

Out of 2233 info tables, the most common three make up 370, with the most common case making up 240 of them.

I think that case is the small function return (RET_SMALL) with no pointers on the stack:

.align 8
	.quad	0
	.long	30
	.long	0

But shrinking this further would give up the common info table layout. So hard to judge the complexity and cost of this.

For now just documenting this.

comment:6 Changed 13 months ago by osa1


Bumping milestones of low-priority tickets.

Note: See TracTickets for help on using tickets.