Opened 6 years ago

Closed 3 years ago

Last modified 2 years ago

#8472 closed bug (fixed)

Primitive string literals prevent optimization

Reported by: akio Owned by: gridaphobe
Priority: normal Milestone:
Component: Compiler Version: 7.6.3
Keywords: newcomer, strings Cc:
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D2554, Phab:D2605
Wiki Page:

Description

Using an Addr# literal seems to result in less aggressive optimization. If I compile the attached program like this:

ghc -O2 -fforce-recomp -ddump-simpl addr.hs

the code is optimized nicely. Everything are inlined into t, intermediate pairs are eliminated, etc.

However, when I replace the Int# literals in the code with Addr# literals by defining the ADDR macro:

ghc -O2 -fforce-recomp -ddump-simpl -DADDR addr.hs

GHC now creates 2 extra top-level bindings, each of which allocates a pair. I don't see why the presence of Addr# literals should prevent inlining, so I'm reporting a bug.

Attachments (1)

addr.hs (708 bytes) - added by akio 6 years ago.

Download all attachments as: .zip

Change History (25)

Changed 6 years ago by akio

Attachment: addr.hs added

comment:1 Changed 6 years ago by simonpj

Great example. It's an example of #2840 (which I've closed as a dup of this ticket, since this is a real report).

The problem is that we get something like this:

f = let a::Addr# = "foo"#
    in \x -> blah

g y = ...(f e)...

We can't float the binding for a to the top level because Core doesn't allow top-level bindings. But by not floating it we prevent f being inlined, which is pretty terrible.

For literals other than strings, this doesn't make any difference, because we inline them freely. But for literal strings we don't want to make lots of copies of them; on the contrary we'd like to CSE identical strings. So it'd help to be able to bind them at top level.

I think the solution is simply to allow top level bindings of form a::Addr# = "foo"#. That is:

  • The type is Addr#
  • The RHS is a string literal; in particular NOT a string computation

Things that would need doing:

  • Modify the test isUnLiftedType ty in SetLevels.lvlMFE, which stops unlifted things getting floated to top level.
  • Similarly Simplify.bindingOk.
  • Make CmmLint check the new invariant.
  • The STG->Cmm code generator would need to generate some suitable CmmData stuff.

This is a fairly easy job. Any volunteers?

Simon

Last edited 6 years ago by simonpj (previous) (diff)

comment:2 Changed 5 years ago by xnyhps

Owner: set to xnyhps

comment:3 Changed 4 years ago by simonpj

See also #11312

comment:4 Changed 3 years ago by mpickering

Keywords: newcomer added

This seems well-specified for an newcomer.

comment:5 Changed 3 years ago by gridaphobe

While working on Phab:D1259 I came across another example of this issue (NB it requires my patch to trigger). The idea in that patch is to avoid inlining String literals and share them as top-level values instead. In order to keep using the REWRITE rules, we pretend that unpackCString# is CONLIKE. This has a side-effect of making GHC float the unboxed string literal out into a separate let-binder, which then prevents us from floating the unpackCString# application to the top-level, as unboxed literals aren't allowed there.

Here's a simple example that triggers the behavior with the Phab:D1259 patch, but properly floats the Strings on master.

module Foo where

draw xs = a ++ b ++ xs
[a,b] = ["aa", "bb"]

comment:6 Changed 3 years ago by simonpj

See also #12585

comment:7 Changed 3 years ago by simonpj

More intersting string-literal stuff

comment:8 Changed 3 years ago by gridaphobe

Owner: changed from xnyhps to gridaphobe

@simonpj I'm working on this and have a patch nearly ready, but I'm a bit confused by your suggestion (in comment:1) to update CmmLint. It's not clear to me how to check this at the Cmm level, how do we determine what is top level?

Perhaps you meant to check the invariant in CoreLint? I'm having to tweak a few of the checks there anyway that expect all top-level binders to have lifted types.

comment:9 Changed 3 years ago by akio

Ack, I have also been working on this, but it seems like @gridaphobe's patches are closer to be ready (I haven't updated the core lint to check the new invariant).

Here is a link to my patch in case it contains anything useful: https://github.com/takano-akio/ghc/commit/abd74a0d1fdde0e75d46bf1ff255ed4966a020ad

comment:10 Changed 3 years ago by gridaphobe

@akio sorry about the duplicate work!

It looks like our patches are very similar, though I introduce a new StgRhs constructor rather than your StgTopBinding type. I think your approach is a bit better, as I end up having to deal with the (im)possibility of let-binding a literal anywhere (it looks like Stg uses case <lit> of <var> { __DEFAULT -> ... } to bind literals, which makes sense given that they can't be lazy).

Does your patch validate? I just noticed this afternoon that although my patch works for the example and parts of nofib, it causes a linker error when building ghc-stage2, due to undefined symbols. I'm setting the label for the string literal a bit differently from you, so your patch might be fine.

My CoreLint patch is at https://github.com/ghc/ghc/compare/master...gridaphobe:T8472#diff-9ad7456ebf7fad38de8b24ddceb9bb3c. Do you want to submit your patch + my CoreLint pass, that ought to make for a complete patch :)

(I also notice that both of our patches could use a nice Note explaining why we want to bind string literals at the top level, especially since the logic is spread across multiple phases of the compiler)

comment:11 Changed 3 years ago by simonpj

Excellent work!

Perhaps you meant to check the invariant in CoreLint?

Yes I did. Sorry for the confusion here.

I introduce a new StgRhs constructor rather than your StgTopBinding type. I think your approach is a bit better,

I have not looked at the details, but since we are only talking about introducing a new top-level form, it makes sense for the data type to reflect that if it's convenient to do so.

(I also notice that both of our patches could use a nice Note explaining why we want to bind string literals at the top level, especially since the logic is spread across multiple phases of the compiler)

YES PLEASE!

Also document invariants in CoreSyn.

comment:12 in reply to:  10 Changed 3 years ago by akio

Replying to gridaphobe:

Does your patch validate? I just noticed this afternoon that although my patch works for the example and parts of nofib, it causes a linker error when building ghc-stage2, due to undefined symbols. I'm setting the label for the string literal a bit differently from you, so your patch might be fine.

I don't get link errors, but I do get several failures when I run make slow in testsuite/. Some of the failures look harmless (the expected output has to be adjusted) but I'll need to look more carefully at other failures.

My CoreLint patch is at https://github.com/ghc/ghc/compare/master...gridaphobe:T8472#diff-9ad7456ebf7fad38de8b24ddceb9bb3c. Do you want to submit your patch + my CoreLint pass, that ought to make for a complete patch :)

Thanks, I've taken your patch to CoreLint. I'm happy to sort out the remaining issues and submit a Differential revision, but I won't have much time to work on this until this weekend. Feel free to go ahead and do it yourself if you like. In any case, my work-in-progress branch can be found at https://github.com/takano-akio/ghc/commits/top-level-string.

comment:13 Changed 3 years ago by gridaphobe

Differential Rev(s): D2554
Status: newpatch

comment:14 Changed 3 years ago by akio

Differential Rev(s): D2554Phab:D2554, Phab:D2605

comment:15 Changed 3 years ago by simonpj

Phab:D2554 is abandoned; let's remove it from the ticket sinceit is now irrelevant (I assume).

comment:16 Changed 3 years ago by akio

My patch (Phab:D2605) causes some compiler perf regressions. This happens because

f x = ... "foo"# ...

now gets transformed into

foo = "foo"#
f x = ... foo ...

and this means a larger code size.

For example, on perf/compiler/T1969, the peak_megabytes_allocated goes from 63 to 68 (a 8% increase), the code size (in terms of the Term component of CoreStats) goes from 12603 to 13271 (a 10% increase), and this is fully explained by the above effect, which increases the code size by 2 for each top-level string literal.

comment:17 Changed 3 years ago by simonpj

this means a larger code size.

Why?? In both cases I'd ultimately expect to see a top-level machine-code label for a literal string, and a reference to that label in the compiled code. I see no reason for increased code size, or increased bytes-allocated.

Can you explain?

Simon

comment:18 in reply to:  17 Changed 3 years ago by akio

Replying to simonpj:

this means a larger code size.

Why?? In both cases I'd ultimately expect to see a top-level machine-code label for a literal string, and a reference to that label in the compiled code. I see no reason for increased code size, or increased bytes-allocated.

Sorry, I meant a larger size of the core. Presumably the compiler needs to use more memory to store a larger syntax tree.

comment:19 Changed 3 years ago by simonpj

Oh OK, now I understand. I think that's fine:

  • peak allocation is very vulnerable to the moment at which major gc runs.
  • "Term" component of core-stats is fine.

Thanks

Simon

comment:20 Changed 3 years ago by Ben Gamari <ben@…>

In d49b2bb/ghc:

Allow top-level string literals in Core (#8472)

This commits relaxes the invariants of the Core syntax so that a
top-level variable can be bound to a primitive string literal of type
Addr#.

This commit:

* Relaxes the invatiants of the Core, and allows top-level bindings whose
  type is Addr# as long as their RHS is either a primitive string literal or
  another variable.

* Allows the simplifier and the full-laziness transformer to float out
  primitive string literals to the top leve.

* Introduces the new StgGenTopBinding type to accomodate top-level Addr#
  bindings.

* Introduces a new type of labels in the object code, with the suffix "_bytes",
  for exported top-level Addr# bindings.

* Makes some built-in rules more robust. This was necessary to keep them
  functional after the above changes.

This is a continuation of D2554.

Rebasing notes:
This had two slightly suspicious performance regressions:

* T12425: bytes allocated regressed by roughly 5%
* T4029: bytes allocated regressed by a bit over 1%
* T13035: bytes allocated regressed by a bit over 5%

These deserve additional investigation.

Rebased by: bgamari.

Test Plan: ./validate --slow

Reviewers: goldfire, trofi, simonmar, simonpj, austin, hvr, bgamari

Reviewed By: trofi, simonpj, bgamari

Subscribers: trofi, simonpj, gridaphobe, thomie

Differential Revision: https://phabricator.haskell.org/D2605

GHC Trac Issues: #8472

comment:21 Changed 3 years ago by mpickering

Resolution: fixed
Status: patchclosed

comment:22 Changed 3 years ago by Ben Gamari <ben@…>

In f5b275a2/ghc:

Don't tick top-level string literals

This fixes a regression due to D2605 (see #8472) wherein top-level primitive
strings would fail to be noticed by CoreToStg as they were wrapped in a
tick. This resulted in a panic in CoreToStg due to inconsistent CAF information
(or a Core Lint failure, if enabled). Here we document the invariant that
unlifted expressions can only sit at top-level if of the form `Lit (MachStr
...)` with no ticks or other embellishments. Moreover, we fix instance of
this in `Simplify.prepareRhs` and `FloatOut.wrapTick` where this
invariant was being broken.

Test Plan: Validate with `-g`. Run testsuite with `WAY=ghci`.

Reviewers: austin, simonpj

Reviewed By: simonpj

Subscribers: simonpj, akio, scpmw, thomie

Differential Revision: https://phabricator.haskell.org/D3051

comment:23 Changed 2 years ago by bgamari

Keywords: strings added

comment:24 Changed 2 years ago by bgamari

The patch described in comment:22 is in the master branch and GHC 8.2.1 and is a temporary workaround for some of the breakage caused by this change. The general theme of breakage-due-to-ticks is tracked in #14123.

Note: See TracTickets for help on using tickets.