Opened 2 years ago

Last modified 3 days ago

#14373 patch bug

Introduce PTR-tagging for big constructor families

Reported by: heisenbug Owned by: heisenbug
Priority: normal Milestone:
Component: Compiler Version: 8.2.1
Keywords: CodeGen Cc: osa1
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D4267
Wiki Page:

Description

Currently only small constructor families come into the benefit of pointer tagging.

In big fams, 1 is the tag that says "I am evaluated". I suggest to do best-effort pointer tagging on big families too by this scheme:

Ptr-tag 1..6 for the first 6 constructors, 7 would signify "look into the info table and branch on that tag". In the info table the tags will then be 6..(familySize - 1).

I have an implementation which I'll submit to fabricator soon.

TODOs: update wiki pages.

Change History (10)

comment:1 Changed 2 years ago by simonpj

See branch wip/T14373

comment:2 Changed 2 years ago by heisenbug

Status: newpatch

comment:3 Changed 2 years ago by heisenbug

Last edited 2 years ago by heisenbug (previous) (diff)

comment:5 Changed 2 years ago by heisenbug

Owner: set to heisenbug

comment:6 Changed 2 years ago by duog

Differential Rev(s): Phab:D4267

comment:7 Changed 21 months ago by simonpj

Keywords: CodeGen added

comment:8 Changed 18 months ago by osa1

Cc: osa1 added

comment:9 Changed 7 weeks ago by Marge Bot <ben+marge-bot@…>

In 9897e8c8/ghc:

Implement pointer tagging for big families (#14373)

Formerly we punted on these and evaluated constructors always got a tag
of 1.

We now cascade switches because we have to check the tag first and when
it is MAX_PTR_TAG then get the precise tag from the info table and
switch on that. The only technically tricky part is that the default
case needs (logical) duplication. To do this we emit an extra label for
it and branch to that from the second switch. This avoids duplicated
codegen.

Here's a simple example of the new code gen:

    data D = D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8

On a 64-bit system previously all constructors would be tagged 1. With
the new code gen D7 and D8 are tagged 7:

    [Lib.D7_con_entry() {
         ...
         {offset
           c1eu: // global
               R1 = R1 + 7;
               call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
         }
     }]

    [Lib.D8_con_entry() {
         ...
         {offset
           c1ez: // global
               R1 = R1 + 7;
               call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
         }
     }]

When switching we now look at the info table only when the tag is 7. For
example, if we derive Enum for the type above, the Cmm looks like this:

    c2Le:
        _s2Js::P64 = R1;
        _c2Lq::P64 = _s2Js::P64 & 7;
        switch [1 .. 7] _c2Lq::P64 {
            case 1 : goto c2Lk;
            case 2 : goto c2Ll;
            case 3 : goto c2Lm;
            case 4 : goto c2Ln;
            case 5 : goto c2Lo;
            case 6 : goto c2Lp;
            case 7 : goto c2Lj;
        }

    // Read info table for tag
    c2Lj:
        _c2Lv::I64 = %MO_UU_Conv_W32_W64(I32[I64[_s2Js::P64 & (-8)] - 4]);
        if (_c2Lv::I64 != 6) goto c2Lu; else goto c2Lt;

Generated Cmm sizes do not change too much, but binaries are very
slightly larger, due to the fact that the new instructions are longer in
encoded form. E.g. previously entry code for D8 above would be

    00000000000001c0 <Lib_D8_con_info>:
     1c0:	48 ff c3             	inc    %rbx
     1c3:	ff 65 00             	jmpq   *0x0(%rbp)

With this patch

    00000000000001d0 <Lib_D8_con_info>:
     1d0:	48 83 c3 07          	add    $0x7,%rbx
     1d4:	ff 65 00             	jmpq   *0x0(%rbp)

This is one byte longer.

Secondly, reading info table directly and then switching is shorter

    _c1co:
            movq -1(%rbx),%rax
            movl -4(%rax),%eax
            // Switch on info table tag
            jmp *_n1d5(,%rax,8)

than doing the same switch, and then for the tag 7 doing another switch:

    // When tag is 7
    _c1ct:
            andq $-8,%rbx
            movq (%rbx),%rax
            movl -4(%rax),%eax
            // Switch on info table tag
            ...

Some changes of binary sizes in actual programs:

- In NoFib the worst case is 0.1% increase in benchmark "parser" (see
  NoFib results below). All programs get slightly larger.

- Stage 2 compiler size does not change.

- In "containers" (the library) size of all object files increases
  0.0005%. Size of the test program "bitqueue-properties" increases
  0.03%.

nofib benchmarks kindly provided by Ömer (@osa1):

NoFib Results
=============

--------------------------------------------------------------------------------
        Program           Size    Allocs    Instrs     Reads    Writes
--------------------------------------------------------------------------------
             CS          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
            CSD          +0.0%      0.0%      0.0%     +0.0%     +0.0%
             FS          +0.0%      0.0%      0.0%     +0.0%      0.0%
              S          +0.0%      0.0%     -0.0%      0.0%      0.0%
             VS          +0.0%      0.0%     -0.0%     +0.0%     +0.0%
            VSD          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
            VSM          +0.0%      0.0%      0.0%      0.0%      0.0%
           anna          +0.0%      0.0%     +0.1%     -0.9%     -0.0%
           ansi          +0.0%      0.0%     -0.0%     +0.0%     +0.0%
           atom          +0.0%      0.0%      0.0%      0.0%      0.0%
         awards          +0.0%      0.0%     -0.0%     +0.0%      0.0%
         banner          +0.0%      0.0%     -0.0%     +0.0%      0.0%
     bernouilli          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
   binary-trees          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
          boyer          +0.0%      0.0%     +0.0%      0.0%     -0.0%
         boyer2          +0.0%      0.0%     +0.0%      0.0%     -0.0%
           bspt          +0.0%      0.0%     +0.0%     +0.0%      0.0%
      cacheprof          +0.0%      0.0%     +0.1%     -0.8%      0.0%
       calendar          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
       cichelli          +0.0%      0.0%     +0.0%      0.0%      0.0%
        circsim          +0.0%      0.0%     -0.0%     -0.1%     -0.0%
       clausify          +0.0%      0.0%     +0.0%     +0.0%      0.0%
  comp_lab_zift          +0.0%      0.0%     +0.0%      0.0%     -0.0%
       compress          +0.0%      0.0%     +0.0%     +0.0%      0.0%
      compress2          +0.0%      0.0%      0.0%      0.0%      0.0%
    constraints          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
   cryptarithm1          +0.0%      0.0%     +0.0%      0.0%      0.0%
   cryptarithm2          +0.0%      0.0%     +0.0%     -0.0%      0.0%
            cse          +0.0%      0.0%     +0.0%     +0.0%      0.0%
   digits-of-e1          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
   digits-of-e2          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
         dom-lt          +0.0%      0.0%     +0.0%     +0.0%      0.0%
          eliza          +0.0%      0.0%     -0.0%     +0.0%      0.0%
          event          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
    exact-reals          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         exp3_8          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
         expert          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
 fannkuch-redux          +0.0%      0.0%     +0.0%      0.0%      0.0%
          fasta          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
            fem          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            fft          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
           fft2          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
       fibheaps          +0.0%      0.0%     +0.0%     +0.0%      0.0%
           fish          +0.0%      0.0%     +0.0%     +0.0%      0.0%
          fluid          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         fulsom          +0.0%      0.0%     +0.0%     -0.0%     +0.0%
         gamteb          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
            gcd          +0.0%      0.0%     +0.0%     +0.0%      0.0%
    gen_regexps          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
         genfft          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
             gg          +0.0%      0.0%      0.0%     -0.0%      0.0%
           grep          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         hidden          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
            hpg          +0.0%      0.0%     +0.0%     -0.1%     -0.0%
            ida          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
          infer          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
        integer          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
      integrate          +0.0%      0.0%      0.0%     +0.0%      0.0%
   k-nucleotide          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
          kahan          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
        knights          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
         lambda          +0.0%      0.0%     +1.2%     -6.1%     -0.0%
     last-piece          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
           lcss          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
           life          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
           lift          +0.0%      0.0%     +0.0%     +0.0%      0.0%
         linear          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
      listcompr          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
       listcopy          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
       maillist          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
         mandel          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
        mandel2          +0.0%      0.0%     +0.0%     +0.0%     -0.0%
           mate          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
        minimax          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
        mkhprog          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
     multiplier          +0.0%      0.0%      0.0%     +0.0%     -0.0%
         n-body          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
       nucleic2          +0.0%      0.0%     +0.0%     +0.0%     -0.0%
           para          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
      paraffins          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         parser          +0.1%      0.0%     +0.4%     -1.7%     -0.0%
        parstof          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
            pic          +0.0%      0.0%     +0.0%      0.0%     -0.0%
       pidigits          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
          power          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
         pretty          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         primes          +0.0%      0.0%     +0.0%      0.0%      0.0%
      primetest          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         prolog          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         puzzle          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         queens          +0.0%      0.0%      0.0%     +0.0%     +0.0%
        reptile          +0.0%      0.0%     +0.0%     +0.0%      0.0%
reverse-complem          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
        rewrite          +0.0%      0.0%     +0.0%      0.0%     -0.0%
           rfib          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            rsa          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            scc          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
          sched          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            scs          +0.0%      0.0%     +0.0%     +0.0%      0.0%
         simple          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
          solid          +0.0%      0.0%     +0.0%     +0.0%      0.0%
        sorting          +0.0%      0.0%     +0.0%     -0.0%      0.0%
  spectral-norm          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
         sphere          +0.0%      0.0%     +0.0%     -1.0%      0.0%
         symalg          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            tak          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
      transform          +0.0%      0.0%     +0.4%     -1.3%     +0.0%
       treejoin          +0.0%      0.0%     +0.0%     -0.0%      0.0%
      typecheck          +0.0%      0.0%     -0.0%     +0.0%      0.0%
        veritas          +0.0%      0.0%     +0.0%     -0.1%     +0.0%
           wang          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
      wave4main          +0.0%      0.0%     +0.0%      0.0%     -0.0%
   wheel-sieve1          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
   wheel-sieve2          +0.0%      0.0%     +0.0%     +0.0%      0.0%
           x2n1          +0.0%      0.0%     +0.0%     +0.0%      0.0%
--------------------------------------------------------------------------------
            Min          +0.0%      0.0%     -0.0%     -6.1%     -0.0%
            Max          +0.1%      0.0%     +1.2%     +0.0%     +0.0%
 Geometric Mean          +0.0%     -0.0%     +0.0%     -0.1%     -0.0%

NoFib GC Results
================

--------------------------------------------------------------------------------
        Program           Size    Allocs    Instrs     Reads    Writes
--------------------------------------------------------------------------------
        circsim          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
    constraints          +0.0%      0.0%     -0.0%      0.0%     -0.0%
       fibheaps          +0.0%      0.0%      0.0%     -0.0%     -0.0%
         fulsom          +0.0%      0.0%      0.0%     -0.6%     -0.0%
       gc_bench          +0.0%      0.0%      0.0%      0.0%     -0.0%
           hash          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
           lcss          +0.0%      0.0%      0.0%     -0.0%      0.0%
      mutstore1          +0.0%      0.0%      0.0%     -0.0%     -0.0%
      mutstore2          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
          power          +0.0%      0.0%     -0.0%      0.0%     -0.0%
     spellcheck          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
--------------------------------------------------------------------------------
            Min          +0.0%      0.0%     -0.0%     -0.6%     -0.0%
            Max          +0.0%      0.0%     +0.0%      0.0%      0.0%
 Geometric Mean          +0.0%     +0.0%     +0.0%     -0.1%     +0.0%

Fixes #14373

These performance regressions appear to be a fluke in CI. See the
discussion in !1742 for details.

Metric Increase:
    T6048
    T12234
    T12425
    Naperian
    T12150
    T5837
    T13035

comment:10 Changed 3 days ago by Marge Bot <ben+marge-bot@…>

In 0e57d8a/ghc:

Fix chaining tagged and untagged ptrs in compacting GC

Currently compacting GC has the invariant that in a chain all fields are tagged
the same. However this does not really hold: root pointers are not tagged, so
when we thread a root we initialize a chain without a tag. When the pointed
objects is evaluated and we have more pointers to it from the heap, we then add
*tagged* fields to the chain (because pointers to it from the heap are tagged),
ending up chaining fields with different tags (pointers from roots are NOT
tagged, pointers from heap are). This breaks the invariant and as a result
compacting GC turns tagged pointers into non-tagged.

This later causes problem in the generated code where we do reads assuming that
the pointer is aligned, e.g.

    0x7(%rax) -- assumes that pointer is tagged 1

which causes misaligned reads. This caused #17088.

We fix this using the "pointer tagging for large families" patch (#14373,
!1742):

- With the pointer tagging patch the GC can know what the tagged pointer to a
  CONSTR should be (previously we'd need to know the family size -- large
  families are always tagged 1, small families are tagged depending on the
  constructor).

- Since we now know what the tags should be we no longer need to store the
  pointer tag in the info table pointers when forming chains in the compacting
  GC.

As a result we no longer need to tag pointers in chains with 1/2 depending on
whether the field points to an info table pointer, or to another field: an info
table pointer is always tagged 0, everything else in the chain is tagged 1. The
lost tags in pointers can be retrieved by looking at the info table.

Finally, instead of using tag 1 for fields and tag 0 for info table pointers, we
use two different tags for fields:

- 1 for fields that have untagged pointers
- 2 for fields that have tagged pointers

When unchaining we then look at the pointer to a field, and depending on its tag
we either leave a tagged pointer or an untagged pointer in the field.

This allows chaining untagged and tagged fields together in compacting GC.

Fixes #17088

Nofib results
-------------

Binaries are smaller because of smaller `Compact.c` code.

make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" EXTRA_HC_OPTS="-with-rtsopts=-c" NoFibRuns=1

    --------------------------------------------------------------------------------
            Program           Size    Allocs    Instrs     Reads    Writes
    --------------------------------------------------------------------------------
                 CS          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                CSD          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                 FS          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                  S          -0.3%      0.0%     +5.4%     +0.8%     +3.9%
                 VS          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
                VSD          -0.3%      0.0%     -0.0%     -0.0%     -0.2%
                VSM          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
               anna          -0.1%      0.0%     +0.0%     +0.0%     +0.0%
               ansi          -0.3%      0.0%     +0.1%     +0.0%     +0.0%
               atom          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
             awards          -0.2%      0.0%     +0.0%      0.0%     -0.0%
             banner          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
         bernouilli          -0.3%      0.0%     +0.1%     +0.0%     +0.0%
       binary-trees          -0.2%      0.0%     +0.0%      0.0%     +0.0%
              boyer          -0.3%      0.0%     +0.2%     +0.0%     +0.0%
             boyer2          -0.2%      0.0%     +0.2%     +0.1%     +0.0%
               bspt          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
          cacheprof          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
           calendar          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
           cichelli          -0.3%      0.0%     +1.1%     +0.2%     +0.5%
            circsim          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
           clausify          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
      comp_lab_zift          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
           compress          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
          compress2          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
        constraints          -0.3%      0.0%     +0.2%     +0.1%     +0.1%
       cryptarithm1          -0.3%      0.0%     +0.0%     -0.0%      0.0%
       cryptarithm2          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                cse          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
       digits-of-e1          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
       digits-of-e2          -0.3%      0.0%     +0.0%     +0.0%     -0.0%
             dom-lt          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
              eliza          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
              event          -0.3%      0.0%     +0.1%     +0.0%     -0.0%
        exact-reals          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
             exp3_8          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
             expert          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
     fannkuch-redux          -0.3%      0.0%     -0.0%     -0.0%     -0.0%
              fasta          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                fem          -0.2%      0.0%     +0.1%     +0.0%     +0.0%
                fft          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
               fft2          -0.2%      0.0%     +0.0%     -0.0%     +0.0%
           fibheaps          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
               fish          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
              fluid          -0.2%      0.0%     +0.4%     +0.1%     +0.1%
             fulsom          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
             gamteb          -0.2%      0.0%     +0.1%     +0.0%     +0.0%
                gcd          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
        gen_regexps          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
             genfft          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                 gg          -0.2%      0.0%     +0.7%     +0.3%     +0.2%
               grep          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
             hidden          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                hpg          -0.2%      0.0%     +0.1%     +0.0%     +0.0%
                ida          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
              infer          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
            integer          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
          integrate          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
       k-nucleotide          -0.2%      0.0%     +0.0%     +0.0%     -0.0%
              kahan          -0.3%      0.0%     -0.0%     -0.0%     -0.0%
            knights          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
             lambda          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
         last-piece          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
               lcss          -0.3%      0.0%     +0.0%     +0.0%      0.0%
               life          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
               lift          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
             linear          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
          listcompr          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
           listcopy          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
           maillist          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
             mandel          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
            mandel2          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
               mate          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
            minimax          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
            mkhprog          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
         multiplier          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
             n-body          -0.2%      0.0%     -0.0%     -0.0%     -0.0%
           nucleic2          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
               para          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
          paraffins          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
             parser          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
            parstof          -0.2%      0.0%     +0.8%     +0.2%     +0.2%
                pic          -0.2%      0.0%     +0.1%     -0.1%     -0.1%
           pidigits          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
              power          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
             pretty          -0.3%      0.0%     -0.0%     -0.0%     -0.1%
             primes          -0.3%      0.0%     +0.0%     +0.0%     -0.0%
          primetest          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
             prolog          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
             puzzle          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
             queens          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
            reptile          -0.2%      0.0%     +0.2%     +0.1%     +0.0%
    reverse-complem          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
            rewrite          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
               rfib          -0.2%      0.0%     +0.0%     +0.0%     -0.0%
                rsa          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                scc          -0.3%      0.0%     -0.0%     -0.0%     -0.1%
              sched          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
                scs          -0.2%      0.0%     +0.1%     +0.0%     +0.0%
             simple          -0.2%      0.0%     +3.4%     +1.0%     +1.8%
              solid          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
            sorting          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
      spectral-norm          -0.2%      0.0%     -0.0%     -0.0%     -0.0%
             sphere          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
             symalg          -0.2%      0.0%     +0.0%     +0.0%     +0.0%
                tak          -0.3%      0.0%     +0.0%     +0.0%     -0.0%
          transform          -0.2%      0.0%     +0.2%     +0.1%     +0.1%
           treejoin          -0.3%      0.0%     +0.2%     -0.0%     -0.1%
          typecheck          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
            veritas          -0.1%      0.0%     +0.0%     +0.0%     +0.0%
               wang          -0.2%      0.0%     +0.0%     -0.0%     -0.0%
          wave4main          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
       wheel-sieve1          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
       wheel-sieve2          -0.3%      0.0%     +0.0%     -0.0%     -0.0%
               x2n1          -0.3%      0.0%     +0.0%     +0.0%     +0.0%
    --------------------------------------------------------------------------------
                Min          -0.3%      0.0%     -0.0%     -0.1%     -0.2%
                Max          -0.1%      0.0%     +5.4%     +1.0%     +3.9%
     Geometric Mean          -0.3%     -0.0%     +0.1%     +0.0%     +0.1%

    --------------------------------------------------------------------------------
            Program           Size    Allocs    Instrs     Reads    Writes
    --------------------------------------------------------------------------------
            circsim          -0.2%      0.0%     +1.6%     +0.4%     +0.7%
        constraints          -0.3%      0.0%     +4.3%     +1.5%     +2.3%
           fibheaps          -0.3%      0.0%     +3.5%     +1.2%     +1.3%
             fulsom          -0.2%      0.0%     +3.6%     +1.2%     +1.8%
           gc_bench          -0.3%      0.0%     +4.1%     +1.3%     +2.3%
               hash          -0.3%      0.0%     +6.6%     +2.2%     +3.6%
               lcss          -0.3%      0.0%     +0.7%     +0.2%     +0.7%
          mutstore1          -0.3%      0.0%     +4.8%     +1.4%     +2.8%
          mutstore2          -0.3%      0.0%     +3.4%     +1.0%     +1.7%
              power          -0.2%      0.0%     +2.7%     +0.6%     +1.9%
         spellcheck          -0.3%      0.0%     +1.1%     +0.4%     +0.4%
    --------------------------------------------------------------------------------
                Min          -0.3%      0.0%     +0.7%     +0.2%     +0.4%
                Max          -0.2%      0.0%     +6.6%     +2.2%     +3.6%
     Geometric Mean          -0.3%     +0.0%     +3.3%     +1.0%     +1.8%

Metric changes
--------------

While it sounds ridiculous, this change causes increased allocations in
the following tests. We concluded that this change can't cause a
difference in allocations and decided to land this patch. Fluctuations
in "bytes allocated" metric is tracked in #17686.

Metric Increase:
    Naperian
    T10547
    T12150
    T12234
    T12425
    T13035
    T5837
    T6048
Note: See TracTickets for help on using tickets.