Opened 4 years ago

Closed 4 years ago

#10829 closed bug (fixed)

Simplification in the RHS of rules

Reported by: afarmer Owned by:
Priority: high Milestone: 7.10.3
Component: Compiler Version: 7.10.2
Keywords: Cc: RyanGlScott, conal, Roboguy
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #10528 Differential Rev(s): Phab:D1246
Wiki Page:

Description (last modified by afarmer)

HERMIT users depend on RULES to specify equational properties.

7.10.2 performed both inlining and simplification in both sides of the rules, meaning they can't really be used for this. This breaks most HERMIT use cases. Note that this behavior was a change from 7.8 and prior, due to this commit:

https://git.haskell.org/ghc.git/commitdiff/8af219adb914b292d0f8c737fe0a1e3f7fb19cf3

The following commit disables the inlining/simplification in the LHS of rules, which fixes half the problem, but it has yet to be merged and released (attached to ticket #10528):

https://git.haskell.org/ghc.git/commitdiff/bc4b64ca5b99bff6b3d5051b57cb2bc52bd4c841

This ticket is to ask that inlining/simplification also be disabled in the RHS of rules.

As an example of what is happening, we are seeing rules like this:

repH :: [a] -> [a] -> [a]

{-# RULES "repH ++" [~] forall xs ys. repH (xs ++ ys) = repH xs . repH ys #-}

get to HERMIT as:

"repH ++" forall xs ys. repH (xs ++ ys) = let f = repH xs
                                              g = repH ys 
                                          in \ z -> f (g z)

In this case it is just an unfolding of composition, but some rules get rather gross on the RHS. The extra junk makes equational reasoning with these rules very fiddly and breaks the correspondence with the source-level code. For instance, it is almost impossible to apply these right-to-left, which is a common thing to do when performing equational reasoning.

Some possible solutions:

  1. Don't inline/apply rules in the RHS at all (just like the LHS).
  1. Don't inline/apply rules in the RHS of rules marked with the NeverActive notation (rules intended for HERMIT use are generally marked NeverActive). Since NeverActive rules are not applied by GHC anyway, this should actually save compile-time work and not affect real programs/rules.

Would either of those be acceptable/possible? Option 1 would be ideal, because it would match the behavior of 7.8 (AFAICT). Option 2 would be sufficient, however.

Attachments (1)

nofib-results.txt (189.6 KB) - added by afarmer 4 years ago.

Download all attachments as: .zip

Change History (25)

comment:1 Changed 4 years ago by afarmer

Description: modified (diff)

comment:2 Changed 4 years ago by RyanGlScott

Cc: RyanGlScott added

comment:3 Changed 4 years ago by conal

Cc: conal added

comment:4 Changed 4 years ago by simonpj

I think that Option 1 would be fine. Could someone try it?

  • In Simplify.simplRule, define rhs_env = lhs_env.
  • Probably rename updModeForRuleLHS to updModeForRule
  • Add a Note with updModeForRule to explain why we also do no inlining on rule RHSs, citing the ticket.
  • Check that there is no nofib impact.

Andrew: maybe you can do this?

Simon

comment:5 Changed 4 years ago by afarmer

Yes I have a patch at home that does this. I'll nofib bench it tonight and report back, thanks!

comment:6 Changed 4 years ago by afarmer

Differential Rev(s): D1246

Alright, I used the 7.10.2 branch and cherry picked bc4b64ca5b99bff6b3d5051b57cb2bc52bd4c841 (SPJ's "Do not inline or apply rules on LHS of rules") as my nofib base. My patch is in Phab:D1246. I used the following nofib command:

make clean && make boot && make -k mode=slow 2>&1 NoFibRuns=30

(as an aside, what does the -k do? It was recommended without comment on the nofib wiki page)

Full nofib results are attached.

To summarize, there was a very slight negative effect (Geom Mean +0.3% runtime, min -1.4%, max 2.2%). See below for Runtime/Elapsed/Total Mem. Binary Sizes and Allocations were unchanged. Average compile time increased 0.8%.

Programs with more than 1% change:

--------------------------------------------------------------------------------
        Program           Size    Allocs   Runtime   Elapsed  TotalMem
--------------------------------------------------------------------------------
      cacheprof           0.0%     -0.1%     +0.6%     +0.6%     +1.0%
       calendar           0.0%      0.0%     +1.4%     +1.5%      0.0%
   cryptarithm1           0.0%      0.0%     -1.4%     -1.4%      0.0%
       fibheaps           0.0%      0.0%     +1.2%     +1.1%      0.0%
       maillist           0.0%     -0.0%     0.051     0.051     -9.0%
         primes           0.0%      0.0%     +1.4%     +1.5%      0.0%
        rewrite           0.0%      0.0%     +1.9%     +1.9%      0.0%
          sched           0.0%      0.0%     +1.4%     +1.4%      0.0%
          solid           0.0%      0.0%     +1.9%     +2.0%      0.0%
      transform           0.0%      0.0%     +2.2%     +2.1%      0.0%
--------------------------------------------------------------------------------
            Min           0.0%     -0.1%     -1.4%     -1.4%     -9.0%
            Max           0.0%      0.0%     +2.2%     +2.1%     +1.0%
 Geometric Mean          -0.0%     -0.0%     +0.3%     +0.3%     -0.1%
Last edited 4 years ago by afarmer (previous) (diff)

comment:7 Changed 4 years ago by afarmer

Priority: normalhigh

Changed 4 years ago by afarmer

Attachment: nofib-results.txt added

comment:8 Changed 4 years ago by afarmer

I reran nofib on a non-laptop machine* and updated the comment above and the attached result output. Let me know what you think.

(* Turns out even a quiet laptop with wifi disabled still can't be used to benchmark due to what I am guessing are thermal throttling issues.)

Last edited 4 years ago by afarmer (previous) (diff)

comment:9 Changed 4 years ago by simonpj

Looks fine to me. I'd be happy to add it to 7.10.3. But what about the RHS (see comment:4)?

comment:10 Changed 4 years ago by afarmer

Ah, sorry I wasn't clear above. These results are for the changes you suggested in comment:4 (which are in Phab:D1246).

The comparison is between 7.10.2+the LHS patch and 7.10.2+the LHS patch+D1246.

Last edited 4 years ago by afarmer (previous) (diff)

comment:11 Changed 4 years ago by afarmer

(Edit: I originally wrote D1426, I meant D1246.)

comment:12 Changed 4 years ago by afarmer

Status: newpatch

comment:13 Changed 4 years ago by RyanGlScott

Differential Rev(s): D1246Phab:D1246

Changed the Diff and ticket information so that they'll auto-hyperlink.

comment:14 Changed 4 years ago by bgamari

Someone correct me if I'm wrong but it seems like this change carries a bit of risk: Could this not result in a failure in optimization in a library with incorrect phasing annotations? We should definitely check this against Stackage or some similar smoke-test to ensure that libraries with incorrect phasing aren't adversely affected by this change as I don't have much confidence that such a regression would appear in nofib (as we've already seen in the case of text).

Also, have we looked for Core changes that might explain the regressions seen in comment:6? The runtime regressions shown look as though they may be real; it would be nice to have at least some idea of what is happening here.

comment:15 Changed 4 years ago by simonpj

I'd be rather surprised if anything optimises worse after this change. It's only a question of whether the optimisation happens before or after the rule is fired.

I've been wrong before, and a smoke test would be good.

Simon

comment:16 Changed 4 years ago by afarmer

I'm working on updating test suite so validation passes, and I'm also compiling stackage LTS 3.6 as a smoke test. I'll report back when those things are done.

comment:17 Changed 4 years ago by Roboguy

Cc: Roboguy added

comment:18 Changed 4 years ago by bgamari

Any updates on this, Andrew?

comment:19 Changed 4 years ago by afarmer

Rebased onto most recent master and did a ./validate --slow on both master and this patch. The one additional test that fails is T7785, so I'm looking into it now.

=====> T7785(optasm) 1 of 1 [0, 0, 0]
cd ./simplCore/should_compile &&  "/data/users/anfarmer/head/inplace/bin/ghc-stage2" -c T7785.hs -fforce-recomp -dcore-lint -dcmm-lint -dno-debug-output -no-user-package-db -rtsopts -fno-warn-tabs -fno-warn-missed-specialisations -fno-ghci-history  -O -fasm -ddump-rules > T7785.comp.stderr 2>&1
Actual stderr output differs from expected:
--- ./simplCore/should_compile/T7785.stderr.normalised  2015-10-06 11:10:41.550769587 -0700
+++ ./simplCore/should_compile/T7785.comp.stderr.normalised     2015-10-06 11:10:41.550769587 -0700
@@ -4,5 +4,3 @@
     forall ($dMyFunctor :: MyFunctor []) (irred :: Domain [] Int).
       shared @ [] $dMyFunctor irred
       = bar_$sshared
-"SPEC/Foo myfmap @ []" [ALWAYS]
-    forall (tpl :: MyFunctor []). myfmap @ [] tpl = $cmyfmap
*** unexpected failure for T7785(optasm)

I did a preliminary stackage LTS 3.7 build (with 7.10.2+patch) and it looked positive (everything seemed to be building), but the VM ran out of disk and didn't build the last 300 or so packages. Going to restart that process on another machine and report back.

comment:20 Changed 4 years ago by afarmer

So it looks like the output for T7785 was changed by SPJ in 45d9a15c4b85a2ed89579106bdafd84accf2cb39, which is the commit that this issue is addressing anyway, so I just changed the output back to what it was before and the test passes.

Going to start stackage build soon.

comment:21 Changed 4 years ago by afarmer

Alright, I built 7.10.2+LHS patch+this patch and then used that to build stackage (LTS 3.8) as a smoke test. Every package built and tested (for whatever testing stackage-curator does) successfully.

I'll see if I can run benchmark suite on something like text/aeson as a further smoke test, but hopefully this is good to go.

comment:22 Changed 4 years ago by Austin Seipp <austin@…>

In dcc3428/ghc:

Don't inline/apply other rules when simplifying a rule RHS.

HERMIT users depend on RULES to specify equational properties. 7.10.2
performed both inlining and simplification in both sides of the rules, meaning
they can't really be used for this. This breaks most HERMIT use cases.  A
separate commit already disabled this for the LHS of rules. This does so for
the RHS.

See Trac #10829 for nofib results.

Reviewed By: austin, bgamari, simonpj

Differential Revision: https://phabricator.haskell.org/D1246

GHC Trac Issues: #10829

comment:23 Changed 4 years ago by thoughtpolice

Status: patchmerge

comment:24 Changed 4 years ago by bgamari

Resolution: fixed
Status: mergeclosed
Note: See TracTickets for help on using tickets.