Opened 18 months ago

Last modified 10 months ago

#15019 patch bug

Fix performance regressions from #14737

Reported by: tdammers Owned by:
Priority: high Milestone: 8.10.1
Component: Compiler Version:
Keywords: Cc: davide
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #14737 Differential Rev(s): phab:D4635
Wiki Page:

Description (last modified by simonpj)

#14737, while drastically improving performance on some edge cases, introduces performance regressions in 3 test cases:

  • perf/compiler/parsing001.run
  • perf/compiler/T9020.run
  • perf/compiler/T12707.run

Change History (28)

comment:1 Changed 17 months ago by bgamari

Owner: set to tdammers

Tobias will look into this.

comment:2 Changed 17 months ago by bgamari

Differential Rev(s): D4568phab:D4568

comment:3 Changed 17 months ago by simonpj

Description: modified (diff)

comment:4 Changed 17 months ago by tdammers

Looking at one of the regressions, T9010, I set up a trap to catch only those coercions that are found to be equal by eqType, but don't match on isReflCo. And they're all the same: (State# RealWorld -> (# State# RealWorld, () #), State# RealWorld -> (# State# RealWorld, () #)). So this one looks like it would be relatively easy to detect.

Unfortunately, the other one, T12707, paints a different picture; here, the matches look like these:

  (forall a. K1 R Bool a -> [String] -> [String],
   forall a. K1 R Bool a -> [String] -> [String])
  (forall a.
   M1
     S
     ('MetaSel
        ('Just "foo3Field08")
        'NoSourceUnpackedness
        'SourceStrict
        'DecidedStrict)
     (K1 R (Maybe Bool))
     a
   -> [String] -> [String],
   forall a.
   M1
     S
     ('MetaSel
        ('Just "foo3Field08")
        'NoSourceUnpackedness
        'SourceStrict
        'DecidedStrict)
     (Rec0 (Maybe Bool))
     a
   -> [String] -> [String])
  (forall a.
   (:*:)
     (((S1
          ('MetaSel
             ('Just "foo0Field00")
             'NoSourceUnpackedness
             'SourceStrict
             'DecidedStrict)
          (Rec0 String)
        :*: S1
              ('MetaSel
                 ('Just "foo0Field01")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedUnpack)
              (Rec0 Int))
       :*: (S1
              ('MetaSel
                 ('Just "foo0Field02")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedStrict)
              (Rec0 [Int])
            :*: (S1
                   ('MetaSel
                      ('Just "foo0Field03")
                      'NoSourceUnpackedness
                      'SourceStrict
                      'DecidedStrict)
                   (Rec0 (Maybe Bool))
                 :*: S1
                       ('MetaSel
                          ('Just "foo0Field04")
                          'NoSourceUnpackedness
                          'SourceStrict
                          'DecidedStrict)
                       (Rec0 Bool))))
      :*: ((S1
              ('MetaSel
                 ('Just "foo0Field05")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedStrict)
              (Rec0 String)
            :*: S1
                  ('MetaSel
                     ('Just "foo0Field06")
                     'NoSourceUnpackedness
                     'SourceStrict
                     'DecidedUnpack)
                  (Rec0 Int))
           :*: (S1
                  ('MetaSel
                     ('Just "foo0Field07")
                     'NoSourceUnpackedness
                     'SourceStrict
                     'DecidedStrict)
                  (Rec0 [Int])
                :*: (S1
                       ('MetaSel
                          ('Just "foo0Field08")
                          'NoSourceUnpackedness
                          'SourceStrict
                          'DecidedStrict)
                       (Rec0 (Maybe Bool))
                     :*: S1
                           ('MetaSel
                              ('Just "foo0Field09")
                              'NoSourceUnpackedness
                              'SourceStrict
                              'DecidedStrict)
                           (Rec0 Bool)))))
     (((S1
          ('MetaSel
             ('Just "foo0Field10")
             'NoSourceUnpackedness
             'SourceStrict
             'DecidedStrict)
          (Rec0 String)
        :*: S1
              ('MetaSel
                 ('Just "foo0Field11")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedUnpack)
              (Rec0 Int))
       :*: (S1
              ('MetaSel
                 ('Just "foo0Field12")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedStrict)
              (Rec0 [Int])
            :*: (S1
                   ('MetaSel
                      ('Just "foo0Field13")
                      'NoSourceUnpackedness
                      'SourceStrict
                      'DecidedStrict)
                   (Rec0 (Maybe Bool))
                 :*: S1
                       ('MetaSel
                          ('Just "foo0Field14")
                          'NoSourceUnpackedness
                          'SourceStrict
                          'DecidedStrict)
                       (Rec0 Bool))))
      :*: ((S1
              ('MetaSel
                 ('Just "foo0Field15")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedStrict)
              (Rec0 String)
            :*: S1
                  ('MetaSel
                     ('Just "foo0Field16")
                     'NoSourceUnpackedness
                     'SourceStrict
                     'DecidedUnpack)
                  (Rec0 Int))
           :*: (S1
                  ('MetaSel
                     ('Just "foo0Field17")
                     'NoSourceUnpackedness
                     'SourceStrict
                     'DecidedStrict)
                  (Rec0 [Int])
                :*: (S1
                       ('MetaSel
                          ('Just "foo0Field18")
                          'NoSourceUnpackedness
                          'SourceStrict
                          'DecidedStrict)
                       (Rec0 (Maybe Bool))
                     :*: S1
                           ('MetaSel
                              ('Just "foo0Field19")
                              'NoSourceUnpackedness
                              'SourceStrict
                              'DecidedStrict)
                           (Rec0 Bool)))))
     a
   -> [String] -> [String],
   forall a.
   (:*:)
     (((S1
          ('MetaSel
             ('Just "foo0Field00")
             'NoSourceUnpackedness
             'SourceStrict
             'DecidedStrict)
          (Rec0 String)
        :*: S1
              ('MetaSel
                 ('Just "foo0Field01")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedUnpack)
              (Rec0 Int))
       :*: (S1
              ('MetaSel
                 ('Just "foo0Field02")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedStrict)
              (Rec0 [Int])
            :*: (S1
                   ('MetaSel
                      ('Just "foo0Field03")
                      'NoSourceUnpackedness
                      'SourceStrict
                      'DecidedStrict)
                   (Rec0 (Maybe Bool))
                 :*: S1
                       ('MetaSel
                          ('Just "foo0Field04")
                          'NoSourceUnpackedness
                          'SourceStrict
                          'DecidedStrict)
                       (Rec0 Bool))))
      :*: ((S1
              ('MetaSel
                 ('Just "foo0Field05")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedStrict)
              (Rec0 String)
            :*: S1
                  ('MetaSel
                     ('Just "foo0Field06")
                     'NoSourceUnpackedness
                     'SourceStrict
                     'DecidedUnpack)
                  (Rec0 Int))
           :*: (S1
                  ('MetaSel
                     ('Just "foo0Field07")
                     'NoSourceUnpackedness
                     'SourceStrict
                     'DecidedStrict)
                  (Rec0 [Int])
                :*: (S1
                       ('MetaSel
                          ('Just "foo0Field08")
                          'NoSourceUnpackedness
                          'SourceStrict
                          'DecidedStrict)
                       (Rec0 (Maybe Bool))
                     :*: S1
                           ('MetaSel
                              ('Just "foo0Field09")
                              'NoSourceUnpackedness
                              'SourceStrict
                              'DecidedStrict)
                           (Rec0 Bool)))))
     (((S1
          ('MetaSel
             ('Just "foo0Field10")
             'NoSourceUnpackedness
             'SourceStrict
             'DecidedStrict)
          (Rec0 String)
        :*: S1
              ('MetaSel
                 ('Just "foo0Field11")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedUnpack)
              (Rec0 Int))
       :*: (S1
              ('MetaSel
                 ('Just "foo0Field12")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedStrict)
              (Rec0 [Int])
            :*: (S1
                   ('MetaSel
                      ('Just "foo0Field13")
                      'NoSourceUnpackedness
                      'SourceStrict
                      'DecidedStrict)
                   (Rec0 (Maybe Bool))
                 :*: S1
                       ('MetaSel
                          ('Just "foo0Field14")
                          'NoSourceUnpackedness
                          'SourceStrict
                          'DecidedStrict)
                       (Rec0 Bool))))
      :*: ((S1
              ('MetaSel
                 ('Just "foo0Field15")
                 'NoSourceUnpackedness
                 'SourceStrict
                 'DecidedStrict)
              (Rec0 String)
            :*: S1
                  ('MetaSel
                     ('Just "foo0Field16")
                     'NoSourceUnpackedness
                     'SourceStrict
                     'DecidedUnpack)
                  (Rec0 Int))
           :*: (S1
                  ('MetaSel
                     ('Just "foo0Field17")
                     'NoSourceUnpackedness
                     'SourceStrict
                     'DecidedStrict)
                  (Rec0 [Int])
                :*: (S1
                       ('MetaSel
                          ('Just "foo0Field18")
                          'NoSourceUnpackedness
                          'SourceStrict
                          'DecidedStrict)
                       (Rec0 (Maybe Bool))
                     :*: S1
                           ('MetaSel
                              ('Just "foo0Field19")
                              'NoSourceUnpackedness
                              'SourceStrict
                              'DecidedStrict)
                           (Rec0 Bool)))))
     a
   -> [String] -> [String])

Further digging, however, reveals that these are all Sym coercions, so I think I can figure out a faster way to detect these.

comment:5 Changed 17 months ago by simonpj

Looking at one of the regressions, T9010, I set up a trap to catch only those coercions that are found to be equal by eqType, but don't match on isReflCo. And they're all the same: ...

Eh? The code you give doesn't look like a coercion. It looks like a type, or a pair of types. I must be misunderstanding.

comment:6 in reply to:  5 Changed 17 months ago by tdammers

Replying to simonpj:

Eh? The code you give doesn't look like a coercion. It looks like a type, or a pair of types. I must be misunderstanding.

Sorry about that, didn't realize that this was misleading without the relevant context.

So we have this code here:

pushCoTyArg co ty
  | isReflCo co
  = Just (ty, Nothing)

  -- The following is inefficient - don't do `eqType` here, the coercion
  -- optimizer will take care of it. See Trac #14737.
  | tyL `eqType` tyR
  = pprTrace "eqType fired: " (ppr (tyL, tyR)) $ -- <- this is where we're doing the trace logging
    Just (ty, Nothing)

  | isForAllTy tyL
  = ASSERT2( isForAllTy tyR, ppr co $$ ppr ty )
    Just (ty `mkCastTy` mkSymCo co1, Just co2)

  | otherwise
  = Nothing
  where
    Pair tyL tyR = coercionKind co
       -- co :: tyL ~ tyR
       -- tyL = forall (a1 :: k1). ty1
       -- tyR = forall (a2 :: k2). ty2

So the two types we see in this log are tyL and tyR, calculated using coercionKind co. And our goal is to find a drop-in replacement for isReflCo that detects most of the coercions that hit the eqType guard right now, but is significantly cheaper than eqType (and coercionKind / coercionKindRole).

It may become less confusing if we dump the actual coercion too:

pushCoTyArg co ty
  | isReflCo co
  = Just (ty, Nothing)

  -- The following is inefficient - don't do `eqType` here, the coercion
  -- optimizer will take care of it. See Trac #14737.
  | tyL `eqType` tyR
  = pprTrace "eqType fired: " (ppr (co, tyL, tyR)) $
    Just (ty, Nothing)

This gives us output like the following:

eqType fired: 
  (Sym (N:GQuu[0] <K1 R Bool>_N) ; N:GQuu[0] <K1 R Bool>_N,
   forall a. K1 R Bool a -> [String] -> [String],
   forall a. K1 R Bool a -> [String] -> [String])

comment:7 Changed 17 months ago by simonpj

Ah, much easier to understand. Now my next question is this: who built that coercion? It is actually reflexive, and it's trivial to see that; but it is not ReflCo. Usually we build coercions with smart constructors that rapidly smash to ReflCo.

Can you describe how to reproduce this exact example? I can't get it to trip on T9020 and T12707 seems much larger.

comment:8 Changed 17 months ago by tdammers

OK, sure.

Just check out GHC HEAD, and apply this patch:

diff --git a/compiler/coreSyn/CoreOpt.hs b/compiler/coreSyn/CoreOpt.hs
index f1ff68d133..6ddab2c94e 100644
--- a/compiler/coreSyn/CoreOpt.hs
+++ b/compiler/coreSyn/CoreOpt.hs
@@ -958,10 +958,14 @@ pushCoTyArg :: CoercionR -> Type -> Maybe (Type, Maybe CoercionR)
 -- If the returned coercion is Nothing, then it would have been reflexive;
 -- it's faster not to compute it, though.
 pushCoTyArg co ty
+  | isReflCo co
+  = Just (ty, Nothing)
+
   -- The following is inefficient - don't do `eqType` here, the coercion
   -- optimizer will take care of it. See Trac #14737.
-  -- -- | tyL `eqType` tyR
-  -- -- = Just (ty, Nothing)
+  | tyL `eqType` tyR
+  = pprTrace "eqType fired: " (ppr (co, tyL, tyR)) $
+    Just (ty, Nothing)
 
   | isForAllTy tyL
   = ASSERT2( isForAllTy tyR, ppr co $$ ppr ty )
@@ -993,10 +997,15 @@ pushCoValArg :: CoercionR -> Maybe (Coercion, Maybe Coercion)
 -- If the second returned Coercion is actually Nothing, then no cast is necessary;
 -- the returned coercion would have been reflexive.
 pushCoValArg co
+  -- The "easy" case; this doesn't interest us.
+  | isReflCo co
+  = Just (mkRepReflCo arg, Nothing)
+
   -- The following is inefficient - don't do `eqType` here, the coercion
   -- optimizer will take care of it. See Trac #14737.
-  -- -- | tyL `eqType` tyR
-  -- -- = Just (mkRepReflCo arg, Nothing)
+  | tyL `eqType` tyR
+  = pprTrace "eqType fired: " (ppr (co, tyL, tyR)) $
+    Just (mkRepReflCo arg, Nothing)
 
   | isFunTy tyL
   , (co1, co2) <- decomposeFunCo Representational co

Build with the validate flavor, and then:

./inplace/bin/ghc-stage2 -O2 testsuite/tests/perf/compiler/T9020.hs -fforce-recomp

comment:9 Changed 17 months ago by simonpj

OK, got it. The culprit is this:

        addCoerce co1 (CastIt co2 cont)
          = addCoerce (mkTransCo co1 co2) cont

It's really very common for co1 and co2 to cancel each other out, but mkTransCo doesn't spot that. Next thing to try:

        addCoerce co cont
          | isReflCo co = return cont

        addCoerce co1 (CastIt co2 cont)
          | isReflexiveCo co' = return cont
          | otherwise         = {-#SCC "addCoerce-simple-recursion" #-}
                                addCoerce co' cont
          where
            co' = mkTransCo co1 co2

And get rid of the isReflexiveCo at the end altogether.

The initial isReflCo is optional -- just to catch the case where the simplCorecion returns Refl.

comment:10 in reply to:  9 Changed 17 months ago by tdammers

Replying to simonpj:

OK, got it. The culprit is this:

        addCoerce co1 (CastIt co2 cont)
          = addCoerce (mkTransCo co1 co2) cont

It's really very common for co1 and co2 to cancel each other out, but mkTransCo doesn't spot that. Next thing to try:

        addCoerce co cont
          | isReflCo co = return cont

        addCoerce co1 (CastIt co2 cont)
          | isReflexiveCo co' = return cont
          | otherwise         = {-#SCC "addCoerce-simple-recursion" #-}
                                addCoerce co' cont
          where
            co' = mkTransCo co1 co2

And get rid of the isReflexiveCo at the end altogether.

The initial isReflCo is optional -- just to catch the case where the simplCorecion returns Refl.

Yes, this does seem to work. Running some tests to verify.

comment:11 Changed 17 months ago by simonpj

Wow. With this change I see

Unexpected stat failures:
   /tmp/ghctest-c0sci53m/test   spaces/./perf/compiler/T5837.run    T5837 [stat too good] (normal)   (-7.9%)
   /tmp/ghctest-c0sci53m/test   spaces/./perf/compiler/T5321FD.run  T5321FD [stat too good] (normal) (-12.5%)
   /tmp/ghctest-c0sci53m/test   spaces/./perf/compiler/T12425.run   T12425 [stat too good] (optasm)  (-9.5%)
   /tmp/ghctest-c0sci53m/test   spaces/./perf/compiler/T9020.run    T9020 [stat too good] (optasm)   (-32.5%)
   /tmp/ghctest-c0sci53m/test   spaces/./perf/compiler/T12707.run   T12707 [stat too good] (normal)  (-8.7%)

   /tmp/ghctest-c0sci53m/test   spaces/./perf/compiler/T12227.run   T12227 [stat not good enough] (normal) (+660%)

Clearly something is going pear shaped in T12227 but the other numbers look fantastic for such a small change! Do you see this too?

comment:12 Changed 17 months ago by tdammers

Actually, no, I only see "stat too good" in T5321FD, T9020, T12707 and T12425. I just submitted differential Phab:D4635, so I guess we should see what harbormaster has to say on the matter.

comment:13 Changed 17 months ago by simonpj

OK. Turns out that the T12227 regression was because I removed the isReflexiveCo at the end of addCoerce. This is embarrassingly delicate.

Anyway, could you try this revised patch please? I've adjusted it a bit, and added comments.

diff --git a/compiler/coreSyn/CoreOpt.hs b/compiler/coreSyn/CoreOpt.hs
index f1ff68d..552cc3b 100644
--- a/compiler/coreSyn/CoreOpt.hs
+++ b/compiler/coreSyn/CoreOpt.hs
@@ -963,6 +963,9 @@ pushCoTyArg co ty
   -- -- | tyL `eqType` tyR
   -- -- = Just (ty, Nothing)
 
+  | isReflCo co
+  = Just (ty, Nothing)
+
   | isForAllTy tyL
   = ASSERT2( isForAllTy tyR, ppr co $$ ppr ty )
     Just (ty `mkCastTy` mkSymCo co1, Just co2)
@@ -998,6 +1001,9 @@ pushCoValArg co
   -- -- | tyL `eqType` tyR
   -- -- = Just (mkRepReflCo arg, Nothing)
 
+  | isReflCo co
+  = Just (mkRepReflCo arg, Nothing)
+
   | isFunTy tyL
   , (co1, co2) <- decomposeFunCo Representational co
               -- If   co  :: (tyL1 -> tyL2) ~ (tyR1 -> tyR2)
diff --git a/compiler/simplCore/Simplify.hs b/compiler/simplCore/Simplify.hs
index d92f6d7..f6a86f3 100644
--- a/compiler/simplCore/Simplify.hs
+++ b/compiler/simplCore/Simplify.hs
@@ -1209,40 +1209,73 @@ rebuild env expr cont
 ************************************************************************
 -}
 
+{- See Note [Optimising reflexivity]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+It's important (for compiler performance) to get rid of reflexivity as soon
+as it appears.  See Trac #11735, #14737, and #15019.
+
+In particular, we want to behave well on
+
+ *  e |> co1 |> co2
+    where the two happent to cancel out entirely. That is quite common;
+    e.g. a newtype wrapping and unwrapping cancel
+
+
+ * (f |> co) @t1 @t2 ... @tn x1 .. xm
+   Here we wil use pushCoTyArg and pushCoValArg successively, which
+   build up NthCo stacks.  Silly to do that if co is reflexive.
+
+However, we don't want to call isReflexiveCo too much, because it uses
+type equality which is expensive on big types (Trac #14737 comment:7).
+
+A good compromise (determined experimentally) seems to be to call
+isReflexiveCo
+ * when composing casts, and
+ * at the end
+
+In investigating this I saw missed opportunities for on-the-fly
+coercion shrinkage. See Trac #15090.
+-}
+
+
 simplCast :: SimplEnv -> InExpr -> Coercion -> SimplCont
           -> SimplM (SimplFloats, OutExpr)
 simplCast env body co0 cont0
   = do  { co1   <- {-#SCC "simplCast-simplCoercion" #-} simplCoercion env co0
-        ; cont1 <- {-#SCC "simplCast-addCoerce" #-} addCoerce co1 cont0
+        ; cont1 <- {-#SCC "simplCast-addCoerce" #-}
+                   if isReflCo co1
+                   then return cont0  -- See Note [Optimising reflexivity]
+                   else addCoerce co1 cont0
         ; {-#SCC "simplCast-simplExprF" #-} simplExprF env body cont1 }
   where
         -- If the first parameter is Nothing, then simplifying revealed a
         -- reflexive coercion. Omit.
-        addCoerce0 :: Maybe OutCoercion -> SimplCont -> SimplM SimplCont
-        addCoerce0 Nothing   cont = return cont
-        addCoerce0 (Just co) cont = addCoerce co cont
+        addCoerceM :: Maybe OutCoercion -> SimplCont -> SimplM SimplCont
+        addCoerceM Nothing   cont = return cont
+        addCoerceM (Just co) cont = addCoerce co cont
 
         addCoerce :: OutCoercion -> SimplCont -> SimplM SimplCont
-
-        addCoerce co1 (CastIt co2 cont)
-          = {-#SCC "addCoerce-simple-recursion" #-}
-            addCoerce (mkTransCo co1 co2) cont
+        addCoerce co1 (CastIt co2 cont)  -- See Note [Optimising reflexivity]
+          | isReflexiveCo co' = return cont
+          | otherwise         = addCoerce co' cont
+          where
+            co' = mkTransCo co1 co2
 
         addCoerce co cont@(ApplyToTy { sc_arg_ty = arg_ty, sc_cont = tail })
           | Just (arg_ty', m_co') <- pushCoTyArg co arg_ty
           = {-#SCC "addCoerce-pushCoTyArg" #-}
-            do { tail' <- addCoerce0 m_co' tail
+            do { tail' <- addCoerceM m_co' tail
                ; return (cont { sc_arg_ty = arg_ty', sc_cont = tail' }) }
 
         addCoerce co cont@(ApplyToVal { sc_arg = arg, sc_env = arg_se
-                                 , sc_dup = dup, sc_cont = tail })
+                                      , sc_dup = dup, sc_cont = tail })
           | Just (co1, m_co2) <- pushCoValArg co
           , Pair _ new_ty <- coercionKind co1
-          , not (isTypeLevPoly new_ty)  -- without this check, we get a lev-poly arg
+          , not (isTypeLevPoly new_ty)  -- Without this check, we get a lev-poly arg
                                         -- See Note [Levity polymorphism invariants] in CoreSyn
                                         -- test: typecheck/should_run/EtaExpandLevPoly
           = {-#SCC "addCoerce-pushCoValArg" #-}
-            do { tail' <- addCoerce0 m_co2 tail
+            do { tail' <- addCoerceM m_co2 tail
                ; if isReflCo co1
                  then return (cont { sc_cont = tail' })
                       -- Avoid simplifying if possible;
@@ -1260,15 +1293,10 @@ simplCast env body co0 cont0
                                     , sc_cont = tail' }) } }
 
         addCoerce co cont
-          | isReflexiveCo co = {-#SCC "addCoerce-reflexive" #-}
-                               return cont
-          | otherwise        = {-#SCC "addCoerce-other" #-}
-                               return (CastIt co cont)
-                 -- It's worth checking isReflexiveCo.
-                 -- For example, in the initial form of a worker
-                 -- we may find  (coerce T (coerce S (\x.e))) y
-                 -- and we'd like it to simplify to e[y/x] in one round
-                 -- of simplification
+          | isReflexiveCo co = return cont  -- Having this at the end makes a huge
+                                            -- difference in T12227, for some reason
+                                            -- See Note [Optimising reflexivity]
+          | otherwise        = return (CastIt co cont)
 
 simplArg :: SimplEnv -> DupFlag -> StaticEnv -> CoreExpr
          -> SimplM (DupFlag, StaticEnv, OutExpr)

Last edited 17 months ago by simonpj (previous) (diff)

comment:14 Changed 17 months ago by simonpj

How did the patch in comment:13 work out?

comment:15 in reply to:  14 Changed 17 months ago by tdammers

Replying to simonpj:

How did the patch in comment:13 work out?

Absolutely gorgeous.

Phab:D4635 shows improvement on all 3 tests that previously failed, they now even perform about 10% better than pre-#14737.

comment:16 Changed 17 months ago by simonpj

Great. Just add the data here (and/or in the commit message), commit, close, and move on!

comment:17 Changed 17 months ago by tdammers

Improvements should be evident from the changes in the tests, but here they are:

  • T5321FD:
    • Before: 415136648
    • After: 367567168 (down 11.5%)
  • T9020:
    • Before: 423163832
    • #14737: 562206104 (up 32.9%)
    • After: 392559256 (down 7.2% from Before)
  • T12425:
    • Before: 134780272
    • #14737: 141952368 (up 5.3%)
    • After: 130646336 (down 3.1% from Before)

None of the other test cases show any regressions, including the test added for #14737.

comment:18 Changed 17 months ago by tdammers

Status: newpatch

comment:19 Changed 17 months ago by tdammers

https://phabricator.haskell.org/rGHC2a5bdd9adb1cb4afecd3303f0cfcc224be5ce745

Remove unnecessary check in simplCast

The coercion optimizer will take care of it anyway, and the check is
prohibitively expensive.

See Trac Trac #14737.

Reviewers: bgamari

Subscribers: simonpj, thomie, carter

Differential Revision: https://phabricator.haskell.org/D4568
Last edited 17 months ago by tdammers (previous) (diff)

comment:20 Changed 17 months ago by bgamari

Resolution: fixed
Status: patchclosed

comment:21 Changed 17 months ago by simonpj

Owner: tdammers deleted
Resolution: fixed
Status: closednew

But where is ​Phab:D4635? Did that land? comment:17 is all about that patch, but comment:19 is a different patch entirely.

Re-opening until we sort this out.

comment:22 Changed 17 months ago by tdammers

Differential Rev(s): phab:D4568phab:D4635
Status: newpatch

Figured out what went wrong.

Phab:D4568 was listed in the ticket data, but that one actually belongs to #14737, where we fixed the badness in the infamous Grammar.hs (now the T13737 test case), and doctored the performance tests to silence the resulting regressions. This one has landed.

The patch that is supposed to close this ticket is Phab:D4635, where we fix the regressions introduced earlier. This one hasn't landed, but I believe it should ASAP.

The confusion came from me looking at the status of D4568 earlier today, but thinking I was looking at D4635; seeing that the patch had landed, I concluded that this should close the ticket as well.

So, in short: this ticket isn't done yet, but all that's missing is landing Phab:D4635.

comment:23 Changed 17 months ago by Ben Gamari <ben@…>

In d92c755/ghc:

Fix performance regressions from #14737

See #15019. When removing an unnecessary type equality check in #14737,
several regression tests failed. The cause was that some coercions that
are actually Refl coercions weren't passed in as such, which made the
equality check needlessly complex (Refl coercions can be discarded in
this particular check immediately, without inspecting the types at all).

We fix that, and get additional performance improvements for free.

Reviewers: goldfire, bgamari, simonpj

Reviewed By: bgamari, simonpj

Subscribers: simonpj, thomie, carter

Differential Revision: https://phabricator.haskell.org/D4635

comment:24 Changed 17 months ago by simonpj

The patch in comment:23 (hooray) claims to fix three perf regressions, and get other improvement for free. Yet the patch only changes one perf number in perf/compiler/all.T, plus (oddly) one other comment.

What's up here?

comment:25 Changed 17 months ago by bgamari

Hmm, the comment change was due to my conflict resolution. For better or worse the numbers currently in all.T passed validation locally, so I can only guess that some of the numbers are now marginal.

comment:26 in reply to:  24 Changed 17 months ago by tdammers

T12425 at least does not look like it could possibly have become marginal. We went from 141952368 after #14747 (non-marginally up) to 130646336 in #15019 (non-marginally down), and now back up to 150743648 in this patch, which is even higher than the original value post-#14747. So this is somewhat curious IMO.

comment:27 Changed 15 months ago by bgamari

Milestone: 8.6.18.8.1

These won't be addressed by GHC 8.6.

comment:28 Changed 10 months ago by bgamari

Cc: davide added
Milestone: 8.8.18.10.1
Priority: normalhigh

We'll need to sort out what the status of this is. DavidE, can you investigate?

Note: See TracTickets for help on using tickets.