Opened 16 months ago

Closed 15 months ago

Last modified 8 months ago

#15226 closed bug (fixed)

GHC doesn't know that seq# produces something in WHNF

Reported by: dfeuer Owned by:
Priority: normal Milestone: 8.6.1
Component: Compiler Version: 8.4.3
Keywords: Exceptions, DemandAnalysis Cc: simonmar
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case: perf/should_run/T15226, 15226a
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D4796
Wiki Page:

Description

data Str a = Str !a
bar :: Maybe a -> IO (Str (Maybe a))
bar x = do
  x' <- evaluate x
  pure (Str x')

This compiles to

Test.bar1
  = \ (@ a_a3Ld)
      (x_a3Ah :: Maybe a_a3Ld)
      (s_i3Nz :: GHC.Prim.State# GHC.Prim.RealWorld) ->
      case GHC.Prim.seq#
             @ (Maybe a_a3Ld) @ GHC.Prim.RealWorld x_a3Ah s_i3Nz
      of
      { (# ipv_i3NC, ipv1_i3ND #) ->
      (# ipv_i3NC, Test.$WStr @ (Maybe a_a3Ld) ipv1_i3ND #)
      }

We suspend the application of $WStr to ipv1_i3ND, when all we actually need to do is apply Str directly. We could work around this in base by defining

evaluate x = IO $ \s ->
  case seq# x s of
    (# s', !x' #) -> (# s', x' #)

but that seems more than a little bit silly.

Change History (23)

comment:1 Changed 16 months ago by dfeuer

On the other hand, if there's no way to express that in the demand signature, then I guess changing base would be a reasonable alternative....

comment:2 Changed 16 months ago by dfeuer

Differential Rev(s): Phab:D4789
Status: newpatch

I've submitted a differential to change evaluate to work around the problem, but I don't like it very much. I'd love to have a primitive type of kind Type -> TYPE UnliftedRep that only holds things in WHNF; then we could make the primop return something of that type. But we don't have such a beast right now.

comment:3 Changed 16 months ago by dfeuer

I think the right way for now is probably to stick this in simplAlt. When the scrutinee is seq# x s, we want to behave the way we would for a strict datacon field.

comment:4 Changed 16 months ago by dfeuer

Differential Rev(s): Phab:D4789Phab:D4796

Here's a differential to fix the problem properly.

comment:5 Changed 16 months ago by David Feuer <David.Feuer@…>

In d964b05/ghc:

Let the simplifier know that seq# forces

Add a special case in `simplAlt` to record that the result of
`seq#` is in WHNF.

Reviewers: simonmar, bgamari, simonpj

Reviewed By: simonpj

Subscribers: simonpj, rwbarton, thomie, carter

GHC Trac Issues: #15226

Differential Revision: https://phabricator.haskell.org/D4796

comment:6 Changed 16 months ago by dfeuer

Status: patchmerge

I think this is likely simple enough to merge.

We still don't mark the argument to seq# as evaluated in the case branch:

do
  _ <- evaluate x
  pure (Str x)

will still think it has to force x again. I suspect there are likely users doing such things. I think the fix should be really simple, but I don't know how to do it. We should (I believe) rewrite

case seq# x s of
  (# s', x' #) -> E

to

case seq# x s of
  (# s', x' #) -> E [x -> x']

In principle, we could do something like that for spark# as well, but it's probably better to let threads fizzle than to let users rely on the optimizer to make their parallel code do what they expect.

comment:7 Changed 16 months ago by simonpj

We should (I believe) rewrite

Yes -- this is a variant of the case binder-swap in OccurAnal. See Note [Binder swap] in OccurAnal. This is the place to do it.

comment:8 in reply to:  7 Changed 16 months ago by dfeuer

Replying to simonpj:

We should (I believe) rewrite

Yes -- this is a variant of the case binder-swap in OccurAnal. See Note [Binder swap] in OccurAnal. This is the place to do it.

I eventually found that, but I'm not at all sure how to deal with coercions in that context. We could, for example, have something like

case seq# x s `cast` ... of
  (# s', x' #) -> ...

in which case we have to work out how to rejigger all the coercions. I don't know enough about that machinery yet. In the current OccurAnal code, the coercion in the scrutinee is always on a variable, but here it's on a pair containing the variable, so I'm not going to be able to code monkey it.

Last edited 16 months ago by dfeuer (previous) (diff)

comment:9 Changed 16 months ago by simonpj

Good point. A cast would complicate. I suppose that in

case seq# x s |> g of
  (# s', x' #) -> ...

we'd have to have

 g :: (# State# t1, t2 #) ~ (# s1, s2 #)

where x :: t2 and x' :: s2. So you'd want to transform to

case seq# x s |> g of
  (# s', x' #) -> let x = x' |> sym (Nth 2 g)

because Nth 2 g :: t2 ~ s2. (I might not have this exactly right.)

This all seems a bit ad-hoc for seq# but I don't really see an alternative, sadly.

comment:10 Changed 16 months ago by dfeuer

The key OccAnal functions involved seem to be mkAltEnv (not sure what this does, but it seems to look for Vars when I want to look also for applications of seq#) and wrapAltRHS (which seems to actually install the Let when appropriate), but I don't see how it all fits together.

comment:11 Changed 15 months ago by simonpj

If you want to pursue, this I can advise. Whether it's worth the trouble I'm less sure. Do you have an example where it's causing trouble?

comment:12 Changed 15 months ago by dfeuer

I'm also not sure it's worth the trouble, although the current state of affairs seems a bit tricky to document in the Haddocks for evaluate. But considering Control. Parallel.Strategies.rdeepseq, I realized that even this binder swap, in combination with what I've already done, isn't really quite enough. Suppose we have

let e = x + 3 :: Int
in case seq# e s of
     (# s', e' #) -> E

We'd actually like to know that not only e and e', but also x, are evaluated in E, because e is strict in x. So if we do a binder swap, we should do it for all the variables the scrutinee is strict in that are not already known to be evaluated.

comment:13 Changed 15 months ago by simonmar

Is there anything to document? It doesn't affect the semantics of evaluate.

comment:14 Changed 15 months ago by simonpj

So if we do a binder swap, we should do it for all the variables the scrutinee is strict

This is a bridge too far! Strictness analysis will work this out, I think. Eg that let e will turn into a case.

comment:15 in reply to:  14 Changed 15 months ago by dfeuer

Replying to simonpj:

So if we do a binder swap, we should do it for all the variables the scrutinee is strict

This is a bridge too far! Strictness analysis will work this out, I think. Eg that let e will turn into a case.

I don't think so. seq# is intentionally lazy in its argument, to allow explicit ordering in an IO context. This seems pretty important in combination with spark#, for example.

comment:16 Changed 15 months ago by simonpj

seq# is intentionally lazy in its argument, to allow explicit ordering in an IO context

Hmnm. Can you give an example? Nothing in seq#'s documentation says that. It jolly well should!

comment:17 Changed 15 months ago by simonpj

Test Case: perf/should_run/T15226, 15226a

I believe this is a follow-up patch

commit 502026fc0a35460c7f04b26a11320723a7bbfdff
Author: David Feuer <david.feuer@gmail.com>
Date:   Mon Jun 11 10:32:23 2018 -0400

    Make seq# evaluatedness look through casts
    
    In d964b05, I forgot to look through casts to find the `seq#`
    identifier. Fix that.
    
    Reviewers: bgamari
    
    Reviewed By: bgamari
    
    Subscribers: rwbarton, thomie, carter
    
    Differential Revision: https://phabricator.haskell.org/D4804


>---------------------------------------------------------------

502026fc0a35460c7f04b26a11320723a7bbfdff
 compiler/coreSyn/CoreSyn.hs                               | 3 ++-
 testsuite/tests/perf/should_run/{T15226.hs => T15226a.hs} | 5 ++++-
 testsuite/tests/perf/should_run/all.T                     | 9 +++++++++
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/compiler/coreSyn/CoreSyn.hs b/compiler/coreSyn/CoreSyn.hs index 4dd70b0..50e40d1 100644
--- a/compiler/coreSyn/CoreSyn.hs
+++ b/compiler/coreSyn/CoreSyn.hs
@@ -2046,10 +2046,11 @@ collectArgs expr
     go e         as = (e, as)
 
 -- | Attempt to remove the last N arguments of a function call.
--- Strip off any ticks encountered along the way and any ticks
+-- Strip off any ticks or coercions encountered along the way and any
 -- at the end.
 stripNArgs :: Word -> Expr a -> Maybe (Expr a)  stripNArgs !n (Tick _ e) = stripNArgs n e
+stripNArgs n (Cast f _) = stripNArgs n f
 stripNArgs 0 e = Just e
 stripNArgs n (App f _) = stripNArgs (n - 1) f  stripNArgs _ _ = Nothing diff --git a/testsuite/tests/perf/should_run/T15226.hs b/testsuite/tests/perf/should_run/T15226a.hs
similarity index 89%
copy from testsuite/tests/perf/should_run/T15226.hs
copy to testsuite/tests/perf/should_run/T15226a.hs
index 4c09114..6e9a1db 100644
--- a/testsuite/tests/perf/should_run/T15226.hs
+++ b/testsuite/tests/perf/should_run/T15226a.hs
@@ -3,6 +3,7 @@ import Control.Exception (evaluate)
 
 -- Just in case Prelude.repeat changes for some reason.
 import Prelude hiding (repeat)
+import Data.Coerce
 
 -- We want to be sure that the compiler *doesn't* know that
 -- all the elements of the list are in WHNF, because if it @@ -12,11 +13,13 @@ repeat a = res
   where res = a : res
 {-# NOINLINE repeat #-}  -- Belt *and* suspenders
 
+newtype Foo = Foo Int
+
 silly :: [Int] -> IO ()
 silly = foldr go (pure ())
   where
     go x r = do
-      x' <- evaluate x
+      x' <- (coerce (evaluate :: Foo -> IO Foo) :: Int -> IO Int) x
       evaluate (x' + 3)  -- GHC should know that x' has been evaluated,
                          -- so this calculation will be erased entirely.
                          -- Otherwise, we'll create a thunk to pass to diff --git a/testsuite/tests/perf/should_run/all.T b/testsuite/tests/perf/should_run/all.T
index b248dd5..0e7996ef 100644
--- a/testsuite/tests/perf/should_run/all.T
+++ b/testsuite/tests/perf/should_run/all.T
@@ -584,3 +584,12 @@ test('T15226',
      only_ways(['normal'])],
     compile_and_run,
     ['-O'])
+
+test('T15226a',
+    [stats_num_field('bytes allocated',
+                    [ (wordsize(64), 41040, 5) ]),
+                    # 2018-06-06   41040  Look through casts for seq#
+                    # initial  400041040
+     only_ways(['normal'])],
+    compile_and_run,
+    ['-O'])

comment:18 Changed 15 months ago by bgamari

Resolution: fixed
Status: mergeclosed

This is in 8.6.

comment:19 in reply to:  16 Changed 14 months ago by dfeuer

Replying to simonpj:

seq# is intentionally lazy in its argument, to allow explicit ordering in an IO context

Hmnm. Can you give an example? Nothing in seq#'s documentation says that. It jolly well should!

Considering seq# strict can be rather bad, I believe. If we turn .... seq# x s into case x of x' {DEFAULT__ -> .... seq# x' s} then we'll see that x' is evaluated and erase the seq#. That sort of thing is the very sort of trouble seq# was intended to avoid.

comment:20 Changed 14 months ago by simonpj

Considering seq# strict can be rather bad, I believe.

I'm not sure about that. First, let's remember (I always forget this) that seq# has type

seq# ::a -> State# s -> (# State# s, a #)

That is, it involves the IO monad. It's used to implement

evaluate :: a -> IO a
evaluate a = IO $ \s -> seq# a s -- NB. see #2273, #5129

See Trac #5129.

Now consider

\s. let x = blah in
    let y = x+1 in
    case seq# x s of
      (# s', x' #) -> ...

It seems fine to me transform this to

\x. case blah of x ->
    let y = x+1 in
    case seq# x s of (# s', x' #) ->
      ...

What if the seq# is after some other IO operations thus:

\s. let x = blah in
    case f s of      (# s1, r #) ->
    case seq# x s of (# s2, x' #) -> ......

Now you might worry that x might be evaluated (and throw an exception) before f gets a chance to run. But it doesn't: there's a hack in the strictness analyser (see See Note [IO hack in the demand analyser] in DmdAnal) that will make x's binding lazy; in effect the strictness analyser treats the case f s of ... as if it had an extra invisible alternative not mentioning x.

It's not that important. But I think that seq# can safely be strict in x.

comment:21 Changed 14 months ago by simonpj

I'm going to re-open this because there are two loose ends:

  • Should seq# be strict? See comment:10
  • Is it worth doing the "binder-swap" thing? See comment:6 and following.

comment:22 Changed 14 months ago by simonpj

Keywords: Exceptions added

comment:23 Changed 8 months ago by simonpj

Keywords: DemandAnalysis added
Note: See TracTickets for help on using tickets.