Opened 22 months ago

Last modified 19 months ago

#14564 new bug

CAF isn't floated

Reported by: neil.mayhew Owned by:
Priority: normal Milestone:
Component: Compiler Version: 8.2.2
Keywords: FloatOut Cc: nh2
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #14519 Differential Rev(s):
Wiki Page:

Description

In the following code, pattern isn't floated, and the execution time is 20x slower than it should be:

import Text.Regex.TDFA (Regex, makeRegex, match)

main :: IO ()
main = do
    entries <- map parseFilename . lines <$> getContents
    let check (Right (_, t)) = last t == 'Z'
        check _ = False
    print $ all check entries

parseFilename :: String -> Either String (String, String)
parseFilename fn = case (pattern `match` fn :: [[String]]) of
    [[_, full, _, time]] -> Right $ (full, time)
    _ -> Left fn
  where
    pattern :: Regex
    pattern = makeRegex
        "^\\./duplicity-(full|inc|new)(-signatures)?\\.\
        \([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]T[0-9][0-9][0-9][0-9][0-9][0-9]Z)\\."

Curiously, it is floated when profiling is enabled, so the profiled program ends up being 7x faster than the unprofiled one.

I can float the code manually (by taking out where and unindenting) and the program then runs at an acceptable speed.

I get the same behaviour with 8.0.2, 8.2.1 and 8.2.2 but 7.10.3 is OK. I don't think the OS and architecture makes a difference, but for the record I'm on various flavours of 64-bit Linux.

Test input data is here

Change History (6)

comment:1 Changed 22 months ago by simonpj

I took a look. Here's what is happening. Just before the full laziness pass (which does the floating) we have

parseFilename
  = \ (fn_a35P :: String) ->
      case Text.Regex.Base.Context.$fRegexContextabAllTextMatches9
             @ [Char]
             (Text.Regex.TDFA.String.$fRegexLikeRegex[]_go
                Text.Regex.TDFA.String.$fRegexContextRegex[][]3
                fn_a35P
                (Text.Regex.TDFA.NewDFA.Engine.execMatch_$sexecMatch3
                   (case Text.Regex.TDFA.String.$w$cmakeRegexOpts
                           Text.Regex.TDFA.Common.$fRegexOptionsRegexCompOptionExecOption_$cdefaultCompOpt
                           Text.Regex.TDFA.Common.$fRegexOptionsRegexCompOptionExecOption_$cblankExecOpt
                           (GHC.Base.build
                              @ Char
                              (\ (@ b_a5BF) ->
                                 GHC.CString.unpackFoldrCString#
                                   @ b_a5BF
                                   "^\\./duplicity-(full|inc|new)(-signatures)?\\.([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]T[0-9][0-9][0-9][0-9][0-9][0-9]Z)\\."#))
                    of
                    { (# ww1, ww2, ww3, ww4, ww5, ww6, ww7, ww8, ww9, ww10 #) ->
                    Text.Regex.TDFA.Common.Regex ww1 ww2 ww3 ww4 ww5 ww6 ww7 ww8 ww9 ww10
                    })
                   Text.Regex.TDFA.String.$fRegexContextRegex[][]3
                   Text.Regex.TDFA.String.$fRegexContextRegex[][]2
                   fn_a35P))
      of { ... }

Somme inlining has happened, but the constant expression we want to float to top level is the argument to $sexecMatch3, namely

  (case Text.Regex.TDFA.String.$w$cmakeRegexOpts
          Text.Regex.TDFA.Common.$fRegexOptionsRegexCompOptionExecOption_$cdefaultCompOpt
          Text.Regex.TDFA.Common.$fRegexOptionsRegexCompOptionExecOption_$cblankExecOpt
          (GHC.Base.build
             @ Char
             (\ (@ b_a5BF) ->
                GHC.CString.unpackFoldrCString#
                  @ b_a5BF "blah blah""#))
   of
   { (# ww1, ww2, ww3, ww4, ww5, ww6, ww7, ww8, ww9, ww10 #) ->
   Text.Regex.TDFA.Common.Regex ww1 ww2 ww3 ww4 ww5 ww6 ww7 ww8 ww9 ww10
   })

Alas, if you look at SetLevels.lvlMFE, there is a special case for case expressions, discussed in Note [Case MFEs]. And, since $sexecMatch3 is strict, the special case fires, and the case-expression is not floated.

Now Note [Case MFEs] claims that we'll take a sparate decision for the scrutinee (which is where all the work is). But in this case the scrutinee is an unboxed tuple, which also can't float to top level. So we can't flat that either, and we just lose, as Neil discovered.

comment:2 Changed 22 months ago by simonpj

Keywords: FloatOut added

I can think of various ways to improve this

  • Make the ad-hoc special handling for case-expressions fire only if
    • The destination level is no top-level (then no extra thunk is created after all)
    • The scrutinee is lifted (so a separate decision can really be taken for the scrutinee

I note also another problem with Note [Case MFEs]. Consider

g y = h y (case (f 3) of
              Nothing -> True
              Just x -> expensive s)

If we don't float the whole case-expression, we can float (f 3) instead. But we won't float expensive x; so it will be computed for each call of g. Not very clever.

Maybe we should just kill off the special case, and put up with a couple of nofib regressions.

This needs a bit of careful performance comparison. Any volunteers? I can advise.

comment:3 Changed 21 months ago by ntc2

comment:4 Changed 19 months ago by nh2

Cc: nh2 added

comment:5 Changed 19 months ago by nh2

Possibly related (I didn't have time to fully dive into it):

reddit - I wrote a program that runs about 10 times faster interpreted than compiled. Anybody know what's going on here? - https://www.reddit.com/r/haskell/comments/7yh8ar/i_wrote_a_program_that_runs_about_10_times_faster/

comment:6 Changed 19 months ago by simonpj

This Reddit post is the subject of #14827

Note: See TracTickets for help on using tickets.