Opened 2 years ago

Closed 22 months ago

Last modified 22 months ago

#14519 closed bug (wontfix)

Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA

Reported by: ntc2 Owned by: tdammers
Priority: normal Milestone: 8.6.1
Component: Compiler Version: 8.2.2
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case: https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53
Blocked By: Blocking:
Related Tickets: #13745, #14564 Differential Rev(s):
Wiki Page:

Description (last modified by ntc2)

When using Text.RE.TDFA from the regex-tdfa package on GHC 8.2 to match trivial regexes against Data.Text.Lazy strings, I see exponential runtime. On GHC 8.0, or using Data.Text strict strings or String strings, the runtime is as expected.

Here's my repo with simple test code illustrating the regression and timing stats: https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 .

In summary, for the problematic combination of GHC version and libraries, the run times are 3s, 10s, 22s, and 40s for counting regex matches in files with 10000, 20000, 30000, and 40000 lines, respectively. For all of the unproblematic combinations -- i.e. GHC 8.0.2, String, strict Data.Text, or building with profiling -- the run time is always about 1s!

And this is a Heisenbug, in that the performance problem goes away if I build with profiling support!

There is the separate Issue #13745 about compile-time regressions in GHC 8.2 + the regex-tdfa package, that I commented on.

Attachments (5)

Main.hs (1.2 KB) - added by ntc2 2 years ago.
Test code that exhibits exponential runtime regression
defs-30000.txt (146.5 KB) - added by ntc2 2 years ago.
Input file for test program
repro-opt.ticky (6.7 KB) - added by tdammers 22 months ago.
Minimal reproduction, unoptimized (-O0)
repro-opt.2.ticky (6.7 KB) - added by tdammers 22 months ago.
Minimal reproduction, optimized (-O)
repro-unopt.ticky (5.4 KB) - added by tdammers 22 months ago.
Minimal reproduction, unoptimized (-O0)

Download all attachments as: .zip

Change History (50)

Changed 2 years ago by ntc2

Attachment: Main.hs added

Test code that exhibits exponential runtime regression

Changed 2 years ago by ntc2

Attachment: defs-30000.txt added

Input file for test program

comment:1 Changed 2 years ago by ntc2

As suggested by @nomeata, I'm going to try bisecting GHC between 8.0.2 and 8.2.0 to discover when the performance regression was introduced.

comment:2 Changed 2 years ago by ntc2

Having "unusable due to shadowed dependencies" trouble with the GHC 8.2.1 I built from source when trying to build my test code. Links to possibly related issues and a patch:

comment:3 Changed 2 years ago by bgamari

Milestone: 8.4.1
Owner: set to tdammers

comment:4 Changed 2 years ago by ntc2

Description: modified (diff)
Test Case: https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bughttps://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53

I'm having trouble with the bisection search because hashable won't build on some intermediate dev GHCs I've tried. I tried changing the test case to use regex-tdfa-text+Data.Text.Lazy directly, instead of via the common interface provided by regex, but then the problem goes away.

comment:5 Changed 2 years ago by simonpj

Don't forget to try with -fno-state-hack.

comment:6 Changed 2 years ago by ntc2

comment:7 Changed 23 months ago by tdammers

I took a stab at bisecting this myself, the problem however is that I'm getting too many false positives, even when cleaning thoroughly and starting with as clean a slate as possible, and when I do that, build times become prohibitive, so I gave up after a few night-long attempts. So I will now take a different route, trying to isolate the problem further on current GHC HEAD, see if I can pin it down.

comment:8 Changed 23 months ago by tdammers

Quick data point: in a test case using a modified Lazy.hs, the bad behavior can be reproduced when running certain regular expressions against a large test data set, but not others:

  • ^def : bad
  • ^def[^.]: bad
  • ^[^.]: good
  • ^d[^.]: good
  • ^de[^.]: good
  • ^def[^.]: bad
  • ^[^.]def: good

(I picked [^.] as a subexpression that can never match).

Particularly interesting is ^de[^.] vs. ^def[^.]: it seems that having to consume 4 or more tokens from the start of the input triggers the bad behavior.

comment:9 Changed 23 months ago by tdammers

Actually it turns out that I must have done something wrong with my benchmarking; proper testing shows that *all* the examples that start with ^[.^] (i.e. everything that fails on the first token) runs fast (~20 ms), while everything that matches on the first token at least runs slowly (~30,000 ms).

comment:10 Changed 23 months ago by simonpj

I would urge you to build the program and libraries with -ticky and see which function is getting executed a lot. The advantage of -ticky is that it's guaranteed not to affect optimisation or indeed anything. It just logs what happens.

Then you'll need a -ddump-simpl -ddump-stg of the relevant modules to match up with the ticky logs.

comment:11 Changed 23 months ago by tdammers

I would urge you to build the program and libraries with -ticky and see which function is getting executed a lot. The advantage of -ticky is that it's guaranteed not to affect optimisation or indeed anything. It just logs what happens.

That's actually exactly what I'm doing right now. A comparison of running two regular expressions such that one is fast (^[^.]d) and the other is slow (^def) shows that all the metrics are the same, or nearly the same, except for ALLOC_PRIM_gds (from 669 up to 1801526) and ALLOC_PRIM_ctr (from 371392 up to 9325581456).

I should probably dig a bit deeper from here.

comment:12 Changed 23 months ago by simonpj

Interesting. I was thinking of this part of the ticky file

    Entries      Alloc    Alloc'd  Non-void Arguments      STG Name
--------------------------------------------------------------------------------
    1800047   78722320          0   1 L                    p_term{v raIX} (main:Main) (fun)
    2160065   50401232          0   2 >L                   base:GHC.List.takeWhile{v rcW} (fun)
    1509183   43941968          0   2 LL                   base:GHC.Base.++{v 03} (fun)
    1086514   43446320          0   2 LL                   base:GHC.List.zip{v 0x} (fun)
    1088129   37257616          0   2 >L                   base:GHC.Base.map{v 01X} (fun)
     480015   24961008          0   1 L                    p2{v raIU} (main:Main) (fun)
     240001   19200080          0   2 ML                   $w$j{v raIV} (main:Main) (fun,se)
     665502   17440560          0   2 iL                   go1{v raJb} (main:Main) (fun)
     545117   17421624          0   1 M                    subterms{v r1ki} (main:Main) (fun)
     723688   15430288          0   3 LMM                  $wmatch'{v raJ5} (main:Main) (fun)
     662950   15428096          0   3 MML                  $sfind'{v raIM} (main:Main) (fun)
     363373   12627992          0   1 L                    go11{v saSX} (main:Main) (fun) in raJb
     487124   11666944          0   3 >>M                  expr_fold{v r1jH} (main:Main) (fun)
    1267348   11605760          0   2 LM                   find'{v raIN} (main:Main) (fun)

Usually when something exponential is going on you get some very bug numbers at the top. (You may need to sort first.)

comment:13 Changed 23 months ago by tdammers

Correct, but that's why I needed to dig deeper - ticky-ing only the module itself didn't reveal anything, but compiling a bunch of dependencies with -ticky as well gives me this line:

 2250377760180010799840          0   2 Mi                   $wnext2{v reC2} (regex-tdfa-text-1.0.0.3-CIfFZ6rjdCoJI5EFpqTwBO:Text.Regex.TDFA.Text.Lazy) (fun)

(note that it actually overflows the Alloc field, so that's two large numbers concatenated together there)

I can't seem to find wnext2 anywhere in the dumps though, which is a bit strange.

comment:14 Changed 23 months ago by simonpj

Compile Text.Regex.TDFA.Text.Lazy with -ddump-simpl -ddump-stg -ticky. I'd be surprised if $wnext2{v reC2} doesn't show up.

comment:15 Changed 23 months ago by tdammers

That's what I thought I was doing, but apparently I fat-fingered a cabal command. Recompiling as we speak, will report back.

comment:16 Changed 23 months ago by tdammers

Recompiled everything, with --force-reinstalls and -fforce-recomp, and -ddump-stg on everything including dependencies, however, grepping the entire project tree for wnext only matches binaries (.o, .a and the like), but none of the dumps.

So I ran GHC directly on the source file inside the cabal tree:

../ghc/inplace/bin/ghc-stage2 regex-tdfa-text-1.0.0.3/Text/Regex/TDFA/Text/Lazy.hs -fforce-recomp -package-db .cabal-sandbox/x86_64-linux-ghc-8.5.20180108-packages.conf.d -ticky -c -ddump-stg -rtsopts -XMultiParamTypeClasses

But to no avail, wnext2 does not appear in the STG dump:

==================== Pre unarise: ====================
sat_s6o2 :: GHC.Types.Int -> GHC.Int.Int64
[LclId] =
    [] \u [] GHC.Enum.toEnum GHC.Int.$fEnumInt64;

$cafter_r6mQ
  :: GHC.Types.Int
     -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text
[GblId] =
    [] \u [] GHC.Base.. Data.Text.Lazy.drop sat_s6o2;

sat_s6o3 :: GHC.Types.Int -> GHC.Int.Int64
[LclId] =
    [] \u [] GHC.Enum.toEnum GHC.Int.$fEnumInt64;

$cbefore_r6nJ
  :: GHC.Types.Int
     -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text
[GblId] =
    [] \u [] GHC.Base.. Data.Text.Lazy.take sat_s6o3;

Text.Regex.TDFA.Text.Lazy.$fExtractText [InlPrag=NOUSERINLINE CONLIKE]
  :: Text.Regex.Base.RegexLike.Extract Data.Text.Internal.Lazy.Text
[GblId[DFunId]] =
    NO_CCS Text.Regex.Base.RegexLike.C:Extract! [$cbefore_r6nJ
                                                 $cafter_r6mQ
                                                 Data.Text.Internal.Lazy.empty
                                                 $cextract_r6nK];
$cextract_r6nK
  :: (GHC.Types.Int, GHC.Types.Int)
     -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text
[GblId] =
    [] \u []
        Text.Regex.Base.RegexLike.$dmextract
            Text.Regex.TDFA.Text.Lazy.$fExtractText;

Text.Regex.TDFA.Text.Lazy.$fUnconsText [InlPrag=INLINE (sat-args=0)]
  :: Text.Regex.TDFA.NewDFA.Uncons.Uncons
       Data.Text.Internal.Lazy.Text
[GblId[DFunId(nt)]] =
    [] \u [] Data.Text.Lazy.uncons;

$cmakeRegexOptsM_r6nL
  :: forall (m :: * -> *).
     GHC.Base.Monad m =>
     Text.Regex.TDFA.Common.CompOption
     -> Text.Regex.TDFA.Common.ExecOption
     -> Data.Text.Internal.Lazy.Text
     -> m Text.Regex.TDFA.Common.Regex
[GblId, Arity=4, Unf=OtherCon []] =
    [] \r [$dMonad_s6o4 c_s6o5 e_s6o6 source_s6o7]
        let {
          sat_s6o8 [Occ=Once] :: GHC.Base.String
          [LclId] =
              [source_s6o7] \u [] Data.Text.Lazy.unpack source_s6o7;
        } in 
          Text.Regex.Base.RegexLike.makeRegexOptsM
              Text.Regex.TDFA.String.$fRegexMakerRegexCompOptionExecOption[]
              $dMonad_s6o4
              c_s6o5
              e_s6o6
              sat_s6o8;

Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText [InlPrag=NOUSERINLINE CONLIKE]
  :: Text.Regex.Base.RegexLike.RegexMaker
       Text.Regex.TDFA.Common.Regex
       Text.Regex.TDFA.Common.CompOption
       Text.Regex.TDFA.Common.ExecOption
       Data.Text.Internal.Lazy.Text
[GblId[DFunId]] =
    NO_CCS Text.Regex.Base.RegexLike.C:RegexMaker! [Text.Regex.TDFA.Common.$fRegexOptionsRegexCompOptionExecOption
                                                    $cmakeRegex_r6nN
                                                    $cmakeRegexOpts_r6nM
                                                    $cmakeRegexM_r6nO
                                                    $cmakeRegexOptsM_r6nL];
$cmakeRegexOpts_r6nM
  :: Text.Regex.TDFA.Common.CompOption
     -> Text.Regex.TDFA.Common.ExecOption
     -> Data.Text.Internal.Lazy.Text
     -> Text.Regex.TDFA.Common.Regex
[GblId] =
    [] \u []
        Text.Regex.Base.RegexLike.$dmmakeRegexOpts
            Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText;
$cmakeRegex_r6nN
  :: Data.Text.Internal.Lazy.Text -> Text.Regex.TDFA.Common.Regex
[GblId] =
    [] \u []
        Text.Regex.Base.RegexLike.$dmmakeRegex
            Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText;
$cmakeRegexM_r6nO
  :: forall (m :: * -> *).
     GHC.Base.Monad m =>
     Data.Text.Internal.Lazy.Text -> m Text.Regex.TDFA.Common.Regex
[GblId, Arity=1, Unf=OtherCon []] =
    [] \r [$dMonad_s6o9]
        Text.Regex.Base.RegexLike.$dmmakeRegexM
            Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText
            $dMonad_s6o9;

$trModule1_r6nP :: GHC.Prim.Addr#
[GblId, Caf=NoCafRefs, Unf=OtherCon []] =
    "main"#;

$trModule2_r6nQ :: GHC.Types.TrName
[GblId, Caf=NoCafRefs, Unf=OtherCon []] =
    NO_CCS GHC.Types.TrNameS! [$trModule1_r6nP];

$trModule3_r6nR :: GHC.Prim.Addr#
[GblId, Caf=NoCafRefs, Unf=OtherCon []] =
    "Text.Regex.TDFA.Text.Lazy"#;

$trModule4_r6nS :: GHC.Types.TrName
[GblId, Caf=NoCafRefs, Unf=OtherCon []] =
    NO_CCS GHC.Types.TrNameS! [$trModule3_r6nR];

Text.Regex.TDFA.Text.Lazy.$trModule :: GHC.Types.Module
[GblId, Caf=NoCafRefs, Unf=OtherCon []] =
    NO_CCS GHC.Types.Module! [$trModule2_r6nQ $trModule4_r6nS];

$cmatchTest_r6nT
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text -> GHC.Types.Bool
[GblId] =
    [] \u []
        Text.Regex.TDFA.NewDFA.Tester.matchTest Data.Text.Lazy.uncons;

$cmatchAll_r6nU
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> [Text.Regex.Base.RegexLike.MatchArray]
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [r_s6oa s_s6ob]
        let {
          sat_s6od [Occ=Once] :: GHC.Types.Char
          [LclId] =
              NO_CCS GHC.Types.C#! ['\n'#]; } in
        let {
          sat_s6oc [Occ=Once] :: Text.Regex.TDFA.Common.Position
          [LclId] =
              NO_CCS GHC.Types.I#! [0#];
        } in 
          Text.Regex.TDFA.NewDFA.Engine.execMatch
              Data.Text.Lazy.uncons r_s6oa sat_s6oc sat_s6od s_s6ob;

Text.Regex.TDFA.Text.Lazy.compile
  :: Text.Regex.TDFA.Common.CompOption
     -> Text.Regex.TDFA.Common.ExecOption
     -> Data.Text.Internal.Lazy.Text
     -> Data.Either.Either GHC.Base.String Text.Regex.TDFA.Common.Regex
[GblId, Arity=3, Unf=OtherCon []] =
    [] \r [compOpt_s6oe execOpt_s6of txt_s6og]
        let {
          sat_s6oh [Occ=Once] :: GHC.Base.String
          [LclId] =
              [txt_s6og] \u [] Data.Text.Lazy.unpack txt_s6og;
        } in 
          case Text.Regex.TDFA.ReadRegex.parseRegex sat_s6oh of {
            Data.Either.Left err_s6oj [Occ=Once] ->
                let {
                  sat_s6om [Occ=Once] :: [GHC.Types.Char]
                  [LclId] =
                      [err_s6oj] \u []
                          let {
                            sat_s6ol [Occ=Once] :: [GHC.Types.Char]
                            [LclId] =
                                [err_s6oj] \u []
                                    GHC.Show.show Text.Parsec.Error.$fShowParseError err_s6oj; } in
                          let {
                            sat_s6ok [Occ=Once] :: [GHC.Types.Char]
                            [LclId] =
                                [] \u []
                                    GHC.CString.unpackCString#
                                        "parseRegex for Text.Regex.TDFA.Text.Lazy failed:"#;
                          } in  GHC.Base.++ sat_s6ok sat_s6ol;
                } in  Data.Either.Left [sat_s6om];
            Data.Either.Right pattern_s6on [Occ=Once] ->
                let {
                  sat_s6oo [Occ=Once] :: Text.Regex.TDFA.Common.Regex
                  [LclId] =
                      [compOpt_s6oe execOpt_s6of pattern_s6on] \u []
                          Text.Regex.TDFA.TDFA.patternToRegex
                              pattern_s6on compOpt_s6oe execOpt_s6of;
                } in  Data.Either.Right [sat_s6oo];
          };

$cmatchOnce_r6nV
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [r_s6op s_s6oq]
        let {
          sat_s6or [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray]
          [LclId] =
              [r_s6op s_s6oq] \u [] $cmatchAll_r6nU r_s6op s_s6oq;
        } in  Data.Maybe.listToMaybe sat_s6or;

$cmatchAllText_r6nW
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> [Text.Regex.Base.RegexLike.MatchText
           Data.Text.Internal.Lazy.Text]
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [regex_s6os source_s6ot]
        let {
          go_s6ou [Occ=LoopBreaker]
            :: GHC.Types.Int
               -> Data.Text.Internal.Lazy.Text
               -> [GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)]
               -> [GHC.Arr.Array
                     GHC.Types.Int
                     (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int))]
          [LclId, Arity=3, Unf=OtherCon []] =
              sat-only [go_s6ou] \r [i_s6ov ds_s6ow ds1_s6ox]
                  case i_s6ov of i1_s6oy {
                    GHC.Types.I# _ [Occ=Dead] ->
                        case ds1_s6ox of {
                          [] -> [] [];
                          : x_s6oB xs_s6oC [Occ=Once] ->
                              let {
                                ds2_s6oD :: (GHC.Types.Int, GHC.Types.Int)
                                [LclId] =
                                    [x_s6oB] \u []
                                        let {
                                          sat_s6oE [Occ=Once] :: GHC.Types.Int
                                          [LclId] =
                                              NO_CCS GHC.Types.I#! [0#];
                                        } in 
                                          Data.Array.Base.!
                                              Data.Array.Base.$fIArrayArraye
                                              GHC.Arr.$fIxInt
                                              x_s6oB
                                              sat_s6oE; } in
                              let {
                                off0_s6oF :: GHC.Types.Int
                                [LclId] =
                                    [ds2_s6oD] \u []
                                        case ds2_s6oD of {
                                          (,) off1_s6oH [Occ=Once] _ [Occ=Dead] -> off1_s6oH;
                                        }; } in
                              let {
                                len0_s6oJ :: GHC.Types.Int
                                [LclId] =
                                    [ds2_s6oD] \u []
                                        case ds2_s6oD of {
                                          (,) _ [Occ=Dead] len1_s6oM [Occ=Once] -> len1_s6oM;
                                        }; } in
                              let {
                                sat_s6p0 [Occ=Once]
                                  :: [GHC.Arr.Array
                                        GHC.Types.Int
                                        (Data.Text.Internal.Lazy.Text,
                                         (GHC.Types.Int, GHC.Types.Int))]
                                [LclId] =
                                    [go_s6ou ds_s6ow i1_s6oy xs_s6oC off0_s6oF len0_s6oJ] \u []
                                        let {
                                          sat_s6oX [Occ=Once] :: GHC.Types.Int
                                          [LclId] =
                                              [i1_s6oy off0_s6oF len0_s6oJ] \u []
                                                  let {
                                                    sat_s6oW [Occ=Once] :: GHC.Types.Int
                                                    [LclId] =
                                                        [i1_s6oy len0_s6oJ] \u []
                                                            GHC.Num.-
                                                                GHC.Num.$fNumInt len0_s6oJ i1_s6oy;
                                                  } in 
                                                    GHC.Num.+ GHC.Num.$fNumInt off0_s6oF sat_s6oW;
                                        } in 
                                          case $cafter_r6mQ sat_s6oX ds_s6ow of t'_s6oY {
                                            __DEFAULT ->
                                                let {
                                                  sat_s6oZ [Occ=Once] :: GHC.Types.Int
                                                  [LclId] =
                                                      [off0_s6oF len0_s6oJ] \u []
                                                          GHC.Num.+
                                                              GHC.Num.$fNumInt off0_s6oF len0_s6oJ;
                                                } in  go_s6ou sat_s6oZ t'_s6oY xs_s6oC;
                                          }; } in
                              let {
                                sat_s6oV [Occ=Once]
                                  :: GHC.Arr.Array
                                       GHC.Types.Int
                                       (Data.Text.Internal.Lazy.Text,
                                        (GHC.Types.Int, GHC.Types.Int))
                                [LclId] =
                                    [ds_s6ow i1_s6oy x_s6oB] \u []
                                        let {
                                          sat_s6oU [Occ=Once]
                                            :: (GHC.Types.Int, GHC.Types.Int)
                                               -> (Data.Text.Internal.Lazy.Text,
                                                   (GHC.Types.Int, GHC.Types.Int))
                                          [LclId] =
                                              [ds_s6ow i1_s6oy] \r [pair_s6oN]
                                                  case pair_s6oN of wild1_s6oO {
                                                    (,) off_s6oP [Occ=Once] len_s6oQ [Occ=Once] ->
                                                        let {
                                                          sat_s6oT [Occ=Once]
                                                            :: Data.Text.Internal.Lazy.Text
                                                          [LclId] =
                                                              [ds_s6ow
                                                               i1_s6oy
                                                               off_s6oP
                                                               len_s6oQ] \u []
                                                                  let {
                                                                    sat_s6oR [Occ=Once]
                                                                      :: GHC.Types.Int
                                                                    [LclId] =
                                                                        [i1_s6oy off_s6oP] \u []
                                                                            GHC.Num.-
                                                                                GHC.Num.$fNumInt
                                                                                off_s6oP
                                                                                i1_s6oy; } in
                                                                  let {
                                                                    sat_s6oS [Occ=Once]
                                                                      :: (GHC.Types.Int,
                                                                          GHC.Types.Int)
                                                                    [LclId] =
                                                                        NO_CCS (,)! [sat_s6oR
                                                                                     len_s6oQ];
                                                                  } in 
                                                                    $cextract_r6nK sat_s6oS ds_s6ow;
                                                        } in  (,) [sat_s6oT wild1_s6oO];
                                                  };
                                        } in  GHC.Base.fmap GHC.Arr.$fFunctorArray sat_s6oU x_s6oB;
                              } in  : [sat_s6oV sat_s6p0];
                        };
                  }; } in
        let {
          sat_s6p2 [Occ=Once]
            :: [GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)]
          [LclId] =
              [regex_s6os source_s6ot] \u []
                  $cmatchAll_r6nU regex_s6os source_s6ot; } in
        let {
          sat_s6p1 [Occ=Once] :: GHC.Types.Int
          [LclId] =
              NO_CCS GHC.Types.I#! [0#];
        } in  go_s6ou sat_s6p1 source_s6ot sat_s6p2;

$cmatchCount_r6nX
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text -> GHC.Types.Int
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [r_s6p3 s_s6p4]
        let {
          sat_s6pm [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray]
          [LclId] =
              [r_s6p3 s_s6p4] \u []
                  let {
                    sat_s6pl [Occ=Once] :: GHC.Types.Char
                    [LclId] =
                        NO_CCS GHC.Types.C#! ['\n'#]; } in
                  let {
                    sat_s6pk [Occ=Once] :: Text.Regex.TDFA.Common.Position
                    [LclId] =
                        NO_CCS GHC.Types.I#! [0#]; } in
                  let {
                    sat_s6pj [Occ=Once] :: Text.Regex.TDFA.Common.Regex
                    [LclId] =
                        [r_s6p3] \u []
                            case r_s6p3 of wild_s6p5 {
                              Text.Regex.TDFA.Common.Regex ds_s6p6 [Occ=Once]
                                                           ds1_s6p7 [Occ=Once]
                                                           ds2_s6p8 [Occ=Once]
                                                           ds3_s6p9 [Occ=Once]
                                                           ds4_s6pa [Occ=Once]
                                                           ds5_s6pb [Occ=Once]
                                                           ds6_s6pc [Occ=Once]
                                                           ds7_s6pd [Occ=Once]
                                                           ds8_s6pe [Occ=Once]
                                                           _ [Occ=Dead] ->
                                  let {
                                    sat_s6pi [Occ=Once] :: Text.Regex.TDFA.Common.ExecOption
                                    [LclId] =
                                        [wild_s6p5] \u []
                                            case
                                                Text.Regex.TDFA.Common.regex_execOptions wild_s6p5
                                            of
                                            { Text.Regex.TDFA.Common.ExecOption _ [Occ=Dead] ->
                                                  Text.Regex.TDFA.Common.ExecOption [GHC.Types.False];
                                            };
                                  } in 
                                    Text.Regex.TDFA.Common.Regex [ds_s6p6
                                                                  ds1_s6p7
                                                                  ds2_s6p8
                                                                  ds3_s6p9
                                                                  ds4_s6pa
                                                                  ds5_s6pb
                                                                  ds6_s6pc
                                                                  ds7_s6pd
                                                                  ds8_s6pe
                                                                  sat_s6pi];
                            };
                  } in 
                    Text.Regex.TDFA.NewDFA.Engine.execMatch
                        Data.Text.Lazy.uncons sat_s6pj sat_s6pk sat_s6pl s_s6p4;
        } in  Data.Foldable.length Data.Foldable.$fFoldable[] sat_s6pm;

$cmatchOnceText_r6nY
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> GHC.Base.Maybe
          (Data.Text.Internal.Lazy.Text,
           Text.Regex.Base.RegexLike.MatchText Data.Text.Internal.Lazy.Text,
           Data.Text.Internal.Lazy.Text)
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [regex_s6pn source_s6po]
        let {
          sat_s6pL [Occ=Once]
            :: GHC.Base.Maybe
                 (GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int))
          [LclId] =
              [regex_s6pn source_s6po] \u []
                  let {
                    sat_s6pK [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray]
                    [LclId] =
                        [regex_s6pn source_s6po] \u []
                            let {
                              sat_s6pJ [Occ=Once] :: GHC.Types.Char
                              [LclId] =
                                  NO_CCS GHC.Types.C#! ['\n'#]; } in
                            let {
                              sat_s6pI [Occ=Once] :: Text.Regex.TDFA.Common.Position
                              [LclId] =
                                  NO_CCS GHC.Types.I#! [0#];
                            } in 
                              Text.Regex.TDFA.NewDFA.Engine.execMatch
                                  Data.Text.Lazy.uncons regex_s6pn sat_s6pI sat_s6pJ source_s6po;
                  } in  Data.Maybe.listToMaybe sat_s6pK; } in
        let {
          sat_s6pH [Occ=Once]
            :: GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)
               -> (Data.Text.Internal.Lazy.Text,
                   GHC.Arr.Array
                     GHC.Types.Int
                     (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)),
                   Data.Text.Internal.Lazy.Text)
          [LclId] =
              [source_s6po] \r [ma_s6pp]
                  let {
                    ds_s6pq :: (GHC.Types.Int, GHC.Types.Int)
                    [LclId] =
                        [ma_s6pp] \u []
                            let {
                              sat_s6pr [Occ=Once] :: GHC.Types.Int
                              [LclId] =
                                  NO_CCS GHC.Types.I#! [0#];
                            } in 
                              Data.Array.Base.!
                                  Data.Array.Base.$fIArrayArraye
                                  GHC.Arr.$fIxInt
                                  ma_s6pp
                                  sat_s6pr; } in
                  let {
                    o_s6ps :: GHC.Types.Int
                    [LclId] =
                        [ds_s6pq] \u []
                            case ds_s6pq of {
                              (,) o1_s6pu [Occ=Once] _ [Occ=Dead] -> o1_s6pu;
                            }; } in
                  let {
                    sat_s6pG [Occ=Once] :: Data.Text.Internal.Lazy.Text
                    [LclId] =
                        [source_s6po ds_s6pq o_s6ps] \u []
                            let {
                              sat_s6pF [Occ=Once] :: GHC.Types.Int
                              [LclId] =
                                  [ds_s6pq o_s6ps] \u []
                                      let {
                                        sat_s6pE [Occ=Once] :: GHC.Types.Int
                                        [LclId] =
                                            [ds_s6pq] \u []
                                                case ds_s6pq of {
                                                  (,) _ [Occ=Dead] l_s6pD [Occ=Once] -> l_s6pD;
                                                };
                                      } in  GHC.Num.+ GHC.Num.$fNumInt o_s6ps sat_s6pE;
                            } in  $cafter_r6mQ sat_s6pF source_s6po; } in
                  let {
                    sat_s6pA [Occ=Once]
                      :: GHC.Arr.Array
                           GHC.Types.Int
                           (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int))
                    [LclId] =
                        [source_s6po ma_s6pp] \u []
                            let {
                              sat_s6pz [Occ=Once]
                                :: (GHC.Types.Int, GHC.Types.Int)
                                   -> (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int))
                              [LclId] =
                                  [source_s6po] \r [ol_s6px]
                                      let {
                                        sat_s6py [Occ=Once] :: Data.Text.Internal.Lazy.Text
                                        [LclId] =
                                            [source_s6po ol_s6px] \u []
                                                $cextract_r6nK ol_s6px source_s6po;
                                      } in  (,) [sat_s6py ol_s6px];
                            } in  GHC.Base.fmap GHC.Arr.$fFunctorArray sat_s6pz ma_s6pp; } in
                  let {
                    sat_s6pw [Occ=Once] :: Data.Text.Internal.Lazy.Text
                    [LclId] =
                        [source_s6po o_s6ps] \u [] $cbefore_r6nJ o_s6ps source_s6po;
                  } in  (,,) [sat_s6pw sat_s6pA sat_s6pG];
        } in  GHC.Base.fmap GHC.Base.$fFunctorMaybe sat_s6pH sat_s6pL;

Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText [InlPrag=NOUSERINLINE CONLIKE]
  :: Text.Regex.Base.RegexLike.RegexLike
       Text.Regex.TDFA.Common.Regex Data.Text.Internal.Lazy.Text
[GblId[DFunId]] =
    NO_CCS Text.Regex.Base.RegexLike.C:RegexLike! [Text.Regex.TDFA.Text.Lazy.$fExtractText
                                                   $cmatchOnce_r6nV
                                                   $cmatchAll_r6nU
                                                   $cmatchCount_r6nX
                                                   $cmatchTest_r6nT
                                                   $cmatchAllText_r6nW
                                                   $cmatchOnceText_r6nY];

Text.Regex.TDFA.Text.Lazy.regexec
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> Data.Either.Either
          GHC.Base.String
          (GHC.Base.Maybe
             (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text,
              Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text]))
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [r_s6pM txt_s6pN]
        case $cmatchOnceText_r6nY r_s6pM txt_s6pN of {
          GHC.Base.Nothing -> Data.Either.Right [GHC.Base.Nothing];
          GHC.Base.Just ds_s6pP [Occ=Once!] ->
              case ds_s6pP of {
                (,,) pre_s6pR [Occ=Once] mt_s6pS post_s6pT [Occ=Once] ->
                    let {
                      sat_s6pZ [Occ=Once] :: [Data.Text.Internal.Lazy.Text]
                      [LclId] =
                          [mt_s6pS] \u []
                              let {
                                sat_s6pY [Occ=Once]
                                  :: [(Data.Text.Internal.Lazy.Text,
                                       (Text.Regex.Base.RegexLike.MatchOffset,
                                        Text.Regex.Base.RegexLike.MatchLength))]
                                [LclId] =
                                    [mt_s6pS] \u []
                                        let {
                                          sat_s6pX [Occ=Once]
                                            :: [(Data.Text.Internal.Lazy.Text,
                                                 (Text.Regex.Base.RegexLike.MatchOffset,
                                                  Text.Regex.Base.RegexLike.MatchLength))]
                                          [LclId] =
                                              [mt_s6pS] \u []
                                                  Data.Array.Base.elems
                                                      Data.Array.Base.$fIArrayArraye
                                                      GHC.Arr.$fIxInt
                                                      mt_s6pS;
                                        } in  GHC.List.tail sat_s6pX;
                              } in  GHC.Base.map Data.Tuple.fst sat_s6pY; } in
                    let {
                      sat_s6pW [Occ=Once] :: Data.Text.Internal.Lazy.Text
                      [LclId] =
                          [mt_s6pS] \u []
                              let {
                                sat_s6pV [Occ=Once]
                                  :: (Data.Text.Internal.Lazy.Text,
                                      (Text.Regex.Base.RegexLike.MatchOffset,
                                       Text.Regex.Base.RegexLike.MatchLength))
                                [LclId] =
                                    [mt_s6pS] \u []
                                        let {
                                          sat_s6pU [Occ=Once] :: GHC.Types.Int
                                          [LclId] =
                                              NO_CCS GHC.Types.I#! [0#];
                                        } in 
                                          Data.Array.Base.!
                                              Data.Array.Base.$fIArrayArraye
                                              GHC.Arr.$fIxInt
                                              mt_s6pS
                                              sat_s6pU;
                              } in  Data.Tuple.fst sat_s6pV; } in
                    let {
                      sat_s6q0 [Occ=Once]
                        :: (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text,
                            Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text])
                      [LclId] =
                          NO_CCS (,,,)! [pre_s6pR sat_s6pW post_s6pT sat_s6pZ]; } in
                    let {
                      sat_s6q1 [Occ=Once]
                        :: GHC.Base.Maybe
                             (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text,
                              Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text])
                      [LclId] =
                          NO_CCS GHC.Base.Just! [sat_s6q0];
                    } in  Data.Either.Right [sat_s6q1];
              };
        };

$cmatch_r6nZ
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text
[GblId] =
    [] \u []
        Text.Regex.Base.Impl.polymatch
            Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText;

$cmatchM_r6o0
  :: forall (m :: * -> *).
     GHC.Base.Monad m =>
     Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text -> m Data.Text.Internal.Lazy.Text
[GblId, Arity=1, Unf=OtherCon []] =
    [] \r [$dMonad_s6q2]
        Text.Regex.Base.Impl.polymatchM
            Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText $dMonad_s6q2;

Text.Regex.TDFA.Text.Lazy.$fRegexContextRegexTextText [InlPrag=NOUSERINLINE CONLIKE]
  :: Text.Regex.Base.RegexLike.RegexContext
       Text.Regex.TDFA.Common.Regex
       Data.Text.Internal.Lazy.Text
       Data.Text.Internal.Lazy.Text
[GblId[DFunId]] =
    NO_CCS Text.Regex.Base.RegexLike.C:RegexContext! [Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText
                                                      $cmatch_r6nZ
                                                      $cmatchM_r6o0];

Text.Regex.TDFA.Text.Lazy.execute
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> Data.Either.Either
          GHC.Base.String
          (GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray)
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [r_s6q3 txt_s6q4]
        let {
          sat_s6q5 [Occ=Once]
            :: GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray
          [LclId] =
              [r_s6q3 txt_s6q4] \u [] $cmatchOnce_r6nV r_s6q3 txt_s6q4;
        } in  Data.Either.Right [sat_s6q5];



==================== STG syntax: ====================
sat_s6o2 :: GHC.Types.Int -> GHC.Int.Int64
[LclId] =
    [] \u [] GHC.Enum.toEnum GHC.Int.$fEnumInt64;

$cafter_r6mQ
  :: GHC.Types.Int
     -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text
[GblId] =
    [] \u [] GHC.Base.. Data.Text.Lazy.drop sat_s6o2;

sat_s6o3 :: GHC.Types.Int -> GHC.Int.Int64
[LclId] =
    [] \u [] GHC.Enum.toEnum GHC.Int.$fEnumInt64;

$cbefore_r6nJ
  :: GHC.Types.Int
     -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text
[GblId] =
    [] \u [] GHC.Base.. Data.Text.Lazy.take sat_s6o3;

Text.Regex.TDFA.Text.Lazy.$fExtractText [InlPrag=NOUSERINLINE CONLIKE]
  :: Text.Regex.Base.RegexLike.Extract Data.Text.Internal.Lazy.Text
[GblId[DFunId]] =
    NO_CCS Text.Regex.Base.RegexLike.C:Extract! [$cbefore_r6nJ
                                                 $cafter_r6mQ
                                                 Data.Text.Internal.Lazy.empty
                                                 $cextract_r6nK];
$cextract_r6nK
  :: (GHC.Types.Int, GHC.Types.Int)
     -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text
[GblId] =
    [] \u []
        Text.Regex.Base.RegexLike.$dmextract
            Text.Regex.TDFA.Text.Lazy.$fExtractText;

Text.Regex.TDFA.Text.Lazy.$fUnconsText [InlPrag=INLINE (sat-args=0)]
  :: Text.Regex.TDFA.NewDFA.Uncons.Uncons
       Data.Text.Internal.Lazy.Text
[GblId[DFunId(nt)]] =
    [] \u [] Data.Text.Lazy.uncons;

$cmakeRegexOptsM_r6nL
  :: forall (m :: * -> *).
     GHC.Base.Monad m =>
     Text.Regex.TDFA.Common.CompOption
     -> Text.Regex.TDFA.Common.ExecOption
     -> Data.Text.Internal.Lazy.Text
     -> m Text.Regex.TDFA.Common.Regex
[GblId, Arity=4, Unf=OtherCon []] =
    [] \r [$dMonad_s6o4 c_s6o5 e_s6o6 source_s6o7]
        let {
          sat_s6o8 [Occ=Once] :: GHC.Base.String
          [LclId] =
              [source_s6o7] \u [] Data.Text.Lazy.unpack source_s6o7;
        } in 
          Text.Regex.Base.RegexLike.makeRegexOptsM
              Text.Regex.TDFA.String.$fRegexMakerRegexCompOptionExecOption[]
              $dMonad_s6o4
              c_s6o5
              e_s6o6
              sat_s6o8;

Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText [InlPrag=NOUSERINLINE CONLIKE]
  :: Text.Regex.Base.RegexLike.RegexMaker
       Text.Regex.TDFA.Common.Regex
       Text.Regex.TDFA.Common.CompOption
       Text.Regex.TDFA.Common.ExecOption
       Data.Text.Internal.Lazy.Text
[GblId[DFunId]] =
    NO_CCS Text.Regex.Base.RegexLike.C:RegexMaker! [Text.Regex.TDFA.Common.$fRegexOptionsRegexCompOptionExecOption
                                                    $cmakeRegex_r6nN
                                                    $cmakeRegexOpts_r6nM
                                                    $cmakeRegexM_r6nO
                                                    $cmakeRegexOptsM_r6nL];
$cmakeRegexOpts_r6nM
  :: Text.Regex.TDFA.Common.CompOption
     -> Text.Regex.TDFA.Common.ExecOption
     -> Data.Text.Internal.Lazy.Text
     -> Text.Regex.TDFA.Common.Regex
[GblId] =
    [] \u []
        Text.Regex.Base.RegexLike.$dmmakeRegexOpts
            Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText;
$cmakeRegex_r6nN
  :: Data.Text.Internal.Lazy.Text -> Text.Regex.TDFA.Common.Regex
[GblId] =
    [] \u []
        Text.Regex.Base.RegexLike.$dmmakeRegex
            Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText;
$cmakeRegexM_r6nO
  :: forall (m :: * -> *).
     GHC.Base.Monad m =>
     Data.Text.Internal.Lazy.Text -> m Text.Regex.TDFA.Common.Regex
[GblId, Arity=1, Unf=OtherCon []] =
    [] \r [$dMonad_s6o9]
        Text.Regex.Base.RegexLike.$dmmakeRegexM
            Text.Regex.TDFA.Text.Lazy.$fRegexMakerRegexCompOptionExecOptionText
            $dMonad_s6o9;

$trModule1_r6nP :: GHC.Prim.Addr#
[GblId, Caf=NoCafRefs, Unf=OtherCon []] =
    "main"#;

$trModule2_r6nQ :: GHC.Types.TrName
[GblId, Caf=NoCafRefs, Unf=OtherCon []] =
    NO_CCS GHC.Types.TrNameS! [$trModule1_r6nP];

$trModule3_r6nR :: GHC.Prim.Addr#
[GblId, Caf=NoCafRefs, Unf=OtherCon []] =
    "Text.Regex.TDFA.Text.Lazy"#;

$trModule4_r6nS :: GHC.Types.TrName
[GblId, Caf=NoCafRefs, Unf=OtherCon []] =
    NO_CCS GHC.Types.TrNameS! [$trModule3_r6nR];

Text.Regex.TDFA.Text.Lazy.$trModule :: GHC.Types.Module
[GblId, Caf=NoCafRefs, Unf=OtherCon []] =
    NO_CCS GHC.Types.Module! [$trModule2_r6nQ $trModule4_r6nS];

$cmatchTest_r6nT
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text -> GHC.Types.Bool
[GblId] =
    [] \u []
        Text.Regex.TDFA.NewDFA.Tester.matchTest Data.Text.Lazy.uncons;

$cmatchAll_r6nU
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> [Text.Regex.Base.RegexLike.MatchArray]
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [r_s6oa s_s6ob]
        let {
          sat_s6od [Occ=Once] :: GHC.Types.Char
          [LclId] =
              NO_CCS GHC.Types.C#! ['\n'#]; } in
        let {
          sat_s6oc [Occ=Once] :: Text.Regex.TDFA.Common.Position
          [LclId] =
              NO_CCS GHC.Types.I#! [0#];
        } in 
          Text.Regex.TDFA.NewDFA.Engine.execMatch
              Data.Text.Lazy.uncons r_s6oa sat_s6oc sat_s6od s_s6ob;

Text.Regex.TDFA.Text.Lazy.compile
  :: Text.Regex.TDFA.Common.CompOption
     -> Text.Regex.TDFA.Common.ExecOption
     -> Data.Text.Internal.Lazy.Text
     -> Data.Either.Either GHC.Base.String Text.Regex.TDFA.Common.Regex
[GblId, Arity=3, Unf=OtherCon []] =
    [] \r [compOpt_s6oe execOpt_s6of txt_s6og]
        let {
          sat_s6oh [Occ=Once] :: GHC.Base.String
          [LclId] =
              [txt_s6og] \u [] Data.Text.Lazy.unpack txt_s6og;
        } in 
          case Text.Regex.TDFA.ReadRegex.parseRegex sat_s6oh of {
            Data.Either.Left err_s6oj [Occ=Once] ->
                let {
                  sat_s6om [Occ=Once] :: [GHC.Types.Char]
                  [LclId] =
                      [err_s6oj] \u []
                          let {
                            sat_s6ol [Occ=Once] :: [GHC.Types.Char]
                            [LclId] =
                                [err_s6oj] \u []
                                    GHC.Show.show Text.Parsec.Error.$fShowParseError err_s6oj; } in
                          let {
                            sat_s6ok [Occ=Once] :: [GHC.Types.Char]
                            [LclId] =
                                [] \u []
                                    GHC.CString.unpackCString#
                                        "parseRegex for Text.Regex.TDFA.Text.Lazy failed:"#;
                          } in  GHC.Base.++ sat_s6ok sat_s6ol;
                } in  Data.Either.Left [sat_s6om];
            Data.Either.Right pattern_s6on [Occ=Once] ->
                let {
                  sat_s6oo [Occ=Once] :: Text.Regex.TDFA.Common.Regex
                  [LclId] =
                      [compOpt_s6oe execOpt_s6of pattern_s6on] \u []
                          Text.Regex.TDFA.TDFA.patternToRegex
                              pattern_s6on compOpt_s6oe execOpt_s6of;
                } in  Data.Either.Right [sat_s6oo];
          };

$cmatchOnce_r6nV
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [r_s6op s_s6oq]
        let {
          sat_s6or [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray]
          [LclId] =
              [r_s6op s_s6oq] \u [] $cmatchAll_r6nU r_s6op s_s6oq;
        } in  Data.Maybe.listToMaybe sat_s6or;

$cmatchAllText_r6nW
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> [Text.Regex.Base.RegexLike.MatchText
           Data.Text.Internal.Lazy.Text]
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [regex_s6os source_s6ot]
        let {
          go_s6ou [Occ=LoopBreaker]
            :: GHC.Types.Int
               -> Data.Text.Internal.Lazy.Text
               -> [GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)]
               -> [GHC.Arr.Array
                     GHC.Types.Int
                     (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int))]
          [LclId, Arity=3, Unf=OtherCon []] =
              sat-only [go_s6ou] \r [i_s6ov ds_s6ow ds1_s6ox]
                  case i_s6ov of i1_s6oy {
                    GHC.Types.I# _ [Occ=Dead] ->
                        case ds1_s6ox of {
                          [] -> [] [];
                          : x_s6oB xs_s6oC [Occ=Once] ->
                              let {
                                ds2_s6oD :: (GHC.Types.Int, GHC.Types.Int)
                                [LclId] =
                                    [x_s6oB] \u []
                                        let {
                                          sat_s6oE [Occ=Once] :: GHC.Types.Int
                                          [LclId] =
                                              NO_CCS GHC.Types.I#! [0#];
                                        } in 
                                          Data.Array.Base.!
                                              Data.Array.Base.$fIArrayArraye
                                              GHC.Arr.$fIxInt
                                              x_s6oB
                                              sat_s6oE; } in
                              let {
                                off0_s6oF :: GHC.Types.Int
                                [LclId] =
                                    [ds2_s6oD] \u []
                                        case ds2_s6oD of {
                                          (,) off1_s6oH [Occ=Once] _ [Occ=Dead] -> off1_s6oH;
                                        }; } in
                              let {
                                len0_s6oJ :: GHC.Types.Int
                                [LclId] =
                                    [ds2_s6oD] \u []
                                        case ds2_s6oD of {
                                          (,) _ [Occ=Dead] len1_s6oM [Occ=Once] -> len1_s6oM;
                                        }; } in
                              let {
                                sat_s6p0 [Occ=Once]
                                  :: [GHC.Arr.Array
                                        GHC.Types.Int
                                        (Data.Text.Internal.Lazy.Text,
                                         (GHC.Types.Int, GHC.Types.Int))]
                                [LclId] =
                                    [go_s6ou ds_s6ow i1_s6oy xs_s6oC off0_s6oF len0_s6oJ] \u []
                                        let {
                                          sat_s6oX [Occ=Once] :: GHC.Types.Int
                                          [LclId] =
                                              [i1_s6oy off0_s6oF len0_s6oJ] \u []
                                                  let {
                                                    sat_s6oW [Occ=Once] :: GHC.Types.Int
                                                    [LclId] =
                                                        [i1_s6oy len0_s6oJ] \u []
                                                            GHC.Num.-
                                                                GHC.Num.$fNumInt len0_s6oJ i1_s6oy;
                                                  } in 
                                                    GHC.Num.+ GHC.Num.$fNumInt off0_s6oF sat_s6oW;
                                        } in 
                                          case $cafter_r6mQ sat_s6oX ds_s6ow of t'_s6oY {
                                            __DEFAULT ->
                                                let {
                                                  sat_s6oZ [Occ=Once] :: GHC.Types.Int
                                                  [LclId] =
                                                      [off0_s6oF len0_s6oJ] \u []
                                                          GHC.Num.+
                                                              GHC.Num.$fNumInt off0_s6oF len0_s6oJ;
                                                } in  go_s6ou sat_s6oZ t'_s6oY xs_s6oC;
                                          }; } in
                              let {
                                sat_s6oV [Occ=Once]
                                  :: GHC.Arr.Array
                                       GHC.Types.Int
                                       (Data.Text.Internal.Lazy.Text,
                                        (GHC.Types.Int, GHC.Types.Int))
                                [LclId] =
                                    [ds_s6ow i1_s6oy x_s6oB] \u []
                                        let {
                                          sat_s6oU [Occ=Once]
                                            :: (GHC.Types.Int, GHC.Types.Int)
                                               -> (Data.Text.Internal.Lazy.Text,
                                                   (GHC.Types.Int, GHC.Types.Int))
                                          [LclId] =
                                              [ds_s6ow i1_s6oy] \r [pair_s6oN]
                                                  case pair_s6oN of wild1_s6oO {
                                                    (,) off_s6oP [Occ=Once] len_s6oQ [Occ=Once] ->
                                                        let {
                                                          sat_s6oT [Occ=Once]
                                                            :: Data.Text.Internal.Lazy.Text
                                                          [LclId] =
                                                              [ds_s6ow
                                                               i1_s6oy
                                                               off_s6oP
                                                               len_s6oQ] \u []
                                                                  let {
                                                                    sat_s6oR [Occ=Once]
                                                                      :: GHC.Types.Int
                                                                    [LclId] =
                                                                        [i1_s6oy off_s6oP] \u []
                                                                            GHC.Num.-
                                                                                GHC.Num.$fNumInt
                                                                                off_s6oP
                                                                                i1_s6oy; } in
                                                                  let {
                                                                    sat_s6oS [Occ=Once]
                                                                      :: (GHC.Types.Int,
                                                                          GHC.Types.Int)
                                                                    [LclId] =
                                                                        NO_CCS (,)! [sat_s6oR
                                                                                     len_s6oQ];
                                                                  } in 
                                                                    $cextract_r6nK sat_s6oS ds_s6ow;
                                                        } in  (,) [sat_s6oT wild1_s6oO];
                                                  };
                                        } in  GHC.Base.fmap GHC.Arr.$fFunctorArray sat_s6oU x_s6oB;
                              } in  : [sat_s6oV sat_s6p0];
                        };
                  }; } in
        let {
          sat_s6p2 [Occ=Once]
            :: [GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)]
          [LclId] =
              [regex_s6os source_s6ot] \u []
                  $cmatchAll_r6nU regex_s6os source_s6ot; } in
        let {
          sat_s6p1 [Occ=Once] :: GHC.Types.Int
          [LclId] =
              NO_CCS GHC.Types.I#! [0#];
        } in  go_s6ou sat_s6p1 source_s6ot sat_s6p2;

$cmatchCount_r6nX
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text -> GHC.Types.Int
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [r_s6p3 s_s6p4]
        let {
          sat_s6pm [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray]
          [LclId] =
              [r_s6p3 s_s6p4] \u []
                  let {
                    sat_s6pl [Occ=Once] :: GHC.Types.Char
                    [LclId] =
                        NO_CCS GHC.Types.C#! ['\n'#]; } in
                  let {
                    sat_s6pk [Occ=Once] :: Text.Regex.TDFA.Common.Position
                    [LclId] =
                        NO_CCS GHC.Types.I#! [0#]; } in
                  let {
                    sat_s6pj [Occ=Once] :: Text.Regex.TDFA.Common.Regex
                    [LclId] =
                        [r_s6p3] \u []
                            case r_s6p3 of wild_s6p5 {
                              Text.Regex.TDFA.Common.Regex ds_s6p6 [Occ=Once]
                                                           ds1_s6p7 [Occ=Once]
                                                           ds2_s6p8 [Occ=Once]
                                                           ds3_s6p9 [Occ=Once]
                                                           ds4_s6pa [Occ=Once]
                                                           ds5_s6pb [Occ=Once]
                                                           ds6_s6pc [Occ=Once]
                                                           ds7_s6pd [Occ=Once]
                                                           ds8_s6pe [Occ=Once]
                                                           _ [Occ=Dead] ->
                                  let {
                                    sat_s6pi [Occ=Once] :: Text.Regex.TDFA.Common.ExecOption
                                    [LclId] =
                                        [wild_s6p5] \u []
                                            case
                                                Text.Regex.TDFA.Common.regex_execOptions wild_s6p5
                                            of
                                            { Text.Regex.TDFA.Common.ExecOption _ [Occ=Dead] ->
                                                  Text.Regex.TDFA.Common.ExecOption [GHC.Types.False];
                                            };
                                  } in 
                                    Text.Regex.TDFA.Common.Regex [ds_s6p6
                                                                  ds1_s6p7
                                                                  ds2_s6p8
                                                                  ds3_s6p9
                                                                  ds4_s6pa
                                                                  ds5_s6pb
                                                                  ds6_s6pc
                                                                  ds7_s6pd
                                                                  ds8_s6pe
                                                                  sat_s6pi];
                            };
                  } in 
                    Text.Regex.TDFA.NewDFA.Engine.execMatch
                        Data.Text.Lazy.uncons sat_s6pj sat_s6pk sat_s6pl s_s6p4;
        } in  Data.Foldable.length Data.Foldable.$fFoldable[] sat_s6pm;

$cmatchOnceText_r6nY
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> GHC.Base.Maybe
          (Data.Text.Internal.Lazy.Text,
           Text.Regex.Base.RegexLike.MatchText Data.Text.Internal.Lazy.Text,
           Data.Text.Internal.Lazy.Text)
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [regex_s6pn source_s6po]
        let {
          sat_s6pL [Occ=Once]
            :: GHC.Base.Maybe
                 (GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int))
          [LclId] =
              [regex_s6pn source_s6po] \u []
                  let {
                    sat_s6pK [Occ=Once] :: [Text.Regex.Base.RegexLike.MatchArray]
                    [LclId] =
                        [regex_s6pn source_s6po] \u []
                            let {
                              sat_s6pJ [Occ=Once] :: GHC.Types.Char
                              [LclId] =
                                  NO_CCS GHC.Types.C#! ['\n'#]; } in
                            let {
                              sat_s6pI [Occ=Once] :: Text.Regex.TDFA.Common.Position
                              [LclId] =
                                  NO_CCS GHC.Types.I#! [0#];
                            } in 
                              Text.Regex.TDFA.NewDFA.Engine.execMatch
                                  Data.Text.Lazy.uncons regex_s6pn sat_s6pI sat_s6pJ source_s6po;
                  } in  Data.Maybe.listToMaybe sat_s6pK; } in
        let {
          sat_s6pH [Occ=Once]
            :: GHC.Arr.Array GHC.Types.Int (GHC.Types.Int, GHC.Types.Int)
               -> (Data.Text.Internal.Lazy.Text,
                   GHC.Arr.Array
                     GHC.Types.Int
                     (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int)),
                   Data.Text.Internal.Lazy.Text)
          [LclId] =
              [source_s6po] \r [ma_s6pp]
                  let {
                    ds_s6pq :: (GHC.Types.Int, GHC.Types.Int)
                    [LclId] =
                        [ma_s6pp] \u []
                            let {
                              sat_s6pr [Occ=Once] :: GHC.Types.Int
                              [LclId] =
                                  NO_CCS GHC.Types.I#! [0#];
                            } in 
                              Data.Array.Base.!
                                  Data.Array.Base.$fIArrayArraye
                                  GHC.Arr.$fIxInt
                                  ma_s6pp
                                  sat_s6pr; } in
                  let {
                    o_s6ps :: GHC.Types.Int
                    [LclId] =
                        [ds_s6pq] \u []
                            case ds_s6pq of {
                              (,) o1_s6pu [Occ=Once] _ [Occ=Dead] -> o1_s6pu;
                            }; } in
                  let {
                    sat_s6pG [Occ=Once] :: Data.Text.Internal.Lazy.Text
                    [LclId] =
                        [source_s6po ds_s6pq o_s6ps] \u []
                            let {
                              sat_s6pF [Occ=Once] :: GHC.Types.Int
                              [LclId] =
                                  [ds_s6pq o_s6ps] \u []
                                      let {
                                        sat_s6pE [Occ=Once] :: GHC.Types.Int
                                        [LclId] =
                                            [ds_s6pq] \u []
                                                case ds_s6pq of {
                                                  (,) _ [Occ=Dead] l_s6pD [Occ=Once] -> l_s6pD;
                                                };
                                      } in  GHC.Num.+ GHC.Num.$fNumInt o_s6ps sat_s6pE;
                            } in  $cafter_r6mQ sat_s6pF source_s6po; } in
                  let {
                    sat_s6pA [Occ=Once]
                      :: GHC.Arr.Array
                           GHC.Types.Int
                           (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int))
                    [LclId] =
                        [source_s6po ma_s6pp] \u []
                            let {
                              sat_s6pz [Occ=Once]
                                :: (GHC.Types.Int, GHC.Types.Int)
                                   -> (Data.Text.Internal.Lazy.Text, (GHC.Types.Int, GHC.Types.Int))
                              [LclId] =
                                  [source_s6po] \r [ol_s6px]
                                      let {
                                        sat_s6py [Occ=Once] :: Data.Text.Internal.Lazy.Text
                                        [LclId] =
                                            [source_s6po ol_s6px] \u []
                                                $cextract_r6nK ol_s6px source_s6po;
                                      } in  (,) [sat_s6py ol_s6px];
                            } in  GHC.Base.fmap GHC.Arr.$fFunctorArray sat_s6pz ma_s6pp; } in
                  let {
                    sat_s6pw [Occ=Once] :: Data.Text.Internal.Lazy.Text
                    [LclId] =
                        [source_s6po o_s6ps] \u [] $cbefore_r6nJ o_s6ps source_s6po;
                  } in  (,,) [sat_s6pw sat_s6pA sat_s6pG];
        } in  GHC.Base.fmap GHC.Base.$fFunctorMaybe sat_s6pH sat_s6pL;

Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText [InlPrag=NOUSERINLINE CONLIKE]
  :: Text.Regex.Base.RegexLike.RegexLike
       Text.Regex.TDFA.Common.Regex Data.Text.Internal.Lazy.Text
[GblId[DFunId]] =
    NO_CCS Text.Regex.Base.RegexLike.C:RegexLike! [Text.Regex.TDFA.Text.Lazy.$fExtractText
                                                   $cmatchOnce_r6nV
                                                   $cmatchAll_r6nU
                                                   $cmatchCount_r6nX
                                                   $cmatchTest_r6nT
                                                   $cmatchAllText_r6nW
                                                   $cmatchOnceText_r6nY];

Text.Regex.TDFA.Text.Lazy.regexec
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> Data.Either.Either
          GHC.Base.String
          (GHC.Base.Maybe
             (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text,
              Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text]))
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [r_s6pM txt_s6pN]
        case $cmatchOnceText_r6nY r_s6pM txt_s6pN of {
          GHC.Base.Nothing -> Data.Either.Right [GHC.Base.Nothing];
          GHC.Base.Just ds_s6pP [Occ=Once!] ->
              case ds_s6pP of {
                (,,) pre_s6pR [Occ=Once] mt_s6pS post_s6pT [Occ=Once] ->
                    let {
                      sat_s6pZ [Occ=Once] :: [Data.Text.Internal.Lazy.Text]
                      [LclId] =
                          [mt_s6pS] \u []
                              let {
                                sat_s6pY [Occ=Once]
                                  :: [(Data.Text.Internal.Lazy.Text,
                                       (Text.Regex.Base.RegexLike.MatchOffset,
                                        Text.Regex.Base.RegexLike.MatchLength))]
                                [LclId] =
                                    [mt_s6pS] \u []
                                        let {
                                          sat_s6pX [Occ=Once]
                                            :: [(Data.Text.Internal.Lazy.Text,
                                                 (Text.Regex.Base.RegexLike.MatchOffset,
                                                  Text.Regex.Base.RegexLike.MatchLength))]
                                          [LclId] =
                                              [mt_s6pS] \u []
                                                  Data.Array.Base.elems
                                                      Data.Array.Base.$fIArrayArraye
                                                      GHC.Arr.$fIxInt
                                                      mt_s6pS;
                                        } in  GHC.List.tail sat_s6pX;
                              } in  GHC.Base.map Data.Tuple.fst sat_s6pY; } in
                    let {
                      sat_s6pW [Occ=Once] :: Data.Text.Internal.Lazy.Text
                      [LclId] =
                          [mt_s6pS] \u []
                              let {
                                sat_s6pV [Occ=Once]
                                  :: (Data.Text.Internal.Lazy.Text,
                                      (Text.Regex.Base.RegexLike.MatchOffset,
                                       Text.Regex.Base.RegexLike.MatchLength))
                                [LclId] =
                                    [mt_s6pS] \u []
                                        let {
                                          sat_s6pU [Occ=Once] :: GHC.Types.Int
                                          [LclId] =
                                              NO_CCS GHC.Types.I#! [0#];
                                        } in 
                                          Data.Array.Base.!
                                              Data.Array.Base.$fIArrayArraye
                                              GHC.Arr.$fIxInt
                                              mt_s6pS
                                              sat_s6pU;
                              } in  Data.Tuple.fst sat_s6pV; } in
                    let {
                      sat_s6q0 [Occ=Once]
                        :: (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text,
                            Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text])
                      [LclId] =
                          NO_CCS (,,,)! [pre_s6pR sat_s6pW post_s6pT sat_s6pZ]; } in
                    let {
                      sat_s6q1 [Occ=Once]
                        :: GHC.Base.Maybe
                             (Data.Text.Internal.Lazy.Text, Data.Text.Internal.Lazy.Text,
                              Data.Text.Internal.Lazy.Text, [Data.Text.Internal.Lazy.Text])
                      [LclId] =
                          NO_CCS GHC.Base.Just! [sat_s6q0];
                    } in  Data.Either.Right [sat_s6q1];
              };
        };

$cmatch_r6nZ
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text -> Data.Text.Internal.Lazy.Text
[GblId] =
    [] \u []
        Text.Regex.Base.Impl.polymatch
            Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText;

$cmatchM_r6o0
  :: forall (m :: * -> *).
     GHC.Base.Monad m =>
     Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text -> m Data.Text.Internal.Lazy.Text
[GblId, Arity=1, Unf=OtherCon []] =
    [] \r [$dMonad_s6q2]
        Text.Regex.Base.Impl.polymatchM
            Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText $dMonad_s6q2;

Text.Regex.TDFA.Text.Lazy.$fRegexContextRegexTextText [InlPrag=NOUSERINLINE CONLIKE]
  :: Text.Regex.Base.RegexLike.RegexContext
       Text.Regex.TDFA.Common.Regex
       Data.Text.Internal.Lazy.Text
       Data.Text.Internal.Lazy.Text
[GblId[DFunId]] =
    NO_CCS Text.Regex.Base.RegexLike.C:RegexContext! [Text.Regex.TDFA.Text.Lazy.$fRegexLikeRegexText
                                                      $cmatch_r6nZ
                                                      $cmatchM_r6o0];

Text.Regex.TDFA.Text.Lazy.execute
  :: Text.Regex.TDFA.Common.Regex
     -> Data.Text.Internal.Lazy.Text
     -> Data.Either.Either
          GHC.Base.String
          (GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray)
[GblId, Arity=2, Unf=OtherCon []] =
    [] \r [r_s6q3 txt_s6q4]
        let {
          sat_s6q5 [Occ=Once]
            :: GHC.Base.Maybe Text.Regex.Base.RegexLike.MatchArray
          [LclId] =
              [r_s6q3 txt_s6q4] \u [] $cmatchOnce_r6nV r_s6q3 txt_s6q4;
        } in  Data.Either.Right [sat_s6q5];

Which I think is strange, because wnext2 doesn't look like anything GHC would autogenerate, but rather like a name a human would choose on purpose.

comment:17 Changed 23 months ago by mpickering

There is a function called next in regex-tdfa in Text/Regex/TDFA/NewDFA which is probably the one you are looking for?

comment:18 Changed 23 months ago by tdammers

Right, didn't think to try without the w- and the -2. On my setup, there's no next in Text/Regex/TDFA/NewDFA, but four similar modules underneath it (Engine, Engine_FA, Engine_NC, and Engine_NC_FA) do have such a function defined in a let binding. It seems that the Lazy module ends up using the one in Engine (without any suffix), via execMatch.

comment:19 Changed 23 months ago by simonpj

Something is spitting out this line in the ticky dump

 2250377760180010799840          0   2 Mi                   $wnext2{v reC2} (regex-tdfa-text-1.0.0.3-CIfFZ6rjdCoJI5EFpqTwBO:Text.Regex.TDFA.Text.Lazy) (fun)

So the string "$wnext2{v reC2}" must appear in some .o file. I'd do strings *.o | grep to find it.

It it easy to repro? I don't see precise instructions above.

comment:20 Changed 23 months ago by ntc2

@simonpj, you can reproduce by checking out the repo with the slow-regex test package (https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bug), building it, and then running the slow-regex executable that produces with arguments "text-lazy" and the name of one of the test files. E.g.

slow-regex text-lazy defs-10000.txt

The defs-10000.txt test file is included with the test package.

With GHC 8.2.2, using new Cabal:

git clone git@github.com:ntc2/ghc-8.2.1-regex-lazy-text-bug.git
cd ghc-8.2.1-regex-lazy-text-bug.git
cabal new-build slow-regex
./dist-newstyle/build/x86_64-linux/ghc-8.2.2/slow-regex-0.1.0.0/c/slow-regex/build/slow-regex/slow-regex text-lazy defs-10000.txt

With GHC 8.2.2, using Stack:

git clone git@github.com:ntc2/ghc-8.2.1-regex-lazy-text-bug.git
cd ghc-8.2.1-regex-lazy-text-bug.git
stack build --stack-yaml stack-ghc-8.2.2.yaml slow-regex
stack exec --stack-yaml stack-ghc-8.2.2.yaml slow-regex text-lazy defs-10000.txt

comment:21 Changed 23 months ago by tdammers

So the string "$wnext2{v reC2}" must appear in some .o file. I'd do strings *.o | grep to find it.

Using strings -f **/*.o | grep wnext2, I get:

regex-tdfa-text-1.0.0.3/dist/dist-sandbox-3f35acf6/build/Text/Regex/TDFA/Text/Lazy.o: $wnext2{v reyv} (regex-tdfa-text-1.0.0.3-CIfFZ6rjdCoJI5EFpqTwBO:Text.Regex.TDFA.Text.Lazy) (fun)

There is no identifier wnext or next in that module, but it does have this:

{-# SPECIALIZE execMatch :: Regex -> Position -> Char -> L.Text -> [MatchArray] #-}
execMatch :: Uncons text => Regex -> Position -> Char -> text -> [MatchArray]
execMatch = Engine.execMatch

execMatch imported from Text.Regex.TDFA.NewDFA.Engine, in turn, defines local functions next and next'; the alias defined here seems to mainly exist in order to add the SPECIALIZE pragma for lazy Text.

execMatch is monstrously large and quite complex, so I haven't managed to figure out entirely how it works, however there is a promising starting point for further digging:

Whether or not SPECIALIZE triggers might depend on debugging and/or profiling build flags, which would explain why the problem disappears in a profiling build. The hypothesis would be that the specialized code does something that the non-specialized code doesn't do, and that something ends up being *less* efficient rather than *more*. The execMatch function is polymorphic over, among other things, the text type, with a typeclass constraint Uncons text (documentation here), so we should expect uncons to inline for the specialized version, but not the un-specialized one (because in the specialized case we can statically resolve it to the non-polymorphic uncons from Data.Text.Lazy), and then, possibly maybe, the inlined uncons leads to a space leak.

comment:22 Changed 23 months ago by tdammers

It it easy to repro? I don't see precise instructions above.

It's a bit elaborate; what I did is roughly this:

  1. Set up a new cabal sandbox
  2. Attempt to install regex-tdfa-text and its dependencies, using GHC HEAD, and with -ticky enabled.
  3. Step 2 will fail due to several dependencies no longer compiling under GHC 8.2. Do whatever it takes to convince cabal otherwise: --allow-newer works for most, but for a few libraries, I had to check out the sources and install from the local source checkout. I started this was before the 8.2 release though, so it's possible that all the dependencies install cleanly by now.
  4. Then inside the sandbox, compile and run an example that triggers the problematic behavior. I used a trimmed-down version of https://ghc.haskell.org/trac/ghc/attachment/ticket/14519/Main.hs, feeding it the 30000-line example attached to this ticket.

comment:23 Changed 22 months ago by simonpj

There is no identifier wnext or next in that module

That's fine -- it will have come from inlining. But I'd be surprised if there was no wnext in the -ddump-simpl or -ddump-stg for that module.

comment:24 Changed 22 months ago by tdammers

That's fine -- it will have come from inlining. But I'd be surprised if there was no wnext in the -ddump-simpl or -ddump-stg for that module.

Figured it out, I typo-ed my cabal command - cabal install regex-tdfa-text will of course not touch the local checkout of regex-tdfa-text, and so -ddump-simpl -ddump-to-file doesn't put dumps where I expect them, and grepping for wnext won't produce anything useful.

comment:25 Changed 22 months ago by tdammers

  • Compile without optimizations
  • Compile without the SPECIALIZE pragma

comment:26 Changed 22 months ago by simonpj

Another line of approach....

Does the problem happen without -O?

If not (and I bet it doesn't), then which modules must be compiled without -O to make the problem go away?

And if compiling M without -O makes the problem go away, try with -O and -fno-float-in and other similar things to switch off various optimisations.

comment:27 Changed 22 months ago by bgamari

Milestone: 8.4.18.6.1

This likely won't be happening for 8.4.

comment:28 Changed 22 months ago by tdammers

Can confirm that the problem "goes away" when compiling with -O0 (100 ms execution time vs. 50,000 for 30,000 lines of input).

comment:29 Changed 22 months ago by tdammers

If not (and I bet it doesn't), then which modules must be compiled without -O to make the problem go away?

Any quick hints on how to do that? Can I somehow instruct cabal to use specific GHC flags selectively for some modules?

Setting -O0 for the main module alone doesn't make the problem go away in any case, while setting -O0 for the entire regex-tdfa-text package does. But that doesn't tell us much of course.

comment:30 Changed 22 months ago by mpickering

Use an {-# OPTIONS_GHC -O0 #-} pragma at the top of the file you want to compile without optimisation.

comment:31 Changed 22 months ago by tdammers

Use an {-# OPTIONS_GHC -O0 #-} pragma at the top of the file you want to compile without optimisation.

Hmm, I was hoping I could do it without changing the source files, that would be easier to script. But it'll have to do.

comment:32 Changed 22 months ago by mpickering

Right but you probably only need to add it to the module which creates the worker $wnext.

Does compiling with -fno-worker-wrapper fix it?

comment:33 Changed 22 months ago by tdammers

Does compiling with -fno-worker-wrapper fix it?

Haven't tried this one yet; -fno-enable-rewrite-rules seems to fix it, the other optimization flags listed as "implied by -O" on https://downloads.haskell.org/~ghc/7.8.2/docs/html/users_guide/flag-reference.html#options-f-compact don't seem to make a difference.

comment:34 Changed 22 months ago by tdammers

Specifically, running the minimal test code over defs-30000, with various flags, and an execution time limit of 1 second (enough to detect the "bad" behavior), gives us this list:

unoptimized (-O0): Time: 101 ms
no-float-in (-O -fno-float-in): Timeout exceeded
no-strictness (-O -fno-strictness): Timeout exceeded
no-full-laziness (-O -fno-full-laziness): Timeout exceeded
no-specialise (-O -fno-specialise): Timeout exceeded
no-do-eta-reduction (-O -fno-do-eta-reduction): Timeout exceeded
no-cse (-O -fno-cse): Timeout exceeded
no-case-merge (-O -fno-case-merge): Timeout exceeded
no-enable-rewrite-rules (-O -fno-enable-rewrite-rules): Time: 88 ms
no-worker-wrapper (-O -fno-worker-wrapper): Timeout exceeded

In other words, among these optimizations, enable-rewrite-rules seems to be the one that triggers the bad behavior.

comment:35 Changed 22 months ago by mpickering

You can try using -ddump-rule-firings to see which rules are firing. Be careful though, disabling rules will cripple other optimisation passes which rely on them to work properly.

comment:36 Changed 22 months ago by tdammers

Disabling all optimizations (-O0) *only* on the Text.Regex.TDFA.Text.Lazy, with -O on all other modules, also produces "good" behavior.

Removing the SPECIALIZE pragma on execMatch in that module however does *not* make a difference.

comment:37 in reply to:  35 Changed 22 months ago by tdammers

Compiling with -ddump-rule-firings gives us:

Rule fired: Class op toEnum (BUILTIN)
Rule fired: Class op toEnum (BUILTIN)
Rule fired: Class op before (BUILTIN)
Rule fired: Class op after (BUILTIN)
Rule fired: LAZY TEXT drop -> fused (Data.Text.Lazy)
Rule fired: LAZY TEXT take -> fused (Data.Text.Lazy)
Rule fired:
    LAZY STREAM stream/unstream fusion (Data.Text.Internal.Lazy.Fusion)
Rule fired: Class op makeRegexOptsM (BUILTIN)
Rule fired: Class op $p1RegexMaker (BUILTIN)
Rule fired: Class op makeRegexOptsM (BUILTIN)
Rule fired: Class op defaultCompOpt (BUILTIN)
Rule fired: Class op defaultExecOpt (BUILTIN)
Rule fired: Class op makeRegexOptsM (BUILTIN)
Rule fired: Class op $p1RegexMaker (BUILTIN)
Rule fired: Class op makeRegexOpts (BUILTIN)
Rule fired: Class op defaultCompOpt (BUILTIN)
Rule fired: Class op defaultExecOpt (BUILTIN)
Rule fired: unpack (GHC.Base)
Rule fired: Class op show (BUILTIN)
Rule fired: ++ (GHC.Base)
Rule fired: fold/build (GHC.Base)
Rule fired: Class op fmap (BUILTIN)
Rule fired: Class op matchOnce (BUILTIN)
Rule fired: Class op bounds (BUILTIN)
Rule fired: Class op unsafeAt (BUILTIN)
Rule fired: Class op numElements (BUILTIN)
Rule fired: Class op index (BUILTIN)
Rule fired: Class op before (BUILTIN)
Rule fired: LAZY TEXT take -> fused (Data.Text.Lazy)
Rule fired: Class op fmap (BUILTIN)
Rule fired: Class op extract (BUILTIN)
Rule fired: Class op after (BUILTIN)
Rule fired: Class op + (BUILTIN)
Rule fired: LAZY TEXT drop -> fused (Data.Text.Lazy)
Rule fired: Class op length (BUILTIN)
Rule fired: Class op matchAll (BUILTIN)
Rule fired: length (GHC.List)
Rule fired: unpack (GHC.Base)
Rule fired: unpack (GHC.Base)
Rule fired: unpack (GHC.Base)
Rule fired: unpack (GHC.Base)
Rule fired: Class op bounds (BUILTIN)
Rule fired: Class op unsafeAt (BUILTIN)
Rule fired: Class op numElements (BUILTIN)
Rule fired: Class op index (BUILTIN)
Rule fired: Class op fmap (BUILTIN)
Rule fired: Class op extract (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op after (BUILTIN)
Rule fired: Class op + (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: LAZY TEXT drop -> fused (Data.Text.Lazy)
Rule fired: Class op + (BUILTIN)
Rule fired: Class op matchAll (BUILTIN)
Rule fired: Class op matchAll (BUILTIN)
Rule fired: Class op matchOnceText (BUILTIN)
Rule fired: Class op bounds (BUILTIN)
Rule fired: Class op unsafeAt (BUILTIN)
Rule fired: Class op numElements (BUILTIN)
Rule fired: Class op index (BUILTIN)
Rule fired: Class op numElements (BUILTIN)
Rule fired: Class op unsafeAt (BUILTIN)
Rule fired: map (GHC.Base)
Rule fired: Class op matchOnce (BUILTIN)
Rule fired: Class op matchOnceText (BUILTIN)
Rule fired: Class op $p1Integral (BUILTIN)
Rule fired: Class op $p1Real (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op $p2Real (BUILTIN)
Rule fired: Class op <= (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op <= (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op toInteger (BUILTIN)
Rule fired: smallIntegerToInt (BUILTIN)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: Class op $p1Integral (BUILTIN)
Rule fired: Class op $p1Real (BUILTIN)
Rule fired: Class op $p2Real (BUILTIN)
Rule fired: Class op toInteger (BUILTIN)
Rule fired: smallIntegerToInt (BUILTIN)
Rule fired: Class op $p1Integral (BUILTIN)
Rule fired: Class op $p1Real (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op $p2Real (BUILTIN)
Rule fired: Class op <= (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op <= (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op toInteger (BUILTIN)
Rule fired: smallIntegerToInt (BUILTIN)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: SPEC next @ Int64 (Text.Regex.TDFA.Text.Lazy)
Rule fired: Class op $p1Integral (BUILTIN)
Rule fired: Class op $p1Real (BUILTIN)
Rule fired: Class op $p2Real (BUILTIN)
Rule fired: Class op toInteger (BUILTIN)
Rule fired: smallIntegerToInt (BUILTIN)
Rule fired: Class op $p1RegexLike (BUILTIN)
Rule fired: Class op empty (BUILTIN)
Rule fired: Class op matchOnceText (BUILTIN)
Rule fired: Class op $p1RegexLike (BUILTIN)
Rule fired: Class op matchOnceText (BUILTIN)
Rule fired: Class op empty (BUILTIN)
Rule fired:
    SPEC/Text.Regex.TDFA.Text.Lazy polymatch @ Regex @ Text (Text.Regex.TDFA.Text.Lazy)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: foldr/app (GHC.Base)
Rule fired: unpack-append (GHC.Base)
Rule fired: foldr/app (GHC.Base)
Rule fired: unpack-append (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: foldr/app (GHC.Base)
Rule fired: unpack-append (GHC.Base)
Rule fired: foldr/app (GHC.Base)
Rule fired: unpack-append (GHC.Base)
Rule fired: lengthList (GHC.List)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: foldr/app (GHC.Base)
Rule fired: unpack-append (GHC.Base)
Rule fired: foldr/app (GHC.Base)
Rule fired: unpack-append (GHC.Base)
Rule fired: mapList (GHC.Base)
Rule fired: unpack-list (GHC.Base)
Rule fired: unpack-append (GHC.Base)
Rule fired: Class op $p1Integral (BUILTIN)
Rule fired: Class op $p1Real (BUILTIN)
Rule fired: Class op $p2Real (BUILTIN)
Rule fired: Class op $p1Integral (BUILTIN)
Rule fired: Class op $p1Real (BUILTIN)
Rule fired: Class op $p2Real (BUILTIN)
Rule fired: Class op <= (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op <= (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op toInteger (BUILTIN)
Rule fired: smallIntegerToInt (BUILTIN)
Rule fired: Class op $p1Integral (BUILTIN)
Rule fired: Class op $p1Real (BUILTIN)
Rule fired: Class op $p2Real (BUILTIN)
Rule fired: Class op toInteger (BUILTIN)
Rule fired: smallIntegerToInt (BUILTIN)
Rule fired: Class op $p1Integral (BUILTIN)
Rule fired: Class op $p1Real (BUILTIN)
Rule fired: Class op $p2Real (BUILTIN)
Rule fired: Class op <= (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op <= (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op $p1Integral (BUILTIN)
Rule fired: Class op $p1Real (BUILTIN)
Rule fired: Class op $p2Real (BUILTIN)
Rule fired: Class op <= (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op <= (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: Class op - (BUILTIN)
Rule fired: Class op fromInteger (BUILTIN)
Rule fired: integerToInt (BUILTIN)
Rule fired: *# (BUILTIN)
Rule fired: *# (BUILTIN)

comment:38 Changed 22 months ago by tdammers

Evidence so far:

  • We already suspected that whatever we're doing in uncons for lazy Text might break things.
  • The next function, which uses uncons a lot, is where things blow up in ticky profiles
  • Problem disappears when disabling rewrite rules.
  • Problem also disappears in profiling builds; this may however be due to rewrites and other optimizations not being fully applied in the presence of explicit cost centres.
  • Some RULES wrt Text fusion appear in the dump.

So, updated hypothesis: This may not actually be a bug in GHC itself, but in the text library, particularly the fusion-related rules. Things like this PR support the idea that text's fusion behavior may not be perfect yet.

If this is the case, then disabling the relevant RULES in text should also make the problem disappear.

comment:39 Changed 22 months ago by simonpj

This may not actually be a bug in GHC itself, but in the text library, particularly the fusion-related rules.

Sounds plausible. Since there only very few such rules triggered in the trace you sent, you could disable them individually and see what happens. Then hand it off to the text library maintainers!

comment:40 Changed 22 months ago by tdammers

Neutering the following RULES in Data.Text.Lazy (from text) makes the problem go away:

  • LAZY TEXT take -> fused
  • LAZY TEXT take -> unfused
  • LAZY TEXT drop -> fused
  • LAZY TEXT drop -> unfused

Will now try to isolate which one of these causes the trouble.

comment:41 Changed 22 months ago by tdammers

Neutering only these two rules also makes the problem go away:

  • LAZY TEXT take -> fused
  • LAZY TEXT drop -> fused

Neutering only the first one (take -> fused), the problem persists.

Neutering only the last one (drop -> fused), the problem disappears.

In other words, it seems that the culprit is this rule, somewhere around line #1119 in Data.Text.Lazy.hs:

"LAZY TEXT drop -> fused" [~1] forall n t.
    drop n t = unstream (S.drop n (stream t))

Going to run a few more tests to make sure though.

comment:42 Changed 22 months ago by tdammers

Clean build with only this one rule neutered confirms.

So this particular rule causes, or at least triggers, the bug.

comment:43 Changed 22 months ago by tdammers

A smaller reproduction case that depends only on text:

module Main
where

import qualified Data.Text.Lazy as LText
import Data.Char
import System.Environment

type LText = LText.Text

matches :: LText -> ([LText], [LText])
matches str =
  case LText.uncons str of
    Nothing ->
      ([], [])
    Just (c, str') ->
      let (upperMatches, lowerMatches) = matches $ LText.drop 1 str
          upper = isUpper c
          lower = isLower c
          match = LText.take 10 str
          upperMatches' = if upper then match:upperMatches else upperMatches
          lowerMatches' = if lower then match:lowerMatches else lowerMatches
      in (upperMatches', lowerMatches')

main = do
  (arg0:args) <- getArgs
  input <- LText.pack <$> readFile arg0
  let (upper, lower) = matches input
  putStrLn $ "Lowercase: " ++ show (take 1 lower)
  putStrLn $ "Uppercase: " ++ show (take 1 upper)
  print $ LText.take 10 input

This example program tries to roughly mimic the usage of lazy Texts in regex-tdfa-text without actually using any code from that. Specifically, it uses drop (which triggers the offending RULE), it snatches off characters from the front of the remaining string one by one, it passes the tail through a recursive loop, it holds on to chunks of text as it traverses the input, thus preventing them from being collected, and it eventually forces at least the first one of the accumulated chunks by printing them to the console.

I'm not 100% sure whether preventing collection is absolutely necessary to trigger the bug, but in any case, the above example runs slower by about a factor 2 when compiled with optimizations, the dump shows that the offending RULE is being hit, and the ticky profiles hint at the same performance issue as well.

Changed 22 months ago by tdammers

Attachment: repro-opt.ticky added

Minimal reproduction, unoptimized (-O0)

Changed 22 months ago by tdammers

Attachment: repro-opt.2.ticky added

Minimal reproduction, optimized (-O)

Changed 22 months ago by tdammers

Attachment: repro-unopt.ticky added

Minimal reproduction, unoptimized (-O0)

comment:44 Changed 22 months ago by tdammers

Filed issue with the text library: https://github.com/haskell/text/issues/216

comment:45 Changed 22 months ago by tdammers

Resolution: wontfix
Status: newclosed

Reported against text library, probably not a GHC bug. We can always reopen if GHC itself turns out to be the problem after all.

Last edited 22 months ago by tdammers (previous) (diff)
Note: See TracTickets for help on using tickets.