Opened 4 years ago

Closed 4 years ago

Last modified 6 months ago

#11303 closed bug (fixed)

Pattern matching against sets of strings sharing a prefix blows up pattern checker

Reported by: bgamari Owned by:
Priority: highest Milestone: 8.2.1
Component: Compiler Version: 7.11
Keywords: PatternMatchWarnings Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case: T11303, T11303b
Blocked By: Blocking:
Related Tickets: #11276, #11302 Differential Rev(s): Phab:D1716, Phab:D1719, Phab:1795
Wiki Page:

Description

hvr stumbled upon this issue while attempting to bootstrap GHC with GHC HEAD. In so doing he found that GHC HEAD required more than 10 GB of memory while compiling genprimopcode (and never completed).

It appears that this blow-up is due to the new pattern checker. In particular it appears that the pattern checker is affected quite adversely by sets of patterns sharing a prefix. For instance, this example,

import System.Environment

main :: IO ()
main = do
  args <- getArgs
  print $ case head args of
                      "--primop-primop-info"  -> "turtle"
                      "--primop-tag" -> "asdf"
                      "--primop-list"  -> "casdhf"
                      "--primop-vector-uniques"  -> "this"
                      "--primop-vector-tys"  -> "is"
                      "--primop-vector-tys-exports"  -> "silly"
                      "--primop-vector-tycons"  -> "hmmm"
                      "--make-haskell-wrappers" -> "123512"
                      "--make-haskell-source"  -> "as;dg"
                      "--make-latex-doc" -> "adghiw"
                      _ -> error "Should not happen, known_args out of sync?"

As written GHC requires over ten gigabytes of heap and several minutes to compile the example. If one perform s/--primop-// to this example it takes 500ms to compile. Alternatively, if on replace the first - in each of the --primop strings with a unique character (thus breaking the shared prefixes) compilation time is a bit shy of a second.

Change History (23)

comment:1 Changed 4 years ago by hvr

Priority: normalhighest

comment:2 Changed 4 years ago by RyanGlScott

Another place "in the wild" where this appears to occur is text-icu. Attempting to compile Data.Text.ICU.Regex.Internal very quickly eats up all of my RAM. I tried waiting for about half an hour for it to finish before giving up (then again, I was using a laptop with "only" 4 GB of RAM). If pattern matching is the culprit, then toURegexOpts is the probable scene of the crime.

comment:3 Changed 4 years ago by rwbarton

In a way, the fact that memory usage explodes on such small inputs is encouraging: surely some mild effort on algorithms/data structures will have a massive effect here.

comment:4 Changed 4 years ago by gkaracha

Ahhhh, I see. This has been noticed before. Actually, I have added a note in deSugar/Check.hs about it: Note [Literals in PmPat]. My primary test case for this problem has been function mkTextEncoding' from libraries/base/GHC/IO/Encoding.hs until now but I guess your example stresses the same problem even more.

The source of the problem lies in the way we translate literals:

  • In the paper, we treat them as guards: literal l gets translated to x (True <- x == l). This allows the algorithm to work uniformly on everything but is too expensive, even for mkTextEncoding'. Essentially, the covered set contains x |> {True ~ (x == l)} and the uncovered contains x |> {False ~ (x == l)}. Hence, we explode first (everything is a guard) and then prune (use the term oracle to check the constraints) which is really exponential.
  • Hence, I changed the implementation and added literals in the pattern language. This means that for literal l the covered set contains l |> {} and the uncovered set contains x |> {False ~ (x == l)} like before. Since literals are patterns, more things can be checked eagerly, like with constructors where we can see immediately that e.g. True can not match False. Hence, we generate less from the start.

The last thing to do (and I think this will address all performance issues concerning literals) is to completely add literals in the pattern language, that is, use a variant of Sestoft's negative patterns:

data PmPat :: PatTy -> * where
  PmCon  :: { ... }        -> PmPat t
  PmVar  :: Id             -> PmPat t
  PmLit  :: PmLit          -> PmPat t
  PmNLit :: Id -> [PmLit] -> PmPat VA -- add this constructor
  PmGrd  :: { ... }        -> PmPat PAT

The idea is that PmNLit x lits represents in a compact way all literals x that are not equal to any of lits. Hence, we generate much less and there is significantly less need for pruning.

There is a slight possibility that this change will affect #322 and also I expect the size of the checker to increase 5-10% (LoC) but I have no better solution at the moment. Any ideas on this approach?

comment:5 Changed 4 years ago by Ben Gamari <ben@…>

In fcc76493/ghc:

Introduce negative patterns for literals (addresses #11303)

Introduce negative patterns for literals. In addition to storing term
constraints for literals (checked at the end by the term oracle), also
check eagerly, using negative patterns. This means generation of smaller
sets (covered, uncovered, and divergent), instead of generating big sets
and pruning afterwards.

Test Plan: validate

Reviewers: austin, bgamari

Reviewed By: bgamari

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1716

GHC Trac Issues: #11303

comment:6 Changed 4 years ago by Ben Gamari <ben@…>

In adcbc98f/ghc:

Add regression test for #11303


Test Plan: Validate

Reviewers: austin

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1719

GHC Trac Issues: #11303

comment:7 Changed 4 years ago by bgamari

Differential Rev(s): Phab:D1716, Phab:D1719
Resolution: fixed
Status: newclosed
Test Case: T11303

The test case now compiles in less than a second.

comment:8 Changed 4 years ago by hvr

Owner: gkaracha deleted
Resolution: fixed
Status: closednew

I just had matrix.h.h.o thrashing due to text-icu still requiring over 10GiB with yesterdays GHC 8.0 snapshot :-(

I tried compiling text-icu-0.7.0.1:Data.Text.ICU.Regex.Internal on my workstation (with swap-space disabled) takes over 30GiB, until the OOM killer finally kicks in:

[19687.231184] Out of memory: Kill process 18811 (ghc) score 929 or sacrifice child
[19687.231186] Killed process 18811 (ghc) total-vm:1074146932kB, anon-rss:30410192kB, file-rss:0kB

comment:9 in reply to:  8 ; Changed 4 years ago by gkaracha

Replying to hvr:

I tried compiling text-icu-0.7.0.1:Data.Text.ICU.Regex.Internal on my workstation (with swap-space disabled) takes over 30GiB, until the OOM killer finally kicks in:

[19687.231184] Out of memory: Kill process 18811 (ghc) score 929 or sacrifice child
[19687.231186] Killed process 18811 (ghc) total-vm:1074146932kB, anon-rss:30410192kB, file-rss:0kB

Hmmmm, how come you consider this an instance of #11303 and not #11276?

comment:10 in reply to:  9 Changed 4 years ago by hvr

Replying to gkaracha:

Hmmmm, how come you consider this an instance of #11303 and not #11276?

Mostly guessed so, because #11276 talks about exponential time, rather than exponential memory usage :-)

Do you need me to provide you a smaller testcase?

comment:11 Changed 4 years ago by mpickering

I suspect #11276 is also a memory issue.

comment:12 Changed 4 years ago by RyanGlScott

Another reason it's probably #11276 instead is because the same workaround can be applied to text-icu:

  • Data/Text/ICU/Regex/Internal.hsc

    index 38765d4..c31f4b6 100644
    a b toURegexpOpts = foldl go (0,-1,-1) 
    174174  where
    175175    go (!flag,work,stack) opt = (flag+flag',work',stack')
    176176     where
     177      flag' :: URegexpFlag
    177178      flag' = case opt of
    178179                CaseInsensitive       -> #const UREGEX_CASE_INSENSITIVE
    179180                Comments              -> #const UREGEX_COMMENTS

Adding a type signature makes the rest of text-icu compile.

comment:13 Changed 4 years ago by gkaracha

Replying to hvr:

Mostly guessed so, because #11276 talks about exponential time, rather than exponential memory usage :-)

I see. Personally, I think that all the recent bug reports concerning the checker were not strictly speaking bugs, rather improvements needed to check realistic code. Yet, #11276 and the behaviour on text-icu-0.7.0.1 seems like an actual bug and I feel they have the same source.

Do you need me to provide you a smaller testcase?

Yes, if it is not too much trouble! Even with Matthew's useful comments I still haven't found the source of #11276, maybe one more test case can shed some more light on this. :-)

Thanks!

comment:14 in reply to:  12 Changed 4 years ago by gkaracha

Replying to RyanGlScott:

Another reason it's probably #11276 instead is because the same workaround can be applied to text-icu: {...} Adding a type signature makes the rest of text-icu compile.

Ah, great! So flag' is the source of the problem. Then yes, they are definitely connected, probably something goes wrong with the types I assign to the initial uncovered set (I get them from the signature/inferred type). Thanks!

Last edited 4 years ago by gkaracha (previous) (diff)

comment:15 Changed 4 years ago by gkaracha

Herbert, could you check again whether text-icu can be built with the current HEAD? My patch for #11276 has been merged so I expect text-icu to be built normally. If so, please also revert #11303 to closed. Thanks for the help! :-)

comment:16 Changed 4 years ago by RyanGlScott

Hm, for some reason, this program still seems to be blowing up GHC even with the latest changes:

{-# LANGUAGE TypeFamilies #-}
module Slow where

data family Letter a

data instance Letter a = A | B | C | D | E | F | G | H | I | J

f :: [Letter a] -> Int
f = foldl go 0
  where
    go n letter = n + n'
      where
        n' = case letter of
                  A -> 0
                  B -> 1
                  C -> 2
                  D -> 3
                  E -> 4
                  F -> 5
                  G -> 6
                  H -> 7
                  I -> 8
                  J -> 9

This time, adding explicit type signatures to go and/or n' does not make it work.

comment:17 Changed 4 years ago by gkaracha

Oh, yes, this is rather expected. I added a note in deSugar/Check.hs about the general case: Note [Translate CoPats]. As the note says, a CoPat is translated as follows:

  pat |> co    ===>    x (pat <- (x |> co))

In the latest commit 0acdcf24/ghc, I added two cases when translating CoPats:

  • If co is refl we can drop it and do not generate the guard
  • If co is just a hole, we can also drop it.

So, as the commit message says, we now generate less guards. For data families, this coercion is essential, because changes the representation tycon to the source tycon and I cannot drop it, that is, without changing the type of the pattern. The previous examples had CoPats due to inference but your example above uses data families directly so the workaround does not apply. I can only hope that I will be able to come up with a better translation for CoPats before the release but I do not know how to address this yet.

Btw, I think we should move this whole discussion to #11276 since #11303 is rather irrelevant.

comment:18 in reply to:  15 Changed 4 years ago by hvr

Resolution: fixed
Status: newclosed

Replying to gkaracha:

Herbert, could you check again whether text-icu can be built with the current HEAD? My patch for #11276 has been merged so I expect text-icu to be built normally. If so, please also revert #11303 to closed. Thanks for the help! :-)

I can confirm that text-icu now compiles with a more reasonable memory usage... :-)

comment:19 Changed 4 years ago by bgamari

comment:20 Changed 4 years ago by simonpj

Keywords: PatternMatchWarnings added
Milestone: 8.0.18.2.1
Resolution: fixed
Status: closednew

Re-opening because I think we can do better for 8.2. But ok for 8.0.

comment:21 Changed 4 years ago by Ben Gamari <ben@…>

In 28f951e/ghc:

Overhaul the Overhauled Pattern Match Checker

Overhaul the Overhauled Pattern Match Checker

* Changed the representation of Value Set Abstractions. Instead of
using a prefix tree, we now use a list of Value Vector Abstractions.
The set of constraints Delta for every Value Vector Abstraction is the
oracle state so that we solve everything only once.

* Instead of doing everything lazily, we prune at once (and in general
everything is much stricter). Hence, an example written with pattern
guards is checked in almost the same time as the equivalent with
pattern matching.

* Do not store the covered and the divergent sets at all. Since what we
only need is a yes/no (does this clause cover anything? Does it force
any thunk?) We just keep a boolean for each.

* Removed flags `-Wtoo-many-guards` and `-ffull-guard-reasoning`.
Replaced with `fmax-pmcheck-iterations=n`. Still debatable what should
the default `n` be.

* When a guard is for sure not going to contribute anything, we treat
it as such: The oracle is not called and cases `CGuard`, `UGuard` and
`DGuard` from the paper are not happening at all (the generation of a
fresh variable, the unfolding of the pattern list etc.). his combined
with the above seems to be enough to drop the memory increase for test
T783 down to 18.7%.

* Do not export function `dsPmWarn` (it is now called directly from
within `checkSingle` and `checkMatches`).

* Make `PmExprVar` hold a `Name` instead of an `Id`. The term oracle
does not handle type information so using `Id` was a waste of
time/space.

* Added testcases T11195, T11303b (data families) and T11374

The patch addresses at least the following:
Trac #11195, #11276, #11303, #11374, #11162

Test Plan: validate

Reviewers: goldfire, bgamari, hvr, austin

Subscribers: simonpj, thomie

Differential Revision: https://phabricator.haskell.org/D1795

comment:22 Changed 4 years ago by gkaracha

Differential Rev(s): Phab:D1716, Phab:D1719Phab:D1716, Phab:D1719, Phab:1795
Resolution: fixed
Status: newclosed
Test Case: T11303T11303, T11303b

comment:23 Changed 6 months ago by Marge Bot <ben+marge-bot@…>

In 4626cf2/ghc:

Fix Uncovered set of literal patterns

Issues #16289 and #15713 are proof that the pattern match checker did
an unsound job of estimating the value set abstraction corresponding to
the uncovered set.

The reason is that the fix from #11303 introducing `NLit` was
incomplete: The `LitCon` case desugared to `Var` rather than `LitVar`,
which would have done the necessary case splitting analogous to the
`ConVar` case.

This patch rectifies that by introducing the fresh unification variable
in `LitCon` in value abstraction position rather than pattern postition,
recording a constraint equating it to the constructor expression rather
than the literal. Fixes #16289 and #15713.
Note: See TracTickets for help on using tickets.