Opened 8 years ago

Closed 6 years ago

#5550 closed bug (fixed)

GHC infinite loop when compiling vector

Reported by: simonpj Owned by:
Priority: low Milestone: 7.6.2
Component: Compiler Version: 7.2.1
Keywords: Cc: rl, v.dijk.bas@…, hackage.haskell.org@…, tkn.akio@…, amos.robinson@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case: simplCore/should_compile/T5550
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Bas reports: When benchmarking my new vector-bytestring package I discovered that building the following program causes GHC to go into, what seems to be, an infinite loop:

import Data.Vector (Vector)
import qualified Data.Vector.Generic as VG

main = print $ VG.foldl f z (VG.fromList [] :: Vector Int)

f = flip (:)
z = []

I build it with:

$ ghc --make vectorGHCloop.hs -O2

It compiles fine without the -O2 or if you specify -fno-enable-rewrite-rules. So it's probably a loop in a rule somewhere.

Note that the program also builds fine when I change the 'f' and 'z' to:

f = (+)
z = 0

I use vector-0.9 and ghc-7.2.1.

Change History (34)

comment:1 Changed 8 years ago by simonpj

Owner: set to rleshchinskiy@…

Roman: first guess is that this is a bug in vector. Could you look?

Simon

comment:2 Changed 8 years ago by simonpj

Daniel Fischer reports:

Replicated with vector-0.7.1 and ghc-7.2.1 (Control-C'ed after six minutes). Compilation finishes (unsurprisingly) with -fno-spec-constr or with {-# NOINLINE f #-}.

It compiles fine with vector-0.7.0.1 and ghc-7.0.4.

comment:3 Changed 8 years ago by rl

Owner: changed from rleshchinskiy@… to rl

comment:4 Changed 8 years ago by rl

If -fno-spec-constr solves this then it's probably a SpecConstr bug and not related to rewrite rules. I'll take a look.

comment:5 Changed 8 years ago by rl

Owner: rl deleted

It's definitely a bug in SpecConstr. Here is a standalone example which doesn't involve vector:

import GHC.Exts ( SpecConstrAnnotation(..) )

data SPEC = SPEC | SPEC2
{-# ANN type SPEC ForceSpecConstr #-}

loop :: SPEC -> [Int] -> [Int] -> [Int]
loop SPEC z [] = z
loop SPEC z (x:xs) = loop SPEC (x:z) xs

It doesn't hang if I remove ForceSpecConstr annotation and set -fno-spec-constr-threshold -fno-spec-constr-count instead.

I'm not sure when I'll have time to investigate more so I'll unassign the ticket for now.

comment:6 Changed 8 years ago by basvandijk

Cc: v.dijk.bas@… added

comment:7 Changed 8 years ago by simonpj

Cc: rl added

Roman,

ForceSpecConstr is behaving exactly as advertised. Here are the comments:

ForceSpecConstr arguments are spotted in scExpr' and scTopBinds which then set
sc_force to True when calling specLoop. This flag does three things:
  * Ignore specConstrThreshold, to specialise functions of arbitrary size
        (see scTopBind)
  * Ignore specConstrCount, to make arbitrary numbers of specialisations
        (see specialise)
  * Specialise even for arguments that are not scrutinised in the loop
        (see argToPat; Trac #4448)

So we generate a specialisation for loop SPEC (x:z) xs. And that specialisation generates another, for loop SPEC (x':x:z) xs'. And so on. The new specialisations are generated even though no one ever scrutinises the agumrnt (condition (3)), and iwthout limit (condition (2)).

So maybe the specification of ForceSpecConstr is wrong. What would you prefer?

comment:8 Changed 8 years ago by rl

I would prefer SpecConstr to have a cleverer "does it get scrutinised" analysis which could handle #4448. But I suppose that's not easy. I'm not really sure what to do here. How about this: ForceSpecConstr turns off the scrutinise check but only for non-recursive types? It seems that would be sufficient to guarantee termination and would still handle #4448.

comment:9 Changed 8 years ago by simonpj

Actually now I look at it, the loop example says

{-# ANN type SPEC ForceSpecConstr #-}

ie it applies to SPEC, not to lists. So maybe this condition:

  * Specialise even for arguments that are not scrutinised in the loop
        (see argToPat; Trac #4448)

suould apply only for arguments of types that are specified as ForceSpecConstr. In loop, we force on SPEC but not on lists! (You implemented this, and I never looked at it in detail.)

That would cure this example. I am not sure whether it'd still make #4448 work, because there is no example code there.

Would that do the job? If so woudl you care to edit SpecConstr?

Simon

comment:10 Changed 8 years ago by rl

ForceSpecConstr applies to entire functions, not to individual arguments. That is, if a function has a ForceSpecConstr argument it will get specialised. That's the only sane way for it to work (we really want to mark functions but giving them a special argument is the least fragile way of doing so).

#4448 has example code right on top!

comment:11 Changed 8 years ago by simonpj

But #4448 has no ForceSpecConstr annotation. I don't know whether it applies to the pair or to the Left/Right type.

My proposal is precisly that condition (3) should be type-specific not function-specific. The other two can operate as now.

comment:12 Changed 8 years ago by rl

Here is what the #4448 example would look like with ForceSpecConstr:

import GHC.Exts ( SpecConstrAnnotation(..) )

data SPEC = SPEC | SPEC2
{-# ANN type SPEC ForceSpecConstr #-}

foo :: Int -> Int -> Int -> (Int,Int)
foo !a !b !c = loop SPEC (Left (a,b))
  where
    loop :: SPEC -> Either (Int,Int) (Int, (Int,Int)) -> (Int,Int)
    loop SPEC (Left (!x,!n))
      | n > 0     = loop SPEC (Right (x,(x+c,n-1)))
      | otherwise = (x,n)

    loop SPEC (Right (i,t))
      | i < c     = loop SPEC (Right (i+1,t))
      | otherwise = loop SPEC (Left t)

I think there is a misunderstanding here. Specialising on SPEC isn't the problem, that happens anyway. It's specialising on other arguments that's problematic. I also suggest a type-specific approach: only turn off condition (3) for non-recursive types.

Perhaps you mean that we should use ForceSpecConstr to mark individual arguments? That doesn't work for a variety of reasons.

comment:13 Changed 8 years ago by simonpj

I'm suggesting this:

  • everythig exactly as now except that (3) only applies to arguments whose type are ForceSpecConstr'd.

In the #4448 code you give, that'd mean you'd have to ForceSpecConstr the Left/Right type. Or, maybe there could be a different annotation for that, meaning "if you see a function applied to me, ignore the "must be scrutinised" test.

comment:14 Changed 8 years ago by rl

I fear there is still a misunderstanding. We don't mark any "real" types as ForceSpecConstr. There is exactly 1 type in vector that's marked as ForceSpecConstr, namely SPEC. We even agreed to get rid of the annotation altogether and just put that type into base but I didn't get around to doing that. We can't mark the types that we actually want to specialise on because they are supplied by the user and might be defined by the user. Nor can we mark the types that we don't want to specialise on, for the same reasons. Really, ForceSpecConstr is an annotation on functions, not on types. The fact that it is applied to a type is an artefact of how the simplifier works - we annotate one very special type and then use it as an additional function argument rather than annotating the function directly because annotating the function would be extremely fragile whereas having a special argument isn't (yet).

comment:15 Changed 8 years ago by igloo

Milestone: 7.4.1

comment:16 Changed 8 years ago by simonpj

I understand the current behaviour. I am proposing a different behaviour!

The bit I didn't realise is this: "We can't mark the types that we actually want to specialise on because they are supplied by the user and might be defined by the user". I thought that the loops we wanted to specialise were stream-fusion loops; and they are all generated by library code; and their State types are all known to the library code. So could we not annotate these types as ones we want to specialise away; as I understand it, if the loop state type is (say) Either we'd like to generate two mutually recursive functions, one for Left states and one for Right.

(Admittedly with my proposal you could not use Either; you'd have to declare a new type so that you could annotate it to be specialised away.)

Would that not work? It would have the advantage of avoiding the need for the SPEC argument altogether. I'm sure there's a fatal flaw, but I'm not seeing it yet.

Simon

comment:17 Changed 8 years ago by rl

There are two flaws :-) Firstly, consider a fold or a scan. The accumulator becomes part of the state and that accumulator is supplied by the user but we still want to specialise on it. Functions such as last also have user-defined types as part of the state.

Secondly, you are quite right that we couldn't use Either if we were to mark State types as ForceSpecConstr. Nor could we use Int, Bool, tuples and so on. We'd have to define separate versions of all those types just to be able to use them in loop states. We used to do something vaguely similar in DPH (before vector and the SpecConstr improvements) and it really didn't work well.

comment:18 Changed 8 years ago by rl

How about a very simple heuristic. When we find a specialisation, we look at each argument separately and turn on the "must be scrutinised" check for it if we find duplicate nested constructors. By this, I mean that the argument contains a subterm C @t1 ... @tm x1 ... xn where C is a constructor and one of x1 ... xn contains a subterm C @t1 ... @tm y1 ... yn which uses the same constructor at the same types. So with my example, when we get to loop SPEC (x:y:z) xs we would spot x:y:z and turn on the "must be scrutinised" check for this argument. But in foo SPEC (x,(y,z)), the nested tuple constructors would have different types and wouldn't trigger this.

Having written this down, I realise that this might not be all that simple to implement. But it does seem to avoid loops and doesn't rely on checking for recursive types.

comment:19 Changed 8 years ago by igloo

Milestone: 7.4.17.6.1
Priority: normallow

comment:20 Changed 8 years ago by liyang

Cc: hackage.haskell.org@… added

comment:21 Changed 7 years ago by igloo

Milestone: 7.6.17.6.2

comment:22 Changed 7 years ago by basvandijk

My colleague just came to me with the question why his shell-escape package didn't stop compiling when building with -O2. When he mentioned he was using vector I remembered this bug and after reading through the code I noticed he was using foldl'.

Roman, can we identify which uses of foldl' result in infinite compilation times? If so, a note about this should probably be added to the documentation to warn users not to use this function since GHC will crash on it.

Here are just some quick experiments to see when GHC crashes:

ok1 = let f x y = x+y
          z = 0
      in V.foldl' f z

ok2 = let f (i,j) y = (i+y, j+y)
          z = (0,0)
      in V.foldl' f z

ok3 = let f Nothing y = Just y
          f (Just x) y = Just (x+y)
          z = Nothing
      in V.foldl' f z

crash1 = let f x y = y:x
             z = []
         in V.foldl' f z

crash2 = let f (i,j) y = (y:i, j+y)
             z = ([],0)
         in V.foldl' f z

It seems that uses of a list anywhere in the accumulator result in crashes. But this probably also happens on other types.

comment:23 Changed 7 years ago by simonpj

difficulty: Unknown

Reading the discussion here, I think the difficulty here is that:

  • vector makes agressive use of the SpecConstr optimisation pass, and in particular the rather unsatisfactory ForceSpecConstr annotation, to achieve a particular effect
  • SpecConstr is behaving as advertised, and that leads to an infinite compile-time loop.
  • None of us knows a clean way to achieve what Roman wants

Rather than have clients of vector fail to compile (plainly bad), perhaps it'd be better to make vector a bit less ambitious in it use of SpecConstr, until someone can figure out a clean way to do what Roman wants.

One step forward would be a small complete example that crystallises the exact problem.

comment:24 Changed 7 years ago by simonpj

See also #7553, #7555.

comment:25 Changed 7 years ago by simonpj

Roman, we are stalled on this one. It is some interaction of the vector library with SpecConstr. Might you investigate, and suggest how to progress? We have three separate tickets about it, so it's clearly getting in the way of vector being widely used.

Thanks

Simon

comment:26 Changed 7 years ago by akio

Cc: tkn.akio@… added

comment:27 Changed 7 years ago by amos.robinson@…

commit 81d55a9ec28d9d7c8b1492516ebd58c5ff90c0e8

Author: Amos Robinson <amos.robinson@gmail.com>
Date:   Thu Mar 28 12:37:42 2013 +1100

    Fix non-termination of SpecConstr (see #5550).
    ForceSpecConstr will now only specialise recursive types a finite number of times.
    There is a new option -fspec-constr-recursive, with a default value of 3.

 compiler/main/DynFlags.hs          |    4 ++
 compiler/specialise/SpecConstr.lhs |   64 ++++++++++++++++++++++++++----------
 2 files changed, 50 insertions(+), 18 deletions(-)

comment:28 Changed 7 years ago by amosrobinson

Status: newmerge

I have implemented Roman's suggestion of having an upper bound for recursive types. So in this example:

loop :: SPEC -> [Int] -> [Int] -> [Int]
loop SPEC z [] = z
loop SPEC z (x:xs) = loop SPEC (x:z) xs

The only good call pattern is loop SPEC (_:_) _ but once that is specialised, there's another call pattern loop SPEC (_:(_:_)) _ and so on.

So we just find the maximum number of recursive constructors in the arguments, and if it's greater than the limit (default 3) we discard the pattern.

I only check the recursive count if ForceSpecConstr is on for the current function. This doesn't really matter, but if ForceSpecConstr isn't on it should terminate anyway because of the max count.

Does this look OK to you, Roman?

https://github.com/ghc/ghc/commit/81d55a9ec28d9d7c8b1492516ebd58c5ff90c0e8

On Fri, Mar 8, 2013 at 7:20 PM, Roman Leshchinskiy <rl@…> wrote:

If you want to prevent infinite specialisation, then IMO the way to do it would be by not specialising more than a few times on recursive types but still specialising on tuples, Maybe, Either and friends as much as possible. That would guarantee termination without imposing an artificial bound.

comment:29 Changed 7 years ago by amosrobinson

Cc: amos.robinson@… added

comment:30 Changed 7 years ago by simonpj

Test Case: simplCore/should_compile/T5550

comment:31 Changed 6 years ago by carter

I hit this bug recently,

see

https://github.com/mokus0/random-fu/issues/13#issuecomment-20546431

step to repro: ghc 7.6.3, and using cabal / cabal-install head modify cabal settings to have "optimization: 2" in the ~/.cabal/config file (so libraries are built with -O2)

then when i try to install random-fu, ghc gets stuck on specconstr as this ticket discusses

comment:32 Changed 6 years ago by carter

Is this fixed in head?

comment:33 Changed 6 years ago by amosrobinson

Sorry for the late reply, I've been distracted.

Yep, it's fixed in head. I'm guessing it will make it into the next release

comment:34 Changed 6 years ago by thoughtpolice

Resolution: fixed
Status: mergeclosed

There's not going to be a 7.6.4, so this is now closed.

Note: See TracTickets for help on using tickets.