Opened 23 months ago

Closed 23 months ago

Last modified 23 months ago

#14521 closed bug (invalid)

Infinite loop at runtime

Reported by: OlivierSohn Owned by:
Priority: high Milestone: 8.2.3
Component: Compiler Version: 8.2.2
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Incorrect result at runtime Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by OlivierSohn)

Hello,

For this program : https://github.com/OlivierSohn/hamazed/tree/repro-ghc-14521-A

The expected output is:

Before rendering animations
animation is rendered
After rendering animations

What I observe is when compiling with ghc 8.0.2 or 8.2.2, with optimizations:

Before rendering animations

I found several ways to circumvent the bug, and documented each one in the source code.

Can someone take a look? Is my program an invalid program that was not detected by the compiler?

Thank you, Olivier

Change History (23)

comment:1 Changed 23 months ago by OlivierSohn

Description: modified (diff)
Summary: Infinite loop at runtime when a given function is not marked INLINEInfinite loop at runtime when either : a given function is not marked INLINE, or functions are stored in strict field

comment:2 Changed 23 months ago by OlivierSohn

Description: modified (diff)

comment:3 Changed 23 months ago by simonpj

Yes, it could be a compiler bug, but a more self-contained repro case would be far far easier to deal with. Thank you!

Does it happen with 8.2?

comment:4 in reply to:  3 ; Changed 23 months ago by OlivierSohn

Replying to simonpj:

Yes, it could be a compiler bug, but a more self-contained repro case would be far far easier to deal with. Thank you!

Does it happen with 8.2?

Ok, I'll try to reduce the problem to a minimal amount of code. I was waiting to be sure that it was not a problem in my code before digging a little deeper.

I don't know if it's a bug fixed by 8.2, I'll try that too. First I need to understand how to build using ghc directly instead of relying on stack, because latest stack version references 8.0.2 :)

comment:5 Changed 23 months ago by bgamari

Description: modified (diff)
Milestone: 8.2.3

For what it's worth the repo build for me with 8.2.1,

$ git clone ​https://github.com/OlivierSohn/hamazed
$ cd hamazed
$ git checkout 9f25223ef0502f91cd9633654bdb172f714c3920
$ git clone https://github.com/OlivierSohn/ansi-terminal.git
$ git -C ansi-terminal checkout e6a2b1ff2e1aebd902c1791583b80ef481f370c5
$ echo "packages: ., ansi-terminal" >> cabal.project
$ cabal new-build all --allow-newer

Unfortunately I'm quite lost regarding the game itself. I found that pressing "Enter" makes numbers start flying about; however the s, e, d, f keys don't appear to do anything. This appears to be the case with both 8.2.2 and 8.0.2.

comment:6 in reply to:  5 Changed 23 months ago by OlivierSohn

Replying to bgamari:

For what it's worth the repo build for me with 8.2.1,

$ git clone ​https://github.com/OlivierSohn/hamazed
$ cd hamazed
$ git checkout 9f25223ef0502f91cd9633654bdb172f714c3920
$ git clone https://github.com/OlivierSohn/ansi-terminal.git
$ git -C ansi-terminal checkout e6a2b1ff2e1aebd902c1791583b80ef481f370c5
$ echo "packages: ., ansi-terminal" >> cabal.project
$ cabal new-build all --allow-newer

Unfortunately I'm quite lost regarding the game itself. I found that pressing "Enter" makes numbers start flying about; however the s, e, d, f keys don't appear to do anything. This appears to be the case with both 8.2.2 and 8.0.2.

@bgamari, I suppose you are on windows? On windows there is a bug which makes it impossible to timeout on IO (explains why the game is blocked), plus setting stdin buffering mode to "unbuffered" is not possible (I explain a bit more in BACKLOG.md file, in "Windows" chapter), this explains the behaviour you see: when pressing Enter, stdin is flushed and the game advances. I tried to circumvent this problem but didn't finish the fix.

In the meantime I made a program that allows to reproduce the behaviour with minimal code : it is branch https://github.com/OlivierSohn/hamazed/tree/repro-ghc-14521-A

When running the program, the expected output is:

Before rendering animations
animation is rendered
After rendering animations

What I see with 8.0.2 when compiled with optimizations is:

Before rendering animations

I also documented in the code what changes make the problem disappear

I'd be interested if someone can build with optimizations with 8.2.2 and run the program and report on the output ?

Last edited 23 months ago by OlivierSohn (previous) (diff)

comment:7 Changed 23 months ago by OlivierSohn

Description: modified (diff)

comment:8 in reply to:  4 ; Changed 23 months ago by svenpanne

Replying to OlivierSohn:

[...] First I need to understand how to build using ghc directly instead of relying on stack, because latest stack version references 8.0.2 :)

Just FYI: Stackage nightly builds use 8.2.1/8.2.2 for some time now, just use stack --resolver=nightly ... or even more specifically e.g. stack --resolver=nightly-2017-11-25 ....

comment:9 in reply to:  8 Changed 23 months ago by OlivierSohn

Replying to svenpanne:

Just FYI: Stackage nightly builds use 8.2.1/8.2.2 for some time now, just use stack --resolver=nightly ... or even more specifically e.g. stack --resolver=nightly-2017-11-25 ....

Thanks, using the nightly resolver I could verify that the behaviour is the same on 8.2.2:

2017-11-25 21:01:48.402994: [debug] Run process: /Users/Olivier/Dev/hs.maze/.stack-work/install/x86_64-osx/nightly-2017-11-25/8.2.2/bin/hamazed-exe
@(Stack/Exec.hs:67:5)
Before rendering animations

comment:10 Changed 23 months ago by OlivierSohn

Description: modified (diff)
Version: 8.0.28.2.2

comment:11 in reply to:  3 Changed 23 months ago by OlivierSohn

Replying to simonpj:

Yes, it could be a compiler bug, but a more self-contained repro case would be far far easier to deal with. Thank you!

Does it happen with 8.2?

I verified that it happens also with 8.2.2, and made a minimal repro case in this branch : https://github.com/OlivierSohn/hamazed/tree/repro-ghc-14521-A

It can be built and run with this command:

stack build --pedantic --resolver=nightly --install-ghc&& stack exec hamazed-exe --resolver=nightly -v

comment:12 Changed 23 months ago by OlivierSohn

Priority: normalhigh

comment:13 Changed 23 months ago by OlivierSohn

I raised the priority as this bug prevents me from refactoring code because I need constantly to make sure that the bug is not happening again.

Last edited 23 months ago by OlivierSohn (previous) (diff)

comment:14 Changed 23 months ago by mpickering

That looks like a good minimal example but you still have lots of unnecessary dependencies in the cabal file. Unless the project can be compiled with a single invocation of ghc or cabal then it is unlikely someone will make quick progress.

comment:15 Changed 23 months ago by OlivierSohn

@mpickering Thanks for the feedback, so now I removed all dependencies from the cabal file.

I also found out that if the Animation module is entirely exported :

module Animation where

It fixes the problem!

I added this observation in the comments.

Here is the branch : https://github.com/OlivierSohn/hamazed/tree/repro-ghc-14521-A

comment:16 Changed 23 months ago by OlivierSohn

Summary: Infinite loop at runtime when either : a given function is not marked INLINE, or functions are stored in strict fieldInfinite loop at runtime

I rename the title because there are actually more ways to circumvent the problem, which are all documented in the code.

comment:17 Changed 23 months ago by OlivierSohn

Description: modified (diff)

I update the description to reference the branch of the repro case and not a particular commit, and to be more concise.

comment:18 Changed 23 months ago by OlivierSohn

Operating System: MacOS XUnknown/Multiple

I verified I can reproduce it on OS X and Windows

comment:19 Changed 23 months ago by bgamari

For the record, I am seeing the program loop in Animation_zdwanimatedNumber_info.

Indeed looking at the simplified Core, the "bad" configuration compiles to

-- RHS size: {terms: 3, types: 1, coercions: 0, joins: 0/0}
Animation.$wanimatedNumber [InlPrag=[0], Occ=LoopBreaker]
  :: GHC.Prim.Int# -> Tree -> Animation -> IO (Maybe Animation)
[GblId, Arity=1, Caf=NoCafRefs, Str=<B,1*U>b]
Animation.$wanimatedNumber
  = \ (ww_s3v3 :: GHC.Prim.Int#) ->
      Animation.$wanimatedNumber ww_s3v3
end Rec }

Bad news bears.

comment:20 Changed 23 months ago by bgamari

Resolution: invalid
Status: newclosed

So I'm not certain why the program runs successfully without optimization, but it seems to me like the program should indeed loop. The gist of the program (eliminating some irrelevant bits) is

data Animator a = Animator {
  _animatorIO   :: !(Tree -> Animation  -> IO (Maybe Animation))
}

mkAnimator :: (t -> Tree -> Animation -> IO (Maybe Animation))
           -> t
           -> Animator a
mkAnimator io_ params = Animator (io_ params)


animatedNumber :: Int -> Tree -> Animation  -> IO (Maybe Animation)
animatedNumber n =
  animate' (mkAnimator animatedNumber n)

After inlining mkAnimator into animatedNumber and expressing the constructor's strictness explicitly we have,

animatedNumber :: Int -> Tree -> Animation  -> IO (Maybe Animation)
animatedNumber n =
  animate' (let x = animatedNumber n
            in x `seq` Animator x)

Consequently if animate' demands its argument the program will loop. This function is defined as,

animate' :: Animator a -> Tree -> Animation  -> IO (Maybe Animation)
animate' (Animator pure_ io_) = animate pure_ io_

Which indeed does demand its argument. Moreover, this is consistent with your observation that adding an INLINE on animate' eliminates the loop (since in the real program Animator contains two fields, only the first, which I elided above, is needed).

Given this, it seems like looping is a completely valid compilation of this program. I would remove the bang from the constructor if laziness is needed (which it seems it is).

Last edited 23 months ago by bgamari (previous) (diff)

comment:21 Changed 23 months ago by simonpj

I took a quick look. To me it looks as if it should definitely loop. Look at the code

animatedNumber n = animate' (mkAnimator animateNumberPure animatedNumber n)

mkAnimator pure_ io_ params = Animator (applyAnimation (pure_ params)) (io_ params)

animate' (Animator pure_ io_) = animate pure_ io_

If we just inline mkAnimator we see:

animatedNumber n = animate' (Animator blah (animatedNumber n))

Now

  • Aminator is declared to be strict in its second argument
  • animate' is certainly strict

So of course animatedNumber n will just return bottom.

The mystery is not why it diverges -- it should! -- but rather why it sometimes fails to diverge. The answer is that GHC allows itself a bit of leeway in making a divergent program into one that converges, in the interests of improving runtimes generally by allowing more optimisations. You can switch off this behaviour by using -fpedantic-bottoms, and then sure enough it always diverges.

In this case the relevant optimisation is eta-expansion. If you do manual eta-expansion on animate', then it converges, always:

animate' :: Animator a -> Tree -> Animation  -> IO (Maybe Animation)
animate' (Animator pure_ io_) t a = animate pure_ io_ t a

So I claim this is not a bug.

Last edited 23 months ago by simonpj (previous) (diff)

comment:22 in reply to:  21 Changed 23 months ago by OlivierSohn

Replying to simonpj:

In this case the relevant optimisation is eta-expansion. If you do manual eta-expansion on animate', then it converges, always:

animate' :: Animator a -> Tree -> Animation  -> IO (Maybe Animation)
animate' (Animator pure_ io_) t a = animate pure_ io_ t a

Thank you for your analysis, I think I understand why making the record field non-strict fixes the problem, but I don't understand why the eta-expansion also fixes it, could you explain me or point me to a resource explaining this point?

comment:23 in reply to:  20 Changed 23 months ago by OlivierSohn

Replying to bgamari: Thanks for your explanations! I'm not used to think in term of strict / lazy evaluation so I'll need to re-read them a few times until I understand everything :) As a default I was told to use strict fields to avoid accumulating thunks, and it's the first time I encounter a problem where I actually need lazyness.

Note: See TracTickets for help on using tickets.