Opened 6 years ago

Last modified 12 months ago

#8316 new bug

GHCi debugger panics when trying force a certain variable

Reported by: guest Owned by:
Priority: normal Milestone:
Component: GHCi Version: 7.6.3
Keywords: debugger, newcomer Cc: hvr
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: GHCi crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D4535, Phab:D5179
Wiki Page:

Description (last modified by osa1)

The file Test.hs has following definition:

foo :: [Int]
foo = [1..]

Calling ghci as:

ghci Test.hs -ignore-dot-ghci

and bebugging foo like this:

*Main> :break foo
Breakpoint 0 activated at main.hs:2:7-11
*Main> foo
Stopped in, main.hs:2:7-11
_result :: [Int] = _
[main.hs:2:7-11] *Main> :print foo
foo = (_t1::[Int])
[main.hs:2:7-11] *Main> _t1

results in this panic:

<interactive>: internal error: TSO object entered!
    (GHC version 8.5.20180302 for x86_64_unknown_linux)
    Please report this as a GHC bug:
[1]    5445 abort (core dumped)  ghci Test.hs -ignore-dot-ghci

Attachments (1)

Test.hs (74 bytes) - added by guest 6 years ago.
The Test.hs file containing the definitions.

Download all attachments as: .zip

Change History (26)

Changed 6 years ago by guest

Attachment: Test.hs added

The Test.hs file containing the definitions.

comment:1 Changed 6 years ago by goldfire

I can confirm this behavior on MacOS 10.8.4 (x86_64) and with HEAD as of 27 Aug.

comment:2 Changed 5 years ago by thomie

Cc: hvr added

Confirmed on Linux x86_64 with ghc-7.9.20141115.

comment:3 Changed 4 years ago by Ben Gamari <ben@…>

In a61e717f/ghc:

testsuite: Add testcase for #8316

This is still broken but really out to be fixed. At least know we'll
know if someone fixes it inadvertently.

Test Plan: validate

Reviewers: austin

Reviewed By: austin

Subscribers: thomie

Differential Revision:

GHC Trac Issues: #8316

comment:4 Changed 4 years ago by rwbarton

Priority: normalhigh

comment:5 Changed 21 months ago by osa1

Confirmed on 8.4.1 and HEAD (8.5.20180308).

comment:6 Changed 21 months ago by RyanGlScott

Keywords: debugger added

comment:7 Changed 21 months ago by osa1

I've been looking into this, here's what I found out so far:

  • Because we start evaluating foo before hitting the breakpoint, by the time the we return to the GHCi prompt foo points to a blackhole.
  • Once we stop at the breakpoint we do :print foo, pprintClosureCommand calls bindSuspensions with the id foo.
  • bindSuspensions invents a new name _t1 and binds it to the thunk that is foo, via RtClosureInspect.cvObtainTerm.
  • cvObtainTerm looks at the heap object pointed to by foo, which is a blackhole, and follows the indirectee pointer. It turns out the indirectee is a TSO object. At this point _t1 becomes bound to a TSO object, and evaluating it (e.g. with print _t1) causes this crash because TSO objects can't be entered.

I tried modifying cvObtainTerm so that it doesn't follow the indirectee pointer when it sees a blackhole. That way we bind _t1 to the blackhole object instead of the TSO object pointed by the indirectee field, but that caused a deadlock in the scheduler. I don't understand why yet.

simonmar, could you advise? Does the story make sense so far?

comment:8 Changed 21 months ago by osa1

Simon, I implemented changes in cvObtainTerm as discussed yesterday, but I'm still getting "TSO object entered" errors.

Previously cvObtainTerm follwed indirectee's of BLACKHOLEs no matter what.

With my changes I only follow the indirectees when they're not TSO or BLOCKING_QUEUE. Somehow with this I still get "TSO object entered".

If I don't follow BLACKHOLE indirectees at all (and bind a BLACKHOLE to _t1 in this reproducer) then I get a deadlock as expected. Do you have any ideas on why this may be happening?

comment:9 Changed 21 months ago by Nolan

Description: modified (diff)
Owner: set to Nolan

I simplified example. whnf function is not required to trigger this bug. You can also notice that context of execution (i.e. [main.hs:2:7-11]) prints twice which looks like a bug and probably deserves a ticket.

comment:10 Changed 21 months ago by Nolan

Owner: Nolan deleted

comment:11 Changed 21 months ago by osa1

Thanks for the simplification. Which GHC version did you use to try this? I don't see the prompt printed twice on 8.4.1 and HEAD.

Last edited 21 months ago by osa1 (previous) (diff)

comment:12 in reply to:  11 Changed 21 months ago by Nolan

I use most current version from git repository(8.5.20180323). I've just built it.

Last edited 21 months ago by Nolan (previous) (diff)

comment:13 Changed 21 months ago by osa1

Thanks, it turns out I have to use the default prompt to reproduce. Reported as #14973.

comment:14 Changed 21 months ago by osa1

Differential Rev(s): Phab:D4535

I created a differential with an implementation of the idea in comment:8. The patch seems to do the right thing but there's probably another bug in somewhere else so I'm still getting "TSO entered" errors.

comment:15 Changed 19 months ago by Nolan

Description: modified (diff)

Bug with context printed twice was fixed.

comment:16 Changed 15 months ago by osa1

Ah, looking at this ticket again, I can see what I missed last time (in Phab:D4535).

The problem is Phab:D4535 does not have any effect because TSO and BLOCKING_QUEUE are already not handled by cvObtainTerm.go and cvObtainTerm returns a Suspension when it finds one of those objects. So even if we follow a BLACKHOLE that points to a TSO we return a Suspension. In Phab:D4535 we returned Suspension slightly earlier (before following the indirectee), but the value we returned was identical to the value we returned without the patch.

What we should do is if we see a BLACKHOLE pointing to an TSO or BLOCKING_QUEUE we should return a Suspension with the BLACKHOLE itself as the hval (currently: hval is the indirectee).

However I suspect entering the BLACKHOLE will result in a deadlock because the thread that's supposed to evaluate the expression (i.e. the owner) is blocked on an MVar (the breakpoint MVar passed to GHCi.Run.withBreakAction) and when we enter the BLACKHOLE our thread gets parked, to be unparked by the owner of the BLACKHOLE, which never happens as we don't update the MVar before entering the BLACKHOLE.

comment:17 Changed 15 months ago by osa1

Differential Rev(s): Phab:D4535Phab:D4535, Phab:D5179

I submitted a diff. As expected, it causes a deadlock in the reproducer.

Simon, any ideas on how to fix the deadlock? Would it be possible to resume the evaluator thread (by updating breakMVar) before entering a BLACKHOLE?

comment:18 Changed 15 months ago by simonmar

I don't see a good way to solve this. The thread that is evaluiating foo is stopped at a breakpoint - that's what the user asked for, so it's not entirely surprising that if they evaluate something that requires foo then it deadlocks.

What would we like to happen? I can think of a couple of alternatives:

1. Just make it work

Should it automatically continue evaluation of foo? How would you know when to do that? Evaluating a BLACKHOLE doesn't necessarily mean that we're about to deadlock, we might be evaluating something that another thread is evaluating. As soon as we release the breakMVar the thread will continue evaluating foo, but I don't know of a way to tell whether/when we should do that.

Perhaps instead of the MVar, a breakpoint should be an asynchronous exception so that we end up with a thunk that we could poke to continue evaluation? That would make this work, but it would mean a big change to the way breakpoints work and I'm not sure whether it would run into other problems. One potential problem is that it's a lot more expensive than the current breakpoint mechanism, so :trace wouldn't work so well.

2. Make it an error of some kind


[main.hs:2:7-11] *Main> _t1
*** Exception: blocked on breakpoint 1 

The question is how to achieve that. Perhaps we periodically monitor the thread we just created to do the evaluation and check whether it's blocked on a blackhole, and then compare the owner of the blackhole it is blocked on against all the threads we know are currently at breakpoints? That could possibly work, but it's tricky to implement.

comment:19 Changed 15 months ago by osa1

Here's another example that deadlocks even with GHC 8.6:

foo = 0 : bar
bar = 1 : foo

in GHCi:

GHCi, version 8.6.1:  :? for help
:Loaded GHCi configuration from /home/omer/rcbackup/.ghci
[1 of 1] Compiling Main             ( test.hs, interpreted )
Ok, one module loaded.
λ:1> :break foo
Breakpoint 0 activated at test.hs:1:7-13
λ:2> foo
Stopped in, test.hs:1:7-13
_result :: [Integer] = _
[test.hs:1:7-13] λ:3> :print bar
bar = (_t1::[Integer])
[test.hs:1:7-13] λ:4> _t1

The reason why we don't get "TSO entered" here is because _t1 stands for bar, and bar itself is not locked by the evaluator thread. Instead an object in bar's payload is owned.

I think this shows that even if we could somehow release the MVar in the original reproducer there will be deadlocks.

I think we should:

  • Merge the patch. We should never enter a TSO or BLOCKING_QUEUE.
  • Document this behavior in the user manual
  • Disallow evaluating BLACKHOLEs in GHCi

The last step would fix the original reproducer, but my example above will still deadlock and that's what you get for having lazy evaluation.

comment:20 Changed 15 months ago by osa1

Description: modified (diff)
Summary: GHCi debugger segfaults when trying force a certain variableGHCi debugger panics when trying force a certain variable

comment:21 Changed 14 months ago by Ben Gamari <ben@…>

In 45ed461/ghc:

Fix BLACKHOLE inspection in RtClosureInspect

When inspecing a BLACKHOLE if the BLACKHOLE points to a TSO or a
BLOCKING_QUEUE we should return a suspension to the BLACKHOLE itself
(instead of returning a suspension to the indirectee). The reason is
because in the debugger when we want to evaluate this term we need to
enter the BLACKHOLE and not to the TSO or BLOCKING_QUEUE. See the
runtime panic caused by this in #8316.

Note that while with this patch we do the right thing to evaluate
thunks in GHCi, evaluating thunks that are owned by the evaluator thread
in a breakpoint will cause a deadlock as we don't release the breakMVar,
which is what blocks the evaluator thread from continuing with
evaluation. So the GHCi thread will enter the BLACKHOLE, but owner of
the BLACKHOLE is also blocked.

Reviewers: simonmar, hvr, bgamari

Reviewed By: bgamari

Subscribers: rwbarton, carter

GHC Trac Issues: #8316

Differential Revision:

comment:22 Changed 14 months ago by bgamari

Milestone: 8.8.1
Resolution: fixed
Status: newclosed

comment:23 Changed 14 months ago by osa1

Should this really be closed? I thought we want to do implement rest of the ideas in comment:19 too.

comment:24 Changed 14 months ago by bgamari

Resolution: fixed
Status: closednew

I left a comment drawing attention to this on the Phab:D5179 before merging but yes, you are right, we should wait to close this until all of these tasks are done.

comment:25 Changed 12 months ago by osa1

Keywords: newcomer added
Milestone: 8.8.1
Priority: highnormal

The panic has been fixed, rest of comment:19 still need to be implemented, but they're not as urgent. Removing the milestone and marking the ticket as 'newcomer'.

Note: See TracTickets for help on using tickets.