Opened 7 years ago

Closed 15 months ago

#7836 closed bug (fixed)

"Invalid object in processHeapClosureForDead" when profiling with -hb

Reported by: hyperthunk Owned by:
Priority: normal Milestone: 8.6.1
Component: Profiling Version: 8.5
Keywords: profiling Cc: osa1
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: #15063, #15087, #15165 Differential Rev(s): Phab:D4928
Wiki Page:

Description (last modified by osa1)

Running the attached program, compiled with -threaded -Wall -auto-all -caf-all -fforce-recomp -fprof-auto-top -fprof-auto-calls - with the following flags: +RTS -hc -hbdrag,void -RTS

The output is as follows:

leaks: internal error: Invalid object in processHeapClosureForDead(): 60
    (GHC version 7.4.2 for i386_apple_darwin)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Abort trap: 6

Attachments (1)

LeakByteStrings.hs (2.2 KB) - added by hyperthunk 7 years ago.
Program that triggers the bug

Download all attachments as: .zip

Change History (33)

Changed 7 years ago by hyperthunk

Attachment: LeakByteStrings.hs added

Program that triggers the bug

comment:1 Changed 7 years ago by ezyang

Status: newinfoneeded

Thanks! It looks like there's a pretty nontrivial dependency in the sample program; can you check if this behavior occurs in 7.6/HEAD?

comment:2 in reply to:  1 ; Changed 7 years ago by hyperthunk

Replying to ezyang:

Thanks! It looks like there's a pretty nontrivial dependency in the sample program; can you check if this behavior occurs in 7.6/HEAD?

Will do that and report back.

comment:3 in reply to:  2 Changed 7 years ago by hyperthunk

Architecture: x86x86_64 (amd64)
Type of failure: None/UnknownRuntime crash

Replying to hyperthunk:

Replying to ezyang:

Thanks! It looks like there's a pretty nontrivial dependency in the sample program; can you check if this behavior occurs in 7.6/HEAD?

Will do that and report back.

Interim results with 7.6.2 look good:

t4@iske:distributed-process-platform $ ghc --version The Glorious Glasgow Haskell Compilation System, version 7.6.2 t4@iske:distributed-process-platform $ ghc-env active 7.6.2-x86_64

And the results:

t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -hc -hbdrag,void -RTS Mon Apr 15 14:00:50 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia started Mon Apr 15 14:00:57 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia blurble t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N -hc -hbdrag,void -RTS Mon Apr 15 14:01:36 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia started Mon Apr 15 14:01:42 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia blurble

However if I supply a specific value for -N<num> though, it crashes with a bus error - perhaps that's an entirely different but to what I originally reported though...

t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N2 -hc -hbdrag,void -RTS Mon Apr 15 14:01:51 UTC 2013 pid://127.0.0.1:10519:0:3Bus error: 10 t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N3 -hc -hbdrag,void -RTS Mon Apr 15 14:01:58 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia staBus error: 10 t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N4 -hc -hbdrag,void -RTS Mon Apr 15 14:02:03 UTC 2013 pid://127.0.0.1Bus error: 10

Platform details:

t4@iske:distributed-process-platform $ uname -a Darwin iske.local 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23 16:25:48 PDT 2012; root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64

comment:4 Changed 7 years ago by hyperthunk

Urgh sorry, that didn't get formatted at all well:

t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -hc -hbdrag,void -RTS
Mon Apr 15 14:00:50 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia started
Mon Apr 15 14:00:57 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia blurble
t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N -hc -RTS
Mon Apr 15 14:01:06 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia started
Mon Apr 15 14:01:11 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia blurble
t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N -hbdrag,void -RTS
Mon Apr 15 14:01:20 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia started
Mon Apr 15 14:01:26 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia blurble
t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N -hc -hbdrag,void -RTS
Mon Apr 15 14:01:36 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia started
Mon Apr 15 14:01:42 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia blurble
t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N2 -hc -hbdrag,void -RTS
Mon Apr 15 14:01:51 UTC 2013 pid://127.0.0.1:10519:0:3Bus error: 10
t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N2 -hc -hbdrag,void -RTS
Mon Apr 15 14:01:54 UTC 2013 pid://127.0.0.1:Bus error: 10
t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N3 -hc -hbdrag,void -RTS
Mon Apr 15 14:01:58 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia staBus error: 10
t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N4 -hc -hbdrag,void -RTS
Mon Apr 15 14:02:03 UTC 2013 pid://127.0.0.1Bus error: 10
t4@iske:distributed-process-platform $ ./dist/build/leaks/leaks +RTS -N -hc -hbdrag,void -RTS
Mon Apr 15 14:02:05 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia started
Mon Apr 15 14:02:12 UTC 2013 pid://127.0.0.1:10519:0:3: mnesia blurble

comment:5 Changed 6 years ago by ezyang

Bus error usually indicates that you've hit an rlimit. If you run the program inside gdb what happens?

comment:6 in reply to:  5 Changed 6 years ago by hyperthunk

Replying to ezyang:

Bus error usually indicates that you've hit an rlimit. If you run the program inside gdb what happens?

Bear with me - I'll get back to you once I've found time to do that.

comment:7 Changed 6 years ago by hyperthunk

Sorry, I've not found time to deal with this yet, but I haven't forgotten. It's just behind a bunch of other stuff on a very long back-burner.

comment:8 Changed 6 years ago by igloo

Description: modified (diff)
difficulty: Unknown

comment:9 Changed 6 years ago by crockeea

I got the following error:

> ghc -rtsopts -prof -fprof-auto -threaded A
> ./A +RTS -sstderr -hb
A: internal error: Invalid object in processHeapClosureForDead(): 0
    (GHC version 7.6.3 for x86_64_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Aborted

All other heap profiling options (-hc -hm -hd -hy -hr) work without error.

Note that this occurred with GHC 7.6.3 (linux, x86_64 (amd64)). My program is too large to post, but I figured I would add this to the bug report to let you know the problem persists.

Last edited 6 years ago by crockeea (previous) (diff)

comment:10 Changed 5 years ago by thoughtpolice

Milestone: 7.10.1

Moving to 7.10.1.

comment:11 Changed 5 years ago by thomie

Architecture: x86_64 (amd64)Unknown/Multiple
Operating System: MacOS XUnknown/Multiple
Summary: Runtime failure profiling with +RTS -hc -hbdrag,void"Invalid object in processHeapClosureForDead" when profiling with -hb

See #9640 for a report of this problem when profiling Yi with 7.8.3. I don't think this is an OS X only problem.

comment:12 Changed 5 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:13 Changed 4 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:14 Changed 4 years ago by thomie

Milestone: 8.0.1

comment:15 Changed 3 years ago by domenkozar

Reproduced on Linux using "+RTS -N -pa -hb -T -A6G -qg -RTS":

cardano-node: internal error: Invalid object in processHeapClosureForDead(): 0

(GHC version 8.0.1 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug

comment:16 Changed 2 years ago by nad

I managed to trigger this bug using Agda version 2.6.0-439fe3e, built using GHC 8.2.1 with profiling enabled and -fprof-auto:

$ agda_p --version +RTS -hb
Agda version 2.6.0-439fe3e
agda_p: internal error: Invalid object in processHeapClosureForDead(): 7
    (GHC version 8.2.1 for i386_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Aborted (core dumped)

comment:17 Changed 2 years ago by nad

I managed to make the test case a bit smaller:

$ cat Main.hs 
import System.Exit

main = exitSuccess
$ ghc --make -prof Main.hs
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking Main ...
$ ./Main +RTS -hb
Main: internal error: Invalid object in processHeapClosureForDead(): 7
    (GHC version 8.2.1 for i386_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Aborted (core dumped)

comment:18 Changed 2 years ago by bgamari

I can reproduce this on Debian 8 using the i386 GHC 8.2.1 bindist and test program from comment:17. However, I can't reproduce this on x86-64, which should serve as a nice hint.

comment:19 Changed 19 months ago by crockeea

I just hit this bug on x86_64 with GHC-8.2.2. It's a large program, so I can't provide an example, but this is still a problem.

comment:20 Changed 19 months ago by osa1

Cc: osa1 added

comment:21 Changed 19 months ago by osa1

Description: modified (diff)
Differential Rev(s): Phab:D4567
Keywords: osx removed
Version: 7.4.28.5

comment:22 Changed 19 months ago by osa1

Status: infoneededpatch

comment:23 Changed 19 months ago by osa1

We should either stop allocating CONSTR_NOCAF on the heap, or apply the patch I submitted. If I'm understanding the story correctly we can legitimately allocate CONSTR_NOCAF on the heap so I think the patch is correct.

comment:24 Changed 19 months ago by Ben Gamari <ben@…>

In a303584e/ghc:

Fix processHeapClosureForDead CONSTR_NOCAF case:

CONSTR_NOCAF was introduced with 55d535da10d as a replacement for
CONSTR_STATIC and CONSTR_NOCAF_STATIC, however, as explained in Note
[static constructors], we copy CONSTR_NOCAFs (which can also be seen in
evacuate) during GC, and they can become dead, like other CONSTR_X_Ys.
processHeapClosureForDead is updated to reflect this.

Reviewers: bgamari, simonmar, erikd

Subscribers: thomie, carter

GHC Trac Issues: #7836

Differential Revision: https://phabricator.haskell.org/D4567

comment:25 Changed 17 months ago by osa1

Trac didn't notice that this was reverted.

comment:26 Changed 17 months ago by osa1

comment:27 Changed 17 months ago by osa1

comment:28 Changed 16 months ago by osa1

Differential Rev(s): Phab:D4567Phab:D4928

comment:29 Changed 16 months ago by osa1

The bug as originally reported (Invalid object in processHeapClosureForDead(): 60) can't happen with modern GHCs as we handle closure 60 already, but the bug in comment:16 is fixed in Phab:D4928.

comment:30 Changed 15 months ago by Ömer Sinan Ağacan <omeragacan@…>

In 2625f131/ghc:

Fix processHeapClosureForDead CONSTR_NOCAF case

CONSTR_NOCAF was introduced with 55d535da10d as a replacement for
CONSTR_STATIC and CONSTR_NOCAF_STATIC, however, as explained in Note
[static constructors], we copy CONSTR_NOCAFs (which can also be seen in
evacuate) during GC, and they can become dead, like other CONSTR_X_Ys.
processHeapClosureForDead is updated to reflect this.

Test Plan: Validates on x86_64. Existing failures on i386.

Reviewers: simonmar, bgamari, erikd

Reviewed By: simonmar, bgamari

Subscribers: rwbarton, thomie, carter

GHC Trac Issues: #7836, #15063, #15087, #15165

Differential Revision: https://phabricator.haskell.org/D4928

comment:31 Changed 15 months ago by osa1

Status: patchmerge

comment:32 Changed 15 months ago by bgamari

Milestone: 8.6.1
Resolution: fixed
Status: mergeclosed
Note: See TracTickets for help on using tickets.