Opened 13 months ago

Last modified 10 months ago

#15897 merge bug

Negative MUT time in +RTS -s -RTS when heap profiling is enabled

Reported by: maoe Owned by:
Priority: normal Milestone: 8.8.1
Component: Profiling Version: 8.6.2
Keywords: Cc: osa1, duog, bgamari
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Debugging information is incorrect Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by maoe)

Mutator run time in +RTS -s -RTS can go negative in GHC 8.6.x when heap profiling is enabled.

Here is an example taken from

{-# OPTIONS_GHC -fprof-auto #-}
import Control.Monad
import Data.IORef
import System.Environment

main = do
  [n] <- getArgs
  replicateM (read n) (newIORef [1,2,3])
% ghc-8.6.2 -prof -rtsopts -fforce-recomp repro.hs
[1 of 1] Compiling Main             ( repro.hs, repro.o )
Linking repro ...
% ./repro 10000000 +RTS -s -h                        
   3,053,080,472 bytes allocated in the heap
   5,538,404,848 bytes copied during GC
   1,013,615,328 bytes maximum residency (16 sample(s))
      36,422,944 bytes maximum slop
             966 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      2600 colls,     0 par    1.375s   1.376s     0.0005s    0.0019s
  Gen  1        16 colls,     0 par    4.355s   4.355s     0.2722s    1.0136s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.000s  ( -0.833s elapsed)
  GC      time    4.016s  (  4.016s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time    1.715s  (  1.715s elapsed)
  EXIT    time    1.715s  (  1.715s elapsed)
  Total   time    6.612s  (  6.612s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    0 bytes per MUT second

  Productivity -12.6% of total user, -12.6% of total elapsed

Here is what I found so far:

  • GHC 8.6.1, 8.6.2, and recent HEAD (20181112) have the same issue. GHC 8.4.4 doesn't have this issue.
  • -hr has the same issue. -hT is fine.
  • Without heap profiling the numbers look fine

Change History (12)

comment:1 Changed 13 months ago by maoe

Description: modified (diff)

comment:2 Changed 13 months ago by maoe

Description: modified (diff)

comment:3 Changed 13 months ago by maoe

Description: modified (diff)

comment:4 Changed 13 months ago by osa1

Cc: osa1 added

comment:5 Changed 13 months ago by osa1

Cc: duog added

This is probably introduced in #14660 (Phab:D4303).

comment:6 Changed 13 months ago by osa1

Cc: bgamari added

If I run this program with debug runtime I get this assertion failure:

Main: internal error: ASSERTION FAILED: file rts/Stats.c, line 1123

    (GHC version 8.7.20181115 for x86_64_unknown_linux)
    Please report this as a GHC bug:
zsh: abort (core dumped)  ./Main 5000000 +RTS -s -h

Code that fails:

// The subdivision of runtime into INIT/EXIT/GC/MUT is just adding
// and subtracting, so the parts should add up to the total exactly.
// Note that stats->total_ns is captured a tiny bit later than
// end_exit_elapsed, so we don't use it here.
ASSERT(stats.init_elapsed_ns \
       + stats.mutator_elapsed_ns \
       + stats.gc_elapsed_ns \
       + sum.exit_elapsed_ns \
       == end_exit_elapsed - start_init_elapsed);

comment:7 Changed 12 months ago by bgamari

Milestone: 8.6.3

Ticket retargeted after milestone closed

comment:9 Changed 11 months ago by osa1

Milestone: 8.8.1
Status: patchmerge

Fixed with 19670bc397. @bgamari I'm milestoning this to 8.8.1, please revert if you disagree.

comment:10 Changed 11 months ago by Ömer Sinan Ağacan <omeragacan@…>

In 19670bc3/ghc:

Fix negative mutator time in GC stats in prof builds

Because garbage collector calls `retainerProfile()` and `heapCensus()`,
GC times normally include some of PROF times too. To fix this we have
these lines:

    // heapCensus() is called by the GC, so RP and HC time are
    // included in the GC stats.  We therefore subtract them to
    // obtain the actual GC cpu time.
    stats.gc_cpu_ns      -=  prof_cpu;
    stats.gc_elapsed_ns  -=  prof_elapsed;

These variables are later used for calculating GC time excluding the
final GC (which should be attributed to EXIT).

    exit_gc_elapsed      = stats.gc_elapsed_ns - start_exit_gc_elapsed;

The problem is if we subtract PROF times from `gc_elapsed_ns` and then
subtract `start_exit_gc_elapsed` from the result, we end up subtracting
PROF times twice, because `start_exit_gc_elapsed` also includes PROF

We now subtract PROF times from GC after the calculations for EXIT and
MUT times. The existing assertion that checks


now holds. When we subtract PROF numbers from GC, and a new assertion


also holds.

Fixes #15897. New assertions added in this commit also revealed #16102,
which is also fixed by this commit.

comment:11 Changed 10 months ago by sgraf

There was a (probably) spurious failure of this in Let's see if this happens again (

comment:12 Changed 10 months ago by osa1

The failures are due to test timing out on CI because it takes too long sometimes. I'll try to find a smaller test case. I think I've never tried making the input smaller, maybe that still reproduces the bug. I should try that first.

Note: See TracTickets for help on using tickets.