Opened 2 years ago

Closed 2 years ago

Last modified 22 months ago

#14192 closed bug (fixed)

Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs

Reported by: nh2 Owned by:
Priority: normal Milestone: 8.4.1
Component: Runtime System Version: 8.0.2
Keywords: gdb, debugging Cc: nh2, simonmar, gcampax, ezyang, nicolast
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #9706 Differential Rev(s):
Wiki Page:

Description

GHC 8.0.2 on Linux changed the memory allocator to always allocate 1TB virtual memory on startup (#9706).

I now have a production Haskell program running in a loop and would like to debug where it is stuck, on another machine, thus attaching with gdb -p and running generate-core-file.

But core dumping takes forever, I Ctrl-C'd it when it reached 140 GB in size (my machine only has 64 GB RAM btw.); after the Ctrl-C the size of the core file on the file system was reported as 1.1T (probably it's a sparse file now).

Is there a workaround for this?

For example, if I could dump only the resident or actually allocated pages, that would probably help.

Change History (20)

comment:1 Changed 2 years ago by nh2

I found some info on selective page dumping on https://stackoverflow.com/questions/11734583/why-core-file-is-more-than-virtual-memory but I'm not sure what the right dumping approach is for programs running under the GHC 8.0 RTS.

comment:2 Changed 2 years ago by nh2

Cc: simonmar gcampax ezyang added

comment:3 Changed 2 years ago by nicolast

I guess setting the MADV_DONTDUMP flag on the region using madvise(2), then resetting the flag when chunks of said memory are being used, could work. Not sure about the performance impact of setting and resetting that flag over and over...

May make sense to do it over chunks of, say, 32MB at a time, which would still result in 'large' (though likely very compressable) coredumps for small programs, yet manageable.

comment:4 Changed 2 years ago by simonmar

I'm surprised this is an issue, I'm sure I've core-dumped processes with the 1TB address space without any problems. The core files look huge, but they're sparse.

I wonder what's being dumped.

We could easily add a flag to change the size of the region, but adding a flag to disable the region completely would add a performance overhead because we'd have to check the flag repeatedly in the inner loop of the GC, so I'd really like to avoid that if possible.

comment:5 Changed 2 years ago by nh2

@simonmar Could you try to reproduce? Try this:

import Control.Concurrent

main = threadDelay 1000000000

Compile with ghc --make thefile.hs (8.0.2), pet the pid with ps, sudo gdb -p thepid, generate-core-file.

For me (Ubuntu 16.04) that runs forever, writing GB after GB to core.* in gdb's working directory.

comment:6 Changed 2 years ago by bgamari

Indeed I can reproduce this.

comment:7 Changed 2 years ago by nicolast

Cc: nicolast added

comment:8 Changed 2 years ago by simonmar

Ah, so this is something to do with gdb's generate-core-file. Ordinary core dumps work just fine, e.g. if I send SIGQUIT to the process by hitting ^\:

> ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.0.2
> ghc foo.hs
> ./foo
^\Quit (core dumped)
> ls -l core
-rw------- 1 smarlow smarlow 1589248 Sep  8 07:26 core
> gdb foo core
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from foo...done.
[New LWP 21001]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./foo'.
Program terminated with signal SIGQUIT, Quit.
#0  0x00007fcd7124c573 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:84
84	../sysdeps/unix/syscall-template.S: No such file or directory.

comment:9 Changed 2 years ago by bgamari

Ahh, so it does. I have Phab:D3929 implementing the idea in ticket:14193, but sadly it doesn't fix gdb. Quite unfortunate but it sounds like gdb is just broken

Last edited 2 years ago by bgamari (previous) (diff)

comment:10 Changed 2 years ago by nh2

It there another way to take a core dump of a running program without terminating it that could be used in this situation?

Also, is it known why the gdb approach doesn't work?

comment:11 Changed 2 years ago by nicolast

comment:12 Changed 2 years ago by nh2

May also depend on GDB version: https://sourceware.org/bugzilla/show_bug.cgi?id=16092

My gdb (>= 7.11.1) certainly has the feature; I found it can be checked with show use-coredump-filter.

There must be more smarts or difference in behaviour that Linux has but GDB doesn't.

comment:13 Changed 2 years ago by bgamari

It sounds like maybe we should be setting MADV_DONTDUMP where available (and later revert it with MADV_DODUMP).

comment:14 Changed 2 years ago by bgamari

So Phab:D3929 doesn't currently address the issue. I've found that strace produces some suspicious looking output when run on a program compiled with that patch,

$ cat >hi.hs <<EOF
main = putStrLn "hello"
EOF
$ inplace/bin/ghc-stage2 hi.hs
$ strace ./hi
...
sysinfo({uptime=1041754, loads=[86240, 81760, 80384], totalram=33598783488, freeram=871940096, sharedram=3290808320, bufferram=1270706176, totalswap=0, freeswap=0, procs=703, totalhigh=0, freehigh=0, mem_unit=1}) = 0
mmap(0x4200000000, 1099512676352, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x4200000000
madvise(0x4200000000, 1099512676352, 0x14 /* MADV_??? */) = -1 EINVAL (Invalid argument)
mmap(0x4200000000, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4200000000
madvise(0x4200000000, 1048576, 0x13 /* MADV_??? */) = -1 EINVAL (Invalid argument)
mmap(0x4200100000, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4200100000
madvise(0x4200100000, 1048576, 0x13 /* MADV_??? */) = -1 EINVAL (Invalid argument)
timer_create(CLOCK_MONOTONIC, {sigev_signo=SIGVTALRM, sigev_notify=SIGEV_SIGNAL}, [0]) = 0
...

In particular note the madvise calls.

Last edited 2 years ago by bgamari (previous) (diff)

comment:15 Changed 2 years ago by bgamari

Ahh, I see, advise isn't a bitmap; I've confirmed that this,

#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <assert.h>
#include <sys/mman.h>

int main() {
        const int len = 1024*1024*1024;
        void *p = mmap(NULL, len, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0);
        if (p == MAP_FAILED) {
                printf("uh oh: mmap failed: %s\n", strerror(errno));
        }
        int res = madvise(p, len, MADV_DONTDUMP);
        if (res != 0) {
                printf("uh oh: madvise failed: %s\n", strerror(errno));
        }
        assert(0); // use generate-core-file in gdb when this is hit
        return 0;
}

produces a small (510kB) core dump in gdb.

comment:16 Changed 2 years ago by bgamari

With Phab:D3929 appropriately updated the test from comment:5 produces a 7MByte core dump from gdb.

comment:17 Changed 2 years ago by Ben Gamari <ben@…>

In 1d1b991/ghc:

rts: Inform kernel that we won't need reserved address space

Trac #14192 points out that currently GHC's two-step allocator results
in extremely large coredumps. It seems like WebKit may have encountered
similar issues and their apparent solution uses madvise(MADV_DONTNEED)
while reserving address space to inform the kernel that the address
space we just requested needs no backing. Perhaps this is used by the
core dump logic to trim out uncommitted pages.

Test Plan: Validate, try core-dumping a compiled executable

Reviewers: austin, erikd, simonmar

Reviewed By: simonmar

Subscribers: rwbarton, thomie

GHC Trac Issues: #14192, #14193

Differential Revision: https://phabricator.haskell.org/D3929

comment:18 Changed 2 years ago by bgamari

Milestone: 8.4.1

comment:19 Changed 2 years ago by bgamari

Resolution: fixed
Status: newclosed

comment:20 Changed 22 months ago by nh2

I can confirm that with this patch, generate-core-file works fine.

For a simple 2-line application, it generates me a ~8 MB core file, and I can re-load that core file into gdb and look at the backtrace successfully.

Nice work!

Note: See TracTickets for help on using tickets.