#15348 closed feature request (fixed)

Enable two-step allocator on FreeBSD

Reported by: bgamari Owned by:
Priority: normal Milestone: 8.6.1
Component: Compiler Version: 8.4.3
Keywords: Cc:
Operating System: FreeBSD Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D4939
Wiki Page:

Description

Currently the two-step allocator is disabled on FreeBSD as the MEM_NORESERVE macro is undefined. It seems that FreeBSD provided this macro until 2014, when it was removed as it wasn't implemented in the kernel. Regardless, Viktor Dukhovni reports empirical evidence on ghc-devs that just plain mmap does what we want.

Attachments (5)

patch-rts__posix__OSMem.c (3.3 KB) - added by vdukhovni 17 months ago.
OSMem.c patch with MAP_GUARD used alone rather than in combination with MAP_ANON|MAP_PRIVATE
patch-rts__posix__OSMem.2.c (3.3 KB) - added by vdukhovni 17 months ago.
[Updated] OSMem.c patch with MAP_GUARD used alone rather than in combination with MAP_ANON|MAP_PRIVATE
mem.pdf (9.3 KB) - added by vdukhovni 17 months ago.
Memory footprint of allocation-intensive application over 7 hours of run-time.
patch-rts__posix__OSMem.3.c (5.4 KB) - added by vdukhovni 17 months ago.
[Further code cleanup] OSMem.c patch with MAP_GUARD used alone rather than in combination with MAP_ANON|MAP_PRIVATE
0001-Enable-two-step-allocator-on-FreeBSD.patch (5.9 KB) - added by vdukhovni 16 months ago.
FreeBSD patch for master (git am format)

Download all attachments as: .zip

Change History (26)

comment:1 Changed 17 months ago by bgamari

Differential Rev(s): Phab:D4939
Status: newpatch

Could someone test Phab:D4939 on FreeBSD?

comment:2 Changed 17 months ago by vdukhovni

Specifically, one can reserve a contiguous address range with:

heap = mmap(NULL, heapmax, PROT_NONE, MAP_GUARD, -1, 0); 

and then incrementally populate segments of that range:

mmap(base, len, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE|MAP_FIXED, -1, 0);

With base == heap for the initial block of pages, and then to the first address not yet mapped for subsequent increments. What we don't get is protection from overcommit OOM, unless the system is configured to disallow overcommit.

comment:3 Changed 17 months ago by bgamari

vdukhovni, I've updated Phab:D4939 to account for your comment. Do you suppose you could test?

comment:4 Changed 17 months ago by vdukhovni

I can try to test, but the patch is not ready yet. FreeBSD does not support combining MAP_GUARD with other flags such as MAP_ANON or MAP_PRIVATE, it must stand alone.

comment:5 Changed 17 months ago by vdukhovni

From kern_mmap() in /usr/src/sys/vm/vm_mmap.c:

        if ((flags & MAP_GUARD) != 0 && (prot != PROT_NONE || fd != -1 ||
            pos != 0 || (flags & (MAP_SHARED | MAP_PRIVATE | MAP_PREFAULT |
            MAP_PREFAULT_READ | MAP_ANON | MAP_STACK)) != 0))
                return (EINVAL);

comment:6 Changed 17 months ago by Ben Gamari <ben@…>

In 87367158/ghc:

rts: Enable two-step allocator on FreeBSD

Previously we would prevent any operating system not providing the
MEM_NORESERVE flag
from using the two-step allocator. Afterall, Linux will reserve
swap-space for
a mapping unless this flag is given, which is most certainly not what
we want.

However, it seems that FreeBSD provides the reservation-only mapping
behavior
that we expect despite not providing the MEM_NORESERVE macro. In fact,
it
provided the macro until 2014, when it was removed on account of not
being
implemented in the kernel. However, empirical evidence suggests that
just plain
mmap does what we want.

Reviewers: erikd, simonmar

Subscribers: rwbarton, thomie, erikd, carter

GHC Trac Issues: #15348

Differential Revision: https://phabricator.haskell.org/D4939

Changed 17 months ago by vdukhovni

Attachment: patch-rts__posix__OSMem.c added

OSMem.c patch with MAP_GUARD used alone rather than in combination with MAP_ANON|MAP_PRIVATE

comment:7 Changed 17 months ago by vdukhovni

I modified the proposed OSMem.c patch as attached ,rebuilt GHC-8.4.3 and am now running my DANE scanner (runs for ~8 hours, using around 300MB of VM space, and doing a total of around 4TB of allocations). So far so good. And it did allocate a 1TB hole in its address space, partly replaced with actually allocated pages. Because this is a reserved hole, and not a mapping, the reported virtual size is still modest.

comment:8 Changed 17 months ago by vdukhovni

Small change for HPUX, instead of

#undef MAP_ANON
#define MAP_ANON ...

safer to

#ifndef MAP_ANON
#define MAP_ANON ...

Changed 17 months ago by vdukhovni

Attachment: patch-rts__posix__OSMem.2.c added

[Updated] OSMem.c patch with MAP_GUARD used alone rather than in combination with MAP_ANON|MAP_PRIVATE

comment:9 Changed 17 months ago by vdukhovni

So far, this works just fine. The scanner ran for ~7 hours, with reasonable graphs of rss/vsz over time (attached). For the next run, I'll throw in RTS memory stats.

Changed 17 months ago by vdukhovni

Attachment: mem.pdf added

Memory footprint of allocation-intensive application over 7 hours of run-time.

Changed 17 months ago by vdukhovni

Attachment: patch-rts__posix__OSMem.3.c added

[Further code cleanup] OSMem.c patch with MAP_GUARD used alone rather than in combination with MAP_ANON|MAP_PRIVATE

comment:10 Changed 17 months ago by vdukhovni

I found the OSMem.c code a bit of an #ifdef maze. The latest patch attached should improve clarity, by better separating the logic from CPP conditionals. One side-effect is that madvise(MADV_WILLNEED) will now be called on COMMIT not only for Linux, but also for FreeBSD and other non-Darwin mmap() API platforms. Will test this shortly...

comment:11 Changed 17 months ago by vdukhovni

The latest patch runs fine... I'm done.

comment:12 Changed 16 months ago by bgamari

Thanks for finishing this up!

comment:13 Changed 16 months ago by bgamari

vdukhovni, I'm having a bit of trouble applying the patch you attached. Could you either rebase against master and upload a new patch or push a branch somewhere? Thanks!

comment:14 Changed 16 months ago by bgamari

Pinging vdukhovni.

Changed 16 months ago by vdukhovni

FreeBSD patch for master (git am format)

comment:15 Changed 16 months ago by vdukhovni

Updated patch attached. (Further questions/comments would be best via direct email, I don't seem to have notifications enabled here)

comment:16 Changed 16 months ago by Ben Gamari <ben@…>

In 123aeb91/ghc:

Enable two-step allocator on FreeBSD

Simplify #ifdef nesting and use MAP_GUARD on FreeBSD and similar
systems. This allows the two-step allocator to be used on FreeBSD,
fixing #15348.

comment:17 Changed 16 months ago by bgamari

Status: patchmerge

comment:18 Changed 16 months ago by bgamari

vdukhovni, the patch has been merged.

That being said, I'm a bit confused; I have tried building it under a FreeBSD 11 VM but it doesn't appear that MAP_GUARD is defined. I checked in both the mmap(2) manpage as well as <sys/mman.h> and found no mention of MAP_GUARD. What am I missing?

comment:19 Changed 16 months ago by bgamari

Ahh, the problem is that I'm on 11.0 yet apparently MAP_GUARD is only supported in 11.1 and later (judging by the manpages on freebsd.org)

comment:20 Changed 16 months ago by vdukhovni

Yes, I'm on a FreeBSD 11.1 system:

$ uname -sr
FreeBSD 11.1-RELEASE-p10
$ grep -rw MAP_GUARD /usr/include/sys
/usr/include/sys/mman.h:#define MAP_GUARD        0x00002000 /* reserve but don't map address range */

comment:21 Changed 16 months ago by bgamari

Resolution: fixed
Status: mergeclosed

Indeed things look better with FreeBSD 11.2.

Merged to ghc-8.6 with 79e136104922aa4dcb555084731a890294cda106.

Note: See TracTickets for help on using tickets.