Changes between Version 1 and Version 2 of HiddenDangersOfTouch


Ignore:
Timestamp:
Nov 28, 2018 2:09:43 PM (13 months ago)
Author:
tdammers
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • HiddenDangersOfTouch

    v1 v2  
    1818}}}
    1919
    20 If we were to implement the above naively, then the garbage collector might collect whatever `p` points to right after `poke p a`, and the subsequent `peek` ends up dereferencing a dangling pointer. This is bad.
     20If we were to implement the above, we would have to decide how to manage the lifecycle of the allocated memory. But none of the options is great:
    2121
    22 A somewhat obvious solution is to change `allocaBytes` to have a different type: instead of `allocaBytes :: Int -> IO (Ptr a)`, we make it `allocaBytes :: Int -> (Ptr a -> IO b) -> IO b`. This is basically the tried-and-tested bracket pattern, allowing us to inject some sort of magical construct after the payload code that makes sure that whatever the pointer points to is held onto until the pointer itself is garbage collected. That magical construct is the `touch#` primop, which essentially says "I will need this value to be alive until at least this point". So `allocaBytes` can be implemented as the moral equivalent of:
     22- We can allocate unmanaged memory through `malloc()` and just use that; it will stay alive until the process exits. This is undesirable because we're leaking memory.
     23- We can allocate unmanaged memory through `malloc()` and leave the responsibility for cleaning up to the caller. The existing `mallocBytes` function does exactly that, but it is often undesirable because it is brittle; we want something that automatically cleans up after itself.
     24- We can allocate managed memory and thus leave cleanup to the GC. The problem with this is that we're not retaining any references that the GC can follow, so our allocated memory becomes eligible for GC right after we return from `allocaBytes`.
     25
     26A somewhat obvious solution is to change `allocaBytes` to have a different type: instead of `allocaBytes :: Int -> IO (Ptr a)`, we make it `allocaBytes :: Int -> (Ptr a -> IO b) -> IO b`. This is the tried-and-tested bracket pattern, allowing us to run deallocation code after the payload action has finished. A simple implementation might look like this:
    2327
    2428{{{#!haskell
    2529allocaBytes n action = do
    26     p <- allocateMemory n
     30    p <- mallocBytes n
    2731    retval <- action p
    28     touch# p
     32    free p
    2933    return retval
    3034}}}
    3135
    32 Great! However, as evidenced by #14346, there is a problem. The correctness of the above code hinges on the presence of `touch#` (without it, the GC is free to pick up whatever `p` points to before `action p` finishes), but GHC's optimizer doesn't actually know this. Specifically, if `action` can be proven to never return, then the dead code elimination optimization will happily delete the `touch#` and `return` parts, and we're back at square 1.
     36And this would actually work. However, `malloc` isn't very efficient compared to the Haskell heap, so ideally, we would want to have the cake and eat it by allocating from the Haskell heap, while still keeping the memory alive until the action has finished. This means that we need a magical token that tells the GC that the memory we allocated is still alive, and which we can place in a strategic position right after the action.
     37
     38That magical construct is the `touch#` primop, which essentially says "I will need this value to be alive until at least this point". So `allocaBytes` can be implemented as the moral equivalent of:
     39
     40{{{#!haskell
     41allocaBytes n action = do
     42    a <- newByteArray n
     43    p <- addressOf a
     44    retval <- action p
     45    touch# a
     46    return retval
     47}}}
     48
     49Great! However, as evidenced by #14346, there is a problem. The correctness of the above code hinges on the presence of `touch#` (without it, the GC is free to pick up `a` before `action p` finishes), but GHC's optimizer doesn't actually know this. Specifically, if `action` can be proven to never return, then the dead code elimination optimization will happily delete the `touch#` and `return` parts, and we're back at square 1.
    3350
    3451The first workaround for this was to add a `NOINLINE` pragma on `allocaBytes` - without inlining `allocaBytes`, the dead code elimination stops at the function boundary, and cannot proceed to removing `touch#`, because: