Opened 19 months ago

Last modified 10 months ago

#14865 new bug

GHC Defeats Manual Worker Wrapper with Unboxed Sum

Reported by: andrewthad Owned by:
Priority: normal Milestone:
Component: Compiler Version: 8.5
Keywords: UnboxedSums Cc: maoe
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Here's the code in question:

{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE TypeInType #-}
{-# LANGUAGE UnboxedSums #-}
{-# LANGUAGE UnboxedTuples #-}

{-# OPTIONS_GHC -O2 #-}

module Byte.Array.Window
  ( findByte
  , boxMaybeInt
  ) where

import Data.Primitive (ByteArray)
import Data.Word (Word8)
import GHC.Types (RuntimeRep,TYPE)
import GHC.Int (Int(I#))
import GHC.Exts (Int#)
import qualified Data.Primitive as PM

type Maybe# (a :: TYPE (r :: RuntimeRep)) = (# (# #) | a #)

boxMaybeInt :: Maybe# Int# -> Maybe Int
boxMaybeInt = \case
  (# | a #) -> Just (I# a)
  (# (# #) | #) -> Nothing

unboxInt :: Int -> Int#
unboxInt (I# i) = i

-- | Finds the first occurrence of the given byte.
--   TODO: optimize this to search through a whole
--   Word64 at a time if the bytearray is pinned.
findByte :: Int -> Int -> Word8 -> ByteArray -> Maybe Int
findByte !off !len !w !arr = boxMaybeInt (go off) where
  go :: Int -> Maybe# Int#
  go !ix = if ix < len
    then if PM.indexByteArray arr ix == w
      then (# | unboxInt ix #)
      else go (ix + 1)
    else (# (# #) | #)

When compiled with GHC 8.5 with -ddump-simpl -dsuppress-all, here is the relevant part of the resulting Core:

-- RHS size: {terms: 33, types: 13, coercions: 0, joins: 1/1}
$wfindByte
$wfindByte
  = \ ww_s38C ww1_s38G ww2_s38K ww3_s38O ->
      joinrec {
        $wgo_s38v
        $wgo_s38v ww4_s38t
          = case <# ww4_s38t ww1_s38G of {
              __DEFAULT -> Nothing;
              1# ->
                case indexWord8Array# ww3_s38O ww4_s38t of wild_a36w { __DEFAULT ->
                case eqWord# wild_a36w ww2_s38K of {
                  __DEFAULT -> jump $wgo_s38v (+# ww4_s38t 1#);
                  1# -> Just (I# ww4_s38t)
                }
                }
            }; } in
      jump $wgo_s38v ww_s38C

-- RHS size: {terms: 21, types: 12, coercions: 0, joins: 0/0}
findByte
findByte
  = \ w_s38w w1_s38x w2_s38y w3_s38z ->
      case w_s38w of { I# ww1_s38C ->
      case w1_s38x of { I# ww3_s38G ->
      case w2_s38y of { W8# ww5_s38K ->
      case w3_s38z of { ByteArray ww7_s38O ->
      $wfindByte ww1_s38C ww3_s38G ww5_s38K ww7_s38O
      }
      }
      }
      }

I expected that the tail recursive go helpful function from my original code would still be a function that returns an unboxed sum when optimized and turned into Core. However, it isn't. The call to boxMaybeInt gets pushed into go. This means that when findByte is called and the result cased on, an allocation is going to happen. I think it would be preferable for boxMaybeInt to not get pushed into the worker, since boxMaybeInt (go off) could be inlined and the allocation of Maybe could be prevented (assuming that it was cased on right afterward).

Change History (4)

comment:1 Changed 19 months ago by simonpj

This is a classic difficulty. Given, say

f x = g (...x...)

perhaps we could split f into two parts

f x = g (f' x)
f' x = ...x...

and then perhaps we can inline the (now small) f, perhaps to good effect.

This is what you want to do here: you want

findByte !off !len !w !arr = boxMaybeInt (go off)

to be inlined at every call site. It's be simple to arrange, by giving len, w and arr as extra arguments to go.

But GHC just isn't clever enough to do that, unless you tell it to do so by hand, with an INLINE pragma.

I don't really know how to fix this. But it's a good example.

comment:2 Changed 19 months ago by andrewthad

Your suggested change produces the desired behavior:

findByte :: Int -> Int -> Word8 -> ByteArray -> Maybe Int
findByte !off !len0 !w0 !arr0 = boxMaybeInt (go off (len0 + off) w0 arr0) where
  go :: Int -> Int -> Word8 -> ByteArray -> Maybe# Int#
  go !ix !end !w !arr = if ix < end
    then if PM.indexByteArray arr ix == w
      then (# | unboxInt ix #)
      else go (ix + 1) end w arr
    else (# (# #) | #)

Results in the following Core:

Rec {
-- RHS size: {terms: 34, types: 21, coercions: 0, joins: 0/0}
$wgo
$wgo
  = \ ww_s3zd ww1_s3zh ww2_s3zl ww3_s3zp ->
      case tagToEnum# (<# ww_s3zd ww1_s3zh) of {
        False -> (#_|#) (##);
        True ->
          case indexWord8Array# ww3_s3zp ww_s3zd of wild1_a3vI { __DEFAULT ->
          case tagToEnum# (eqWord# wild1_a3vI ww2_s3zl) of {
            False -> $wgo (+# ww_s3zd 1#) ww1_s3zh ww2_s3zl ww3_s3zp;
            True -> (#|_#) ww_s3zd
          }
          }
      }
end Rec }

-- RHS size: {terms: 18, types: 11, coercions: 0, joins: 0/0}
$wfindByte
$wfindByte
  = \ ww_s3zy ww1_s3zC ww2_s3zG ww3_s3zK ->
      case $wgo ww_s3zy (+# ww1_s3zC ww_s3zy) ww2_s3zG ww3_s3zK of {
        (#_|#) ds_d3ui -> Nothing;
        (#|_#) a_a1Dk -> Just (I# a_a1Dk)
      }

-- RHS size: {terms: 21, types: 12, coercions: 0, joins: 0/0}
findByte
findByte
  = \ w_s3zs w1_s3zt w2_s3zu w3_s3zv ->
      case w_s3zs of { I# ww1_s3zy ->
      case w1_s3zt of { I# ww3_s3zC ->
      case w2_s3zu of { W8# ww5_s3zG ->
      case w3_s3zv of { ByteArray ww7_s3zK ->
      $wfindByte ww1_s3zy ww3_s3zC ww5_s3zG ww7_s3zK
      }
      }
      }
      }

I don't much mind having to do this by hand. It's just nice to know that there's a reliable way to coax GHC into doing it.

comment:3 Changed 19 months ago by andrewthad

It's weird that trying to manually apply the static argument transformation (which is probably not a performance win here anyway) gets in the way of this. For example, the following code produces the same core I originally was getting:

findByte :: Int -> Int -> Word8 -> ByteArray -> Maybe Int
findByte !off !len0 !w0 !arr0 = boxMaybeInt (goA off (len0 + off) w0 arr0)

goA :: Int -> Int -> Word8 -> ByteArray -> Maybe# Int#
goA !ix0 !end !w !arr = goB ix0 where
  goB :: Int -> Maybe# Int#
  goB !ix = if ix < end
    then if PM.indexByteArray arr ix == w
      then (# | unboxInt ix #)
      else goB (ix + 1)
    else (# (# #) | #)

comment:4 Changed 10 months ago by maoe

Cc: maoe added
Note: See TracTickets for help on using tickets.