Ticket #18 (closed defect: wontfix)

Opened 3 years ago

Last modified 3 years ago

folds for unboxed vectors are slow.

Reported by: Khudyakov Owned by: rl
Priority: major Milestone:
Version: 0.6 Keywords:
Cc:

Description

I found that fold based functions which operate on unboxed vectors are much slower (~7-50 times) than functions which operate on generic vectors. Benchmark is below.

{-# LANGUAGE FlexibleContexts #-}
import qualified Data.Vector.Unboxed as U
import qualified Data.Vector.Generic as G
import Criterion.Main

data T = T {-# UNPACK #-}!Double {-# UNPACK #-}!Int

-- Slow function
mean :: U.Vector Double -> Double
mean = fini . U.foldl go (T 0 0)
  where
    fini (T a _) = a
    go (T m n) x = T m' n'
        where m' = m + (x - m) / fromIntegral n'
              n' = n + 1
{-# INLINE mean #-}

-- fast function
mean' :: (G.Vector v Double) => v Double -> Double
mean' = fini . G.foldl go (T 0 0)
  where
    fini (T a _) = a
    go (T m n) x = T m' n'
        where m' = m + (x - m) / fromIntegral n'
              n' = n + 1
{-# INLINE mean' #-}

vec1e5 :: U.Vector Double
vec1e5 = U.replicate (10^5) 0

main :: IO ()
main = defaultMain [ bench "mean 1e5" $ nf mean  vec1e5
                   , bench "mean'1e5" $ nf mean' vec1e5
                   ]

Here is profiling results. U column is for mean, G is for mean'.

       | U(ms) | G(ms) |
foldl  | 47.3  |  1.08 |
foldl' |  6.54 |  1.09 |
foldr  | 44.3  |  1.99 |
fodlr' |  7.44 |  1.02 |

It looks like that monomorphic variant isn't properly optimized. Results are consistent between GHC6.10.4 and GHC6.12.1.

Change History

Changed 3 years ago by rl

  • owner set to rl

Thank you for the great report. Unfortunately, this turns out to be a GHC bug. It works fine with the HEAD. For 6.10 and 6.12, compiling with -fno-method-sharing is a workaround.

Here is the problem: GHC desugars mean to

mean ... = __inline_me (...(foldl_aZs go_aMc ($WT (D# 0.0) (I# 0))) ...)

foldl_aZs :: (T -> Double -> T)
             -> T
             -> U.Vector Double
             -> T
foldl_aZs = U.foldl @ Double @ T $dUnbox_aZv

Note how the call to foldl has been floated out of the __inline_me because GHC tries to share applications of overloaded functions to known dictionaries. This is very bad. It doesn't happen with the generic version because there the dictionary is not known.

I'll submit a bug report but I don't have much hope that it will be fixed in the 6.12 branch.

There is also an unfortunate interaction between the SPEC hack vector uses for guiding SpecConstr and strictness analysis. Without it, mean would more or less accidentally optimise better. I'll open a different ticket for that, though, since it doesn't affect the root of the problem.

Changed 3 years ago by rl

It turns out that there is already a GHC ticket for this at http://hackage.haskell.org/trac/ghc/ticket/3738.

Changed 3 years ago by rl

See #19 for the SPEC problem.

Changed 3 years ago by rl

  • status changed from new to closed
  • resolution set to wontfix

I'll close this as wontfix since the problem is really with GHC.

Note: See TracTickets for help on using tickets.