Ticket #66 (closed defect: fixed)

Opened 18 months ago

Last modified 18 months ago

! allocates needlessly

Reported by: Khudyakov Owned by:
Priority: critical Milestone: 0.9.1
Version: Keywords:
Cc:

Description

It looks like ! operator allocates memory each time it's called. Probably it creates closure for error function even it's not called. Here is simple program which demonstrates issue:

import qualified Data.Vector.Unboxed as U
import System.Environment

n :: Int
n = 137

vec :: U.Vector Int
vec = U.enumFromN 1 n

resSafe :: Int -> Int
resSafe = U.sum . U.map (\i -> (U.!) vec (i `rem` n)) . U.enumFromN 0

resUnsafe :: Int -> Int
resUnsafe = U.sum . U.map (\i -> U.unsafeIndex vec (i `rem` n)) . U.enumFromN 0

main :: IO ()
main = do
  [t,k] <- getArgs
  case t of 
    "safe"   -> print $ resSafe   (read k)
    "unsafe" -> print $ resUnsafe (read k)

Here is allocation statistics:

./ix safe 100000000 +RTS -s
   3,225,080,440 bytes allocated in the heap
          58,304 bytes copied during GC
          28,816 bytes maximum residency (1 sample(s))
          19,264 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  6151 collections,     0 parallel,  0.01s,  0.02s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    2.90s  (  2.93s elapsed)
  GC    time    0.01s  (  0.02s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    2.92s  (  2.94s elapsed)

  %GC time       0.5%  (0.6% elapsed)

  Alloc rate    1,110,733,182 bytes per MUT second

  Productivity  99.5% of total user, 98.6% of total elapsed
./ix unsafe 100000000 +RTS -s 
          80,624 bytes allocated in the heap
           2,880 bytes copied during GC
          43,784 bytes maximum residency (1 sample(s))
          21,752 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:     0 collections,     0 parallel,  0.00s,  0.00s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    1.89s  (  1.89s elapsed)
  GC    time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    1.89s  (  1.89s elapsed)

  %GC time       0.0%  (0.0% elapsed)

  Alloc rate    42,732 bytes per MUT second

  Productivity 100.0% of total user, 99.8% of total elapsed

Originally discovered by klapaucius https://gist.github.com/1374044

Change History

Changed 18 months ago by rl

  • priority changed from major to critical

This looks like a GHC bug to me but it's fairly easy to fix in vector by worker-wrappering the error functions manually.

Changed 18 months ago by rl

  • status changed from new to closed
  • resolution set to fixed
  • milestone set to 0.9.1

Very well spotted, thanks a lot for the excellent bug report, and mostly in Russian, too! Fixed by:

Sun Nov 27 15:42:30 GMT 2011  Roman Leshchinskiy <rl@cse.unsw.edu.au>
  * Manually worker/wrapper error functions (fixes #66)

Allocation stats before:

safe:
     800,056,892 bytes allocated in the heap
          11,760 bytes copied during GC
          27,248 bytes maximum residency (1 sample(s))
          17,292 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

unsafe:
          57,176 bytes allocated in the heap
           1,508 bytes copied during GC
          42,380 bytes maximum residency (1 sample(s))
          19,060 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

After:

safe:
          56,892 bytes allocated in the heap
           1,508 bytes copied during GC
          42,380 bytes maximum residency (1 sample(s))
          19,060 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

unsafe:
          57,176 bytes allocated in the heap
           1,508 bytes copied during GC
          42,380 bytes maximum residency (1 sample(s))
          19,060 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

I wish I knew why the version with bounds checking allocates a little less that the one without.

Note: See TracTickets for help on using tickets.