Opened 2 years ago

Closed 9 months ago

#14494 closed bug (fixed)

TBQueue leaks space under certain workloads

Reported by: arybczak Owned by:
Priority: normal Milestone:
Component: libraries/stm Version: 8.2.1
Keywords: stm Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

I'm using TBQueue and I noticed suspiciously high memory usage, so I decided to profile and it turned out that readTBQueue leaks space (see attached before.png).

After closer inspection it turned out it's the writeTVar rsize (r + 1) in readTBQueue definition that's the problem - after substitution it for writeTVar rsize $! r + 1 the leak is gone (see attached after.png)

Here are -s outputs:

Before:

 366,535,518,024 bytes allocated in the heap
 115,643,281,224 bytes copied during GC
     241,356,416 bytes maximum residency (1182 sample(s))
       1,516,944 bytes maximum slop
             392 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     247273 colls, 247273 par   128.854s  28.654s     0.0001s    0.0182s
  Gen  1      1182 colls,  1181 par   352.162s  87.812s     0.0743s    0.1322s

  Parallel GC work balance: 78.17% (serial 0%, perfect 100%)

  TASKS: 24 (1 bound, 16 peak workers (23 total), using -N4)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.003s  (  0.003s elapsed)
  MUT     time  581.754s  (226.191s elapsed)
  GC      time  317.130s  ( 75.533s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time  163.885s  ( 40.933s elapsed)
  EXIT    time    0.013s  (  0.011s elapsed)
  Total   time  1062.789s  (301.738s elapsed)

  Alloc rate    630,052,684 bytes per MUT second

  Productivity  54.7% of total user, 61.4% of total elapsed

gc_alloc_block_sync: 8998531
whitehole_spin: 96
gen[0].sync: 180553
gen[1].sync: 31648044

After:

 431,671,260,464 bytes allocated in the heap
  86,540,207,400 bytes copied during GC
     170,338,336 bytes maximum residency (1381 sample(s))
       1,159,472 bytes maximum slop
             260 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     290179 colls, 290179 par   148.921s  33.097s     0.0001s    0.0217s
  Gen  1      1381 colls,  1380 par   206.679s  51.492s     0.0373s    0.0528s

  Parallel GC work balance: 75.51% (serial 0%, perfect 100%)

  TASKS: 23 (1 bound, 17 peak workers (22 total), using -N4)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.005s  (  0.004s elapsed)
  MUT     time  681.718s  (241.009s elapsed)
  GC      time  258.643s  ( 60.370s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time   96.957s  ( 24.219s elapsed)
  EXIT    time    0.009s  (  0.007s elapsed)
  Total   time  1037.335s  (301.390s elapsed)

  Alloc rate    633,210,748 bytes per MUT second

  Productivity  65.7% of total user, 71.9% of total elapsed

gc_alloc_block_sync: 5494680
whitehole_spin: 184
gen[0].sync: 184109
gen[1].sync: 24223953

Attached patch fixes the problem (I made all Int increments/decrements in the module strict as there is no need for them to be lazy).

Attachments (3)

before.png (103.0 KB) - added by arybczak 2 years ago.
after.png (131.0 KB) - added by arybczak 2 years ago.
0001-Fix-space-leak-in-TBQueue-14494.patch (1.7 KB) - added by arybczak 2 years ago.

Download all attachments as: .zip

Change History (6)

Changed 2 years ago by arybczak

Attachment: before.png added

Changed 2 years ago by arybczak

Attachment: after.png added

Changed 2 years ago by arybczak

comment:1 Changed 2 years ago by arybczak

Status: newpatch

comment:2 Changed 9 months ago by bgamari

Component: libraries (other)libraries/stm

comment:3 Changed 9 months ago by arybczak

Resolution: fixed
Status: patchclosed

It was fixed by https://github.com/haskell/stm/pull/2 some time ago.

Note: See TracTickets for help on using tickets.