Opened 6 years ago

Last modified 4 years ago

#8623 new bug

Strange slowness when using async library with FFI callbacks

Reported by: JohnWiegley Owned by: simonmar
Priority: normal Milestone:
Component: Runtime System Version: 7.6.3
Keywords: Cc: michael@…, simonmar
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by hvr)

I've attached a Haskell and a C file, when compiled as such:

ghc -DSPEED_BUG=0 -threaded -O2 -main-is SpeedTest SpeedTest.hs SpeedTestC.c

You should find that with 7.4.2, 7.6.3 or a recent build of 7.8, building with SPEED_BUG=0 produces an executable that takes more than a second to run, while building with SPEED_BUG=1 runs very quickly. I've also attached the Core for both scenarios.

Attachments (4)

SpeedTest.hs (792 bytes) - added by JohnWiegley 6 years ago.
SpeedTestC.c (79 bytes) - added by JohnWiegley 6 years ago.
SpeedTest.slow (20.1 KB) - added by JohnWiegley 6 years ago.
SpeedTest.fast (21.7 KB) - added by JohnWiegley 6 years ago.

Download all attachments as: .zip

Change History (16)

Changed 6 years ago by JohnWiegley

Attachment: SpeedTest.hs added

Changed 6 years ago by JohnWiegley

Attachment: SpeedTestC.c added

Changed 6 years ago by JohnWiegley

Attachment: SpeedTest.slow added

Changed 6 years ago by JohnWiegley

Attachment: SpeedTest.fast added

comment:1 Changed 6 years ago by hvr

fwiw, I suspect some kind of lock contention issue, as the runtime goes down as soon as there's some delay involved (such as putStr or even using threadDelay which causes the thread to yield) as well as using more than one HEC via +RTS -N2

comment:2 Changed 6 years ago by hvr

Description: modified (diff)

comment:3 Changed 6 years ago by JohnWiegley

But Control.Concurrent.yield does not help, which I would have thought would also cause the thread to yield.

comment:4 Changed 6 years ago by JohnWiegley

I tried using just forkIO and MVar, but it does not exhibit the problem:

main = do
    let test = mk'speed_test_cb (return ()) >>= speed_test
    test
    m <- newEmptyMVar
    _ <- forkIO $ test >> putMVar m ()
    takeMVar m

comment:5 Changed 6 years ago by carter

doe the problem go away if the program is compiled in GHC HEAD using -fno-omit-yields ?

comment:6 Changed 6 years ago by JohnWiegley

No, it makes no difference.

comment:7 Changed 6 years ago by simonmar

So I understand why this happens, but I don't have a good idea for what to do about it yet.

The problem is that each call into the RTS from the C function is very short, and when it is complete. the next call creates a new thread that goes to the back of the run queue. The main thread then gets a full time slice before the new thread gets to run. You can see the effect with +RTS -C0.1 which halves the time slice length and halves the time to run the program.

The new thread goes to the back of the queue to avoid possible starvation of other threads in the queue. Somehow we want to make the new thread inherit the rest of the timeslice from the previous thread, but I need to think some more about how we can do that.

comment:8 Changed 5 years ago by thoughtpolice

Milestone: 7.8.37.10.1

Moving to 7.10.1

comment:9 Changed 5 years ago by thomie

Operating System: MacOS XUnknown/Multiple

comment:10 Changed 5 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:11 Changed 4 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:12 Changed 4 years ago by thomie

Milestone: 8.0.1
Note: See TracTickets for help on using tickets.