Opened 2 years ago

Last modified 2 years ago

#14256 new bug

GHCi is faster than compiled code

Reported by: mpickering Owned by:
Priority: normal Milestone:
Component: Compiler Version: 8.2.1
Keywords: Cc: sibi
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


A stackoverflow user Milad Zahedi reports that running his code in GHCi is 50x faster than compiling it and running the compiled binary.

Today I did little benchmarking on my local machine to compare plain text speed of different Haskell web frameworks, and I noticed something strange. Almost all the frameworks that I tested, performed better when they were run from GHCi compared to compiled version. here are my results

|framework| GHCi rpm   | compiled rpm 
|snap     | 8000       | 150
|yesod    | 6000       | 2500
|scotty   | 22000      | 9500
|servant  | 17000      | 8500
|spock    | 3300       | 2700

I know that these numbers do not reflect on these frameworks speed, since they are not well tuned or optimized, but my question is why are these frameworks performing better when launched from GHCi. Am I doing something wrong ?

in order to build them I simply run stack build

The benchmarks are from

This ticket is to verify and investigate these claims.

Change History (5)

comment:1 Changed 2 years ago by sibi

Cc: sibi added

comment:2 Changed 2 years ago by bgamari

Type of failure: None/UnknownRuntime performance bug

comment:3 Changed 2 years ago by saep

I experienced the same thing with scotty and servant with ghc 8.0.2 (stackage snapshot lts-9.5).

I ran it on nixos and debian (both 64 bit linux). The numbers I measured are very close to those above. The code was also very similar to the mentioned benchmark, so I don't think it's worth posting.

comment:4 Changed 2 years ago by saep

I think the problem is with the benchmarking tool. I used apache bench and got the same results as above. Just today, a colleague recommended to use wrk ( for benchmarking and with it, the compiled binary was always faster.

I initially suspected that the threaded runtime of the compiled binary could have been responsible for the slow-down because running the server from ghci only occupied 1 core whereas the binary used all my 8(ish) cores. When I only use "-O2" and not "-O2 -threaded -rtsopts -with-rtsopts=-N", the apache bench measurements of the compiled binary were, as intuitively expected, better than the ghci ones.

comment:5 Changed 2 years ago by bgamari

Thanks for looking into this saep!

Note: See TracTickets for help on using tickets.