Locating the cause of a crash

There has been a vigorous thread on error attribution ("I get a head [] error; but who called head?"). This page summarises various approaches.

Relevant tickets: #960, #1441, #3693, #9049, #10656


Here are the approaches we have under consideration

  • (PROF) Use profiling. Almost all the same issues arise in answering the question "where should I attribute this allocation cost?" as arise when answering "where did this error arise"? If you
    • Compile with profiling
    • Run with +RTS -xc

then any crash (call to error) will yield an informative backtrace. The backtrace gives you a stack of calls that looks very like what you'd get in a call-by-value language. (Lots of papers about profiling in a lazy language, dating right back to Formally based profiling for higher order functional languages give the background.)

You need to compile your whole program with profiling, and you need a profiled version of the packages you install. You can do the latter by adding --enable-profiling to Cabal, and you can put that in your ~/.cabal file.

Programs run slower, of course, but that may not matter when debugging. But a crash in a production system will generate no useful information.

  • (DYN) Walk the call stack, generating a dynamic backtrace. This is what every other language does; it works in a production system (i.e. all debug flags off); and it carries zero runtime overhead unless a crash actually happens.

One difficulty is that the backtrace is unintuitive, because of lazy evaluation, but it is still massively better than nothing. Another difficulty is that GHC shakes the program around during optimisation, so it is hard to say what code comes from where.

Addressing these challenges is the subject of Peter Wortman's PhD. He has a paper Causality of Optimized Haskell: What is burning our cycles?, and an implementation is well advanced (in GHC 7.10).

  • (NEEDLE) Finding the needle. This is a cross between (PROF) and (DYN). It transforms the program, but in a less invasive way than for full profiling. Lots more details on ExplicitCallStack/FindingTheNeedle. We don't currently plan to implement this in HEAD: it is not clear that, given (PROF) and (DYN), it's worth a third path, and one that is non-trivial to implement (as you'll see from the paper).

Other relevant writings

Last modified 2 years ago Last modified on Nov 16, 2017 9:50:52 PM