Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#8852 closed bug (fixed)

7.8.1 uses a lot of memory when compiling attoparsec programs using <|>

Reported by: joelteon Owned by:
Priority: normal Milestone: 7.8.4
Component: Compiler Version: 7.8.1-rc2
Keywords: Cc: idhameed@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

To reproduce, install a pre-0.11.2.1 version of attoparsec. This bug was worked around in 0.11.2.1 by removing the INLINE on plus in attoparsec.

With this test program:

{-# LANGUAGE OverloadedStrings #-}

import Control.Applicative
import Data.Attoparsec.Text
import Data.Text (Text)

parser :: Parser Text
parser = string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"
     <|> string "a" <|> string "a"

main :: IO ()
main = parseTest parser "a"

and using GHC 7.8.1.rc2:

Compiling using -O2, GHC tops out at ~1GB of RAM and takes 25s.

Using -O0, GHC takes 0.47s and uses <150MB of RAM.

Compare this with GHC 7.6.3:

Compiling using -O2, GHC uses <150MB and takes 3.7s. Memory usage is similar with -O0 although compile time goes down to 0.36s.

An extreme version of this bug can be found in the thyme package here: https://github.com/liyang/thyme/blob/master/src/Data/Thyme/Format.hs#L589-L693. Compiling that module with an unfixed attoparsec makes GHC use all available memory and stall out, forcing kill -9. Replacing the function body with undefined makes the package compile as expected.

Change History (10)

comment:1 Changed 5 years ago by ihameed

Cc: idhameed@… added

comment:2 Changed 5 years ago by rezb1t

I'd like to note that this problem occurs when compiling darcs 2.8.4, happy, alex, network 2.5.0.0, and quite a few other libraries. Something is very wrong here, GHC either uses a tiny amount of RAM or it uses gigabytes.

When compiling darcs, GHC used 4GB of RAM on one source file! Weirder yet, I've noticed that when used in tandem with cabal-install, the RAM isn't freed in between source files.. whether this is intended or not, I couldn't tell, but running ghc in make mode doesn't do this.

The behavior is the same whether it be GHC 7.8.2 or HEAD of ghc-7.8 currently

comment:3 Changed 5 years ago by simonpj

This is the dreaded SpecConstr blowup again (see #8980, #8941 (possibly), #8960, #7898, #7068, #7944, #5550, #8836).

Compiling with -fno-spec-constr makes it go through fine, so that's a workaround.

Simon

comment:4 Changed 5 years ago by simonpj

I made a little progress here.

First, SpecConstr is not generating lots of code! In fact it is doing no specialisation whatsoever. BUT it is nevertheless taking a very long time to do, well, nothing.

So something is afoot.

Simon

comment:5 Changed 5 years ago by rwbarton

Somehow I'm unable to reproduce this with ghc 7.8.1, 7.8.3 or HEAD and attoparsec 0.11.1.0. Anyone have specific instructions? Maybe I am using the wrong version of some other dependency?

comment:6 Changed 5 years ago by Simon Peyton Jones <simonpj@…>

In af4bc31c50c873344a2426d4be842f92edf17019/ghc:

Do not duplicate call information in SpecConstr (Trac #8852)

This long-standing and egregious bug meant that call information was
being gratuitously copied, leading to an exponential blowup in the
number of calls to be examined when function definitions are deeply
nested.  That is what has been causing the blowup in SpecConstr's
running time, not (as I had previously supposed) generating very large code.

See Note [spec_usg includes rhs_usg]

comment:7 Changed 5 years ago by simonpj

Resolution: fixed
Status: newclosed

OK I have finally fixed this bug. I think there is a good chance that doing so has also fixed #8852, #8980, #8941 (possibly), #8960, #7898, #7068, #7944, #8836. I have turned these tickets into info-needed status, because they are hard to reproduce, but this ticked #8852 is definitely fine now.

It's kind of hard to make a regression test without depending on an (out of date version of) attoparsec, and its dependencies, so I'm not adding a regression test at all for now.

Phew

Simon

comment:8 Changed 5 years ago by tibbe

This might also be the cause of https://github.com/kolmodin/binary/issues/60. Could we have a 7.8.4 release?

comment:9 Changed 5 years ago by simonpj

Can you not just switch off SpecConstr? But in principle, yes, if there is user pressure, we could push out 7.8.4

Simon

comment:10 Changed 5 years ago by tibbe

Milestone: 7.8.4
Note: See TracTickets for help on using tickets.