Opened 9 years ago

Closed 9 years ago

#4270 closed bug (fixed)

Out of memory when compiling Statistics.Quantile

Reported by: Itkovian Owned by: igloo
Priority: high Milestone: 7.2.1
Component: Compiler Version: 6.13
Keywords: Out-of-memory Cc: Roman.Leshchinskiy@…, rl, michal.terepeta@…
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Compile-time crash Test Case: Statistics.Quantile
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

A recent build of GHC (6.13.20100823) runs out of memory when building the statistics-0.6.0.2 package while compiling the Statistics/Quantile.hs file.

Not that I am _not_ using the LLVM backend.

Info:

gengar2:~ $ uname -a
Linux gengar2 2.6.18-128.1.1.el5.perfctr.2.6.38.ug.1 #1 SMP Sun Feb 22 21:38:17 CET 2009 x86_64 x86_64 x86_64 GNU/Linux

I'm on an Intel Xeon machine,

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Xeon(R) CPU           L5420  @ 2.50GHz
stepping	: 6
cpu MHz		: 2500.093
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips	: 5003.65
clflush size	: 64
cache_alignment	: 64
address sizes	: 38 bits physical, 48 bits virtual

Relevant steps:

gengar2:~/.cabal/packages/hackage.haskell.org/statistics/0.6.0.2/statistics-0.6.0.2 $ ghc --info
 [("Project name","The Glorious Glasgow Haskell Compilation System")
 ,("Project version","6.13.20100823")
 ,("Booter version","6.12.1")
 ,("Stage","2")
 ,("Build platform","x86_64-unknown-linux")
 ,("Host platform","x86_64-unknown-linux")
 ,("Target platform","x86_64-unknown-linux")
 ,("Have interpreter","YES")
 ,("Object splitting","YES")
 ,("Have native code generator","YES")
 ,("Have llvm code generator","YES")
 ,("Support SMP","YES")
 ,("Unregisterised","NO")
 ,("Tables next to code","YES")
 ,("RTS ways","l debug  thr thr_debug thr_l thr_p  dyn debug_dyn thr_dyn thr_debug_dyn")
 ,("Leading underscore","NO")
 ,("Debug on","False")
 ,("LibDir","/user/home/gent/vsc400/vsc40075/data/ghc-llvm/lib/ghc-6.13.20100823")
 ,("Global Package DB","/user/home/gent/vsc400/vsc40075/data/ghc-llvm/lib/ghc-6.13.20100823/package.conf.d")
 ]
gengar2:~/.cabal/packages/hackage.haskell.org/statistics/0.6.0.2/statistics-0.6.0.2 $ cabal configure --user -v
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/ghc --numeric-version
looking for package tool: ghc-pkg near compiler in
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin
found package tool in
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/ghc-pkg
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/ghc-pkg --version
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/ghc --supported-languages
Reading installed packages...
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/ghc-pkg dump --global
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/ghc-pkg dump --user
Reading available packages...
Resolving dependencies...
selecting statistics-0.6.0.2 (hackage)
selecting base-4.3.0.0 (installed)
selecting erf-1.0.0.0 (installed)
selecting ffi-1.0 (installed)
selecting ghc-prim-0.2.0.0 (installed)
selecting integer-gmp-0.2.0.0 (installed)
selecting mwc-random-0.7.0.0 (installed)
selecting Cabal-1.9.2 (installed)
selecting array-0.3.0.0 (installed)
selecting bin-package-db-0.0.0.0 (installed)
selecting binary-0.5.0.2 (installed)
selecting bytestring-0.9.1.7 (installed)
selecting containers-0.3.0.0 (installed)
selecting directory-1.0.1.2 (installed)
selecting filepath-1.2.0.0 (installed)
selecting ghc-6.13.20100823 (installed)
selecting hpc-0.5.0.5 (installed)
selecting old-locale-1.0.0.2 (installed)
selecting old-time-1.0.0.5 (installed)
selecting pretty-1.0.1.1 (installed)
selecting primitive-0.3 (installed)
selecting process-1.0.1.3 (installed)
selecting rts-1.0 (installed)
selecting template-haskell-2.4.0.0 (installed)
selecting time-1.2.0.3 (installed)
selecting unix-2.4.0.1 (installed)
selecting vector-0.6.0.2 (installed)
selecting vector-algorithms-0.3.2 (installed)
Configuring statistics-0.6.0.2...
Dependency base ==4.3.0.0: using base-4.3.0.0
Dependency erf ==1.0.0.0: using erf-1.0.0.0
Dependency mwc-random ==0.7.0.0: using mwc-random-0.7.0.0
Dependency primitive ==0.3: using primitive-0.3
Dependency time ==1.2.0.3: using time-1.2.0.3
Dependency vector ==0.6.0.2: using vector-0.6.0.2
Dependency vector-algorithms ==0.3.2: using vector-algorithms-0.3.2
Using Cabal-1.8.0.2 compiled by ghc-6.12
Using compiler: ghc-6.13.20100823
Using install prefix: /user/home/gent/vsc400/vsc40075/.cabal
Binaries installed in: /user/home/gent/vsc400/vsc40075/.cabal/bin
Libraries installed in:
/user/home/gent/vsc400/vsc40075/.cabal/lib/statistics-0.6.0.2/ghc-6.13.20100823
Private binaries installed in: /user/home/gent/vsc400/vsc40075/.cabal/libexec
Data files installed in:
/user/home/gent/vsc400/vsc40075/.cabal/share/statistics-0.6.0.2
Documentation installed in:
/user/home/gent/vsc400/vsc40075/.cabal/share/doc/statistics-0.6.0.2
Using alex version 2.3.3 found on system at:
/user/home/gent/vsc400/vsc40075/.cabal/bin/alex
Using ar found on system at: /usr/bin/ar
No c2hs found
No cpphs found
No ffihugs found
Using gcc version 4.1.2 found on system at: /usr/bin/gcc
Using ghc version 6.13.20100823 found on system at:
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/ghc
Using ghc-pkg version 6.13.20100823 found on system at:
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/ghc-pkg
No greencard found
Using haddock version 2.7.2 found on system at:
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/haddock
Using happy version 1.18.4 found on system at:
/user/home/gent/vsc400/vsc40075/.cabal/bin/happy
No hmake found
Using hsc2hs version 0.67 found on system at:
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/hsc2hs
No hscolour found
No hugs found
No jhc found
Using ld found on system at: /usr/bin/ld
No lhc found
No lhc-pkg found
No nhc98 found
Using pkg-config version 0.21 found on system at: /usr/bin/pkg-config
Using ranlib found on system at: /usr/bin/ranlib
Using strip found on system at: /usr/bin/strip
Using tar found on system at: /bin/tar
gengar2:~/.cabal/packages/hackage.haskell.org/statistics/0.6.0.2/statistics-0.6.0.2 $ cabal build -v
Creating dist/build (and its parents)
Creating dist/build/autogen (and its parents)
Preprocessing library statistics-0.6.0.2...
Building statistics-0.6.0.2...
Building library...
Creating dist/build (and its parents)
/user/home/gent/vsc400/vsc40075/data/ghc-llvm/bin/ghc --make -package-name statistics-0.6.0.2 -hide-all-packages -fbuilding-cabal-package -i -idist/build -i. -idist/build/autogen -Idist/build/autogen -Idist/build -optP-include -optPdist/build/autogen/cabal_macros.h -odir dist/build -hidir dist/build -stubdir dist/build -package-id base-4.3.0.0-9f7ee17cf0c893cd92f8415a454f915e -package-id erf-1.0.0.0-234201e2383c09459ab94beb0df3b3e8 -package-id mwc-random-0.7.0.0-4eee58df7544c900191a0363293b359c -package-id primitive-0.3-32b8e2f221017a439ede3f200657c4e2 -package-id time-1.2.0.3-59ab28dd5cef69306532e1ebdf2c6002 -package-id vector-0.6.0.2-5ec473c310b06377a585b57e8cdec13a -package-id vector-algorithms-0.3.2-a3c01526213cd5570b76237e0b01b6d0 -O -fwarn-tabs -Wall -funbox-strict-fields Statistics.Autocorrelation Statistics.Constants Statistics.Distribution Statistics.Distribution.Binomial Statistics.Distribution.Gamma Statistics.Distribution.Geometric Statistics.Distribution.Exponential Statistics.Distribution.Hypergeometric Statistics.Distribution.Normal Statistics.Distribution.Poisson Statistics.Function Statistics.KernelDensity Statistics.Math Statistics.Quantile Statistics.Resampling Statistics.Resampling.Bootstrap Statistics.Sample Statistics.Sample.Powers Statistics.Types Statistics.Internal
[ 1 of 20] Compiling Statistics.Internal ( Statistics/Internal.hs, dist/build/Statistics/Internal.o )
[ 2 of 20] Compiling Statistics.Function ( Statistics/Function.hs, dist/build/Statistics/Function.o )
[ 3 of 20] Compiling Statistics.Types ( Statistics/Types.hs, dist/build/Statistics/Types.o )
[ 4 of 20] Compiling Statistics.Resampling ( Statistics/Resampling.hs, dist/build/Statistics/Resampling.o )
[ 5 of 20] Compiling Statistics.Distribution ( Statistics/Distribution.hs, dist/build/Statistics/Distribution.o )
[ 6 of 20] Compiling Statistics.Distribution.Geometric ( Statistics/Distribution/Geometric.hs, dist/build/Statistics/Distribution/Geometric.o )
[ 7 of 20] Compiling Statistics.Constants ( Statistics/Constants.hs, dist/build/Statistics/Constants.o )
[ 8 of 20] Compiling Statistics.Quantile ( Statistics/Quantile.hs, dist/build/Statistics/Quantile.o )
ghc: out of memory (requested 1048576 bytes)

For comparison, it does build using ghc-6.12.1.

gengar2:~ $ ghc --info
 [("Project name","The Glorious Glasgow Haskell Compilation System")
 ,("Project version","6.12.1")
 ,("Booter version","6.8.2")
 ,("Stage","2")
 ,("Have interpreter","YES")
 ,("Object splitting","YES")
 ,("Have native code generator","YES")
 ,("Support SMP","YES")
 ,("Unregisterised","NO")
 ,("Tables next to code","YES")
 ,("Win32 DLLs","")
 ,("RTS ways","l debug  thr thr_debug thr_l thr_p  dyn debug_dyn thr_dyn thr_debug_dyn")
 ,("Leading underscore","NO")
 ,("Debug on","False")
 ,("LibDir","/user/home/gent/vsc400/vsc40075/data/ghc-release-6.12.1/lib/ghc-6.12.1")
 ]
gengar2:~ $ cabal install statistics
Resolving dependencies...
Configuring vector-0.6.0.2...
Preprocessing library vector-0.6.0.2...

<snip>

Configuring statistics-0.6.0.2...
Preprocessing library statistics-0.6.0.2...
Building statistics-0.6.0.2...
[ 1 of 20] Compiling Statistics.Internal ( Statistics/Internal.hs, dist/build/Statistics/Internal.o )
[ 2 of 20] Compiling Statistics.Function ( Statistics/Function.hs, dist/build/Statistics/Function.o )
[ 3 of 20] Compiling Statistics.Types ( Statistics/Types.hs, dist/build/Statistics/Types.o )
[ 4 of 20] Compiling Statistics.Resampling ( Statistics/Resampling.hs, dist/build/Statistics/Resampling.o )
[ 5 of 20] Compiling Statistics.Distribution ( Statistics/Distribution.hs, dist/build/Statistics/Distribution.o )
[ 6 of 20] Compiling Statistics.Distribution.Geometric ( Statistics/Distribution/Geometric.hs, dist/build/Statistics/Distribution/Geometric.o )
[ 7 of 20] Compiling Statistics.Constants ( Statistics/Constants.hs, dist/build/Statistics/Constants.o )
[ 8 of 20] Compiling Statistics.Quantile ( Statistics/Quantile.hs, dist/build/Statistics/Quantile.o )
[ 9 of 20] Compiling Statistics.Sample ( Statistics/Sample.hs, dist/build/Statistics/Sample.o )
[10 of 20] Compiling Statistics.Distribution.Normal ( Statistics/Distribution/Normal.hs, dist/build/Statistics/Distribution/Normal.o )
[11 of 20] Compiling Statistics.Math  ( Statistics/Math.hs, dist/build/Statistics/Math.o )
[12 of 20] Compiling Statistics.Distribution.Binomial ( Statistics/Distribution/Binomial.hs, dist/build/Statistics/Distribution/Binomial.o )
[13 of 20] Compiling Statistics.Distribution.Gamma ( Statistics/Distribution/Gamma.hs, dist/build/Statistics/Distribution/Gamma.o )
[14 of 20] Compiling Statistics.Distribution.Hypergeometric ( Statistics/Distribution/Hypergeometric.hs, dist/build/Statistics/Distribution/Hypergeometric.o )
[15 of 20] Compiling Statistics.Distribution.Poisson ( Statistics/Distribution/Poisson.hs, dist/build/Statistics/Distribution/Poisson.o )
[16 of 20] Compiling Statistics.Sample.Powers ( Statistics/Sample/Powers.hs, dist/build/Statistics/Sample/Powers.o )
[17 of 20] Compiling Statistics.Distribution.Exponential ( Statistics/Distribution/Exponential.hs, dist/build/Statistics/Distribution/Exponential.o )
[18 of 20] Compiling Statistics.KernelDensity ( Statistics/KernelDensity.hs, dist/build/Statistics/KernelDensity.o )
[19 of 20] Compiling Statistics.Resampling.Bootstrap ( Statistics/Resampling/Bootstrap.hs, dist/build/Statistics/Resampling/Bootstrap.o )
[20 of 20] Compiling Statistics.Autocorrelation ( Statistics/Autocorrelation.hs, dist/build/Statistics/Autocorrelation.o )
Registering statistics-0.6.0.2...
Installing library in
/user/home/gent/vsc400/vsc40075/.cabal/lib/statistics-0.6.0.2/ghc-6.12.1
Registering statistics-0.6.0.2...

Attachments (3)

4270.hs (37.2 KB) - added by igloo 9 years ago.
small.hs (5.8 KB) - added by igloo 9 years ago.
tiny.hs (2.4 KB) - added by igloo 9 years ago.

Download all attachments as: .zip

Change History (17)

comment:1 Changed 9 years ago by Itkovian

comment:2 Changed 9 years ago by simonmar

Milestone: 6.14.1
Priority: normalhighest

Regression, we need to fix this before the release.

Changed 9 years ago by igloo

Attachment: 4270.hs added

comment:3 Changed 9 years ago by igloo

I looked into a slowdown higher up in the dependency tree:

Timing compilation of the attached 4270.hs with ghc -c 4270.hs -O, with 6.12.2 and HEAD, with and without USE_PRAGMAS defined:

6.12.2 HEAD
USE_PRAGMAS 1.60s 5.67s
No USE_PRAGMAS 1.76s 1.84s

My hunch is that HEAD is paying more attention to INLINE pragmas, and so generating more code, and thus using more memory.

Changed 9 years ago by igloo

Attachment: small.hs added

Changed 9 years ago by igloo

Attachment: tiny.hs added

comment:4 Changed 9 years ago by simonpj

Cc: Roman.Leshchinskiy@… added

The culprit is zipWithM. It generates quite a lot of code. In the examples we make lots of copies of it, which dramatically increases the size of the modules.

Moreover, the functions are all monadic, so inlining them doensn't help much unless we know which monad. And in tiny.hs and small.hs, we don't.

The HEAD is worse than 6.12 in these examples because it (HEAD) optimises an INLINE function f, and generates code for it, just in case it appears in the form map f xs. That seems ok to me.

I don't see quite what to do about this. If you want to duplicate all the code for zipWithM at every call site, you're going to get lots of code. Are you sure you want this much duplication?

I guess the offending zipWithM may originally come from vector, so we need advice from Roman.

The main thought I have is that

  • we might mark zipWithM as INLINABLE
  • then in calling modules we might auto-SPECIALISE any INLINABLE imported functions

But that still might not do the job if you want to not only specialise zipWithM but also inline the specialised function at all its call sites.

Roman?

comment:5 Changed 9 years ago by rl

Cc: rl added

Is the oom problem in Quantile caused by this as well?

As to zipWithM and other stream functions, optimising them is pointless as they will always be inlined. If GHC fails to inline them for some reason, then it really won't matter whether they are optimised or not as performance will be quite atrocious in any case. This also means that it's only worth inlining them at the final call site.

There doesn't seem to be an easy way of specifying all this. I wonder, though, what GHC would do if I rewrote, say, zipWithM as a nullary function:

zipWithM = \f (Stream stepa sa0 na) (Stream stepb sb0 nb) -> ...

GHC might realise that it will always be inlined, even in map zipWithM, and could simply leave it alone. I don't think it does that at the moment and in any case, this is quite ugly. Perhaps we need a new pragma. Simon?

comment:6 in reply to:  5 ; Changed 9 years ago by simonpj

Replying to rl:

Is the oom problem in Quantile caused by this as well?

I'm not sure, but I think Ian believes so.

As to zipWithM and other stream functions, optimising them is pointless as they will always be inlined.

It really is a very big function to inline at every call site!

Simon

comment:7 in reply to:  6 Changed 9 years ago by rl

Replying to simonpj:

As to zipWithM and other stream functions, optimising them is pointless as they will always be inlined.

It really is a very big function to inline at every call site!

Don't forget that as soon as it gets inlined into a loop, the cases get eliminated and then SpecConstr removes the loop state (the tuple with the Maybe) as well. This is actually the whole point.

comment:8 in reply to:  5 ; Changed 9 years ago by igloo

Replying to rl:

Is the oom problem in Quantile caused by this as well?

It's not an OOM problem in Quantile as such, it's ghc --make running out of memory when compiling the whole package. If you repeat the command, with the earlier modules already built, then compilation succeeds.

So it's not easy to see exactly what pushes the memory usage over the edge. I stopped at the first issue I found.

comment:9 in reply to:  8 Changed 9 years ago by simonmar

Priority: highesthigh

Replying to igloo:

It's not an OOM problem in Quantile as such, it's ghc --make running out of memory when compiling the whole package. If you repeat the command, with the earlier modules already built, then compilation succeeds.

That definitely shouldn't happen, because after completing a compilation we should discard everything except the exported interface of the module. So let's treat this as a space leak to be investigated.

comment:10 Changed 9 years ago by igloo

Owner: set to igloo

comment:11 Changed 9 years ago by igloo

Milestone: 7.0.17.0.2

comment:12 Changed 9 years ago by igloo

Milestone: 7.0.27.2.1

comment:13 Changed 9 years ago by michalt

Cc: michal.terepeta@… added

I can't reproduce that with HEAD (7.1.20110416); building the whole package gives:

634 MB total memory in use (0 MB lost due to fragmentation)

This is still quite a lot, but nowhere near 1GB..

comment:14 Changed 9 years ago by igloo

Resolution: fixed
Status: newclosed

We think the extra memory use is just due to keeping all the unfoldings in memory (rather than loading them from interface files on demand).

Note: See TracTickets for help on using tickets.