Opened 4 years ago

Closed 19 months ago

#11285 closed feature request (wontfix)

Split objects makes static linking really slow

Reported by: ezyang Owned by:
Priority: high Milestone:
Component: Compiler (Linking) Version: 7.11
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by ezyang)

Here's a comparison of a few builds of Setup.hs using GHC 7.10.3. In the first case, I am building using a version of GHC with split objects disabled on all libraries. In the second, split objects were enabled but Cabal was compiled without split objects. In the third, Cabal was built with split objects.

[ezyang@hs01 ezyang]$ rm Setup; time ghc-7.10-nosplitobjs/inplace/bin/ghc-stage2 --make Setup.hs -O0
rm: cannot remove ‘Setup’: No such file or directory
[1 of 1] Compiling Main             ( Setup.hs, Setup.o )
Linking Setup ...

real    0m0.950s
user    0m0.757s
sys     0m0.163s
[ezyang@hs01 ezyang]$ rm Setup; time ghc --make Setup.hs -O0
Linking Setup ...

real    0m1.209s
user    0m0.973s
sys     0m0.177s
[ezyang@hs01 ezyang]$ rm Setup; time ghc -no-user-package-db --make Setup.hs -O0
[1 of 1] Compiling Main             ( Setup.hs, Setup.o ) [Distribution.Simple changed]
Linking Setup ...

real    0m3.136s
user    0m2.693s
sys     0m0.407s

In my experience, Cabal is the MOST expensive library to compile with split objects (on my laptop, this is an x2 difference in link time); among base libraries, ld.gold visibly hitches when it has to link base.

Slow link times make for unpleasant experience for users, especially since we don't compile executables as dynamic by default. To make matters worse, split object compiled boot libraries represent a mandatory tax for anyone using static linking, because it's *not possible* to swap out those static archives with non-split objects ones.

Could we enhance GHC to support running the linker in a "fast mode", where we ask the linker to treat archives as atomic units and not try to optimize for binary size? We can keep the current slow mode for production executables that people want to ship.

Change History (8)

comment:1 Changed 4 years ago by ezyang

Component: CompilerCompiler (Linking)
Priority: normalhigh
Type of failure: None/UnknownCompile-time performance bug
Version: 7.10.37.11

comment:2 Changed 4 years ago by ezyang

Description: modified (diff)
Summary: Static linking is really slow sometimesSplit objects makes static linking really slow
Type: bugfeature request

I've diagnosed that split objects is the problem. I've rewritten the description to reflect this.

comment:3 Changed 4 years ago by rwbarton

8.0 has the -ffunction-sections-style replacement for -split-objs, right? Is that better or worse?

comment:4 Changed 4 years ago by ezyang

EDIT: The numbers here are wrong, ffunction-sections was not actually enabled (not supported on Linux) so I was just testing the difference between no split objects and split objects.

Ha! On a quick and dirty test, -ffunction-sections is FOUR times worse for compiling Setup.hs on ld.bfd. However, it is TWO times better with ld.gold. (But not using split objects with gold is still the fastest.)

[ezyang@hs01 ezyang]$ rm Setup; time ghc-8.0-nosplitobjs/inplace/bin/ghc-stage2 -no-user-package-db --make Setup.hs -O0 -optl-fuse-ld=gold
[1 of 1] Compiling Main             ( Setup.hs, Setup.o )
Linking Setup ...

real    0m1.429s
user    0m1.250s
sys     0m0.163s
sys     0m0.583s
[ezyang@hs01 ezyang]$ rm Setup; time ghc-8.0/usr/bin/ghc -no-user-package-db --make Setup.hs -O0 -optl-fuse-ld=gold
Linking Setup ...

real    0m2.537s
user    0m2.310s
sys     0m0.220s
[ezyang@hs01 ezyang]$ rm Setup; time ghc-8.0-nosplitobjs/inplace/bin/ghc-stage2 -no-user-package-db --make Setup.hs -O0 
Linking Setup ...

real    0m11.349s
user    0m10.823s
sys     0m0.553s
[ezyang@hs01 ezyang]$ rm Setup; time ghc-8.0/usr/bin/ghc -no-user-package-db --make Setup.hs -O0 
[1 of 1] Compiling Main             ( Setup.hs, Setup.o )
Linking Setup ...

real    0m3.380s
user    0m2.867s
sys     0m0.500s

I don't think we can generally assume people will be using gold, so switching this on by default probably is unacceptable.

Last edited 4 years ago by ezyang (previous) (diff)

comment:5 Changed 4 years ago by olsner

ghc could very well default to use gold if it's available, I think. There are a few reasons to explicitly need bfd-ld (e.g. when using linker scripts), but for linking normal programs it shouldn't matter either way. To support "special" use cases, we'd just need to make sure -optl-fuse-ld=bfd overrides ghc's setting.

Could we enhance GHC to support running the linker in a "fast mode"

I think this is not entirely up to the linking stage, as both split objects and function-sections are compile-time rather than link-time settings.

Something that could be done at the linking stage is linking against the incrementally linked libraries-for-ghci - both split objects and split sections are undone by the incremental linking step. That might just run into different bottlenecks though :)

Since #8405, --gc-sections is sent to the linker too. IIRC my previous experiments didn't find that it affected link times much unless actually using -split-sections for the installed libraries, but it could be moved to an explicit flag if need be. The downside of that is that users then have to learn a new flag to get smaller binaries.

comment:6 Changed 4 years ago by ezyang

olsner: It's not, unless we install both "slow to link" and "quick to link" versions of static libraries. As you mention, this might be helpful anyway for loading statically linked libraries to GHCi.

I *believe* my experiments showed that --gc-sections didn't really cost you anything if you weren't using -split-sections. So it should be fine to continue to pass it. Disabling it didn't really help with link times anyway.

comment:7 Changed 4 years ago by ezyang

I gave some bogus numbers, because the build system did not inform me that SplitSections doesn't actually do anything on Linux yet. So someone will have to do this test on Mac OS X and tell us what the difference is.

comment:8 Changed 19 months ago by bgamari

Resolution: wontfix
Status: newclosed

Given that (a) split objects will soon be ripped out and replaced with split sections (likely for 8.6, see #13939) , and (b) we now use ld.gold when possible (#13541) I think this can be safely closed.

Note: See TracTickets for help on using tickets.