Opened 17 months ago

Closed 14 months ago

Last modified 13 months ago

#15051 closed bug (fixed)

-split-objs generates excessively many files on Windows

Reported by: kanetw Owned by: Phyx-
Priority: highest Milestone: 8.6.1
Component: Compiler (CodeGen) Version: 8.5
Keywords: Cc:
Operating System: Windows Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D4915 Phab:D4916
Wiki Page:

Description

A BuildFlavour = perf build generated ~18k .S files for ghc-prim:GHC.Types, which caused it to basically get stuck compiling all those assembly files. Removing -split-objs (via SplitObjs = NO) significantly sped up compilation.

Change History (17)

comment:1 Changed 17 months ago by kanetw

Version: 8.2.28.5

comment:2 Changed 16 months ago by Phyx-

Priority: normalhighest

Default builds are now broken. it's now generating over 18k worth of split markers, which means Windows builds don't finish in any reasonable amount of time.

Does GHC prim really have over 18k worth of symbols now in a single module? Other platforms don't notice this because they use split-sections. Which we can't use on Windows due to a lack of a fast linking process and non-existent x86 implementation.

*sigh* perhaps it's time to fix this.

comment:3 Changed 16 months ago by Phyx-

@kanetw do you happen to know which hash you noticed this on first? It seems to affect the 8.4.1 release tag already. I guess @bgamari just waited hours and hours for the builds to finish?

comment:4 Changed 16 months ago by kanetw

Um, whatever built was most recent 7 weeks ago. I can give you the hash once I'm back home, but that won't be for another week.

comment:5 Changed 15 months ago by bgamari

It looks like the source of the symbols is the Typeable KindReps generated for the tuple types. However, I'm rather surprised that we are just seeing this now; this logic has been present since 8.2. Are you sure this is a new problem?

comment:6 Changed 15 months ago by bgamari

Owner: set to Phyx-

comment:7 Changed 15 months ago by Phyx-

Architecture: x86_64 (amd64)Unknown/Multiple
Component: CompilerCompiler (CodeGen)
Type of failure: None/UnknownCompile-time performance bug

This is beyond annoying. The windows tarballs are useless after this change. A full compile takes a day and a half in validate mode. even compiling a simple module now takes seconds.

So I'm just going to kill SPLIT_OBJS as this change has effectively made it useless. Compiling a simple Cabal module alone takes 900 fork/exec calls.

comment:8 Changed 15 months ago by Phyx-

Differential Rev(s): Phab:D4915 Phab:D4916

comment:9 Changed 15 months ago by bgamari

Yes, I think just disabling (and ultimately removing) split objects is the right idea here. This is something I've been looking to do for quite some time.

comment:10 Changed 15 months ago by Phyx-

It's unfortunate that it leaves the Windows port in a very icky situation. It either gets slower, gets much slower, or gets bigger.

comment:11 Changed 15 months ago by bgamari

I agree, but disk space is reasonably cheap now. It seems to me like accepting a temporary increase in size and focusing effort on fixing split sections is a better use of time than trying to fix what is ultimately a dead-end.

comment:12 Changed 15 months ago by Phyx-

True, from my own checks the increase in size doesn't seem to be that much. so Yeah I also think disabling it is the way forwards. Also because fundamentally there's nothing really that can be done about it. split-objs just won't scale.

comment:13 Changed 14 months ago by Ben Gamari <ben@…>

In 53649947/ghc:

split-obj: disable split-objects on Windows.

A change has caused GHC to generate excessive specializations.
This is making GHC generate 1800 splits for a simple GHC.Prim module,
which means 1800 fork/exec calls.

Due to this compilation times on Windows with split-objs on take over
24 hours to complete depending on your disk speed.  Also the end
compiler
compiling medium to large project is also much slower.

So I think we need to just disable split-objects. As there's nothing
that
can be done about this.

Test Plan: ./validate

Reviewers: bgamari

Subscribers: tdammers, rwbarton, thomie, erikd, carter

GHC Trac Issues: #15051

Differential Revision: https://phabricator.haskell.org/D4915

comment:14 Changed 14 months ago by Zemyla

I've noticed this problem too, but I'm not sure getting rid of split-objs is the solution. The problem, it seems to me, is that it's (a) splitting with a Perl script, and (b) calling gcc with each and every single individual file produced.

Is it possible to get rid of the Perl script, at least on Windows, by specializing the split-objects procedure for Windows x86/x64 builds and then using gcc to build many assembly files at a time? Actually, now that I think about it, why is it even using gcc to convert the assembly files into object files in the first place? Doesn't it convert the code directly into an object file when split-objs isn't used?

comment:15 Changed 14 months ago by bgamari

Resolution: fixed
Status: newclosed

comment:16 Changed 14 months ago by Phyx-

I've noticed this problem too, but I'm not sure getting rid of split-objs is the solution. The problem, it seems to me, is that it's (a) splitting with a Perl script, and (b) calling gcc with each and every single individual file produced.

(a) is not the problem. The perl script just does a simple linear scan looking for split markers and breaks it up. The overall runtime of the split script is negligible.

(b) It won't work without this. split-objs just exploits the fact that linkers pull in library code on demand. If no symbols in an object file in an archive is needed it's not pulled in. split-section uses "linker stubs" to get the same effect, the tiny object files are pre-linked together to get one giant object file where each original .s file is a new stub. Passing all the .s files to the assembler will cause it to merge the contents of the sections and you'd get one linker stub. The net effect would be the same as not splitting to begin with. While you could force the creation of a linker stub with a .file directive for each part, the result still won't be the same as your .text section header will still be merged.

Is it possible to get rid of the Perl script, at least on Windows, by specializing the split-objects procedure for Windows x86/x64 builds and then using gcc to build many assembly files at a time?

Not really, as I've explained above.

Actually, now that I think about it, why is it even using gcc to convert the assembly files into object files in the first place? Doesn't it convert the code directly into an object file when split-objs isn't used?

No, GHC doesn't have an assembler or static linker. It always uses an external program for both of these cases. It only has a runtime linker for GHCi and Template Haskell.

As it stands, -split-objs is simply a dead approach, it's better to invest the time into getting -split-sections working, which doesn't rely on hacks.

comment:17 Changed 13 months ago by Ben Gamari <ben@…>

In 23774c98/ghc:

function-section: enable on windows

gc-sections was onced observed to be slow on Windows, which is the only
reason it's not enabled yet.  However, it seems to be better now.

Test Plan: ./validate

Reviewers: bgamari

Reviewed By: bgamari

Subscribers: rwbarton, thomie, carter

GHC Trac Issues: #15051

Differential Revision: https://phabricator.haskell.org/D4916
Note: See TracTickets for help on using tickets.