Opened 4 years ago

Last modified 4 years ago

#11561 new feature request

Have static ghci link against its own copy of its libraries

Reported by: rwbarton Owned by:
Priority: low Milestone:
Component: Runtime System (Linker) Version: 8.1
Keywords: Cc: trommler
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #11238 Differential Rev(s):
Wiki Page:

Description

When ghci is built dynamically, there is just one copy of each of the libraries that make up ghci (the ghc library, and base and ghc-prim and all the other dependencies of ghc or ghci) loaded, which is used both by ghc(i) itself directly, and also by the code run inside ghci. Could we do the same for statically-built ghci?

Here is a possibility.

  • Ensure through suitable linker flags that all objects in the archives for the libraries that make up ghci are linked into the ghci executable. (Or link the .o files that we also produce (except for ghc itself), instead.)
  • Through more linker flags, add dynamic symbol table entries for all the symbols in the ghci executable that might be referenced from code run inside ghci; say all the symbols that the runtime linker would add to its internal symbol table.

This is a bit weird, because the ghci executable is not relocatable. But I can't see any specific reason why it is wrong to do.

  • In the runtime linker, don't load any library that was linked into ghci. Instead, trust that dlsym() (or the OS-specific equivalent) will find symbols from those libraries.

Pros:

  • ghci startup time (and time to load other libraries linked into ghc) should be reduced since the runtime linker doesn't need to process a bunch of relocations. ghci startup time is particularly noticeable on, say, ARM. (On my tablet it takes about a second.) Not sure whether the dynamic loader has to process the executable's dynamic symbol table at startup time, but I assume that even if it does, that process is much faster.
  • We can eliminate the .o file for each library linked into ghci from the binary distribution, at the cost of having to link that entire .o file into ghci. The net savings would be whatever part of that object file would have been linked into ghci anyways. (Note that we do not currently distribute a .o file for the ghc library anyways. My estimate for the total savings is on the order of 20MB.)

Unknown

  • The total process size of ghci might go up or down. For each library that is eventually loaded into ghci anyways, we save space equal to the amount of code that would have been linked into ghci from that library. For each library that is not eventually loaded into ghci, we use additional space equal to the amount of code in that library that would not have been linked into ghci.

We could exclude some libraries from this treatment if they make for particularly bad tradeoffs here and are not depended on by other libraries that we want to include. We might exclude the ghc package just because it is rarely loaded into ghci, even though most or all of it will be linked into ghci anyways, to leave more flexibility for its dependencies.

Cons:

  • ?

Change History (5)

comment:1 Changed 4 years ago by rwbarton

I don't know how this would interact with remote ghci. Badly, I guess.

comment:2 Changed 4 years ago by trommler

Cc: trommler added

I proposed something similar as part of the dynamic linker redesign (#11238).

My idea in that ticket was to use the system runtime linker on architectures where we do not have RTS linker support (e.g. powerpc64[le]) even in a statically linked GHCi.

The ticket mentioned above has more information and a link to a Wiki page with the design proposal (in progress).

comment:3 in reply to:  1 ; Changed 4 years ago by trommler

Replying to rwbarton:

I don't know how this would interact with remote ghci. Badly, I guess.

Why? Remote GHCi lives in its own process, so the way the parent process is linked does not matter at all. Am I missing something fundamental here?

comment:4 in reply to:  3 Changed 4 years ago by rwbarton

Replying to trommler:

Replying to rwbarton:

I don't know how this would interact with remote ghci. Badly, I guess.

Why? Remote GHCi lives in its own process, so the way the parent process is linked does not matter at all. Am I missing something fundamental here?

I just thought that the pros would not apply since the Haskell libraries linked into the main ghc(i) process cannot be shared with the ones used by code loaded into the remote ghci process. But I didn't think about the fact that the remote ghci process also has its own set of Haskell libraries (assuming again that it is statically linked). So in fact the same benefits apply there; maybe reduced if ghc-iserv links against fewer Haskell libraries than ghc does.

comment:5 in reply to:  2 Changed 4 years ago by rwbarton

Replying to trommler:

I proposed something similar as part of the dynamic linker redesign (#11238).

My idea in that ticket was to use the system runtime linker on architectures where we do not have RTS linker support (e.g. powerpc64[le]) even in a statically linked GHCi.

The ticket mentioned above has more information and a link to a Wiki page with the design proposal (in progress).

Thanks for the link, I'll take a look at the wiki page referenced there.

Note: See TracTickets for help on using tickets.