Opened 16 months ago

Closed 9 months ago

#15105 closed bug (fixed)

`typecheckModule` from GHC API crashes on MacOS for files with TH

Reported by: harpocrates Owned by:
Priority: high Milestone: 8.6.3
Component: GHC API Version: 8.4.2
Keywords: Cc: lerkok, chak
Operating System: MacOS X Architecture: Unknown/Multiple
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

I believe this is the same issue that is causing manually built haddock and doctest to crash on MacOS when fed TH (originally reported https://github.com/haskell/haddock/issues/767 and https://github.com/sol/doctest/issues/199).

I've attached a minimal program that uses the GHC API and exhibits the same problem.

$ ghc-8.4.2 -package ghc -package containers -package ghc-paths Prog.hs
[1 of 1] Compiling Main             ( Prog.hs, Prog.o )
Linking Prog ...
$ ./Prog Main-no-TH.hs -package template-haskell
$ ./Prog Main-TH.hs -package template-haskell
Prog:
lookupSymbol failed in relocateSection (RELOC_GOT)
/usr/local/lib/ghc-8.4.2/integer-gmp-1.0.2.0/HSinteger-gmp-1.0.2.0.o: unknown symbol `___gmp_rands'
Prog: Prog: unable to load package `integer-gmp-1.0.2.0'

In case it isn't clear, I do not expect Main-TH.hs to crash Prog.

Attachments (4)

Main-no-TH.hs (51 bytes) - added by harpocrates 16 months ago.
Main-TH.hs (62 bytes) - added by harpocrates 16 months ago.
Prog.hs (693 bytes) - added by harpocrates 16 months ago.
symboltable.xz (20.9 KB) - added by berdario 9 months ago.
Symbol table for HSinteger-gmp-1.0.2.0.o

Download all attachments as: .zip

Change History (19)

Changed 16 months ago by harpocrates

Attachment: Main-no-TH.hs added

Changed 16 months ago by harpocrates

Attachment: Main-TH.hs added

Changed 16 months ago by harpocrates

Attachment: Prog.hs added

comment:1 Changed 16 months ago by harpocrates

The problem doesn't manifest itself with an inplace GHC.

$ diff Prog.hs InplaceProg.hs
4d3
< import qualified GHC.Paths as GhcPaths
9c8
<   libDir <- pure GhcPaths.libdir
---
>   let libDir = "./inplace/lib"

Then,

$ ./inplace/bin/ghc-stage2 -package ghc -package containers InplaceProg.hs
[1 of 1] Compiling Main             ( InplaceProg.hs, InplaceProg.o )
Linking InplaceProg ...
$ ./InplaceProg  Main-no-TH.hs -package template-haskell
$ ./InplaceProg  Main-TH.hs -package template-haskell

comment:2 Changed 16 months ago by darchon

We have some code in the Clash compiler to do with dynamic linking, which I expect is TH-related: https://github.com/clash-lang/clash-compiler/blame/d7dfef9d89b30d096370887899be24b9027914ac/clash-ghc/src-ghc/Clash/GHC/LoadModules.hs#L174-L179

    let ghcDynamic = case lookup "GHC Dynamic" (DynFlags.compilerInfo dflags) of
                      Just "YES" -> True
                      _          -> False
    let dflags3 = if ghcDynamic then DynFlags.gopt_set dflags2 DynFlags.Opt_BuildDynamicToo
                                else dflags2
    _ <- GHC.setSessionDynFlags dflags3

I don't have access to an OS X machine, so could you check if setting the Opt_BuildDynamicToo flag fixes the TH issue?

comment:3 in reply to:  2 Changed 16 months ago by harpocrates

Replying to darchon:

We have some code in the Clash compiler to do with dynamic linking, which I expect is TH-related: https://github.com/clash-lang/clash-compiler/blame/d7dfef9d89b30d096370887899be24b9027914ac/clash-ghc/src-ghc/Clash/GHC/LoadModules.hs#L174-L179

    let ghcDynamic = case lookup "GHC Dynamic" (DynFlags.compilerInfo dflags) of
                      Just "YES" -> True
                      _          -> False
    let dflags3 = if ghcDynamic then DynFlags.gopt_set dflags2 DynFlags.Opt_BuildDynamicToo
                                else dflags2
    _ <- GHC.setSessionDynFlags dflags3

I don't have access to an OS X machine, so could you check if setting the Opt_BuildDynamicToo flag fixes the TH issue?

I just tried this on OS X and it unfortunately makes no difference: in either case, lookup "GHC Dynamic" (DynFlags.compilerInfo dynflags') produced Just "NO" (and then manually enabling DynFlags.Opt_BuildDynamicToo also didn't change anything).

comment:4 Changed 15 months ago by lerkok

Cc: lerkok added

comment:5 Changed 15 months ago by bgamari

Any updates on this harpocrates? Judging by the haddock ticket it sounds like you are pretty close.

comment:6 Changed 15 months ago by bgamari

On ​https://github.com/haskell/haddock/issues/767 harpocrates said,

Alright, I think I'm close to nailing this. I think there is something suspicious about the HSinteger-gmp-1.0.2.0.o file from GHC's integer-gmp lib folder. I think all that we need to do to fix this issue is remove this file. When I do that, everything starts working again.

Note that the symbols provided by this object file are already in the static library of the same name. Furthermore, when I manually install GHC (after building from source), this troublesome file is not present.

I'm obviously still investigating, but I figured I'd ask if anyone knows any reason HSinteger-gmp-1.0.2.0.o should not be deleted...

This object is a monolithic relocatable object used by statically-linked GHCi. Are you seeing it being loaded?

comment:7 Changed 15 months ago by chak

Cc: chak added

comment:8 Changed 15 months ago by harpocrates

After removing HSinteger-gmp-1.0.2.0.o more than two weeks ago, I stopped having this issue. Since then, I haven't encountered any other issues/side-effects (isn't a statically linked GHCi the default? GHCi is still working for me...). Everything has continued to function just fine and as expected.

I confess I haven't really spent any time on this since that last comment on the Haddock ticket. I'd be glad to help (since I can replicate the bug quite easily), but I'm not sure what to investigate next.

comment:9 Changed 15 months ago by lerkok

I can confirm that the workaround @harpocrates is describing works for me. (In particular, the doctest package on Mac.)

comment:10 Changed 15 months ago by simonmic

Also confirmed, I can run doctests again and don't yet see the ill effects from

~$ locate HSinteger-gmp-1.0.2.0.o
/Users/simon/.stack/programs/x86_64-osx/ghc-8.4.2/lib/ghc-8.4.2/integer-gmp-1.0.2.0/HSinteger-gmp-1.0.2.0.o
~$ mv /Users/simon/.stack/programs/x86_64-osx/ghc-8.4.2/lib/ghc-8.4.2/integer-gmp-1.0.2.0/HSinteger-gmp-1.0.2.0.o{,_DISABLE_GHC_ISSUE_15105}

comment:11 Changed 14 months ago by bgamari

Milestone: 8.6.18.8.1

These won't be addressed for GHC 8.6.

comment:12 Changed 9 months ago by bgamari

Milestone: 8.8.18.6.3
Priority: normalhigh

Apparently several others have encountered this. Bumping in priority.

Changed 9 months ago by berdario

Attachment: symboltable.xz added

Symbol table for HSinteger-gmp-1.0.2.0.o

comment:13 Changed 9 months ago by bgamari

I have been looking into this with Dario Bertini at Munihac. It appears that ___gmp_rands is a "common" symbol. With linker debugging enabled I see the following output,

relocateSection: making jump island for ___stack_chk_guard, extern = 1, X86_64_RELOC_GOT
lookupSymbol: looking up ___stack_chk_guard
lookupSymbol: symbol not found
lookupSymbol: looking up ___stack_chk_guard with dlsym
relocateSection: looked up ___stack_chk_guard, external X86_64_RELOC_GOT or X86_64_RELOC_GOT_LOAD
               : addr = 0x7fffabebd070
relocateSection: value = 0x10e3e1d50
relocateSection: relocation 8481
               : type      = 3
               : address   = 283153
               : symbolnum = 4303
               : pcrel     = 1
               : length    = 2
               : extern    = 1
               : type      = 3
relocateSection: length = 2, thing = 0, baseValue = 0x10e3125f5
relocateSection: making jump island for ___gmp_rands, extern = 1, X86_64_RELOC_GOT
lookupSymbol: looking up ___gmp_rands
lookupSymbol: symbol not found
lookupSymbol: looking up ___gmp_rands with dlsym
relocateSection: looked up ___gmp_rands, external X86_64_RELOC_GOT or X86_64_RELOC_GOT_LOAD
               : addr = 0x0
ghc-stage2:
lookupSymbol failed in relocateSection (RELOC_GOT)
/Users/berdario/Projects/ghc/libraries/integer-gmp/dist-install/build/HSinteger-gmp-1.0.2.0.o: unknown symbol `___gmp_rands'

removeLibrarySearchPath: ptr = `0x0'

In principle common symbols do appear to be handled by ocGetNames_MachO.

comment:14 Changed 9 months ago by Ben Gamari <ben@…>

In 25489085/ghc:

rts/MachO: Iterate through N (all) symbols, not M external symbols

Fixes #15105

comment:15 Changed 9 months ago by bgamari

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.