Opened 5 years ago

Last modified 11 months ago

#9718 patch task

Avoid TidyPgm predicting what CorePrep will do

Reported by: simonpj Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.8.3
Keywords: CodeGen, CAFs Cc: osa1, simonmar, sgraf
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D5432
Wiki Page:

Description (last modified by simonpj)

At the moment the TidyPgm pass is forced to predict, accurately but unpleasantly, some aspects of what CorePrep and Core2Stg will do. Reason:

  • Each Id in the interface file records (among other things) the arity of the Id, and whether it has CAF references
  • We really only know these two things for sure after CorePrep. The conversion from Core to STG makes no structural changes.
  • However the result of TidyPgm (which preceded CorePrep) is used to generate the interface file. So it has to predict the arity and CAF-ref status of each Id.
  • This is not good. It restricts what CorePrep can do (notably, it must not change the arity of a top-level Id), and it leads to unsavoury code (e.g. look at the call to CorePrep.cvtLitInteger in TidyPgm.cafRefsL. It's also dangerous: an inconsistency could lead to a crash.

This is a long-standing problem. My current thought for how to unravel it is this:

  • TidyPgm does not attach arity or CAF-ref info.
  • Instead, run CorePrep after TidyPgm, and generate accurate arity and CAF-ref info
  • Then use that auxiliary mapping during the conversion from tidied program to ModIface.

I don't think this would be hard. It would mean that the tidied program and the core-prep'd program would have to exist in memory at the same time.

An alternative would be to generate the ModIface from the tidied program sans arity and CAF-ref info, and then, after CorePrep run over it to add arity and CAF-ref info. (You'd have to do this before generating the fingerprints.) The advantage of this is that the ModIface can be a lot smaller than the code for the entire module.

Another alternative would be to ensure that after CoreTidy we treat each top-level binding one at a time, and pump them right down the pipeline individually, all the way through code generation. That way we would avoid creating the STG, or Cmm, for the entire program all at once.

A long-standing wart which needs some careful attention.


Change History (42)

comment:1 Changed 4 years ago by thomie

Type: bugtask

Seems like this is just a refactoring task, not a user facing bug.

comment:2 Changed 3 years ago by simonpj

See Commentary/Rts/Storage/GC/CAFs for info about CAFs.

comment:3 Changed 3 years ago by simonpj

Another complexity for prediction is that Windows DLLs turn some static constructor applications into thunks; see

  • StgSyn.isDllConApp
  • TidyPgm.hasCafRefs

comment:4 Changed 3 years ago by simonpj

really just a dup of #4121, but better explained here

comment:5 Changed 3 years ago by simonpj

Description: modified (diff)

comment:6 Changed 3 years ago by simonpj

Keywords: CodeGen added

comment:7 Changed 19 months ago by osa1

Cc: osa1 added

comment:8 Changed 19 months ago by simonpj

#15038 is a bug that would not have arisen if we'd implemented this ticket.

comment:9 Changed 19 months ago by simonpj

Keywords: CAFs added

comment:10 Changed 18 months ago by simonpj

When we fix this, let's revisit the patch in comment:36 of #15038. It implements a very special case for one error-id, solely because that error-id is introduced after the CAF computation has been done in TidyCore. When we fix this ticket, #9718, we can remove the special case and treat all error-ids uniformly.

comment:11 Changed 13 months ago by osa1

I started working on this. A few notes:

  • We want to be able to change CAFFY-ness and arity information not only in Core, but also in STG. #15038 happened because CAFFY-ness was changed in STG. In Phab:D4717 (for #15113) we change CAFs in CoreToStg.
  • I think to update an iface file after STG we could update the old ModIface (used to generate the initial file) and pass it to hscMaybeWriteIface when we're done with transformations (right before starting to generate Cmm). So the only extra thing we keep in memory would be the ModIface.

Alternatively I think we could implement a ModIfaceUpdate type that represents changes in Names in an existing interface file. Then updateIface function would read the iface file, apply the updates to the in-memory representation, then write it again. This trades performance for residency (although peak memory may stay the same).

  • ModIface is currently not suitable for updates. The field that holds iface definitions has type [(Fingerprint,IfaceDecl)], which is hard to update (need to search the entire list to find the IfaceDecl for a given Name.
  • Similarly, id details of a IfaceDecls is also hard to update. The relevant types are:
data IfaceIdInfo
  = NoInfo                      -- When writing interface file without -O
  | HasInfo [IfaceInfoItem]     -- Has info, and here it is

data IfaceInfoItem
  = HsArity         Arity
  | HsStrictness    StrictSig
  | HsInline        InlinePragma
  | HsUnfold        Bool             -- True <=> isStrongLoopBreaker is true
                    IfaceUnfolding   -- See Note [Expose recursive functions]
  | HsNoCafRefs
  | HsLevity                         -- Present <=> never levity polymorphic

So to update caf refs we need to filter the whole list. To update the arity we need to drop any existing HsArity info items and cons a new one.

Depending on how large these lists are perhaps this isn't as big of a problem as the previous item though.

  • I'm assuming that idName of a top-level STG binder is the Name used in IfaceDecl for the declaration, so we don't need to generate a map from STG binders to IfaceDecl names. In my testing I found this to be true, but I only tried tiny programs so far.

comment:12 Changed 13 months ago by simonpj

  • Yes, I think we should take the arity and CAFness from the STG code, not just after we generate it, but rather just before we code-gen it , which is the moment of truth.
  • Rather than use the IfaceIdInfo for these two fields, I think we could make them two fixed fields of each top-level Id binding in a ModIface. Every Id has CAF-ness and arity.
  • The final arity is really the representation arity, after unarisation of unboxed tuples etc. I think we should probably treat that as a separate matter to the "arity" recorded by the Simplifier. The latter really can be passed on by CoreTidy, exactly as now. The former determines calling convention etc; it's a code-gen thing. We currently fudge this issue: see idRepArity.
  • Fingerprints will indeed change; but they are in any case calculated separately; see addFingerprints. So we probably want to defer that step too.
  • For now, I'd be inclined to simply hold onto the partly-complete ModIface until codegen; then add info from the just-before-codegen STG code; and construct a final ModIface. If that gives rise to residency issues we can decide what to do then.

comment:13 Changed 13 months ago by osa1

Another thing we need to figure out is how to update use sites of binders when we update CAFness info. For example, in the example in #15113:

lvl2_r3y3 :: [Char]
lvl2_r3y3 = unpackCString# lvl1_r3y2

-- RHS size: {terms: 7, types: 6, coercions: 2, joins: 0/0}
patError :: forall a. Addr# -> a
[GblId, Arity=1, Str=<B,U>x, Unf=OtherCon []]
  = \ (@ a_a2kh) (s_a1Pi :: Addr#) ->
        @ SomeException
        @ 'LiftedRep
        @ a_a2kh
           ((untangle s_a1Pi lvl2_r3y3)
            `cast` (Sym (Control.Exception.Base.N:PatternMatchFail[0])
                    :: (String :: *) ~R# (PatternMatchFail :: *))))

We want to be able to make lvl2_r3y3 re-entrant. When we do this patError won't have any CAF refs, but to actually realize this we need to update references to lvl2_r3y3 to update the id info.

Alternatively I think we could pass an environment of updated ids to topStgBindHasCafRefs and friends.

comment:14 Changed 13 months ago by simonpj

Alternatively I think we could pass an environment of updated ids to topStgBindHasCafRefs and friends.

Yes, let's do something more like that.

But I have forgotten how SRTs are constructed. We clearly need two passes:

  1. Find out which top-level bindings are CAFFy: that is, refers (transitively) to a CAF. That simply involves a free-variable analysis and a fixpoint calculation.
  1. Build an SRT for each closure that lists the CAFFy things it mentions. It's not obvious where that is done today -- can you point to it?

For step (2) it occurs to me that we could decorate the StgRhsClosure with the list of things to put in its SRT, just as we decorate it with the list of locally-bound free variables to put in its closure. That seems so obvious, I'm not sure why we don't do that.

One other thought. Suppose we have

x = factorial 200
p = (x,True)
q = (False,x)
wubble = \x. (p,q)

Then x, y and p are all CAFFy, so currently wurble's SRT will contain both p and q. But actually we need only one of them, because the goal is solely to keep x alive. So actually wurble's SRT could list p alone, or q alone, or indeed just x. I'm not sure if this is worth exploiting; and doing so cross-module would mean we had to put more stuff in .hi files.

comment:15 Changed 13 months ago by osa1

Build an SRT for each closure that lists the CAFFy things it mentions. It's not obvious where that is done today -- can you point to it?

We don't build SRTs until Cmm (doSRTs in CmmBuildInfoTables.hs). In TidyPgm we decide on CAFFYness and record it in IdInfos of binders. As far as I can see there are no CAF or SRT related computation done in STG.

Btw, I found this commented-out code in StgSyn:

pprStgExpr (StgLet srt (StgNonRec bndr (StgRhsClosure cc bi free_vars upd_flag args rhs))
                        expr@(StgLet _ _))
  = ...

It seems like at some point someone tried to decorate StgLets with SRTs.

I don't understand how the CAFFYness info in Ids used in the code gen, but I wonder if even after this work there will still be room for bugs because idCafInfo of an id (not in binder position) may disagree with the final CAFFYness information of the id.

comment:16 Changed 13 months ago by simonpj

I don't understand how the CAFFYness info in Ids used in the code gen, but I wonder if even after this work there will still be room for bugs because idCafInfo of an id (not in binder position) may disagree with the final CAFFYness information of the id.

I think we should

  • Tag each binder with its CAFFYness, immediately before code generation, after any STG-to-STG passes (step 1 of comment:14).
  • Propagate that info into the .hi file for the module.
  • In SRT generation (setp 2 of comment:14, presumably in CmmBuildInfoTables) use a finite map of Id to CAFFYness, so that we specifically do not rely on the CAFFYness of occurrences of local Ids. (For imported Ids we should be fine.)

Sound OK?

comment:17 Changed 13 months ago by simonmar

Cc: simonmar added

comment:18 Changed 13 months ago by sgraf

Cc: sgraf added

comment:19 Changed 11 months ago by osa1

I almost finished the analysis implementation (I don't update the .hi file or ModIface yet). I currently have a sanity check that after analysis compares CafInfos computed by TidyPgm with the CafInfos the new analysis computes, and I can see cases when building stage 2 where the results differ. For example, for this definition:


  :: forall r. GHC.Base.Functor ((->) r)
[GblId[DFunId], Str=m] =
    CCS_DONT_CARE GHC.Base.C:Functor! [GHC.Base..

This definition itself is not a CAF, so we look at the free variables GHC.Base.$fFunctor->_$c<$ and GHC.Base..:

GHC.Base.. [InlPrag=INLINE (sat-args=2)]
  :: forall b c a. (b -> c) -> (a -> b) -> a -> c
 Unf=OtherCon []] =
    \r [f_s6N7 g_s6N8 x_s6N9]
        let {
          sat_s6Na [Occ=Once] :: b_a2Yb
          [LclId] =
              \u [] g_s6N8 x_s6N9;
        } in  f_s6N7 sat_s6Na;

This itself is not a CAF and has no free variables, so not CAFFY.

GHC.Base.$fFunctor->_$c<$ :: forall r a b. a -> (r -> b) -> r -> a
 Unf=OtherCon []] =
    \r [x_s6Ng eta_s6Nh eta1_s6Ni] x_s6Ng;,

This also isn't a CAF itself and has no free variables. So really fFunctor-> should not be CAFFY, but its CafInfo says it may have refs. I don't know which Core pass thinks that this function may have CAF refs, and why.

We have more info about what is CAF and what is not in STG so the new analysis may make more things non-CAFFY, which is a good thing. But I still want to understand why a Core pass things this definition is CAFFY.

comment:20 Changed 11 months ago by osa1

I now built GHC with some debug prints to see what's going on. Here are some numbers:

  • The new analysis never returns MayHaveCafRefs for a binder that previous passes assigned NoCafRefs.
  • There are 9604 binders for which the new analysis returned NoCafRefs, but they were assigned MayHaveCafRefs by previous passes.

I looked at a few of the definitions that were assigned NoCafRefs by the new analysis but were previously MayHaveCafRefs. Here are a few examples: (see also the example in comment:19)


sat_s7Je :: GHC.Types.Int
[LclId] =
    CCS_DONT_CARE GHC.Types.I#! [68#];

(I don't know why this was previously MayHaveCafRefs)


  :: forall a.
     GHC.Conc.Sync.STM a -> GHC.Conc.Sync.STM a -> GHC.Conc.Sync.STM a
[LclId] =
    \r [eta_B3 eta_B2 void_0E]
        catchRetry# [eta_B3 eta_B2 GHC.Prim.void#];

This itself is not a CAF. Only free variable is GHC.Prim.void# (we don't consider catchRetry# as a free variable because it's a primop) which is defined like this

voidPrimId  = pcMiscPrelId voidPrimIdName voidPrimTy
                (noCafIdInfo `setUnfoldingInfo` evaldUnfolding     -- Note [evaldUnfoldings]
                             `setNeverLevPoly`  voidPrimTy)

So this is not CAFFY (again, I don't know why this was previously MayHaveCafRefs).

I also checked a few other definitions, one interesting binder that was previously MayHaveCafRefs and is now NoCafRefs is this foreign import:

foreign import ccall unsafe "static stdlib.h &free" c_free_finalizer
    :: FunPtr (Ptr Word8 -> IO ())

Once again I have no idea why this was previously MayHaveCafRefs...

comment:21 Changed 11 months ago by osa1

I think one of the reasons why previosly we had so much more MayHaveCafRefs is because vanillaIdInfo conservatively assigns MayHaveCafRefs to the id, which is used in a lot of places when inventing new Ids.

comment:22 Changed 11 months ago by simonpj

In GHC today, the CafInfo for top-level bindings is set by TidyPgm.tidyTopBind, which in turn uses hasCafRefs.

Note that hasCafRefs returns true if

  • The binding mentions CAFFy things
  • The binding is a CAF

Notice the the code in hasCafRefs has all kind of wierd stuff to do with integer literals; this vanishes entirely in your new code because the integer-literal expansion is done by the time your new analysis gets to it.

But you may need to look at

  is_caf = not (arity > 0 || rhsIsStatic (targetPlatform dflags) is_dynamic_name
                                         cvt_literal expr)

The rhsIsStatic code (in CoreUtils) is basically looking for a thunk. But it has some kind of special case for DLLs -- you should ensure that your new analysis deals correctly with this.

rhsIsStatic also has code for dealing with partial applications, which I don't think we need. So I think you only need a MUCH simpler function.

comment:23 Changed 11 months ago by simonpj

Re commment:20

sat_s7Je :: GHC.Types.Int
[LclId] =
    CCS_DONT_CARE GHC.Types.I#! [68#];

This is introduced in core-to-stg, and perhaps indeed its CAFFy flag is conservative. So that may be a useful improvement.

comment:24 Changed 11 months ago by osa1

So I think you only need a MUCH simpler function.

Indeed, my function is much simpler. The whole analysis is implemented in 69 lines:

I just booted GHC with updated iface files. I'll now try to validate.

comment:25 Changed 11 months ago by simonpj

Did you check out the bit about DLLs in CoreUtils.rhsIsStatic?

comment:26 Changed 11 months ago by osa1

It validates! I'll check out rhsIsStatic business now.

One thing I realized that at least some uses of rhsIsStatic won't be necessary (e.g. the use in canFloatFromNoCaf). I'll check others now.

comment:27 Changed 11 months ago by osa1

I'm confused about what do cross-DLL references have anything to do with the task at hand. After all we only do changes in the module we're compiling, and we never make any changes in imported ids. I believe those comments about cross-DLL references are either misplaced, or the function was somehow used for other things in the past. I think we can just remove the whole function, and remove the use sites etc. simply because we can now freely float whatever we want without worrying about changing CafInfo.

comment:28 Changed 11 months ago by osa1

Almost done ... I removed the entire cross-DLL stuff, with a lot of other comments and code, and it validates. As said in comment:27 I think the DLL stuff is not relevant to the task (we don't do any cross-module stuff), and the function (rhsIsStatic) is not used elsewhere, so it's fine to remove it.

I need to do some refactoring, and ModIface updates are currently terrible because the ModIface type is not easy to update (holds everything in lists). Another problem is I introduced new Stg passes. I'll do some refactoring and then submit a diff to get some comments before doing more refactoring to fix these issues.

comment:29 Changed 11 months ago by simonpj

I think the DLL stuff is not relevant to the task (we don't do any cross-module stuff)and the function (rhsIsStatic) is not used elsewhere, so it's fine to remove it.

Probably. But be careful here! There is some platform-specific stuff in that function, so it might be fine on your platform but not on others. I see stuff like

Top-level constructor applications can usually be allocated
statically, but they can't if the constructor, or any of the
arguments, come from another DLL (because we can't refer to static
labels in other DLLs).

If this happens we simply make the RHS into an updatable thunk,
and 'execute' it rather than allocating it statically.


    go (Var f) n_val_args
        | (platformOS platform /= OSMinGW32) ||
          not (is_dynamic_name (idName f))

I don't fully understand this, but it's all tied up with isDllConApp, which in turn is used in CoreToStg. I believe that this code in CoreToStg nails it:

  | StgConApp con args _ <- unticked_rhs
  , -- Dynamic StgConApps are updatable
    not (isDllConApp dflags this_mod con args)
  = -- CorePrep does this right, but just to make sure
    ASSERT2( not (isUnboxedTupleCon con || isUnboxedSumCon con)
           , ppr bndr $$ ppr con $$ ppr args)
    ( StgRhsCon dontCareCCS con args, ccs )

so we don't need to worry in our new (late) CAF analysis. But none of this DLL special casing is properly documented so I don't really understand what is happening.

Returning to rhsIsStatic we have this:

  is_static _      (Lit (LitLabel {}))    = False
  is_static _      (Lit _)                = True
        -- A LitLabel (foreign import "&foo") in an argument
        -- prevents a constructor application from being static.  The
        -- reason is that it might give rise to unresolvable symbols
        -- in the object file: under Linux, references to "weak"
        -- symbols from the data segment give rise to "unresolvable
        -- relocation" errors at link time This might be due to a bug
        -- in the linker, but we'll work around it here anyway.
        -- SDM 24/2/2004

No idea what this is about, but I wouldn't just delete it without thought.

Finally, I see

                -- A naked un-applied variable is *not* deemed a static RHS
                -- E.g.         f = g
                -- Reason: better to update so that the indirection gets shorted
                --         out, and the true value will be seen
                -- NB: if you change this, you'll break the invariant that THUNK_STATICs
                --     are always updatable.  If you do so, make sure that non-updatable
                --     ones have enough space for their static link field!

I think this is probably again dealt with in CoreToStg. If we have a bare variable as the RHS I think we get an updatable thunk. I don't understand the bit about THUNK_STATIC though.

comment:30 Changed 11 months ago by simonpj

Re "Finally I see" Simon M pointed out the that the code generator has a special case for indirection bindings f = g:

cgTopRhsClosure dflags rec id ccs upd_flag args body =
  -- special case for a indirection (f = g).  We create an IND_STATIC
  -- closure pointing directly to the indirectee.  This is exactly
  -- what the CAF will eventually evaluate to anyway, we're just
  -- shortcutting the whole process, and generating a lot less code
  -- (#7308)
  -- Note: we omit the optimisation when this binding is part of a
  -- recursive group, because the optimisation would inhibit the black
  -- hole detection from working in that case.  Test
  -- concurrent/should_run/4030 fails, for instance.
  gen_code dflags _ closure_label
    | StgApp f [] <- body, null args, isNonRec rec
    = ...

We decided that:

  • The code generator makes an IND_STATIC for these
  • So f itself is not a CAF
  • But it might still be CAFFY if g is.

Hence the CAF analysis could (and probably should) take advantage of this special case (alas, still a form of prediction!) provided it points to a specific Note about it in the code generator.

comment:31 Changed 11 months ago by osa1

OK so it turns out I made a mistake in my testing and ended up not testing anything. I submitted Phab:D5416 to fix a confusing variable naming which is what confused me. Currently I'm stuck with a weird error caused when I add one more hscWriteIface call after code generation, without changing the interface file (so I'm just writing the same interface file again, later in compilation). The diff to reproduce is:

diff --git a/compiler/main/DriverPipeline.hs b/compiler/main/DriverPipeline.hs
index a9e486c94a..c614d7f102 100644
--- a/compiler/main/DriverPipeline.hs
+++ b/compiler/main/DriverPipeline.hs
@@ -193,7 +193,7 @@ compileOne' m_tc_result mHscMessage
             o_time <- getModificationUTCTime object_filename
             let linkable = LM o_time this_mod [DotO object_filename]
             return hmi0 { hm_linkable = Just linkable }
-        (HscRecomp cgguts summary, HscInterpreted) -> do
+        (HscRecomp cgguts summary _iface, HscInterpreted) -> do
             (hasStub, comp_bc, spt_entries) <-
                 hscInteractive hsc_env cgguts summary

@@ -214,14 +214,14 @@ compileOne' m_tc_result mHscMessage
             let linkable = LM unlinked_time (ms_mod summary)
                            (hs_unlinked ++ stub_o)
             return hmi0 { hm_linkable = Just linkable }
-        (HscRecomp cgguts summary, _) -> do
+        (HscRecomp cgguts summary iface, _) -> do
             output_fn <- getOutputFilename next_phase
                             (Temporary TFL_CurrentModule)
                             basename dflags next_phase (Just location)
             -- We're in --make mode: finish the compilation pipeline.
             _ <- runPipeline StopLn hsc_env
-                               Just (HscOut src_flavour mod_name (HscRecomp cgguts summary)))
+                               Just (HscOut src_flavour mod_name (HscRecomp cgguts summary iface)))
                               (Just basename)
                               (Just location)
@@ -1104,13 +1104,13 @@ runPhase (HscOut src_flavour mod_name result) _ dflags = do
                        basename = dropExtension input_fn
                    liftIO $ compileEmptyStub dflags hsc_env' basename location mod_name
                    return (RealPhase StopLn, o_file)
-            HscRecomp cgguts mod_summary
+            HscRecomp cgguts mod_summary iface
               -> do output_fn <- phaseOutputFilename next_phase

                     PipeState{hsc_env=hsc_env'} <- getPipeState

                     (outputFilename, mStub, foreign_files) <- liftIO $
-                      hscGenHardCode hsc_env' cgguts mod_summary output_fn
+                      hscGenHardCode hsc_env' cgguts mod_summary iface output_fn
                     stub_o <- liftIO (mapM (compileStub hsc_env') mStub)
                     foreign_os <- liftIO $
                       mapM (uncurry (compileForeign hsc_env')) foreign_files
diff --git a/compiler/main/HscMain.hs b/compiler/main/HscMain.hs
index d7cebd00fc..198865d6e0 100644
--- a/compiler/main/HscMain.hs
+++ b/compiler/main/HscMain.hs
@@ -763,7 +763,7 @@ finish summary tc_result mb_old_hash = do
             desugared_guts <- hscSimplify' plugins desugared_guts0
             (iface, changed, details, cgguts) <-
               liftIO $ hscNormalIface hsc_env desugared_guts mb_old_hash
-            return (iface, changed, details, HscRecomp cgguts summary)
+            return (iface, changed, details, HscRecomp cgguts summary iface)
       else mk_simple_iface
   liftIO $ hscMaybeWriteIface dflags iface changed summary
@@ -1292,10 +1292,10 @@ hscWriteIface dflags iface no_change mod_summary = do
         writeIfaceFile dynDflags dynIfaceFile' iface

 -- | Compile to hard-code.
-hscGenHardCode :: HscEnv -> CgGuts -> ModSummary -> FilePath
+hscGenHardCode :: HscEnv -> CgGuts -> ModSummary -> ModIface -> FilePath
                -> IO (FilePath, Maybe FilePath, [(ForeignSrcLang, FilePath)])
                -- ^ @Just f@ <=> _stub.c is f
-hscGenHardCode hsc_env cgguts mod_summary output_filename = do
+hscGenHardCode hsc_env cgguts mod_summary iface output_filename = do
         let CgGuts{ -- This is the last use of the ModGuts in a compilation.
                     -- From now on, we just use the bits we need.
                     cg_module   = this_mod,
@@ -1327,6 +1327,11 @@ hscGenHardCode hsc_env cgguts mod_summary output_filename = do
             prof_init = profilingInitCode this_mod cost_centre_info
             foreign_stubs = foreign_stubs0 `appendStubC` prof_init

+        ------ Overwrite iface file with new info ------------
+        -- Generating iface again
+        hscWriteIface dflags iface False mod_summary
         ------------------  Code generation ------------------

         -- The back-end is streamed: each top-level function goes
diff --git a/compiler/main/HscTypes.hs b/compiler/main/HscTypes.hs
index d57d69bda6..15c7b1fb03 100644
--- a/compiler/main/HscTypes.hs
+++ b/compiler/main/HscTypes.hs
@@ -222,7 +222,7 @@ data HscStatus
     | HscUpToDate
     | HscUpdateBoot
     | HscUpdateSig
-    | HscRecomp CgGuts ModSummary
+    | HscRecomp CgGuts ModSummary !ModIface

 -- -----------------------------------------------------------------------------
 -- The Hsc monad: Passing an environment and warning state

It looks large but all I'm doing is passing the ModIface to the code generator, and overwriting the interface file (without changing anything) after STG generation. If I build GHC with this patch I get weird errors like:

"inplace/bin/ghc-stage1" -hisuf hi -osuf  o -hcsuf hc -static  -O0 -H64m -Wall      -this-unit-id ghc-prim-0.5.3 -hide-all-packages -i -ilibraries/ghc-prim/. -ilibraries/ghc-prim/dist-install/build -Ilibraries/ghc-prim/dist-install/build -ilibraries/ghc-prim/dist-install/build/./autogen -Ilibraries/ghc-prim/dist-install/build/./autogen -Ilibraries/ghc-prim/.    -optP-include -optPlibraries/ghc-prim/dist-install/build/./autogen/cabal_macros.h -package-id rts -this-unit-id ghc-prim -XHaskell2010 -O  -no-user-package-db -rtsopts  -Wno-trustworthy-safe -Wno-deprecated-flags     -Wnoncanonical-monad-instances  -odir libraries/ghc-prim/dist-install/build -hidir libraries/ghc-prim/dist-install/build -stubdir libraries/ghc-prim/dist-install/build   -dynamic-too -c libraries/ghc-prim/./GHC/CString.hs -o libraries/ghc-prim/dist-install/build/GHC/CString.o -dyno libraries/ghc-prim/dist-install/build/GHC/CString.dyn_o

libraries/ghc-prim/GHC/CString.hs:23:1: error:
    Bad interface file: libraries/ghc-prim/dist-install/build/GHC/Types.hi
        mismatched interface file ways (wanted "", got "dyn")
23 | import GHC.Types
   | ^^^^^^^^^^^^^^^^

I also tried writing the interface file _twice_ when we write it for the first time, just to make sure this isn't because we see a file and take a different code path and break things etc. but that's not the case, it works fine. So somehow if I overwrite it right after writing it, it's fine. But if I overwrite it later in compilation (after STG generation) things break.

Last edited 11 months ago by osa1 (previous) (diff)

comment:32 Changed 11 months ago by osa1

Amazing.. I can build this with hadrian but not with make. I wonder what's causing this different behavior..

comment:33 Changed 11 months ago by osa1

If I add these two lines to mk/ I can build with make too:


so I somehow broke something with dyn_o/dyn_hi builds.

comment:34 Changed 11 months ago by simonmar

I was thinking about what we discussed the other day about compilations that stop after generating an interface file, such as -fno-code. Since this will now affect the ABI, we must include -fno-code when we fingerprint the flags, here:$37-64

Possible this can be refined to "-fno-code is relevant only when -fno-omit-iface-pragmas".

Are there any other flags that should be in this category?

comment:35 Changed 11 months ago by osa1

Possible this can be refined to "-fno-code is relevant only when -fno-omit-iface-pragmas".

I'm confused about this line. We currently don't have -fno-omit-iface-pragmas. Are you suggesting implementing a new flag for something and only adding that to the fingerprint? (instead of -fno-code)

comment:36 Changed 11 months ago by simonmar

Sorry, the flag is -fno-omit-interface-pragmas. Does it make sense now?

comment:37 Changed 11 months ago by osa1

What does that flag do? I think it's not documented (I checked the man page and user manual). Looking at the code, it seems like it doesn't write pragmas to the interface file (and ignores any pragmas that may make a different in the interface file, e.g. RULEs and UNPACK pragmas). I don't see how it's relevant to -fno-code though...

comment:38 Changed 11 months ago by osa1

I found the problem with dynamic interface files that I mention in comment:31. Normally when I run these:

$ ghc-stage1 --show-iface libraries/ghc-prim/dist-install/build/GHC/Types.hi
$ ghc-stage1 --show-iface libraries/ghc-prim/dist-install/build/GHC/Types.dyn_hi

The first command should show Wanted: [] Got: [] in Ways section, the second should show Wanted: [] Got: [dyn], but with my code both show Got [dyn]. So it seems like I'm generating a dyn iface file instead of a normal iface file.

This seems to be related with how the -dynamic-too flag is implemented (it runs the pipeline multiple times, but I think the repeated parts of the pipeline does not include the code generator).

comment:39 Changed 11 months ago by simonmar

@osa1 sorry I assumed too much background knowledge here. Let me fill in the blanks: -fomit-interface-pragmas is on by default with -O0, but off with -O and above. It is the flag that controls whether we emit optimisation info, including things like unfoldings, strictness signatures, arity, and CAF info into the interface file. When the flag is on, we only emit types and nothing else.

It normally isn't used explicitly on the command line. Perhaps it should be documented, I'm not sure what our policy on documenting "internal" flags like this is currently.

The relevance to -fno-code is this:

  • we were considering whether -fno-code affects the ABI or not, because if it affects the ABI, then it needs to be included in the flags we fingerprint in FlagChecher
  • Prior to your changes here, -fno-code does not affect the ABI, but now it does, because the CAF info may change if we use -fno-code.
  • But, this only applies if -fomit-interface-pragmas is off (which is also included in FlagChecker right now).
  • So, we must now include -fno-code in FlagChecker

Make sense?

comment:40 Changed 11 months ago by osa1

Aha, found the problem. We run the code generator twice with -dynamic-too, but normally we write the interface files only once (hscWriteIface generates two interfaces, a .hi and a .dyn-hi). Now if I call hscWriteIface again after STG generation I end up calling hscWriteIface two more times, one for the normal compilation and one for -dynamic-too. The first call correctly updates both interface files, then before the second call we update the DynFlags to add WayDyn way and remove Opt_BuildDynamicToo (in dynamicTooMkDynamicDynFlags), so the second call to hscWriteIface updates the normal iface file (not the dyn iface), but updates it with WayDyn.

Correct behavior: in the first codegen we should only update the normal iface file, in the second only the dyn iface file.

Not sure how to implement this yet.

(I'll look at comment:39 later)

comment:41 Changed 11 months ago by osa1

Yay, it validates!

Will do some more refactoring and submit a diff soon.

comment:42 Changed 11 months ago by osa1

Differential Rev(s): Phab:D5432
Status: newpatch

Not quite ready but submitted the patch to get comments.

Note: See TracTickets for help on using tickets.