Hoopl cleanup

This page was created in August 2013 as a temporary place to store proposals about cleaning up Hoopl library. After these changes are implemented we should replace this page with either another page or a Note in the source code that would explain current design of Hoopl.

Me (Jan Stolarek, JS) and Simon PJ recently had some discussion about cleaning up Hoopl to make its interface more consistent and less confusing. Here are some of the key proposals and ideas:

The API for forward and backward analysis

Observations about forward analysis

  • Forwards analysis starts with a list of entry labels (typically just one), and it makes sense to have an in-flowing fact for each such label. Yes, it could default to bottom, but it's probably a user error not to supply such a fact.
  • If forwards analysis is given a fact for each entry label, Hoopl never needs to know the bottom of the lattice; indeed there doesn't need to be a bottom. Hoopl treats the block in dependency order, so it always has an in-flowing fact before it starts to analyse a block. It needs still a join for the lattice, of course.
  • For some analyses, it's quite clumsy to have a bottom element. Consider constant-propagation, where we want to transform
      x := 3     ===>    x := 3
      ....               ...
      y = x+2            y = 3+2
    (We might then do constant folding and dead code elim, but ignore that for now.) If we need a bottom in the lattice, our facts look like
      data CPFact = Bot | CP (Map LocaReg Const)
    When a variable is not in the domain of the map it means it maps to top (ie the variable can hold different values on different control-flow paths). This is all fine, but the join operation needs to deal with Bot etc. And what is frustrating is that Bot is never, ever used! I don't want to define it and manipulate it when it is never used!

Conclusion: for fwd analysis we don't need a bottom in the lattice, and it's a pain for (some) clients to supply one.

Observations about backward analysis

  • Backwards analysis currently takes a list of entry points, so that it finds the reachable code and enumerates it in reverse order. But that's all the entry point list does. It'd be just fine to enumerate all the blocks in the graph in reverse order, and not supply a list of entry points.
  • Backwards analysis (for a closed-on-entry graph) takes a (Fact x f) argument, for a graph where x describes its open/closed on exit status. So if x=O we pass one fact; and that is entirely reasonable becuse it is the fact flowing backwards into the exit. But if x=C we pass a FactBase. At first I thought that was stupid, but now I see some sense in it: these are facts labels outside (downstream successors of) the graph being analysed. We'd better document this point.
  • NB: returning to the first bullet, we can't just take code reachable from downstream successors (ie behave dually to fwd anal), because tail calls, returns, and infinite loops don't have any such downstream successors, but we jolly well want to analyse them.
  • Backward analysis does need a bottom for the lattice, to initialise loops. Example:
       L1: ...blah blah...
           CondBranch e L1 L2
       L2: blah blah
    When analysing L1 (backwards) we must join the facts flowing back from L2 (which we will have analysed first) and L1; and on the first iteration, we don't have any fact from L1. You might think we could just use the fact from L2, and merely refrain from joining with L1, but that doesn't deal with the case where the CondBranch was an unconditional branch to L1, so there is no other fact to join with.

Conclusion: for backwards analysis the client really must give us a bottom element.


We could reflect these observations in the API for forwards and backward analysis, as follows.

Current situation:

data FwdPass m n f
  = FwdPass { fp_lattice  :: DataflowLattice f
            , fp_transfer :: FwdTransfer n f
            , fp_rewrite  :: FwdRewrite m n f }

   :: forall m n f e x entries. (CheckpointMonad m, NonLocal n, LabelsPtr entries)
   => FwdPass m n f
   -> MaybeC e entries
   -> Graph n e x -> Fact e f
   -> m (Graph n e x, FactBase f, MaybeO x f)

data BwdPass m n f
  = BwdPass { bp_lattice  :: DataflowLattice f
            , bp_transfer :: BwdTransfer n f
            , bp_rewrite  :: BwdRewrite m n f }

   :: (CheckpointMonad m, NonLocal n, LabelsPtr entries)
   => BwdPass m n f
   -> MaybeC e entries -> Graph n e x -> Fact x f
   -> m (Graph n e x, FactBase f, MaybeO e f)

Possible refactoring:

data FwdPass m n f
  = FwdPass { fp_join     :: JoinFun f   -- No "bottom" for fwd
            , fp_transfer :: FwdTransfer n f
            , fp_rewrite  :: FwdRewrite m n f }

   :: forall m n f e x entries. (CheckpointMonad m, NonLocal n)
   => FwdPass m n f
   -> Fact e f           -- Entry points plus a fact for each
   -> Graph n e x 
   -> m (Graph n e x, FactBase f, MaybeO x f)

data BwdPass m n f
  = BwdPass { bp_bot      :: f        -- Need "bottom" for bwd
            , bp_join     :: JoinFun f
            , bp_transfer :: BwdTransfer n f
            , bp_rewrite  :: BwdRewrite m n f }

   :: (CheckpointMonad m, NonLocal n)
   => BwdPass m n f
   -> Fact x f       -- Facts about successors
   -> Graph n e x
   -> m (Graph n e x, FactBase f, MaybeO e f)

The differences are not great. But the types are still nicely symmetrical; and they say more precisely what is actually necessary and useful.

Smaller proposals

  • (David Luposchainsky) Hoopl is the only library in GHC that defines its own <*> operation, which will clash with the AMP. Hoopl's <*> is conceptually just mappend, so if you're doing a large-scale refactoring of the module maybe consider adding a suitable Monoid instance to replace <*> with <> (or something) before it even becomes a problem.
  • Simon doesn't like the joinInFacts function, which is only called to possibly produce some debugging output from the join function.
  • (Jan Stolarek) To add more to the above point, the lattice join function is required to take a Label paramter for debugging purposes. All join functions I've seen so far simply ignore that parameter but it still leads to problems. One example of this is joinFacts function, which requires the Label of a block for which we join facts so that it can pass it to the join function. The problem is there is no easy way to recover that Label at the call site of joinFacts. The alternatives here are: a) pass a bogus Label (it can even be undefined) since it won't be used anyway; b) use joinOutFacts which does not require a Label but is marked as deprecated, which requires us to use -fno-warn-deprecation-warnings. Both solutions feel wrong and the only good one seems to be changing the type of lattice join function.
  • Jan doesn't like mess in Hoopl repo. There are unused modules (Compiler.Hoopl.OldDataflow, Compiler.Hoopl.DataflowFold), older versions of some modules (in prototypes/ directory) or private correspondence with paper reviewers and between authors.

Changing the graph type

Suppose you want to do liveness analysis, but afterwards to know the liveness at every node not just at every block Id. One way might be to allow rewriting to transform to a graph with a different node type, so that liveness analysis would have type

  Graph CmmNode -> Graph (CmmNode, Liveness)

To do this, the rewrite functions would have to be able to return graphs of a new type, something like this:

newtype BwdRewrite m n n' f 
  = BwdRewrite3 { getBRewrite3 ::
                    ( n C O -> f          -> m (Maybe (Graph n' C O, BwdRewrite m n' n' f))
                    , n O O -> f          -> m (Maybe (Graph n' O O, BwdRewrite m n' n' f))
                    , n O C -> FactBase f -> m (Maybe (Graph n' O C, BwdRewrite m n' n' f))
                    ) }

We would have to eliminate the Maybe, however, because the no-rewrite case still changes the type.

I'm not sure of the consequences of this, but it's intriguing.


Here are some Hoopl tickets:

  • #9851 - Hoopl library in GHC hides runWithFuel / version number clash
  • #9853 - Stateful transformation causes non-termination in Hoopl analysis.
  • #8315 - Improve specialized Hoopl module
Last modified 4 years ago Last modified on Jan 11, 2016 7:27:13 AM