Changes between Version 45 and Version 46 of Commentary/Compiler/NewCodeGenPipeline

Feb 1, 2014 6:43:03 PM (6 years ago)

Reverted to version 43.


  • Commentary/Compiler/NewCodeGenPipeline

    v45 v46  
    11== Historical page ==
    3 This page used to store information about Cmm Pipeline in the new code generator. This description has been updated, moved to the [wiki:Commentary/Compiler/CodeGen Code Generator] page and is maintained there. The rest of this page has a few historical notes, mostly about Adams optimisation. That optimisation is also described in Note [sharing continuations] in [[GhcFile(compiler/codeGen/StgCmmMonad.hs)]] and probably deserves its own wiki page.
     3This page stores historical information about Cmm Pipeline in the new code generator. This description has been updated and is maintained on the [wiki:Commentary/Compiler/CodeGen Code Generator] page. This page has also historical notes about Adams optimisation. That optimisation is also described in Note [sharing continuations] in [[GhcFile(compiler/codeGen/StgCmmMonad.hs)]] and probably deserves its own wiki page.
    55= Design of the new code generator =
    88See also: [wiki:Commentary/Compiler/NewCodeGenModules overview of the module structure in the new code generator].
     10== Overview ==
     12Code generation now has three stages:
     13  1. Convert STG to Cmm, with implicit stack implicit, and native Cmm calls.
     14  2. Optimise the Cmm, and CPS-convert it to have an explicit stack, and no native calls.
     15     This part of the pipeline is stitched together in `cmm/CmmPipeline.hs`.
     16  3. Feed the CPS-converted Cmm to the existing, unmodified native code generators.
     18Ultimately our plan is to expand the capability of the new pipeline so that it does native code generation too, and we can ultimately discard the existing code generators.  The design of this stage is here: [wiki:Commentary/Compiler/IntegratedCodeGen]
    1020== The Cmm pipeline ==
    12 Description of Cmm pipeline has been moved to [wiki:Commentary/Compiler/CodeGen Code Generator] page. Below are descriptions of two passes that at some point existed in the Cmm pipeline of the new code generator but were later removed and didn't make it into GHC 7.8 (first stable release with the new codegen).
     22The first two steps are described in more detail here:
     24 * '''Code generator''' converts STG to `CmmGraph`.  Implemented in `StgCmm*` modules (in directory `codeGen`).
     25   * `Cmm.CmmGraph` is pretty much a Hoopl graph of `CmmNode.CmmNode` nodes. Control transfer instructions are always the last node of a basic block.
     26   * Parameter passing is made explicit; the calling convention depends on the target architecture.  The key function is `CmmCallConv.assignArgumentsPos`.
     27     * Parameters are passed in virtual registers R1, R2 etc. [These map 1-1 to real registers.]
     28     * Overflow parameters are passed on the stack using explicit memory stores, to locations described abstractly using the [wiki:Commentary/Compiler/StackAreas ''Stack Area'' abstraction.].   
     29     * Making the calling convention explicit includes an explicit store instruction of the return address, which is stored explicitly on the stack in the same way as overflow parameters. This is done (obscurely) in `MkGraph.mkCall`.
     31 * '''Simple control flow optimisation''', implemented in `CmmContFlowOpt`.  It's called both at the beginning and end of the pipeline.
     32   * Branch chain elimination.
     33   * Remove unreachable blocks.
     34   * Block concatenation.  branch to K; and this is the only use of K. 
     36 * '''More control flow optimisations'''.
     37   * Common Block Elimination (like CSE). This essentially implements the Adams optimisation, we believe.
     38   * Consider (sometime): block duplication.  branch to K; and K is a short block.  Branch chain elimination is just a special case of this.
     40 * '''Proc-point analysis''' and '''transformation''', implemented in `CmmProcPoint`. The transformation part adds a function prologue to the front of each proc-point, following a standard entry convention.
     41    * The analysis produces a set of `BlockId` that should become proc-points
     42    * The transformation inserts a function prologue at the start of each proc-point, and a function epilogue just before each branch to a proc-point.
    1444 * '''(OUTDATED - !CmmSpillReload does not exist any more)''' '''Add spill/reload''', implemented in `CmmSpillReload`, to spill live C-- variables before a call and reload them afterwards.  The spill and reload instructions are simply memory stores and loads respectively, using symbolic stack offsets (see [wiki:Commentary/Compiler/StackAreas#Layingoutthestack stack layout]).  For example, a spill of variable 'x' would look like `Ptr32[SS(x)] = x`.
    2353   * Sink or inline assignments nearer their use points
    2454   * Do constant mach-op folding. This is done in this phase, because folded mach-ops can be inlined, and inlining exposes opportunities for mach-op folding.
     56 * '''Remove dead assignments and stores''', implemented in `CmmLive`, removes assignments to dead variables and things like ``a = a`` or ``I32[Hp] = I32[Hp]``. The latter may more appropriately be done in a general optimization pass, as it doesn't take advantage of liveness information.
     58 * '''Figure out the stack layout''', implemented in `CmmStackLayout`.
     59   * Each variable 'x', and each proc-point label 'K', has an associated ''Area'', written SS(x) and SS(k) resp, that names a contiguous portion of the stack frame. 
     60   * The stack layout pass produces a mapping of: ''(`Area` -> `StackOffset`)''. For more detail, see [wiki:Commentary/Compiler/StackAreas#Layingoutthestack the description of stack layout.]
     61   * A `StackOffset` is the byte offset of a stack slot from the old end (high address) of the frame.  It doesn't vary as the physical stack pointer moves.
     63 * '''Manifest the stack pointer''', implemented in `CmmStackLayout`.  Once the stack layout mapping has been determined, a second pass walks over the graph, making the stack pointer, `Sp` explicit. Before this pass, there is no `Sp` at all.  After this, `Sp` is completely manifest.
     64   * replacing references to `Areas` with offsets from `Sp`.
     65   * adding adjustments to `Sp`.
     67 * '''Split into multiple !CmmProcs''', implemented in `CmmProcPointZ`.  At this point we build an info-table for each of the !CmmProcs, including SRTs.  Done on the basis of the live local variables (by now mapped to stack slots) and live CAF statics.
     68   * `LastCall` and `LastReturn` nodes are replaced by `Jump`s.
     70 * '''Build info tables''', implemented in `CmmBuildInfoTables`.. 
     71   * Find each safe `MidForeignCall` node, "lowers" it into the suspend/call/resume sequence (see `Note [Foreign calls]` in `CmmNode.hs`.), and build an info table for them.
     72   * Convert the `CmmInfo` for each `CmmProc` into a `[CmmStatic]`, using the live variable information computed just before "Figure out stack layout". 
    2674=== Branches to continuations and the "Adams optimisation" ===