Opened 4 years ago

Last modified 3 years ago

#10681 new feature request

Teach GHC to interpret all hs files as two levels of hs-boot files (abstract types only/full types + values)

Reported by: ezyang Owned by: ezyang
Priority: low Milestone:
Component: Compiler Version: 7.11
Keywords: backpack Cc: skilpat
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by ezyang)

This is a new proposal for solving #1409. The big addition here is that we create two hs-boot files for each hs file: one that is a full hs-boot file to be imported by hs files to break loops, and a second which only includes abstract types for hs-boot files to import. C.f. #10679

Discussion. Here is a slightly goofy hs-boot file I've excerpted from GHC:

module Packages where

import {-# SOURCE #-} Module (PackageKey)
import {-# SOURCE #-} DynFlags (DynFlags)

packageKeyPackageIdString :: DynFlags -> PackageKey -> Maybe String

The hs-boot file must itself import hs-boot files, because this boot file is used by Module and DynFlags; without {-# SOURCE #-}, the boot file itself will participate in a cycle!

But notice that there is something very interesting: a boot file is ONLY ever interested in importing other modules to get types. Never to import constructors or functions!

We can use this observation to give us a mechanical transformation of an hs file to an hs-boot file, ASSUMING we can define a "second level" of hs-boot file to record our abstract types.

Example. In this example, we have chosen to break the loop from As import to B.

module A where
  import {-# SOURCE #-} B
  data A = A B
  f :: A -> Bool
  f (A (B (A b))) = g b
  f _ = True

module B where
  import A
  data B = B A
  g :: B -> Bool
  g (B (A (B b))) = f b
  g _ = False

The first-level hs-boots are:

module A where -- not actually used
  import {-# SOURCE 2 #-} B
  data A = A B
  f :: A -> Bool

module B where
  import {-# SOURCE 2 #-} A
  data B = B A
  g :: B -> Bool

The second-level hs-boots are:

module A where
  data A

module B where -- not actually used
  data B

Commentary. Here are some remarks:

  1. Because we have to lift the transitive dependencies of anything we {-# SOURCE #-} import, it doesn't make sense to have a pragma which explicitly says what to put in the hs-boot file; instead, we just put in everything that we *can* handle in an hs-boot file (so exclude anything with missing type signatures, type families, etc.) Ideally, these automatic hs-boot files are generated lazily, but they should be reused as necessary.
  1. This facility actually makes {-# SOURCE #-} a lot more attractive for increasing separate compilation: you can mark an import {-# SOURCE #-} to ensure that if its implementation changes, you don't have to recompile this module / you can build the module in parallel with that module. The downside is that when the imported file is modified, we have to regenerate the hs-boot stub before we conclude that the types have not changed (as opposed to with separate hs-boot files, where a modification to hs would not bump the timestamp on hs-boot.
  1. With Haskell98, you should never need more than two levels of hs-boot nesting. However, with data kind promotion, you may need arbitrarily many levels of nesting. You could simply exclude promoted data kinds ala Handling unsupported boot features; however an alternate thing to do is generalize hs-boot to arbitrarily many levels. However, this might be annoying to implement because dependency analysis needs to know how to determine universe stratification so it can tell how many levels of hs-boot are necessary.
  1. We can't force the first level of hs-boot files to be abstract types, for two reasons: (1) a source file importing the hs-boot file may really need the selector/constructor, and (2) the hs-boot files will reflect any cycles from the source files, that's no good! Rolling out to the second level breaks the cycle because abstract types never need any imports.
  1. What about type class instances? I propose that instances be lifted to the hs-boot level (so hs file usages of the instance continue to work), but not the hs-boot2 level (so that we can still "bottom out"). This can result in some slightly unintuitive behavior, however:
    module A where
      instance Eq (a -> b) where ...
    module B where
      import A
    module C where
      import {-# SOURCE #-} B
    In this case, C would NOT see the Eq instance for functions defined in A.

Handling unsupported boot features. Some type-level features in Haskell are not supported at the boot-level (type families, etc), so the automatic generation of hs-boot needs a way of transitively(!) excluding these definitions from hs-boot files. We can exclude things from the boot file in the following way:

  1. If a declaration is not liftable to the hs-boot file, we replace it with a "not bootable" declaration, which specifies that there is something with this Name, but we don't have any information about it. (This is a sort of generalized version of an abstract type).
  1. If we are type-checking a declaration and make reference to a not bootable declaration, the full declaration itself is considered not bootable.

Alternately, we can just make sure all language features are supported in boot files.

Change History (9)

comment:1 Changed 4 years ago by ezyang

Summary: Teach GHC to interpret all hs files as two types of hs-boot files (abstract types only/full types + values)Teach GHC to interpret all hs files as two levels of hs-boot files (abstract types only/full types + values)

comment:2 Changed 4 years ago by skilpat

Cc: skilpat added

comment:3 Changed 4 years ago by goldfire

ezyang says:

This seems to definitely suggest that you should never need more than two levels of hs-boot nesting, or perhaps three with kinding. (But maybe someone has a fancy type system feature for which this is not true!)

I believe I have such a fancy type system feature. My solution to #7961 (that is, my very long-running "nokinds" branch, due to be merged Any Season Now [at least it's better than Any Year Now]) merges types with kinds, allowing an arbitrary number of "levels". In other words, I think a pathological case can (if I understand your proposal here correctly) require an arbitrary number of hs-boot levels.

For example:

data A
data B :: A -> *
data C :: B a -> *
data D :: C b -> *
data E :: D c -> *

This compiles fine on my branch, but I believe would cause problems with this proposal.

comment:4 Changed 4 years ago by ezyang

Thanks Richard. Do you have any examples which would require "infinitely" many levels? Your example requires an arbitrary number of levels, but it's always finite.

comment:5 Changed 4 years ago by goldfire

I don't, at the moment. But I conjecture they're possible. The "infinite" case would correspond to programs that aren't writable in Agda/Coq/Idris due to their universe hierarchy. I hear tell of programmers running into this limitation occasionally, but I don't know if the cases are pathological or not.

comment:6 Changed 4 years ago by ezyang

Description: modified (diff)

Update with a comment about handling type class instances.

comment:7 Changed 4 years ago by ezyang

Description: modified (diff)

Updated description with some remarks about handling type system features which are not supported by boot files.

@goldfire: Intuitively, it seems like if you can figure out your universe hierarchy, you can just write as many levels of hs-boot files as you need. Unfortunately, because types and kinds are syntactically merged in your nokinds branch, it's not immediately obvious prior to typechecking what the universes are (the pain of de-stratifying!) which makes it much more difficult to plan compilation. So it seems like it would be much simpler to just not include these types of declarations in hs-boot files.

comment:8 Changed 4 years ago by ezyang

Keywords: backpack added

comment:9 Changed 3 years ago by ezyang

Priority: normallow
Note: See TracTickets for help on using tickets.