Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#11011 closed feature request (fixed)

Add type-indexed type representations (`TypeRep a`)

Reported by: bjmprice Owned by: bgamari
Priority: normal Milestone: 8.2.1
Component: Compiler Version:
Keywords: Typeable Cc: rodlogic, ekmett, core-libraries-committee@…, mboes, int-e, adamgundry, rrnewton, RyanGlScott, oerjan, baramoglo
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D2010
Wiki Page: wiki:Typeable/BenGamari

Description (last modified by simonpj)

We have a plan to move to a type-indexed form of TypeRep. This ticket serves to track progress.

The key wiki page is: Typeable.

We would like to invite comments and discussion from the community using this ticket as a more permanent home than email.

There is a also a broader question, about using the new expressiveness of TypeRep and Typeable to support static pointers. One particular point that would benefit from more eyes is the polymorphic static pointers support. But first things first!

Change History (52)

comment:1 Changed 4 years ago by bjmprice

To record a couple of my thoughts:

  • When RAE's nokinds branch merges, we can have kind-hetrogenous type equalitys, which would find good use in comparing TypeReps
  • When Injective Type Families is implemented, it may be useful for reducing passing PolyTags around (currently they are not used in the code, but needed, else GHC complains about not necessarily injective type families)

comment:2 Changed 4 years ago by rodlogic

Cc: rodlogic added

comment:3 Changed 4 years ago by goldfire

From the wiki:

To ease the transition we will provide both

  • The old API via a (deprecated) new module Data.Typeable710.
  • The old API via the exiting module Data.Typeable but with new names for (deprecated) old types and functions.

This doesn't appear to ease the transition to me. It means forcing users to make a choice: 1) be lazy and change your import, or 2) educate themselves about our new interface, which involves a non-trivial amount of whiz-bang new features. Many will choose (1), because many people are lazy, especially in the face of type theory. Then those people will have to change again.

Furthermore, this violates the "3-year no-warning maintenance window" policy being formulated on the libraries@ list.

And, with a little type-level hackery, I think we can have our cake and eat it to: use just 1 set of names for both APIs. To wit:

data TypeRep710 = forall a. TypeRep710 (TypeRep80 a)
data TypeRep80 :: k -> *

type family TypeRep :: k where
  TypeRep = TypeRep710
  TypeRep = TypeRep80

class Typeable (a :: k) where
  typeRep# :: Proxy# a -> TypeRep80 a

type family TypeRepKind res where
  TypeRepKind (proxy (a :: k) -> b) = k
  TypeRepKind (TypeRep80 (a :: k)) = k

class TypeRepResult res where
  type TypeRepIndex res :: TypeRepKind res
  typeRep :: Typeable (TypeRepIndex res) => res

instance (b ~ TypeRep710) => TypeRepResult (proxy a -> b) where
  type TypeRepIndex (proxy a -> b) = a
  typeRep (_ :: proxy a) = TypeRep710 (typeRep :: TypeRep80 a)
instance TypeRepResult (TypeRep80 a) where
  type TypeRepIndex (TypeRep80 a) = a
  typeRep = undefined

This works on my branch. The following definitions also work:

foo :: TypeRep -> ()
foo = undefined

bar :: TypeRep Int -> ()
bar = undefined

where the first gets TypeRep710 and the second gets TypeRep80. We also conveniently have typeRep :: Typeable a => proxy a -> TypeRep (where that's the TypeRep710) and typeRep :: Typeable a => TypeRep a (where that's the TypeRep80). In a few years, we just remove the compatibility shim. :)

Do I honestly think this is a good idea? I'm not sure. But I don't have a better one that will keep everyone happy.

comment:4 Changed 4 years ago by simonpj

Remember:

  • Most things that use Typeable will continue to work un-modified. The class has not changed its kind; and the functions cast, gcast etc continue to work with the same signatures.
  • I believe that only a relatively small number of packages (mostly Edward K's) decompose TypeRep; in fact, Richard I think you did a small study that discovered this.

The story is already quite complicated. Making it more complicated still with a back-compat shim is undesirable unless it is absolutely necessary.

First thing is to nail the API etc; then we must discuss transition.

None of this is going to land until after 8.0

comment:5 Changed 4 years ago by simonpj

Description: modified (diff)
Summary: Type-indexed TypeReps, Static Pointers and Distributed ClosuresAdd type-indexed type representations (`TypeRep a`)

comment:6 Changed 4 years ago by simonpj

Description: modified (diff)

comment:7 Changed 4 years ago by bgamari

One bit of trickiness that I've encountered while looking at this is the that fact Core Lint disallows unsaturated applications of unlifted types. So, for instance, if we have,

data TTTypeRep (a :: k) where
    TrTyCon :: !Fingerprint -> !(TyCon a) -> TTypeRep (a :: k)
    TrApp   :: forall k1 k2 (a :: k1 -> k2) (b :: k1).
               !Fingerprint
            -> TTypeRep (a :: k1 -> k2)
            -> TTypeRep (b :: k1)
            -> TTypeRep (a b)

And we ask for Typeable (Array# Int), we will attempt to decompose Array# Int and end up with a subterm,

array#Rep :: TTypeRep Array#
array#Rep = TrTyCon @(* -> #) @Array# fprint array#TyCon

in the resulting representation, which currently fails Core Lint due to the argument @Array#, which is prohibited by Core Lint,

  • compiler/coreSyn/CoreLint.hs

    diff --git a/compiler/coreSyn/CoreLint.hs b/compiler/coreSyn/CoreLint.hs
    index 26e7257..861a9bc 100644
    a b lintType ty@(TyConApp tc tys) 
    10421042  = lintType ty'   -- Expand type synonyms, so that we do not bogusly complain
    10431043                   --  about un-saturated type synonyms
    10441044
    1045   | isUnliftedTyCon tc || isTypeSynonymTyCon tc || isTypeFamilyTyCon tc
     1045  | isTypeSynonymTyCon tc || isTypeFamilyTyCon tc
    10461046       -- Also type synonyms and type families
    10471047  , length tys < tyConArity tc
    10481048  = failWithL (hang (text "Un-saturated type application") 2 (ppr ty))

Richard suspected that this check is likely there for a good reason and it may take a bit of Hard Thought to ensure that unsaturated unlifted type applications are handled soundly.

Last edited 4 years ago by bgamari (previous) (diff)

comment:8 Changed 4 years ago by bgamari

Another question that I stumbled upon is whether we should provide something akin to Data.Typeable.splitPolyTyConApp, which provides access to the kind arguments which a type constructor was applied to. In some cases there may be no way to extract these kinds otherwise.

Consider the following,

data Proxy (a :: k) = Proxy
proxyIntRep :: TypeRep (Proxy Int)
proxyIntRep = typeRepKind proxyIntRep

data Compose (f :: k1 -> *) (g :: k2 -> k1) (x :: k2) = Compose (f (g a))
data Id a = Id a

composeIdIdRep :: TypeRep (Compose Id Id)
composeIdIdRep = typeRep

-- The plan currently allows
data AppResult (t :: k) where
    App :: TypeRep a  TypeRep b  AppResult (a b)
splitApp :: TypeRep a  Maybe (AppResult a)
typeRepKind :: TypeRep (a :: k) -> TypeRep k

-- Allowing one to get concrete kind of Proxy
proxyIntKind :: TypeRep (* -> *)
proxyIntKind = typeRepKind proxyIntRep

-- In the case of `Proxy Int` we can extract the first kind argument, 
-- `k ~ *`, trivially via function decomposition as it is the
-- kind of the first type.
pattern TRFun :: fun ~ (arg -> res)
              => TTypeRep arg 
              -> TTypeRep res 
              -> TTypeRep fun

-- In contrast, determining the instantiation of `k1` in the case of
-- `Compose Id Id` is not possible since it is not exposed in
-- the resulting kind.
composeIdIdKind :: TypeRep (* -> *)
composeIdIdKind = kindRep composeIdIdRep

comment:9 Changed 4 years ago by goldfire

Re comment:7:

Given our ideas in #11471, it's quite possible that the saturation check has gone stale. It does take a bit of Hard Thought, but my guess is that we can lift this restriction.

Re comment:8:

I'm afraid there are a few bugs in your code, and it's hard to disentangle what you're getting at.

  • The body of proxyIntRep is not what you mean. Perhaps you just meant typeRep?
  • The type of proxyIntKind is wrong. Why should it be TypeRep (* -> *)? I would think it's just plain old *, which is the kind of Proxy Int.
  • Your TRFun pattern would seem to want pattern signature () => fun ~ (arg -> res) => TTypeRep arg -> TTypeRep res -> TTypeRep fun, no? That is, the equality is provided, not required.
  • Are you worried about decomposing Proxy Int itself? As in case splitApp proxyIntRep of App trProxy _trInt -> typeRepKind trProxy? But that would just be * -> * because the Proxy would still be specialized to *.

Does this help to clarify? We absolutely need to only build kind-monomorphic TypeReps, as described in the paper, but I don't see where things go awry under this scheme.

comment:10 Changed 4 years ago by bgamari

Re comment:9:

Oh dear, yes, that code is just non-sense. Sorry about that, it seems I should have read through it before hitting submit.

Let's try again: The principle concern was that the current Typeable scheme provides a way for users to query the kinds at which a TyCon was instantiated via Data.Typeable.splitPolyTyConApp. The current New Typeable plan doesn't seem to provide any way to accomplish this.

comment:11 Changed 4 years ago by bgamari

Also, Richard pointed out in IRC that the TyCon representations that I produce in my current implementation are too weakly typed and would mean that we #10000 (also #9858) would remain (at least if we allowed something like TyCon -> TypeRep, which the current Typeable does not).

To fix this, we need to make the user provide type representations for the concrete kinds he wishes to instantiate at before giving him a representation. For instance, Proxy's representation would look like,

data TyCon a

proxyTyCon :: TypeRep k -> TyCon (Proxy :: k -> *)
proxyTyCon = ...

However, this makes an already complex story even more complex. Even worse, it will make all module with type definitions pay even more in both compilation time and code size due to the lambdas. TyCons are currently nice pure records with quite efficient representations. It would be nice to retain this property.

In light of this, perhaps it would be reasonable to consider an alternative:

An alternative: Remove TyCon entirely

An appealing alternative here would be to remove TyCon from the user-facing interface entirely. This would mean that it can remain unindexed, TyCon is already more-or-less an implementation detail and I can't think of any reason why a TypeRep wouldn't do.

Removing TyCon from the interface would be a substantial simplification to the scheme and neither Richard nor I could think of any reason why it is strictly necessary.

Last edited 4 years ago by bgamari (previous) (diff)

comment:12 Changed 4 years ago by bgamari

Actually, I forgot about a small wrinkle. TyCon is currently the only way one can query for the names of the package and module where a type constructor is defined.

Perhaps we should just stick with the status quo: provide TyCon in unindexed form as essentially a record of basic metadata about a type constructor (perhaps even hiding the fingerprint, so users don't get any funny ideas).

The important thing is that we don't provide a way to create a TypeRep from a TyCon (nor provide any evidence of type equality on the basis of TyCon equality). So long as this is the case the weakly-typed nature of TyCon should pose no threat.

Last edited 4 years ago by bgamari (previous) (diff)

comment:13 Changed 4 years ago by simonpj

  • Yes I hypothesise that when we have #11471 done (which will be soon) we can lift the restriction on partial applications of unboxed types. I don't remember why the restriction is there anyway; it may be that we can simply lift it.
  • The entire business of "querying the kinds at which a TyCon is instantiated" seems moot once types and kinds are unified. I.e. now. Doesn't the same decompose-type-application stuff work for kinds?
  • Yes, the main reason for TyCon is to be able to ask for its name, etc. Maybe other meta-data too in due course. SO I agree with comment:12. But let's not expose the internal representation of either TypeRep or TyCon.

comment:14 Changed 4 years ago by bgamari

I have an implementation of this here. It's quite rough, there are a variety of interface questions that remain to be resolved, and I have yet to fix up a few Typeable uses in GHC itself (e.g. the Binary instances) which prevents me from finishing a stage2 build. That being said most of the groundwork has been laid,

  • TTypeRep :: k -> * has replaced TypeRep in the definition of Typeable. The constructors of TTypeRep are entirely hidden; there are pattern synonyms for querying for the applications and type constructors.
  • There exists a TypeRep which is simply a quantified TTypeRep
  • We have tTypeRepKind :: TypeRep (a :: k) -> TypeRep k
  • The solver now preserves enough information to construct these representations. This took a bit of gymnastics due to the recursive kind relations of (->), TYPE, and Levity. These representations are handled as a bit of a special case. I need to
  • The desugarer can produce new Typeable dictionaries
  • I get much of the way through the stage2 build before failing as I haven't yet updated the instances in Binary.
  • withTypeable doesn't yet exist
  • In order to test things I have lifted the unsaturated-unlifted-application Core Lint check mentioned above
  • TrTyApp includes the kind of the application, which I only realized recently isn't necessary; it can be derived from the kind of the thing being applied.

There are a number of interface questions that remain:

  • do we want to provide the continuation-style queries or only pattern synonyms?
  • what should the typical use-case of serialization look like?
  • do we want to provide "paranoid" equality checks which scrutinize the entire representation, in addition to the constant-time fingerprint-based checks?
  • do we want to provide a way to view the kind arguments a tycon is instantiated at?
  • should the pattern synonyms against TrTypeRep provide the kind of the term in addition to the type constructor/applied types?
  • the usual bike-shedding regarding naming has yet to happen

That's pretty much the current state of things after spending the better part of last weekend on the project. Further progress will likely be slow until I have time again next weekend.

Last edited 4 years ago by bgamari (previous) (diff)

comment:15 in reply to:  13 Changed 4 years ago by goldfire

Replying to simonpj:

  • The entire business of "querying the kinds at which a TyCon is instantiated" seems moot once types and kinds are unified. I.e. now. Doesn't the same decompose-type-application stuff work for kinds?

No. We're using "kind" here to refer to parameters in which the tycon's kind is dependent. That is, when we have Proxy :: forall k. k -> Type, Proxy's first argument is dependent. We can't have a TypeRep just for plain, uninstantiated Proxy because Proxy's kind mentions a forall, and we would need TypeRep (forall k. k -> Type) Proxy, which requires impredicativity. And there's the matter of writing the very intricate type for mkAppTy :: TypeRep ... -> TypeRep ... -> TypeRep ..., which would now need to perform substitution. Ick. So we just don't do any of it.

This means that we can't decompose kind parameters.

comment:16 in reply to:  14 Changed 4 years ago by goldfire

Replying to bgamari:

  • The solver now preserves enough information to construct these representations. This took a bit of gymnastics due to the recursive kind relations of (->), TYPE, and Levity. These representations are handled as a bit of a special case. I need to

[sic] Ooh. It's like a mystery novel. Perhaps having a type-safe language with both * :: * and self-reflective type representations shows that math is actually inconsistent and opened up a hole in spacetime through which bgamari has exited. :) I do hope that's not the case.

Other points:

  • I vote only pattern synonyms.
  • Serialization is used in Cloud Haskell, no?
  • I like the idea of including the paranoid checks as an option for paranoid users. Although perhaps not, given that I believe it's likelier for a cosmic ray to flip a bit than for the fingerprints to give a false positive.

As for names: I think a proper migration story needs to be articulated. In a few years, I'd hate to have all these extra Ts lying around. Use a new module name for the new features and keep the old module with the old interface around? Maybe the new module can be Data.Reflection.

comment:17 Changed 4 years ago by simonpj

Oh yes, we have section about that: 5.8 in the paper.

OK then, how do we implement spitPolyTyConApp. The old API looked like:

splitPolyTyConApp :: TypeRep -> (TyCon,[KindRep],[TypeRep])

but what should the API look like now? Presumably the type args should be done with the new type-app-decomposition stuff, and when that says "not a type application" we can ask for the TyCon and KindRep args?

comment:18 Changed 4 years ago by bgamari

An alternative to falling back to an entirely unindexed TyCon would be to augment it with a type-indexed version,

-- Metadata describing a type constructor. Same thing we currently have.
data TyCon = TyCon { mod_name, pkg_name, con_name :: String }

-- A type constructor together with the kinds it was applied to (abstract)
data TTyCon (a :: k) = TTyCon { tycon :: TyCon
                              , dep_kinds :: [TypeRep]
                              , kind :: TTypeRep k }

-- A type representation (abstract)
data TTypeRep (a :: k) where
    TrTyCon :: Fingerprint -> TyCon a -> TTypeRep a
    TrApp   :: forall k1 k2 (a :: k1 -> k2) (b :: k1). 
               Fingerprint -> TTypeRep a -> TTypeRep b -> TTypeRep (a b)

pattern TRApp :: forall k k1 (a :: k1 -> k) (b :: k1) fun.
                 (fun ~ a b) => TTypeRep a -> TTypeRep b -> TTypeRep fun

pattern TRCon :: forall k (a :: k). TTyCon a -> TTypeRep a

tyConModule   :: TTyCon a -> String
tyConPackage  :: TTyCon a -> String
tyConDepKinds :: TTyCon a -> [TypeRep]
tyConKind     :: TTyCon (a :: k) -> TTypeRep k

tTypeRepKind  :: forall k (a :: k). TTypeRep a -> TTypeRep k
Last edited 4 years ago by bgamari (previous) (diff)

comment:19 Changed 4 years ago by goldfire

If we have an un-indexed TyCon type (an idea I quite like), then we can do something like this:

splitPolyTyConApp :: TypeRep a -> (TyCon, [TypeRepX])

which gets you all the arguments. But it loses all connection with the original type.

In other news, nowhere on Hackage (as downloaded sometime in the fall) is splitPolyTyConApp used. So maybe we just drop it.

comment:20 Changed 4 years ago by bgamari

These representations are handled as a bit of a special case. I need to

[sic] Ooh. It's like a mystery novel.

:-) That was meant to say that I needed to take another careful look at the implementation surrounding the representations primitive type since it seems a bit fragile at the moment.

Other points:

  • I vote only pattern synonyms.

That sounds reasonable to me.

  • Serialization is used in Cloud Haskell, no?

Yes; it's also used in GHC (there is a Binary instance). I'm a bit unsure at the moment of what this should look like.

  • I like the idea of including the paranoid checks as an option for paranoid users. Although perhaps not, given that I believe it's likelier for a cosmic ray to flip a bit than for the fingerprints to give a false positive.

I'm not opposed to providing it, although it would be good to state clearly what our threat model is here. What guarantees are we trying to keep? This question seems to become rather hairy in the presence of serialization.

As for names: I think a proper migration story needs to be articulated. In a few years, I'd hate to have all these extra Ts lying around. Use a new module name for the new features and keep the old module with the old interface around? Maybe the new module can be Data.Reflection.

I'm not sure the Ts are necessarily bad given that the quantified TypeRep still serves a purpose. Especially in light of my proposal in comment:18 it seems like the T prefix could be a useful convention.

comment:21 Changed 4 years ago by simonpj

Cc: ekmett core-libraries-committee@… added

It's clear that we'll continue to want bot a type-indexed and non-type-indexed representation. The only question is whether they are called

  • TypeRep t: type indexed
  • TypeRepX: non-indexed

or

  • TypeRepT t or TTypeRep t: type indexed
  • TypeRep: non-indexed

The latter has the merit of being more backward compatible. The former is nicer to my mind.

Adding Edward and the Core Libraries Committee.

comment:22 Changed 4 years ago by mboes

Cc: mboes added

comment:23 Changed 4 years ago by ekmett

Up until now we seem to have been focused on getting the Typeable internals "right" rather than compatible. Fortunately, the vast majority of users of Typeable limit themselves to deriving and to cast, so whatever you do to the internals affects very few users.

If you like the TypeRep t and TypeRepX convention, then I'd say go for it.

comment:24 Changed 4 years ago by goldfire

I think we can have our cake and eat it too. Just use a new module name and then we can have Data.Typeable.TypeRep be different from Data.Reflection.TypeRep. (Of course, we can keep just one Typeable, happily.)

I vastly prefer TypeRep and TypeRepX over TTypeRep and TypeRep. But I have no great reason for doing so.

comment:25 Changed 4 years ago by int-e

Cc: int-e added

comment:26 Changed 4 years ago by bgamari

I've put up some notes collected from the course of my implementation here.

At the moment I am struggling to find a design for serialization (or, more specifically, deserialization). The current scheme is described in this section, although the need for unsafeCoerce is worrying. I'd love to hear what others have to say about this.

comment:27 Changed 4 years ago by goldfire

I think that unsafeCoerce is quite bad and does threaten safety of the whole system, because you can always unpack a TypeRep from a TypeRepX.

But I think I may have an answer -- I have written a very partial Read instance in a similar setup here (used to implement my job talk). That indexed TypeRepX by a constraint, but I don't think you need that to get this done.

A big difference between my approach and the real one will be that the real one must be open (allow datatype definitions), while my setup is closed. I don't think this is so terrible, though, as we can use a class (somewhat like my TyConAble) that provides the ability to read in a tycon. This class could be internal, perhaps protected by a name that can't be written in Haskell.

Does this get you unstuck or have I done something silly?

comment:28 Changed 4 years ago by simonpj

I don't understand your proposal about how to make it open. (I agree, closed is (relatively) easy.)

comment:29 Changed 4 years ago by goldfire

Yes, I see your point. I guess I have indeed done something silly and that unsafeCoerce may be the only way forward. This means that an invalid input file could cause a segfault. But I suppose that's just the old "garbage in, garbage out" situation.

comment:30 Changed 4 years ago by adamgundry

Cc: adamgundry added

Ben, perhaps I'm missing something (your wiki page doesn't appear to define mkTyCon or mkTRApp), but I don't see why you can't write getTypeRepX without resorting to unsafeCoerce. If TyCon is unindexed then the first case should just work without it, and in the second case, isn't it possible to do a runtime check that the kinds of rep_f and rep_x agree?

Apologies if I've overlooked something obvious by being late to the party.

comment:31 in reply to:  24 Changed 4 years ago by bgamari

If TyCon is unindexed then the first case should just work without it,

Well, not quite. The trouble here is that kinds are necessarily of type *. That is, mkTrCon has type,

mkTrCon :: forall (k :: *) (a :: k). TyCon -> TTypeRep k -> TTypeRep a

That being said, a runtime check will suffice here.

This still leaves the question of whether the given TyCon may plausibly have the requested kind, but at least this does away with the unsafeCoerces.

in the second case, isn't it possible to do a runtime check that the kinds of rep_f and rep_x agree?

Indeed, this ought to be possible. Thanks for bringing this up!

comment:32 in reply to:  24 Changed 4 years ago by bgamari

Milestone: 8.2.1

Replying to goldfire:

I think we can have our cake and eat it too. Just use a new module name and then we can have Data.Typeable.TypeRep be different from Data.Reflection.TypeRep. (Of course, we can keep just one Typeable, happily.)

This is an interesting idea; it would be great to be able to pull off this change without massive code breakage.

That being said, to pick on the particular choice of names Reflection and Typeable: My only concern is that neither of these names suggest indexed or un-indexed other than the fact that Data.Typeable is currently unindexed. It would be nice to have a justification for these names for someone coming to the language without this historical context. Richard, can you think of a rationale here?

I'm having trouble thinking of another name than Reflection that has a better story here.

I vastly prefer TypeRep and TypeRepX over TTypeRep and TypeRep. But I have no great reason for doing so.

Right, I think I've also come around this view as well.

comment:33 Changed 4 years ago by goldfire

We could have Data.Typeable.Indexed, but then when the un-indexed version is deprecated and gone, we'll be left with an awkward module name. I think it will be straightforward enough to deprecate Data.Typeable and have a comment (appearing in Haddock too, of course) that Data.Reflection is the new module name. (The need to rename aside, I think Data.Reflection is superior to Data.Typeable.)

Very sadly, Data.Reflection is taken, by the reflection package, which is in active use. GHC.Reflection? (Aside: is there a concrete rule separating out GHC. modules from Data. ones? And should my new Data.Kind really be GHC.Kind?)

Or perhaps we start a new top-level module prefix and go with Type.Reflection. Maybe someday Data.Type.Equality and friends will lose the Data part, too.

comment:34 Changed 4 years ago by simonpj

I quite like Type.Reflection. But it's not unthinkable to have a module name clash between packages, particularly if they are not widely used.

comment:35 Changed 4 years ago by bgamari

Indeed, I have also wondered about Data.Kind. I don't know of any policy on GHC's module naming, but perhaps we should consider consulting with the CLC before placing things outside of GHC.

Type.Reflection sounds quite reasonable. I'll try to put together a proposal around this idea.

comment:36 Changed 4 years ago by ekmett

We've spoken about moving reflection into GHC itself, because the core of it is well typed, even if it can't be properly expressed in the surface language, and we had planned to put it in GHC.Reflection. (Back at ICFP in Boston a couple of years ago.)

The main reason why it didn't happen at the time was that by the time Austin and I got done writing Proxy# the conference was over.

comment:37 Changed 4 years ago by rrnewton

Cc: rrnewton added

comment:38 Changed 4 years ago by bgamari

Differential Rev(s): Phab:D2010
Status: newpatch

I've uploaded the current state of my wip/ttypeable branch as Phab:D2010. It's still quite preliminary however it does give you a working stage2 compiler with functional type-indexed type representations and what I believe is a pretty reasonable set of interfaces.

I've also written up a description of the proposal as I have implemented it at Typeable/BenGamari.

There are a number of tasks outstanding:

  • Fix representation pretty-printer to correctly handle tuples, lists, arrows, and precedence
  • Look at the testsuite failures:
    • several are easy output changes
    • others are testcases which appear to use internal Typeable interfaces that have changed: T8132, dynamic002
    • others seems to involve stack overflows: T10294, plugins01, T5550, annrun01, ann01, annth_make,
    • others are Core lint violations: T11120
    • others appear to be performance changes (even some improvements, perplexingly) that need to be examined
  • Carefully check Data.Typeable for deviations from the previous interface and fill in gaps where possible
  • Evaluate serialization/deserization performance; perhaps it would be worth offering unsafe interface without runtime checks?
  • Examine whether the primitive TypeReps (e.g. trTYPE, trArrow) should be INLINEABLE; we perform a number of runtime checks during serialization/deserialization so being able to inline these fingerprints may be helpful.
  • Evaluate whether we want to try harder to preserve the re-exports that have been dropped from Data.Dynamic.
  • Some of the new naming choices should be revisited (e.g. TRFun)
Last edited 4 years ago by bgamari (previous) (diff)

comment:39 Changed 4 years ago by simonpj

I like the direction here.

On deserialisation, let's make sure we have a compelling use-case before investing much time tilting at windmills.

comment:40 Changed 4 years ago by bgamari

The Core Lint violation in T11120 mentioned in comment:38 appears to be the result of (->) overly constrained kind when used in prefix position, * -> * -> *,

    Kind application error in type ‘(->) Char#’
      Function kind = * -> * -> *
      Arg kinds = [(Char#, TYPE 'WordRep)]

Where this type appears in a use of mkTrApp,

mkTrApp @* @* @((->) Char#) @CharHash ...

which was produced as evidence for the Typeable representation for,

data CharHash = CharHash Char#

I've opened #11714 to track the overly constrained kind of (->).

Last edited 4 years ago by bgamari (previous) (diff)

comment:41 Changed 4 years ago by bgamari

I believe the failure of the TypeOf test (which claims that things of kind Constraint are instead of kind *) may be related to a latent bug that I noticed a while back when working on #11334. I've posted a description and reproduction case as #11715.

comment:42 Changed 4 years ago by bgamari

I've consolidated some notes on the status of my implementation on the Wiki page. I've included a list of related tickets for things which need to be fixed elsewhere in the compiler.

comment:43 Changed 4 years ago by Iceland_jack

Bikeshedding.

Base already has the existentials SomeSymbol and SomeNat so one wonders if SomeTypeRep is a more consistent name for TypeRepX.

comment:44 in reply to:  43 Changed 3 years ago by bgamari

Replying to Iceland_jack:

Base already has the existentials SomeSymbol and SomeNat so one wonders if SomeTypeRep is a more consistent name for TypeRepX.

Very true. I do like this proposal despite the name being a bit longer. Really, the length isn't even such a big deal given that users can always use the Data.Typeable.TypeRep synonym.

Last edited 3 years ago by bgamari (previous) (diff)

comment:45 Changed 3 years ago by RyanGlScott

Cc: RyanGlScott added

comment:46 Changed 3 years ago by oerjan

Cc: oerjan added

comment:47 Changed 3 years ago by baramoglo

Cc: baramoglo added

comment:48 Changed 3 years ago by bgamari

Owner: set to bgamari

I'm still working on this here and there but suffering from a severe lack of time.

comment:49 Changed 3 years ago by simonpj

Keywords: Typeable added

comment:50 Changed 3 years ago by Ben Gamari <ben@…>

In 8fa4bf9a/ghc:

Type-indexed Typeable

This at long last realizes the ideas for type-indexed Typeable discussed in A
Reflection on Types (#11011). The general sketch of the project is described on
the Wiki (Typeable/BenGamari). The general idea is that we are adding a type
index to `TypeRep`,

    data TypeRep (a :: k)

This index allows the typechecker to reason about the type represented by the `TypeRep`.
This index representation mechanism is exposed as `Type.Reflection`, which also provides
a number of patterns for inspecting `TypeRep`s,

```lang=haskell
pattern TRFun :: forall k (fun :: k). ()
              => forall (r1 :: RuntimeRep) (r2 :: RuntimeRep)
                        (arg :: TYPE r1) (res :: TYPE r2).
                 (k ~ Type, fun ~~ (arg -> res))
              => TypeRep arg
              -> TypeRep res
              -> TypeRep fun

pattern TRApp :: forall k2 (t :: k2). ()
              => forall k1 (a :: k1 -> k2) (b :: k1). (t ~ a b)
              => TypeRep a -> TypeRep b -> TypeRep t

-- | Pattern match on a type constructor.
pattern TRCon :: forall k (a :: k). TyCon -> TypeRep a

-- | Pattern match on a type constructor including its instantiated kind
-- variables.
pattern TRCon' :: forall k (a :: k). TyCon -> [SomeTypeRep] -> TypeRep a
```

In addition, we give the user access to the kind of a `TypeRep` (#10343),

    typeRepKind :: TypeRep (a :: k) -> TypeRep k

Moreover, all of this plays nicely with 8.2's levity polymorphism, including the
newly levity polymorphic (->) type constructor.

Library changes
---------------

The primary change here is the introduction of a Type.Reflection module to base.
This module provides access to the new type-indexed TypeRep introduced in this
patch. We also continue to provide the unindexed Data.Typeable interface, which
is simply a type synonym for the existentially quantified SomeTypeRep,

    data SomeTypeRep where SomeTypeRep :: TypeRep a -> SomeTypeRep

Naturally, this change also touched Data.Dynamic, which can now export the
Dynamic data constructor. Moreover, I removed a blanket reexport of
Data.Typeable from Data.Dynamic (which itself doesn't even import Data.Typeable
now).

We also add a kind heterogeneous type equality type, (:~~:), to
Data.Type.Equality.

Implementation
--------------

The implementation strategy is described in Note [Grand plan for Typeable] in
TcTypeable. None of it was difficult, but it did exercise a number of parts of
the new levity polymorphism story which had not yet been exercised, which took
some sorting out.

The rough idea is that we augment the TyCon produced for each type constructor
with information about the constructor's kind (which we call a KindRep). This
allows us to reconstruct the monomorphic result kind of an particular
instantiation of a type constructor given its kind arguments.

Unfortunately all of this takes a fair amount of work to generate and send
through the compilation pipeline. In particular, the KindReps can unfortunately
get quite large. Moreover, the simplifier will float out various pieces of them,
resulting in numerous top-level bindings. Consequently we mark the KindRep
bindings as noinline, ensuring that the float-outs don't make it into the
interface file. This is important since there is generally little benefit to
inlining KindReps and they would otherwise strongly affect compiler performance.

Performance
-----------

Initially I was hoping to also clear up the remaining holes in Typeable's
coverage by adding support for both unboxed tuples (#12409) and unboxed sums
(#13276). While the former was fairly straightforward, the latter ended up being
quite difficult: while the implementation can support them easily, enabling this
support causes thousands of Typeable bindings to be emitted to the GHC.Types as
each arity-N sum tycon brings with it N promoted datacons, each of which has a
KindRep whose size which itself scales with N. Doing this was simply too
expensive to be practical; consequently I've disabled support for the time
being.

Even after disabling sums this change regresses compiler performance far more
than I would like. In particular there are several testcases in the testsuite
which consist mostly of types which regress by over 30% in compiler allocations.
These include (considering the "bytes allocated" metric),

 * T1969:  +10%
 * T10858: +23%
 * T3294:  +19%
 * T5631:  +41%
 * T6048:  +23%
 * T9675:  +20%
 * T9872a: +5.2%
 * T9872d: +12%
 * T9233:  +10%
 * T10370: +34%
 * T12425: +30%
 * T12234: +16%
 * 13035:  +17%
 * T4029:  +6.1%

I've spent quite some time chasing down the source of this regression and while
I was able to make som improvements, I think this approach of generating
Typeable bindings at time of type definition is doomed to give us unnecessarily
large compile-time overhead.

In the future I think we should consider moving some of all of the Typeable
binding generation logic back to the solver (where it was prior to
91c6b1f54aea658b0056caec45655475897f1972). I've opened #13261 documenting this
proposal.

comment:51 Changed 3 years ago by bgamari

Resolution: fixed
Status: patchclosed

It's done (although at an unfortunately high cost to compiler performance, which I'll be working to reduce).

comment:52 Changed 3 years ago by bgamari

Wiki Page: wiki:Typeable/BenGamari
Note: See TracTickets for help on using tickets.