Opened 3 years ago

Last modified 2 years ago

#13327 new bug

Figure out how to make sense of Data instances for poly-kinded datatypes

Reported by: RyanGlScott Owned by:
Priority: normal Milestone:
Component: libraries/base Version: 8.0.1
Keywords: deriving Cc: dreixel
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #4028 Differential Rev(s):
Wiki Page:


Something's funny with Data.Data. In particular, what instance do you expect when you write this?

data T phantom = T                                                                        
  deriving Data

You might expect:

instance Typeable phantom => Data (T phantom) where

And indeed, it is possible to define a Data instance like this. But in practice, GHC actually derives this instance:

instance Data phantom => Data (T phantom) where
  dataCast1 f = gcast1 f

The reason is because GHC has special cases for when it derives Data instances for datatypes of kind * -> * or * -> * -> *. These special cases cause the derived Data instances to insert definitions for dataCast1 or dataCast2, respectively, and their implementations (gcast1 or gcast2, respectively) require the stronger instance context. See #4028 for a longer discussion about this.

Things get far less predictable when you throw in PolyKinds, however. If you try this instead:

data T (phantom :: k) = T
  deriving Data

Then you do not get instance Data phantom => Data (T phantom) as above. Instead, you get this:

instance (Typeable k, Typeable (phantom :: k)) => Data (T phantom) where
  -- No implementation for dataCast1

As it turns out, GHC's special-casing for deriving Data is not privy to datatypes of kind k -> * or k1 -> k2 -> *, just * -> * and * -> * -> *.

This is all a bit disconcerting because if you look at Data.Data, there are many instances of the flavor:

deriving instance Data (p :: *) => Data (U1 p) -- U1 :: k -> *

This makes sense in a pre-PolyKinds world, but really, the most general type for this Data instance would be:

deriving instance (Typeable k, Typeable (p :: k)) => Data (U1 p)

Even worse, choosing the less polymorphic instance Data (p :: *) => Data (U1 p) via StandaloneDeriving doesn't even cause GHC to emit a definition for dataCast1—it just restricts the kind with no benefit!

This whole situation doesn't sit well with me, especially since as we add more poly-kinded Data instances to Data.Data, we have to ask ourselves each time what the "right" Data instance is. It would be great if we could answer these questions:

  1. Should we expand the special-casing for datatypes of kind k -> * or k1 -> k2 -> *? Or do dataCast1 and dataCast2 even make sense for poly-kinded datatypes?
  2. Is there a way to modify DeriveDataTypeable such that emitting definitions for dataCast1 and dataCast2 don't require us to default some kind variables to *? Perhaps TypeInType could help us here somehow.

Change History (4)

comment:1 Changed 3 years ago by simonpj

The whole dataCast1, dataCast2 business is clearly a hack. It predates kind polymorphism. Could it be cleaned up now we have kind polymorphism?

comment:2 Changed 3 years ago by RyanGlScott

comment:3 Changed 3 years ago by RyanGlScott

Edward Kmett laid out an interesting (albeit not terribly practical) proposal for cleaning up the current story in Sadly, his code doesn't actually typecheck due to what I believe is a corollary of #13337.

I came up with an even grosser proposal in that does typecheck, but with emphasis on the "even grosser" part. There's certainly no way I'd want Data.Data to adopt that approach as-is.

I need to think more on how I can take the programming patterns in these two proposals and clean them up into something palatable.

comment:4 Changed 2 years ago by RyanGlScott

Keywords: deriving added
Note: See TracTickets for help on using tickets.