Opened 4 years ago

Closed 17 months ago

#10607 closed feature request (fixed)

Auto derive from top to bottom

Reported by: songzh Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.11
Keywords: deriving, typeclass, auto Cc: RyanGlScott
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #13324 Differential Rev(s):
Wiki Page:

Description (last modified by songzh)

It is very good to see that Template Haskell types are now Generic and DeriveAnyClass have been implemented. I suppose that this auto-deriving thing can go even further. The problem is like the following:

When we manipulate a complex composite JSON object, we need to define a dozen of data types for each of the dozen we may need to derive Eq, Show, Generic etc.

Another situaition can be manipulating the AST of a language, there may be dozens of data type definitions, also, for each of them, we need to derive Show, Eq, Generic.

Take the old Template Haskell types without Generic as an example, I want to make them instances of Generic type class, I wrote:

deriving instance Generic FixityDirection
deriving instance Generic Inline
deriving instance Generic RuleBndr
deriving instance Generic Match
deriving instance Generic Name
deriving instance Generic RuleMatch
deriving instance Generic Pred
deriving instance Generic Phases
deriving instance Generic Con
deriving instance Generic Module
deriving instance Generic AnnTarget
deriving instance Generic Type
deriving instance Generic TyVarBndr
deriving instance Generic TyLit
deriving instance Generic Exp
deriving instance Generic Lit
deriving instance Generic Pat
deriving instance Generic Dec
deriving instance Generic Clause
deriving instance Generic FunDep
deriving instance Generic Foreign
deriving instance Generic Fixity
deriving instance Generic Pragma
deriving instance Generic FamFlavour
...

How can we use Generic? Say we want to serialize Template Haskell data type with Binary type class, Of course, by writing Binary empty instances or just derive it with DeriveAnyClass extension. However, the problem is that we still need to write deriving Binary dozens of times, as many as the trivial Generic deriving declaration above.

What if GHC can attemp to derive the instance from the top of the tree down to the leaf in the data type declarations? For example:

data Person = Person Names Address
            | Student Names Address
data Names  = Names String
data Address = Address Gate
type Gate = (String,Int)

deriving topdown Generic Person
deriving topdown Eq Person

It will generate Eq instances of Person, Names and Address since Name and Address are in the Person data declaration tree.

I tried to implement a prototype see https://github.com/HaskellZhangSong/derive-topdown

It is rather an experiment tryout than a formal implementation. If you think this feature is attractive, I would like to discuss it.

Change History (27)

comment:1 Changed 4 years ago by songzh

Description: modified (diff)

comment:2 Changed 4 years ago by songzh

Description: modified (diff)

comment:3 Changed 4 years ago by goldfire

To me, this looks like the perfect application of Template Haskell. Yes, this could be built into GHC, but this seems like one step too far. If you're keen on this being built-in, perhaps seek out support on mailing lists and such to see whether others agree.

comment:4 in reply to:  3 Changed 4 years ago by songzh

Yes you are right, this is can be one application of TH. However, I am only a newbie of TH, I find I cannot query whether type like Eq a => List a is an instance of Eq with TH. And there are also other problems for me to implement it, like the situation type synonym with type variables. Any Template Haskell expert would like to work this out?

Replying to goldfire:

To me, this looks like the perfect application of Template Haskell. Yes, this could be built into GHC, but this seems like one step too far. If you're keen on this being built-in, perhaps seek out support on mailing lists and such to see whether others agree.

comment:5 Changed 4 years ago by goldfire

Yes -- TH struggles a bit with instance lookup and such. Implementing new querying features shouldn't be hard though, if you can suggest an API.

Why do you need fancy querying capabilities for your use case? Where does reifyInstances fail for you? As for type synonyms, you can check out the th-expand-syns package. th-desugar also does (limited) type family expansion (among other, larger features).

comment:6 in reply to:  5 ; Changed 4 years ago by songzh

Replying to goldfire:

Yes -- TH struggles a bit with instance lookup and such. Implementing new querying features shouldn't be hard though, if you can suggest an API.

Why do you need fancy querying capabilities for your use case? Where does reifyInstances fail for you? As for type synonyms, you can check out the th-expand-syns package. th-desugar also does (limited) type family expansion (among other, larger features).

Hi Richard I tried to implement a prototype of this top down deriving scheme with standalone deriving. The project is on https://github.com/HaskellZhangSong/TopdownDerive

However, I have 3 problems about it.

  1. It forces me to use KindSignatures, ConstraintKind and other extension which should not be needed.(please see TopDownDeriveTest.hs file)
  1. For type synonym, I want to do it without using TypeSynonymInstance, but I am not sure how to get the arity of a type constructor. For example: type T a = (a,a,a,a,a) I need to generate (Eq a , Eq b, Eq c ,Eq d, Eq e) => (a,b,c,d,e). However, the type synonym can be eta reduced, what should do to handle this case?
  1. When I derive some instances that based on Generic class like Binary or FromJSON it gives me a type error:
    Could not deduce (G.Generic a)
      arising from a use of `binary-0.7.6.1:Data.Binary.Class.$gdmget'
    from the context (B.Binary a)
      bound by the instance declaration

while I think deriving instance (Binary a ,Binary b) => Binary (A a b) should work fine. Why do I have to write deriving instance (B.Binary a, G.Generic a) => (B.Binary (B a))? And could you give me some suggestions to solve it.

I am with GHC 7.10.1 by the way

Thanks

Song

Last edited 4 years ago by songzh (previous) (diff)

comment:7 in reply to:  6 ; Changed 4 years ago by goldfire

Replying to songzh:

  1. It forces me to use KindSignatures, ConstraintKind and other extension which should not be needed.(please see TopDownDeriveTest.hs file)

Fair enough. But, in truth, you might need these extensions, depending on the definitions of types you are deriving for. One way forward here is to allow Template Haskell to turn on some extensions just within a splice, instead of specifying them at the top of a file. (This actually shouldn't be hard, once #10820 is complete.)

  1. For type synonym, I want to do it without using TypeSynonymInstance, but I am not sure how to get the arity of a type constructor. For example: type T a = (a,a,a,a,a) I need to generate (Eq a , Eq b, Eq c ,Eq d, Eq e) => (a,b,c,d,e). However, the type synonym can be eta reduced, what should do to handle this case?

I'm not sure what the problem is here. Are you worried about deriving, say, Functor, for which the eta reduction is necessary? But your example actually cannot be eta-reduced, because the variable a is used multiple times in the RHS. So I'm not sure what you're getting at on this one.

  1. When I derive some instances that based on Generic class like Binary or FromJSON it gives me a type error:
    Could not deduce (G.Generic a)
      arising from a use of `binary-0.7.6.1:Data.Binary.Class.$gdmget'
    from the context (B.Binary a)
      bound by the instance declaration

while I think deriving instance (Binary a ,Binary b) => Binary (A a b) should work fine. Why do I have to write deriving instance (B.Binary a, G.Generic a) => (B.Binary (B a))? And could you give me some suggestions to solve it.

I'm a little lost here. What's the code being type-checked? What's A? What's B? Does the error happen both with and without TH? Or only when using TH?

comment:8 Changed 4 years ago by goldfire

For point 3, could #10361 be related? I have a very strong suspicion that it is.

comment:9 in reply to:  7 ; Changed 4 years ago by songzh

Replying to goldfire:

Replying to songzh:

  1. It forces me to use KindSignatures, ConstraintKind and other extension which should not be needed.(please see TopDownDeriveTest.hs file)

Fair enough. But, in truth, you might need these extensions, depending on the definitions of types you are deriving for. One way forward here is to allow Template Haskell to turn on some extensions just within a splice, instead of specifying them at the top of a file. (This actually shouldn't be hard, once #10820 is complete.)

  1. Thanks very much. I will watch the ticket.
  1. For type synonym, I want to do it without using TypeSynonymInstance, but I am not sure how to get the arity of a type constructor. For example: type T a = (a,a,a,a,a) I need to generate (Eq a , Eq b, Eq c ,Eq d, Eq e) => (a,b,c,d,e). However, the type synonym can be eta reduced, what should do to handle this case?

I'm not sure what the problem is here. Are you worried about deriving, say, Functor, for which the eta reduction is necessary? But your example actually cannot be eta-reduced, because the variable a is used multiple times in the RHS. So I'm not sure what you're getting at on this one.

  1. I will try to solve it, just not sure about how to handle TySynD case.
    getTyVarCons :: Name -> Q ([TyVarBndr], [Con])
    getTyVarCons name = do
            info <- reify name
            case info of 
                 TyConI dec ->
                    case dec of
                         DataD    _ _ tvbs cons _ -> return (tvbs,cons)
                         NewtypeD _ _ tvbs con  _ -> return (tvbs,[con])
                         TySynD   _ vars type'    -> undefined -- need to handle type eta reduction
                         _ -> error "must be data, newtype definition or type synonym!"
                 _ -> error "bad type name, quoted name is not a type!"
    

You can see that I am not sure how to handle TySynD case. If the user declares

data A a b c = A a b c
type A' a = A a String a
  1. When I derive some instances that based on Generic class like Binary or FromJSON it gives me a type error:
    Could not deduce (G.Generic a)
      arising from a use of `binary-0.7.6.1:Data.Binary.Class.$gdmget'
    from the context (B.Binary a)
      bound by the instance declaration

while I think deriving instance (Binary a ,Binary b) => Binary (A a b) should work fine. Why do I have to write deriving instance (B.Binary a, G.Generic a) => (B.Binary (B a))? And could you give me some suggestions to solve it.

I'm a little lost here. What's the code being type-checked? What's A? What's B? Does the error happen both with and without TH? Or only when using TH?

  1. Sorry for confusing you. Normmally, I derive the type classes in the following way:
    {-# LANGUAGE StandaloneDeriving #-}
    {-# LANGUAGE DeriveGeneric #-}
    {-# LANGUAGE DeriveAnyClass #-}
        
    import qualified GHC.Generics as G
    import qualified Data.Binary as B
    
    data C a b = A (B a) deriving (Eq, G.Generic, Ord)
    data B a = B a | F (D a) deriving (Eq, G.Generic, Ord)
    data D b = D b | E b deriving (Eq, G.Generic, Ord)
    
    deriving instance B.Binary b => B.Binary (D b)
    deriving instance B.Binary a => B.Binary (B a)
    deriving instance (B.Binary a , B.Binary b) => (B.Binary (C a b)) 
    

For standalone deriving (B.Binary (C a b)), I do not need to give G.Generic a and G.Generic b in its context. However, if I derive G.Generic for C by using standalone deriving, GHC complains that I did not give G.Generic in context (B.Binary a, B.Binary b)

{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DeriveAnyClass #-}
    
import qualified GHC.Generics as G
import qualified Data.Binary as B

data C a b = A (B a) deriving (Eq, Ord)
data B a = B a | F (D a) deriving (Eq, G.Generic, Ord)
data D b = D b | E b deriving (Eq, G.Generic, Ord)

deriving instance (G.Generic a, G.Generic b) => (G.Generic (C a b))

deriving instance B.Binary b => B.Binary (D b)
deriving instance B.Binary a => B.Binary (B a)
deriving instance (B.Binary a , B.Binary b) => (B.Binary (C a b)) 

This means that I have to discriminate generic standalone deriving(i.e. with G.Generic context) and non-generic standalone deriving(i.e. without G.Generic context), which is not very nice. Are there any way to solve this.

comment:10 in reply to:  9 ; Changed 4 years ago by goldfire

Replying to songzh:

  1. I will try to solve it, just not sure about how to handle TySynD case.

Have you tried th-expand-syns? The snippet you've given just retrieves the constructors (with their bound variables) of a type. The fact that your A' synonym repeats a variable a shouldn't affect anything. You should be able to just use th-expand-syns and then recur. I'm sorry -- I still don't see the problem.

This is more straightforward: you have unnecessary assumptions in your manual Generic instance. Deriving G.Generic (C a b) does not require G.Generic instances for a and b. In your standalone-deriving version, you've added these, so when the default methods in Binary use C's Generic instance, GHC also tries to get those others. Omit the context for your Generic (C a b) instance and your second example compiles.

comment:11 in reply to:  10 Changed 4 years ago by songzh

  1. I am a newbie of TH, but thanks, I will look into th-expand-syns
  1. I noticed and understood the reason why we do not need type variable Generic a in the context finally!! thanks.

I may request an API for querying whether Eq a => List a is an instance of Eq typeclass. Thanks.

comment:12 Changed 3 years ago by RyanGlScott

Cc: RyanGlScott added

I'm chiming in on this ticket pretty late, but nevertheless, here's my take on this situation.

I agree that this feature shouldn't be baked into the language, and this should ideally be something that Template Haskell could handle. And Song certainly tried to make it work in TH, but quickly discovered a problem that has bitten singletons and many other TH libraries—TH is just plain terrible at type inference. Stated in a more focused way, if you are trying to use TH to splice in something like:

deriving instance ??? => C (T a b c)

Then trying to use TH to come up with ??? is really hard in the general case. In fact, I don't think there'd be any way to Do It Right without the full power of GHC's type inference engine, which is certainly something I'm not eager to bake into TH.

I talked to Song about this at ICFP 2016, and we brainstormed some ways to get around this. I pointed out that StandaloneDeriving currently has a glaring flaw. While StandaloneDeriving is quite flexible and lets you derive instances wherever you want, it always requires that you provide the instance context in full. But deriving clauses don't have this onerous restriction, as you can just type:

data T a b c = ...
  deriving C

And GHC will figure out the instance context for you. What I feel like we need here is the ability to combine StandaloneDeriving's capability to be used anywhere with deriving clauses' type inference.

So what I will suggest (and hopefully articulate later in a proper GHC proposal) is extending StandaloneDeriving in the following way: if you type this:

deriving instance => C (T a b c)

Then GHC will try to fill in the instance context as if you had written data T a b c = ... deriving C. (Syntax bikeshedding is welcome.) Now the derive-topdown library's job is much easier: it just has to splice in a bunch of instances of the form:

deriving instance => Generic (A a b)
deriving instance => Data (B a b)
-- etc.

And GHC will do the rest! This requires very little in the way of fancy TH footwork, and moreover, it makes StandaloneDeriving more convenient to use.

Last edited 3 years ago by RyanGlScott (previous) (diff)

comment:13 Changed 3 years ago by rwbarton

Sounds reasonable.

(Syntax bikeshedding is welcome.)

Well, since you asked... how about deriving instance _ => C (T a b c), which is currently accepted by the parser (and probably won't be too taxing for other tools) but rejected later with the error

C.hs:8:19: error:
    Wildcard ‘_’ not allowed
      in In a deriving declaration for ‘C’
  |
8 | deriving instance _ => C (T a b c)
  |                   ^

comment:14 Changed 3 years ago by RyanGlScott

I'm quite reluctant to steal wildcard syntax for this particular purpose, because it's conceivable that in the future, we might want to allow GHC to fill in this wildcard, producing an error to the effect of:

• Found type wildcard ‘_’ standing for ‘(C a, C b, C c)’
  To use the inferred type, enable PartialTypeSignatures

comment:15 Changed 3 years ago by rwbarton

Why isn't that the same thing though?

comment:16 Changed 3 years ago by RyanGlScott

Hm... you know, that's a good observation. This approach does require another language extension (PartialTypeSignatures), but it certainly achieves the same end result.

My only hesitation is that I don't understand PartialTypeSignatures that well, so I haven't fully considered the ramifications of allowing this. I'm fairly sure it would require a more complication implementation, since you'd also probably want to allow things like this:

deriving instance (C a, _) => C (T a b c)

But the prospect of not needing any changes to the language is a mighty tempting one.

comment:17 Changed 3 years ago by simonpj

Smells right to me.

comment:18 Changed 3 years ago by RyanGlScott

Alright. I've opened up #13324 to track the feature request for using type wildcards in derived instance contexts, since that has applications besides derive-topdown.

Song, would this approach work for your needs?

comment:19 in reply to:  18 Changed 3 years ago by songzh

Replying to RyanGlScott:

Alright. I've opened up #13324 to track the feature request for using type wildcards in derived instance contexts, since that has applications besides derive-topdown.

Song, would this approach work for your needs?

Thanks, but I think there still is one problem.

When writing the TH code, I encountered three problems. One is the context generation problem which you are trying to solve for me. Another is type synonym. I want to expand it to data or newtype declaration to generate instance, but now I forced user to use XTypeSynonymInstances. Richard introduced me th-expand-syns which I am not quite sure how it can be used to help on my problem now.

The last problem is the most severe one. It is about isInstance function.

My process of generation standalone instances is dead simple by using type StateT [Type] Q [Dec]. The roughly sketched algorithm is the following:

For a type T, get all its arguments of data constructors and prepare to generate instance of class C. But before the generation, for each of the argument arg_n we check two things:

1, whether the type arg_n already an instance of the class which may be built in GHC. (by using isInstance function in Langauge.Haskell.TH)

2, whether it has already been generated. (by looking into the state hold by the state monad)

If neither the case, we generate an standalone declaration and add the type into the list of state monad for avoiding future duplicated generation. After that call the instance generation function recursively. The above two cases are the base cases for the recursion.

The problem is: For case 1, we cannot check a type such as [a], (a,b) is an instance of Eq or Ord with isInstance function, no matter we give the type class context or not:

{-# LANGUAGE TemplateHaskell,QuasiQuotes,ExplicitForAll #-}
import Language.Haskell.TH

char :: Q Bool
char = do
     char_t <- [t| Char |]
     isInstance ''Eq [char_t]

> $(char >>= stringE.show)
"True

poly_a :: Q Bool
poly_a = do
    poly_a_t <- [t| forall a. Eq a => [a] |]
    isInstance ''Eq [poly_a_t]

> $(poly_a >>= stringE.show)
"False"

poly_a' :: Q Bool
poly_a' = do
    poly_a_t <- [t| forall a. [a] |]
    isInstance ''Eq [poly_a_t]
-- False

pair :: Q Bool
pair = do
    pair_t <- [t| forall a b. (a,b) |]
    isInstance ''Eq [pair_t]
-- False

pair' :: Q Bool
pair' = do
    pair_t <- [t| forall a b. (Eq a, Eq b) => (a,b) |]
    isInstance ''Eq [pair_t]
-- False

comment:20 Changed 3 years ago by RyanGlScott

Ah. For your third problem, I have a suggestion which just might work. GHC.Exts exports a handy type family:

type family Any :: k where {}

which inhabits any kind k. You can use Any as a handy workaround to this isInstance limitation:

{-# LANGUAGE TemplateHaskell #-}
import Language.Haskell.TH

import GHC.Exts

char :: Q Bool
char = do
     char_t <- [t| Char |]
     isInstance ''Eq [char_t]

poly_a :: Q Bool
poly_a = do
    poly_a_t <- [t| [Any] |]
    isInstance ''Eq [poly_a_t]

pair :: Q Bool
pair = do
    pair_t <- [t| (Any,Any) |]
    isInstance ''Eq [pair_t]

Then $(poly_a >>= stringE.show) and $(pair >>= stringE.show) both give True.

Granted, there are still some obscure corner cases for which this wouldn't work, like if you only had instance Eq [Int] but not instance Eq a => Eq [a]. But for what you're trying to do, which is checking instances of the form instance (C a_1, ..., C a_n) => C (T a_1 ... a_n), it should work great!

comment:21 Changed 23 months ago by songzh

Dear All

I am very happy to announce that I released derive-topdown-0.0.0.9. I tested with Info type in template-haskell and HsModule type in haskell-src. It can derive class instances for most common cases I think. It also can be used with other type classes with template Haskell deriving mechanism via a function with type Name -> Q [Dec]. I also considered DerivingStrategies in GHC 8.2. Please try it if you are interested.

comment:22 Changed 18 months ago by Ryan Scott <ryan.gl.scott@…>

In affdea82/ghc:

Allow PartialTypeSignatures in standalone deriving contexts

Summary:
At its core, this patch is a simple tweak that allows a user
to write:

```lang=haskell
deriving instance _ => Eq (Foo a)
```

Which is functionally equivalent to:

```lang=haskell
data Foo a = ...
  deriving Eq
```

But with the added flexibility that `StandaloneDeriving` gives you
(namely, the ability to use it anywhere, not just in the same module
that `Foo` was declared in). This fixes #13324, and should hopefully
address a use case brought up in #10607.

Currently, only the use of a single, extra-constraints wildcard is
permitted in a standalone deriving declaration. Any other wildcard
is rejected, so things like
`deriving instance (Eq a, _) => Eq (Foo a)` are currently forbidden.

There are quite a few knock-on changes brought on by this change:

* The `HsSyn` type used to represent standalone-derived instances
  was previously `LHsSigType`, which isn't sufficient to hold
  wildcard types. This needed to be changed to `LHsSigWcType` as a
  result.

* Previously, `DerivContext` was a simple type synonym for
  `Maybe ThetaType`, under the assumption that you'd only ever be in
  the `Nothing` case if you were in a `deriving` clause. After this
  patch, that assumption no longer holds true, as you can also be
  in this situation with standalone deriving when an
  extra-constraints wildcard is used.

  As a result, I changed `DerivContext` to be a proper datatype that
  reflects the new wrinkle that this patch adds, and plumbed this
  through the relevant parts of `TcDeriv` and friends.

* Relatedly, the error-reporting machinery in `TcErrors` also assumed
  that if you have any unsolved constraints in a derived instance,
  then you should be able to fix it by switching over to standalone
  deriving. This was always sound advice before, but with this new
  feature, it's possible to have unsolved constraints even when
  you're standalone-deriving something!

  To rectify this, I tweaked some constructors of `CtOrigin` a bit
  to reflect this new subtlety.

This requires updating the Haddock submodule. See my fork at
https://github.com/RyanGlScott/haddock/commit/067d52fd4be15a1842cbb05f42d9d482de0ad3a7

Test Plan: ./validate

Reviewers: simonpj, goldfire, bgamari

Reviewed By: simonpj

Subscribers: goldfire, rwbarton, thomie, mpickering, carter

GHC Trac Issues: #13324

Differential Revision: https://phabricator.haskell.org/D4383

comment:23 Changed 18 months ago by RyanGlScott

songzh, I finally delivered on my promise to implement the ability to use PartialTypeSignatures in standalone deriving declarations à la deriving _ => Eq (Foo a). Does this make things easier for you on your end?

comment:24 in reply to:  23 Changed 17 months ago by songzh

Replying to RyanGlScott:

songzh, I finally delivered on my promise to implement the ability to use PartialTypeSignatures in standalone deriving declarations à la deriving _ => Eq (Foo a). Does this make things easier for you on your end?

Yes! Your Any wrinkle has worked very well already, but this will make a lot improvement for derive-topdown package. With new PartialTypeSignatures a lot of code can be removed in the package, but for compatibility reason, I will keep them for older GHC.

Even though class instance contexts can be generated and inferred now and type role is carefully handled for now, three other packages,primitive, th-expand-syns, transformers ,still are needed. This stops current derive-topdown being used inside ghc's code, so I will write a minimal package in order to reduce the dependencies of derive-topdown.

Thanks.

comment:25 Changed 17 months ago by RyanGlScott

Great! Do you consider this issue to be resolved, then?

comment:26 in reply to:  25 Changed 17 months ago by songzh

Replying to RyanGlScott:

Great! Do you consider this issue to be resolved, then?

Yes, please close it. I will update derive-topdown a bit when I have time. After that I will have a look https://ghc.haskell.org/trac/ghc/ticket/13368 and try to make it possible with derive-topdown. I suppose that we could write an essay or paper to summarize all this deriving stuffs.

comment:27 Changed 17 months ago by songzh

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.