Opened 7 years ago

Closed 23 months ago

Last modified 16 months ago

#7401 closed feature request (fixed)

Can't derive instance for Eq when datatype has no constructor, while it is trivial do do so.

Reported by: jpbernardy Owned by:
Priority: normal Milestone: 8.4.1
Component: Compiler Version: 7.6.1
Keywords: deriving, newcomer, GHCProposal Cc: omeragacan@…, RyanGlScott, dfeuer, enolan
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: GHC rejects valid program Test Case: deriving/should_run/T7401, deriving/should_run/T7401_fail
Blocked By: Blocking:
Related Tickets: #10577, #13117 Differential Rev(s): Phab:D4047
Wiki Page:

Description (last modified by osa1)

On a "phantom datatype" D, one gets the message:

Can't make a derived instance of `Eq D':
      `D' must have at least one data constructor

However there is a trivial, correct instance of Eq D:

instance Eq D where
  (==) = undefined

Change History (55)

comment:1 Changed 7 years ago by jpbernardy

Type: bugfeature request
Type of failure: None/UnknownGHC rejects valid program

comment:2 Changed 7 years ago by jpbernardy

Summary: Can't derive instance for Eq when datatype has no constructor, when it's trivial do do so.Can't derive instance for Eq when datatype has no constructor, while it is trivial do do so.

comment:3 Changed 6 years ago by igloo

Description: modified (diff)
difficulty: Unknown

comment:4 Changed 6 years ago by igloo

Milestone: 7.8.1
Status: newinfoneeded

jpbernardy, have you got an example of why this is useful please?

Did you run into this in a real program?

comment:5 Changed 6 years ago by kosmikus

It's useful for example when you're using such a type as an argument to another datatype, which happens to place an Eq constraint on its argument.

However, it seems that this works when standalone deriving is used. That might be a viable workaround if it's somehow difficult to fix.

comment:6 in reply to:  description Changed 6 years ago by jpbernardy

Basically what kosmikus said. This occurs often when you do generic programming. ie, when you encode the type A as A + 0 where 0 is a type with no constructor and + is an either-like type.

So: yes, I have seen this in real programs. It's not too hard to write an instance manually, but the error message is confusing. I suspect that a proper fix (allowing to derive an instance) should be very easy, if it is not the case, a better error message (hinting at the user to write an instance with 'undefined') would also be an acceptable fix.

comment:7 Changed 6 years ago by monoidal

I can try to do on this, but let me ask:

Allowing "data A deriving (Eq)" is not compatible with Haskell 98 nor 2010 (point 6 at http://www.haskell.org/onlinereport/haskell2010/haskellch11.html#x18-18200011). Does this mean I should add an add a flag, say -XEmptyDeriving, or should GHC allow such declarations by default?

comment:8 Changed 6 years ago by monoidal

Status: infoneedednew

comment:9 Changed 6 years ago by monoidal

Owner: set to monoidal

comment:10 Changed 6 years ago by simonpj

Thanks for helping. Concerning your question, I think it would be fine for -XEmptyDataDecls to imply that we can derive Eq etc for them. No need for a second flag.

What would be good would be some documentation in the user manual (in the empty data decls section) that says what is derived for the standard build-in classes.

Simon

comment:11 Changed 6 years ago by simonpj

Two more thoughts

  • You point out in #7931, this already works for standalone deriving, which is a bit odd. Standalone and non-standalone deriving should behave pretty much the same.
  • When I say "works" what I mean is that, for nullary types, all methods foo get this definition:
    foo = error "Void foo"
    
    This is done in TcGenDeriv.mkRdrFunBind. Several points:
    • Is this always the right thing to do? For every method, and every class? Eg for Read it wasn't; see #7931. Needs checking.
    • We could also generate
      foo x y = error "Void foo"
      
      (ie with the error inside the lambdas) which isn't quite the same. I'm not sure which is right.
    • Including the name of the type constructor in the error message would be a lot more helpful.

Thanks!

Simon

comment:12 Changed 6 years ago by simonpj

Omer writes: I just started GHC development made a commit for ticket 7401: http://ghc.haskell.org/trac/ghc/ticket/7401

My patch is here: https://github.com/osa1/ghc/commit/3ec257ff48372de30df59cd8854ce28954c9db95

make test succeeds. My test case for this patch is something like this:

    data D deriving Eq

    main :: IO ()
    main = print ((undefined :: D) == (undefined :: D))

example:

haskell$  ./ghc/inplace/bin/ghc-stage2 --interactive derive.hs
GHCi, version 7.7.20130806: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
[1 of 1] Compiling Main             ( derive.hs, interpreted )
Ok, modules loaded: Main.
ghci> main
True
it :: ()

This behavior (returning True) is consistent with standalone deriving version:

    haskell$  cat derive_standalone.hs
    {-# LANGUAGE StandaloneDeriving #-}

    data D

    deriving instance Eq D

    main :: IO ()
    main = print ((undefined :: D) == (undefined :: D))

    haskell$  ./ghc/inplace/bin/ghc-stage2 --interactive derive_standalone.hs
    GHCi, version 7.7.20130806: http://www.haskell.org/ghc/  :? for help
    Loading package ghc-prim ... linking ... done.
    Loading package integer-gmp ... linking ... done.
    Loading package base ... linking ... done.
    [1 of 1] Compiling Main             ( derive_standalone.hs, interpreted )
    Ok, modules loaded: Main.
    ghci> main
    True
    it :: ()
    ghci>
    Leaving GHCi.

However, if you want (==) implementation in this case to be `error "Void (==)"`, I think I can also do that by first fixing the code generated by StandaloneDeriving extension and then fixing my current patch.

Any comments and reviews would be appreciated!

comment:13 Changed 6 years ago by simonpj

There was some follow-up email on ghc-devs about this. (But let's accumulate comments here.) My summary

  • If we have EmptyDataDecls then surely
    data T deriving( Eq )
    
    should work, just as well as standalone deriving
    data T
    deriving Eq T
    
    The two must behave the same.
  • Surely all the built-in classes (Eq, Show, Ord etc) should derive for with EmptyDataDecls too.
  • Haskell 2010 doesn't have EmptyDataDecls (eg the data type chapter specifies n>0 constructors). But oddly the deriving appendix clause 6 says that nothing is derived if there are no constructors. I think we should just ignore clause 6, since H2010 doesn't have empty data types.
  • It would be better if the error "Void Eq" mentioned the data type. More like error "Eq on empty data type T".

Might you do those things, Omer?

Simon

comment:14 Changed 6 years ago by simonpj

Cc: omeragacan@… added

comment:15 Changed 6 years ago by goldfire

Good point about the fact that H2010 does not have empty data decls. But, then, GHC's interpretation of H2010 is incorrect, as setting H2010 as the language turns on EmptyDataDecls. Personally, I don't love straying from the standard, but it may be best not to unnecessarily break code here by "fixing" this deviation from the standard.

comment:16 Changed 6 years ago by monoidal

Haskell 2010 clearly has empty data decls. We've got in section 4.2.1

topdecl -> data [context =>] simpletype [= constrs] [deriving]

Note that "= constrs" is in square brackets [] which denote optional part.

There's also this part: "This declaration introduces a new type constructor T with zero or more constituent data constructors". Plus, the announcement of the standard contained a list of additions, including empty data declarations. If there's any contradicting information, I would call it a typo in the standard. However, there's a quirk: the deriving clause in H2010 is allowed only on nonempty data decls.

I can't work on the ticket in this month - sorry. To the patch author: I think we should be consistent with Eq, Ord etc. One way would be removing cond_stdOK altogether. This should make conditions for standalone deriving and normal deriving completely equivalent, but it is overly permissive in regard to standard.

comment:17 Changed 6 years ago by thoughtpolice

I'm very confused here.

When Simon Marlow announced H2010, EmptyDataDecls is specifically mentioned as part of the revision:

https://groups.google.com/forum/#!topic/fa.haskell/-muu4dzUBJo

Furthermore, the haskell-prime wiki makes this mention in regards to the extension:

http://ghc.haskell.org/trac/haskell-prime/wiki/EmptyDataDecls

The only real issue is whether or not to allow the optional deriving clause after an empty declaration, and, if not, on what stage to rule them out. Clearly, as there are no data constructors over which to define functions, derived instances would (for consistency) have to be everywhere undefined. GHC seems to syntactically allow a deriving clause after an empty data declaration, but the treats it as a contextual error since no interesting instances can be defined. Presumably the reasoning was that this gives a more regular syntax and better error messages than ruling out deriving for empty declarations syntactically. But the point is that there is a choice.

For this extension, probably the best course of action is to follow GHC and allow deriving () clauses for empty data declarations. That way no deriving clause and deriving () still have the same semantics. To do this, we syntactically allow all deriving clauses for empty data, but semantically limit the clauses to deriving ().

(Just to note, this is in fact how it now works - 'data Foo deriving ()' works just fine, but 'data Foo deriving Eq' is a contextual error.)

So, I don't think we're deviating from the standard at all in fact, at least based on the proposal and what Simon said. Perhaps these parts of the standard document were simply overlooked during the editorial process, hence the inconsistency? Under my interpretation, H2010 absolutely has EmptyDataDecls, and they are "ordinary" data types.

Now we are back to the original question, which the proposal somewhat alluded to. I see this as two separate points:

  • What is more consistent, to reject the instance or create the vacuous one? My view is already clear here (that 'ordinary' data types with trivial vacuous instances - where there are no constructors - should also be derivable.)
  • The standard however, specifically denies this interpretation, as does the Haskell' proposal (delta pt 6.) There are two options:
    • Put this behavior under a new flag? Above, Krzysztof mentioned a possible -XEmptyDeriving.
    • Change the semantics of -XEmptyDataDecls and deviate specifically? This is a delicate point obviously. Retroactively changing extensions is always annoying, but we've been here before, and backwards compatibility with a manual instance is doable. This logic kind of flies in the face of doing it anyway, at face value - but this is relevant to users. Some libraries are very long-term supported, and a 'legacy style' must be dealt with. They can simply write manual instances and nothing will ever break. Other people who don't care can just use the new behavior.

Feasibly, whatever we do, this minor amendment could be proposed for Haskell 2014. If this is the case, perhaps we should use a new flag.

I'm personally in favor of doing whatever users want if we can get any consensus, because ultimately they maintain the libraries and will possibly see the consequences of this change.

comment:18 Changed 6 years ago by simonpj

OK, yes, I missed the [ = constrs ] part of the grammar. Yes, EmptyDataDecls is part of H2010.

And yes, clause (6) in the deriving appendix explicitly dis-allows deriving( Eq ), but here I think it would be OK for GHC to be a little more generous than the standard. That is, with EmptyDataDecls (or Haskell2010) we'd accept

data T deriving( Eq )

whereas H2010 would reject it. But failing to reject is very unlikely to break anyone's code.

Simon

comment:19 Changed 5 years ago by thoughtpolice

Milestone: 7.8.37.10.1

Moving to 7.10.1

comment:20 Changed 5 years ago by jstolarek

Keywords: newcomer added

comment:21 Changed 5 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:22 Changed 4 years ago by thomie

Owner: monoidal deleted

comment:23 Changed 4 years ago by osa1

I submitted a patch for this: https://phabricator.haskell.org/D978

comment:24 Changed 4 years ago by thomie

Differential Rev(s): Phab:D978
Owner: set to osa1

comment:25 Changed 4 years ago by simonpj

Responding to Phab here, because the design discussion is less likely to get lost here.

  1. Thank you for the user manual update. Can you add a paragraph or two explaining the design choice? After all, if I did deriving( Eq ) on an empty data type, or standalone deriving, I would expect a static error; the generated code is never useful. But on the contrary, we now propose to give a dynamic error, postponing a crash until runtime. Why is this good? Explain either in the user manual or on a wiki page that discusses. comment:6 seems relevant, but is cryptic.
  1. I'm certain that standalone deriving and deriving(...) on a data type declaration should behave the same. Imagine trying to explain the difference in the user manual, if they differed! Can you make it so?
  1. For Ix, Enum, Bounded, the 2010 report uses words like "derived instance declarations for the class Ix are only possible for enumerations (i.e. datatypes having only nullary constructors) and single-constructor datatypes".

It seems odd to make them fail with a static error if Eq etc fail with a dynamic error. We can interpret the words "having only nullary constructors" as true for an empty data type; after all all the constructors it has are nullary. So let's make isEnumeration true for empty types too. (Make a Note to explain; and the user manual.)

Simon

comment:26 Changed 4 years ago by osa1

(1) and (2) are done(will push commits soon). For (3), I think it may break some things. I found this NOTE in compiler/types/TyCon.hs:

Note [Enumeration types]
~~~~~~~~~~~~~~~~~~~~~~~~
We define datatypes with no constructors to *not* be
enumerations; this fixes trac #2578,  Otherwise we
end up generating an empty table for
  <mod>_<type>_closure_tbl
which is used by tagToEnum# to map Int# to constructors
in an enumeration. The empty table apparently upset
the linker.

Moreover, all the data constructor must be enumerations, meaning
they have type  (forall abc. T a b c).  GADTs are not enumerations.
For example consider
    data T a where
      T1 :: T Int
      T2 :: T Bool
      T3 :: T a
What would [T1 ..] be?  [T1,T3] :: T Int? Easiest thing is to exclude them.
See Trac #4528.

See also trac #2578.

Relevant commits:

commit 409e4d9f59dece1c21434191459c96b3e355faa9
Author: Ian Lynagh <igloo@earth.li>
Date:   Sat Feb 27 17:24:29 2010 +0000

    Add a test for #2578

commit a251cba370c9bfb291159c4deea20226a87eeac3
Author: Simon Marlow <marlowsd@gmail.com>
Date:   Mon Mar 1 09:55:25 2010 +0000

    expand comments for #2578 fix

commit 09d2b3b2e1f9d0a4c4b938dc6ff6a0b068138445
Author: Ian Lynagh <igloo@earth.li>
Date:   Sat Feb 27 17:39:51 2010 +0000

    Fix trac #2578
    We define empty datatypes as not being enumerations, which means the
    empty blocks aren't generated.

This is the fix:

-  = DataTyCon { data_cons = cons, is_enum = all isNullarySrcDataCon cons }
+  = DataTyCon {
+        data_cons = cons,
+        is_enum = -- We define datatypes with no constructors to not be
+                  -- enumerations; this fixes trac #2578
+                  not (null cons) &&
+                  all isNullarySrcDataCon cons
+    }

Note [Enumeration types] mentions some kind of linker failure related with some table(??), but I don't have a lot of ideas about what it's talking about. I don't have a Mac so I can't try with latest linker/GHC.

Any ideas about (3)? How should we proceed?

comment:27 Changed 4 years ago by osa1

I think in the worst case we should be able to consider empty types as enums and fix #2578 in a later stage -- e.g. while generating the "table".

comment:28 Changed 4 years ago by jpbernardy

I just read the proposed documentation on Phabricator. There is something that I think is dissatisfying with the implementation.

As I understand it, the contract expressed by an empty data type is that there should be no value inhabiting it. So, if a piece of program ever builds one, the blame should be assigned to that piece of code. However, it does not appear that the current implementation does this. Eg:

(error "oops" :: Foo) `compare` (error "oops" :: Foo)

should fail with "oops", not ​"compare on empty data type Foo"

With the same interpretation of type-as-contract, a Read instance makes no sense imho.

Please, consider forcing the arguments in all these instances, before failing.

Finally, I realise that my original report was misleading. I am sorry for that.

Last edited 4 years ago by jpbernardy (previous) (diff)

comment:29 Changed 4 years ago by simonpj

I'm losing the will to live on this one!

Please, consider forcing the arguments in all these instances, before failing.

Not necessarily easy, if the argument is [a], say rather than a. This is so much a corner case that what osa1 has implemented seems fine to me.

On (3) I suppose that either you can make the TcDeriv code check for isEnumerationOrEmpty (simplest), or fix the table-generation stuff.

Simon

comment:30 Changed 4 years ago by jpbernardy

simonpj: I'm not sure we're talking about the same thing. The whole point of my request was that, for any class C a where all methods have an a argument, it is trivial to implement a meaningful instance C Foo for any empty type Foo.

In fact, I am guessing that the instance generator for Eq and Ord generate case statements. So, one could just let the normal instance generator run, to get function definitions with empty case statements.

I see now that the instance for Read is fine (because failure is the only option), but that is more than I was asking.

comment:31 Changed 4 years ago by simonpj

What if the method does not have an a argument? Read is one example, but there are probably others. Ix perhaps.

Still, I really don't care myself. What I want is for someone to be able to look in the user manual and correctly predict what derived instance you'll get for an empty data type. If you and osa1 agree, I bet everyone else will too.

comment:32 Changed 4 years ago by osa1

@jpbernardy, I think there are couple of different issues involved here;

  1. This patch doesn't change anything about GHC-derived method implementations. Standalone deriving was already generating this code, but the documentation isn't talking about this explicitly. (It's saying "GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate boilerplate code for the specified class, and typechecks it" https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/deriving.html) Generated methods are always well-typed, and it's using error this way. We merely made non-standalone deriving to work similarly on empty data types, which shouldn't be a problem for anyone, IMO.

So the thing you're complaining about was already there for a long time now. We just change some things for consistency and document it.

(Also, this patch improves the error message which is always good)

  1. I feel like if we force arguments, someone will come here and complain about it too. For example, currently this code works fine: let g = g in g == g where type of g is empty. If we force it in a non-threaded runtime it'd just loop.

(I don't fully understand types-as-contract argument, can you point me to some resources about that?)

I also don't care myself, but I feel like we should make this working same as standalone deriving, and document how it is working. Changing standalone deriving for empty types can potentially break existing code, so this may be our only option forward.

comment:33 Changed 4 years ago by jpbernardy

Ok, I am just pointing out that the implementation is weird from my point of view. I don't want to make a big fuss about this, but I feel that it's something that should be considered. I suppose in a few years someone may raise the issue again :)

Regarding blame:

The point is that if you get a value of an empty type in your context, it is the caller which is to blame. The proper way to report this blame at runtime is force the caller to produce the value, by forcing it. If you raise an exception yourself you get the blame instead, and so it is harder to track which code is really wrong.

I think that a good entry point to the concept of blame is

"Well-typed programs can't be blamed"

http://homepages.inf.ed.ac.uk/wadler/topics/blame.html

comment:34 Changed 4 years ago by rwbarton

I agree with jpbernardy that generating an empty case on (or equivalently forcing) the first argument would be more correct than producing a new error. However, it seems that in this case the "blame" could be assigned to jpbernardy, who wrote the original report with instance Eq D where (==) = undefined :)

comment:35 Changed 4 years ago by osa1

So what do you think about current StandaloneDeriving implementation? It's already doing what you're complaining.

comment:36 Changed 4 years ago by rwbarton

Indeed. I for one think it should be changed to use an empty case. This ticket might not be the right place to discuss it though, since in any case standalone deriving and ordinary deriving should not produce two different instances. I only skimmed over the patch on Phab, but I assume the same code is used to actually generate the instances in those two cases, right?

comment:37 Changed 4 years ago by osa1

but I assume the same code is used to actually generate the instances in those two cases, right?

Right, the whole point is to make those work the same. Either way(e.g. empty case or error) works for me.

comment:38 Changed 4 years ago by osa1

One thing to note is we need to use current implementation at least for methods like mempty :: Monoid a => a.

comment:39 Changed 4 years ago by rwbarton

I created a separate ticket #10577 for the issue of how to define the derived instances of empty types.

comment:40 Changed 4 years ago by osa1

How should we proceed about this? Like @rwbarton mentioned in #10577, none of these is a part of the standard, so I think we can change behavior of standalone deriving to introduce an empty case, and then also change EmptyDataDecls deriving to make the behavior consistent. Should we move this discussion to the mailing list for feedbacks?

comment:41 Changed 4 years ago by simonpj

This ticket has a long thread, which is discouraging for anyone approaching it for the first time.

My suggestion: write a wiki page proposing a particular design. Advertise it on the Haskell and libraries mailing list (this is practically a core-libraries committee issue). Drive the discussion to a consensus. Implement it.

It's very much a corner case so what we really need is a well documented decision rather than a long debate. It just that it's hard to see the wood for the trees now.

Thanks!

Simon

comment:42 Changed 4 years ago by osa1

comment:43 Changed 4 years ago by RyanGlScott

Cc: RyanGlScott added

comment:44 Changed 4 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:45 Changed 4 years ago by thomie

Milestone: 8.0.1

comment:46 Changed 3 years ago by osa1

Description: modified (diff)
Owner: osa1 deleted

mpickering reminded me this ticket so before I leave the ownership of this ticket let me write down the current status:

Basically there are a couple of things we can do and we just need someone to decide which one.

  1. Implement == as _ == _ = True. Pro: this is consistent with Void's Eq instance. Con: I think this lets some bugs to sneak in. I think it doesn't make sense to compute things based on equality of empty types, so at this point you've probably made a mistake and it's better to just bail out.
  1. Implement == as x == y = case x of {}. Pro: it fixes the problem with (1) and in a good way - you've probably used something like error "..." for the empty type terms, so you get your error message. Con: this inconsistent with Void.
  1. Leave this as is - no deriving (Eq). This way we make the user decide which behaviour to have. Pro: You can have whatever you want. Con: Inconvenient.

These were also discussed at some point but IIRC most people agree that these are bad ideas:

  1. x == y = error "(==) on empty type": If it's going to fail it's better to fail like (2).
  1. _ == _ = False: Denotationally / type-theoretically bottoms are equal or something like that? So it's better to implement (1) instead.
  1. Implement (2), change Voids instance too: Can potentially break code.

comment:47 Changed 3 years ago by osa1

Description: modified (diff)

comment:48 Changed 3 years ago by RyanGlScott

comment:49 Changed 2 years ago by dfeuer

Cc: dfeuer added

comment:50 Changed 2 years ago by enolan

Cc: enolan added

comment:51 Changed 2 years ago by AntC

Now a github proposal https://github.com/ghc-proposals/ghc-proposals/pull/63, see discussion.

comment:52 Changed 2 years ago by RyanGlScott

Differential Rev(s): Phab:D978Phab:D4047
Status: newpatch

Phab:D4047 implements the aforementioned proposal.

comment:53 Changed 23 months ago by Ben Gamari <ben@…>

In 1317ba62/ghc:

Implement the EmptyDataDeriving proposal

This implements the `EmptyDataDeriving` proposal put forth in
https://github.com/ghc-proposals/ghc-proposals/blob/dbf51608/proposals/0006-deriving-empty.rst.
This has two major changes:

* The introduction of an `EmptyDataDeriving` extension, which
  permits directly deriving `Eq`, `Ord`, `Read`, and `Show` instances
  for empty data types.
* An overhaul in the code that is emitted in derived instances for
  empty data types. To see an overview of the changes brought forth,
  refer to the changes to the 8.4.1 release notes.

Test Plan: ./validate

Reviewers: bgamari, dfeuer, austin, hvr, goldfire

Reviewed By: bgamari

Subscribers: rwbarton, thomie

GHC Trac Issues: #7401, #10577, #13117

Differential Revision: https://phabricator.haskell.org/D4047

comment:54 Changed 23 months ago by RyanGlScott

Milestone: 8.4.1
Resolution: fixed
Status: patchclosed
Test Case: deriving/should_run/T7401, deriving/should_run/T7401_fail

comment:55 Changed 16 months ago by RyanGlScott

Keywords: GHCProposal added
Note: See TracTickets for help on using tickets.