Opened 3 years ago

Closed 3 years ago

Last modified 2 years ago

#12814 closed bug (fixed)

Should GND infer an instance context when deriving method-free classes?

Reported by: RyanGlScott Owned by: RyanGlScott
Priority: normal Milestone: 8.2.1
Component: Compiler (Type checker) Version: 8.0.1
Keywords: deriving Cc: goldfire, simonpj
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #11369, #12810 Differential Rev(s): Phab:D2692
Wiki Page:

Description

This is a design question that emerged from code that I originally discovered here and here. To recap, it's now possible to have code like this:

{-# LANGUAGE GeneralizedNewtypeDeriving, TypeFamilies #-}
class C a where
  type T a

newtype Identity a = Identity a deriving C

Compiling this (with -Wredundant-constraints) generates this code:

instance C a => C (Identity a) where
  type T (Identity a) = T a

But now GHC will complain:

• Redundant constraint: C a
• In the instance declaration for ‘C (Identity a)’

This warning makes sense from the perspective that the C a constraint isn't ever used by the associated type family instance. So the question arises: should GND avoid generating an instance context for the representation type in the event it's deriving a class with no methods?

Fortunately, patching GHC to do this is trivial:

  • compiler/typecheck/TcDeriv.hs

    diff --git a/compiler/typecheck/TcDeriv.hs b/compiler/typecheck/TcDeriv.hs
    index 4722f16..df2d3d5 100644
    a b mkNewTypeEqn dflags overlap_mode tvs 
    12721272            [ let (Pair t1 t2) = mkCoerceClassMethEqn cls dfun_tvs inst_tys rep_inst_ty m
    12731273              in mkPredOrigin (DerivOriginCoerce meth t1 t2) TypeLevel
    12741274                              (mkReprPrimEqPred t1 t2)
    1275             | meth <- classMethods cls ]
     1275            | meth <- meths ]
     1276        meths = classMethods cls
    12761277
    12771278                -- If there are no tyvars, there's no need
    12781279                -- to abstract over the dictionaries we need
    mkNewTypeEqn dflags overlap_mode tvs 
    12811282                --              instance C T
    12821283                -- rather than
    12831284                --              instance C Int => C T
    1284         all_preds = rep_pred_o : coercible_constraints ++ sc_theta -- NB: rep_pred comes
     1285        all_preds = if null meths then [] else [rep_pred_o]
     1286                    -- See Note [GND and method-free classes]
     1287                       ++ coercible_constraints
     1288                       ++ sc_theta
     1289                       -- NB: rep_pred_o comes first
    12851290
    12861291        -------------------------------------------------------------------
    12871292        --  Figuring out whether we can only do this newtype-deriving thing

After implementing this patch, I ran the testsuite, and there were some surprising results. One thing that shocked me was that the program reported in #6088, which had previously failed due to a patch resulting from #8984, was now passing!

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}  
{-# LANGUAGE EmptyDataDecls #-}

module T6088 where

class C a

newtype A n = A Int

type family Pos n
data True

instance (Pos n ~ True) => C (A n)

newtype B n = B (A n) deriving (C)

That is because previously, GHC was trying to generate an instance like this:

instance (Pos n ~ True) => C (B n)

And this was rejected, since we don't infer exotic equality constraints when deriving. But with this patch, it now generates:

instance {- Empty context => -} C (B n)

Which is certainly valid. But is it what a user would expect? I'm not so sure.

As hvr notes in #11369, sometimes empty classes are used to enforce invariants, like in the following case:

class Throws e

readFoo :: Throws IOError => FilePath -> IO Foo
readFoo fn = {- ...  -}

What if you wanted to have a Throws instance for a newtype? You'd probably want something like this:

newtype Id a = MkId a

instance Throws a => Throws (Id a)

Which feels like something GND should be able to take care of with ease. But to your surprise, you try this:

newtype Id a = MkId a
  deriving Throws

And now GHC generates not the instance above, but rather just:

instance Throws (Identity a)

So it's possible that we would lose some of the expressiveness of GND by implementing this change. Is that acceptable? I'm not sure, so I though I'd solicit feedback here.

Change History (17)

comment:1 Changed 3 years ago by RyanGlScott

Cc: goldfire simonpj added

comment:2 Changed 3 years ago by goldfire

Frankly, I was surprised that GND assumes any class constraint at all. Why should it be different than other forms of deriving? Just infer the class constraint based on the constraints that arise when type-checking the generated definition.

comment:3 in reply to:  2 Changed 3 years ago by RyanGlScott

Replying to goldfire:

Frankly, I was surprised that GND assumes any class constraint at all. Why should it be different than other forms of deriving? Just infer the class constraint based on the constraints that arise when type-checking the generated definition.

I'm not sure how the thing you're imagining is different from what actually happens. GND must infer a constraint (applied to the representation type for the newtype) in order to typecheck any calls to coerce that GND generates. This is quite similar to stock deriving. For example, in:

data Foo a b = Foo b
  deriving Show

You must infer Show b in order to typecheck the generated showsPrec function.

The only difference in this scenario is that there are no stock derivable classes without methods, but with GND, such a thing is possible. So it's quite a different beast.

comment:4 in reply to:  2 Changed 3 years ago by simonpj

Replying to goldfire:

Just infer the class constraint based on the constraints that arise when type-checking the generated definition.

I agree with this; and it's precisely what Ryan's patch does. mkNewTypeEqn generates the constraints that are necessary to satisfy the instance decl; they are then simplified, perhaps to nothing at all. We are going to generate

instance ?? => C (N a) where
  ...type instances...
  op = coerce (op :: rep-ty)

So for each method (if any) we need

  • C rep_ty so that we have the underlying op
  • op_ty[rep_ty] ~R op_ty[N a] to for the coercion of the op

If there are no methods we don't need any of those constraints. Ryan's notes above show that the result may perhaps not be what you want, but if not, just use a standalone deriving and specify the instance context.

So I vote for this patch!

comment:5 Changed 3 years ago by goldfire

I suppose so. But it would also seem possible to skip figuring out, a priori, what constraints should be there. Can't we just type-check the generated code and then posit whatever constraints are necessary for that code to be accepted? That's what we do for stock instances, no?

comment:6 in reply to:  5 Changed 3 years ago by RyanGlScott

Replying to goldfire:

Can't we just type-check the generated code and then posit whatever constraints are necessary for that code to be accepted? That's what we do for stock instances, no?

Currently, no. The code for this is inferConstraints in TcDerivInfer. For stock deriving, it walks over each field of each constructor for a datatype, puts the class constraint on the field's type, and then simplifies. So data Foo a b = Foo a Int deriving Show becomes (Show a, Show Int) becomes Show a.

As for why this is done this way, I'm not sure, but I do know that GHC figures out the instance context before ever generating the class method implementations (possibly for typechecking purposes).

comment:7 Changed 3 years ago by goldfire

Yes, I suppose I knew that, when I think about it. But why don't we do it the other way? (This is an honest question -- not just trying to be a gadfly.) It seems easier and more robust.

comment:8 Changed 3 years ago by simonpj

The rationale is this.

  • For 'deriving' clauses we need to figure out what the context of the derived instance declaration should be.
  • What we actually do is this: we gather together all the constraints that will be required, and simplify them to make a minimal instance context.
  • We could instead (a) generate the method bindings etc, (b) typecheck them, (c) figuring out the constraints that are needed to do so, and minimise them (d) turn all that into an instance decl (e) typecheck the instance decl with all its methods. But typechecking instance decls is already fairly complicated and (a-c) sounds tricky to me.

I still vote for Ryan's suggestion

Last edited 3 years ago by simonpj (previous) (diff)

comment:9 Changed 3 years ago by goldfire

I'm happy with Ryan's patch, too. But I'm still think my new approach would be simpler than the existing one.

(a) is already done. That's TcGenDeriv.

(b) is already done: tcInstDcls (or similar)

(c) will be the set of unsolved wanted constraints left over after solving. These will have to be checked for exoticness, but that's it!

(d) is simply taking the result of (c) and sticking it in the instance decl. There is a matter that (c) outputs Type and (d) will want HsContext -- but I think we already have a way of squirreling a Type inside of an HsType, so this shouldn't be bad.

(e) would be necessary, and it would be repeated work. So this new approach is probably slower than the existing one.

In exchange, we get to delete the first several hundred lines of TcDerivInfer (inferConstraints).

In any case, if it's not broken, perhaps we shouldn't fix it. I just find that special-casing the 0-method case to be a bit odd.

comment:10 Changed 3 years ago by rwbarton

I am sure you understand this better than I do, but what happens in a case involving non-regular recursion (e.g. deriving Eq for data X a = A a | B (X (a, a)))?

I argued for the same approach in the setting of DeriveAnyClass, and was disappointed that instead a heuristic was adopted that has no reason to be correct in general, so I think something is a bit "broken" here, in the sense of your last paragraph.

comment:11 Changed 3 years ago by simonpj

It's really not simple... one instance may need other instances, so we need to solved simultaneous equations; see Note [Simplifying the instance context] in TcDerivInfer.

Also we aren't really special-casing the 0-method case. For each method we need

  1. C rep-ty (same for each method)
  2. A coercible constraint (different for each method)

Instead of generating N copies of (1) we generate just one. But if N=0 we can generate zero of them.

Last edited 3 years ago by simonpj (previous) (diff)

comment:12 Changed 3 years ago by Simon Peyton Jones <simonpj@…>

In f05d685a/ghc:

Refactoring of mkNewTypeEqn

The refactoring here is very small.  I did it while studying
Trac #12814.

To implement the change in #12814, we can just un-comment the
lines at line 1275.  It's ready to go but I didn't want to pull
the trigger in this commit

comment:13 Changed 3 years ago by simonpj

Owner: set to RyanGlScott

Ryan, I suggest you pull the trigger on this

comment:14 Changed 3 years ago by RyanGlScott

Differential Rev(s): Phab:D2692
Status: newpatch

comment:15 Changed 3 years ago by Ben Gamari <ben@…>

In 03e8d26f/ghc:

Prevent GND from inferring an instance context for method-less classes

When `GeneralizedNewtypeDeriving` is used with a type class that has no
methods, it will generate a redundant context, and as a result, it can
trigger warnings when compiled with `-Wredundant-constraints`. This is a
simple change in behavior to check beforehand if a class has methods
when deriving it with GND, and if it has no methods, avoid inferring the
redundant context.

Beware that the test for #6088, which used to be expected to fail, now
compiles without issue since it doesn't infer a problematic instance
context.

Thanks to Simon Peyton Jones for doing the necessary refactoring in
f05d685ae05ec293083f2fa7ec7ba057fbe64869.

Fixes #12814.

Test Plan: ./validate

Reviewers: goldfire, rwbarton, simonpj, austin, bgamari

Reviewed By: simonpj

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D2692

GHC Trac Issues: #12814

comment:16 Changed 3 years ago by bgamari

Resolution: fixed
Status: patchclosed

comment:17 Changed 2 years ago by RyanGlScott

Keywords: deriving added
Note: See TracTickets for help on using tickets.