#10447 closed bug (fixed)
DeriveFoldable rejects instances with constraints in last argument of data type
Reported by: | RyanGlScott | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | 8.0.1 |
Component: | Compiler | Version: | 7.10.1 |
Keywords: | deriving | Cc: | core-libraries-committee@… |
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | GHC rejects valid program | Test Case: | deriving/should_run/T10447 |
Blocked By: | Blocking: | ||
Related Tickets: | #8678 | Differential Rev(s): | Phab:D1031 |
Wiki Page: |
Description (last modified by )
Currently, the -XDeriveFoldable
extension will reject any derived Foldable
instance for a data type where the last argument of the type constructor is constrained. For example, using this data type from TcDeriv.hs as inspiration:
{-# LANGUAGE DeriveFoldable, GADTs, StandaloneDeriving #-} module DeriveFoldableRejected where data T a b where T4 :: Ord b => b -> T a b T5 :: b -> T b b T6 :: T a (b,b) deriving instance Foldable (T a)
Compiling DeriveFoldableRejected.hs
with GHC 7.10 will currently fail:
DeriveFoldableRejected.hs:9:1: Can't make a derived instance of ‘Foldable (T a)’: Constructor ‘T4’ must be truly polymorphic in the last argument of the data type In the stand-alone deriving instance for ‘Foldable (T a)’ Failed, modules loaded: none.
I don't think this restriction needs to apply to Foldable
instances. Unlike Functor
and Traversable
instances, which require the last argument to be truly universal, Foldable
instances can get away without this. To demonstrate, here's a slightly modified T
data type, without the constraints:
{-# LANGUAGE DeriveFoldable, GADTs, StandaloneDeriving #-} {-# OPTIONS_GHC -ddump-deriv #-} module DeriveFoldableLegal where data T a b where T45 :: b -> T a b T6 :: T a b deriving instance Foldable (T a)
The output of -ddump-deriv
is:
Derived instances: instance Data.Foldable.Foldable (DeriveFoldableLegal.T a_aDc) where Data.Foldable.foldr f_aDd z_aDe (DeriveFoldableLegal.T45 a1_aDf) = f_aDd a1_aDf z_aDe Data.Foldable.foldr f_aDg z_aDh DeriveFoldableLegal.T6 = z_aDh Data.Foldable.foldMap f_aDi (DeriveFoldableLegal.T45 a1_aDj) = f_aDi a1_aDj Data.Foldable.foldMap f_aDk DeriveFoldableLegal.T6 = GHC.Base.mempty
Copying this back into DeriveFoldableRejected.hs
(after some cleanup):
{-# LANGUAGE DeriveFoldable, GADTs, StandaloneDeriving #-} module DeriveFoldableRejected where data T a b where T4 :: Ord b => b -> T a b T5 :: b -> T b b T6 :: T a (b,b) instance Foldable (T a) where foldr f z (T4 a) = f a z foldr f z (T5 a) = f a z foldr f z T6 = z foldMap f (T4 a) = f a foldMap f (T5 a) = f a foldMap f T6 = mempty
reveals that it will compile correctly with the generated code. Therefore, it seems like the check for universality in the last type argument shouldn't be used in -XDeriveFoldable
.
Change History (26)
comment:1 Changed 4 years ago by
Description: | modified (diff) |
---|
comment:2 Changed 4 years ago by
Cc: | core-libraries-committee@… added |
---|
comment:3 Changed 4 years ago by
To be a bit more explicit, TcDeriv
specifies these rules for DeriveFunctor
, DeriveFoldable
, and DeriveTraversable
:
-- OK for Functor/Foldable/Traversable class -- Currently: (a) at least one argument -- (b) don't use argument contravariantly -- (c) don't use argument in the wrong place, e.g. data T a = T (X a a) -- (d) optionally: don't use function types -- (e) no "stupid context" on data type
These conditions are OK. I propose that these additional rules from TcDeriv
only be used for DeriveFunctor
and DeriveTraversable
, not for DeriveFoldable
:
Note [Check that the type variable is truly universal] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For Functor, Foldable, Traversable, we must check that the *last argument* of the type constructor is used truly universally quantified. Example data T a b where T1 :: a -> b -> T a b -- Fine! Vanilla H-98 T2 :: b -> c -> T a b -- Fine! Existential c, but we can still map over 'b' T3 :: b -> T Int b -- Fine! Constraint 'a', but 'b' is still polymorphic T4 :: Ord b => b -> T a b -- No! 'b' is constrained T5 :: b -> T b b -- No! 'b' is constrained T6 :: T a (b,b) -- No! 'b' is constrained Notice that only the first of these constructors is vanilla H-98. We only need to take care about the last argument (b in this case). See Trac #8678. Eg. for T1-T3 we can write fmap f (T1 a b) = T1 a (f b) fmap f (T2 b c) = T2 (f b) c fmap f (T3 x) = T3 (f x)
These rules must hold for Functor
and Traversable
instances because the type signatures of fmap :: Functor t => (a -> b) -> t a -> t b
and traverse :: (Applicative f, Traversable t) => (a -> f b) -> t a -> f (t b)
require that the last type argument be truly polymorphic, since it must be instantiated with two arbitrary types, t a
and t b
. As evidence, if you try to make a Functor T
instance using what DeriveFunctor
would (hypothetically) generate:
{-# LANGUAGE GADTs #-} module T where data T a where MkT :: Int -> T Int instance Functor T where fmap f (MkT a) = MkT (f a)
it will be rejected:
T.hs:8:22: Could not deduce (b ~ Int) from the context (a ~ Int) bound by a pattern with constructor MkT :: Int -> T Int, in an equation for ‘fmap’ at T.hs:8:13-17 ‘b’ is a rigid type variable bound by the type signature for fmap :: (a -> b) -> T a -> T b at T.hs:8:5 Expected type: T b Actual type: T Int Relevant bindings include f :: a -> b (bound at T.hs:8:10) fmap :: (a -> b) -> T a -> T b (bound at T.hs:8:5) In the expression: MkT (f a) In an equation for ‘fmap’: fmap f (MkT a) = MkT (f a)
On the other hand, Foldable
instances do not require the last type argument to be truly polymorphic, since all of the Foldable
methods only parameterize t
over a single type variable:
class Foldable t where fold :: Monoid m => t m -> m foldMap :: Monoid m => (a -> m) -> t a -> m foldr :: (a -> b -> b) -> b -> t a -> b foldr' :: (a -> b -> b) -> b -> t a -> b foldl :: (b -> a -> b) -> b -> t a -> b foldl' :: (b -> a -> b) -> b -> t a -> b foldr1 :: (a -> a -> a) -> t a -> a foldl1 :: (a -> a -> a) -> t a -> a toList :: t a -> [a] null :: t a -> Bool length :: t a -> Int elem :: Eq a => a -> t a -> Bool maximum :: forall a. Ord a => t a -> a minimum :: forall a. Ord a => t a -> a sum :: Num a => t a -> a product :: Num a => t a -> a
As a result, the type parameter can be safely constrained. If you plug in the code that DeriveFoldable
would (hypothetically) generate for T
:
{-# LANGUAGE GADTs #-} module T where data T a where MkT :: Int -> T Int instance Foldable T where foldr f z (MkT a) = f a z foldMap f (MkT a) = f a
it compiles without problems, since pattern-matching on the GADT constructor specializes foldr
to the type (Int -> b -> b) -> b -> T Int -> b
and foldMap
to the type Monoid m => (Int -> m) -> T Int -> m
.
Again, this only works with Foldable
since the type signatures of its methods allow for t a
to be refined safely. Functor
and Traversable
do not permit this, so the universal polymorphism check would still apply for them.
comment:4 Changed 4 years ago by
Foldable
is safe in the presence of any constraint at all on a
. It doesn't ever produce one, just consumes them. All occurrences of a
in the Foldable
dictionary are in negative position, and it can safely ignore any constraints on a
you put in place.
Functor
and Traversable
need to keep the check, though.
comment:5 Changed 4 years ago by
OK, but can I ask: can you say precisely what Foldable
code you would expect to get for these two data types?
data T1 a where MkT1 :: (a~Int) => a -> T1 a data T2 a where MkT2 :: Int -> T2 Int data T3 a where MkT3 :: (a~Int) => a -> T3 Int
All three are equivalent. But I am guessing that you intend that they are not equivalent to the "deriving Foldable
" algorithm? If so, that had better be clearly stated in the user manual.
(Regrettably, the user manual is silent on how Foldable
, Traversable
, and Functor
are generated. It'd be jolly good to have a wiki page that explained the deriving algorithm; here's how it's done for Eq/Ord etc.)
The documentation is probably the hard bit. The change to lift the condition is pretty easy: look at TcDeriv.cond_functorOK
. (This same function actually handles Traversable
and Foldable
. A little refactoring would be needed to distinguish the Foldable
case, perhaps by passing the class name instead of a boolean to the function.)
I'd be happy to review a patch.
Simon
comment:6 Changed 4 years ago by
The code that I would expect DeriveFoldable
to derive for T1
, T2
, or T3
would be:
instance Foldable T where foldr f z (MkT a) = f a z foldMap f (MkT a) = f a
which is the same code that you would get for data T a = MkT a deriving Foldable
. I don't believe this would require any change to the deriving Foldable
algorithm (but I'm not intimately familiar with the implementation details).
I agree that it would be nice to have more documentation on the algorithms themselves. I gained a better intuition for the Functor, Foldable, and Traversable algorithms by reading the comments in TcGenDeriv.hs
. For example, here's a somewhat formal description of how deriving Foldable
works:
The cases are: $(foldr 'a 'b) = \x z -> z -- when b does not contain a $(foldr 'a 'a) = f $(foldr 'a '(b1,b2)) = \x z -> case x of (x1,x2) -> $(foldr 'a 'b1) x1 ( $(foldr 'a 'b2) x2 z ) $(foldr 'a '(T b1 b2)) = \x z -> foldr $(foldr 'a 'b2) z x -- when a only occurs in the last parameter, b2
comment:7 Changed 4 years ago by
You expect that the Foldable
instance for data T2 a = (a ~ Int) => MkT2 Int
give you the Int
because there happens to be an Int
equality constraint? That seems pretty unintuitive to me. What about e.g. data E = E Int
, and data A a = A E
vs. data A a = (a ~ Int) => A E
, and then inlining E?
I think Simon's examples are worth thinking about, and derived Foldable
instances in the presence of equality constraints need some more justification. It's true that Foldable
is the wild west of type classes, so parametricity doesn't give us the right answer like for Functor
, but I don't expect it to just look for types that happen to be equal.
comment:8 Changed 4 years ago by
You expect that the
Foldable
instance fordata T2 a = (a ~ Int) => MkT2 Int
give you theInt
because there happens to be anInt
equality constraint?
That is what I am expecting, yes, especially since T1
, T2
, and T3
are equivalent representations of the same thing.
That seems pretty unintuitive to me. What about e.g.
data E = E Int
, anddata A a = A E
vs.data A a = (a ~ Int) => A E
, and then inlining E?
I'm not sure how inlining would cause a derived Foldable
instance for data A a = (a ~ Int) => A E
to fail. Does inlining occur before DeriveFoldable
generates its code? If so, I don't see how GHC could confuse E
for Int
.
derived
Foldable
instances in the presence of equality constraints need some more justification.
I find myself wanting to derive Foldable
instances for deeply embedded DSLs that are represented as GADTs. Here's a simple example from hermit-ghci:
data ShellEffect :: * -> * where ShellEffect :: Value -> ShellEffect ()
(which could equivalently be represented as ShellEffect :: a ~ () => Value -> ShellEffect a
)
This isn't a Functor
or Traversable
, but it readly admits a Foldable
instance which DeriveFoldable
would be able to infer:
instance Foldable ShellEffect where foldr f z (ShellEffect a) = z foldMap f (ShellEffect a) = mempty
comment:9 Changed 4 years ago by
I'm probably making an assumption about how robust GHC's equality constraint solver is, but I would think that it'd be capable enough to infer whether an argument's type is equal to the type parameter. If it were undecidable, then wouldn't this (legal) code have to be rejected?
data S a b where MkS :: (a ~ Int) => Int -> S a b deriving instance Foldable (S a)
comment:10 follow-up: 11 Changed 4 years ago by
Also, for instance
data Refine a where RefineI :: Refine Int RefineD :: Refine Double data Foo a = Foo (Refine a) Int
Should a Foldable
instance for Foo
test the Refine a
and fold over the Int
if it's RefineI
?
comment:11 Changed 4 years ago by
Replying to dolio:
Also, for instance
data Refine a where RefineI :: Refine Int RefineD :: Refine Double data Foo a = Foo (Refine a) IntShould a
Foldable
instance forFoo
test theRefine a
and fold over theInt
if it'sRefineI
?
I assume that Refine
is also an instance of Foldable
? If so, the respective derived Foldable
instances of Refine
and Foo
would be:
instance Foldable Refine where foldr f z RefineI = z foldr f z RefineD = z foldMap f RefineI = mempty foldMap f RefineD = mempty instance Foldable Foo where foldr f z (Foo a1 a2) = foldr f z a1 foldMap f (Foo a1 a2) = mappend (foldMap f a1) mempty
So no, the derived instance wouldn't fold over the Int
, since the Foldable Foo
instance would never pattern match on a constructor of Refine
.
I don't see this as a problem, especially since this seems consistent with the behavior of deriving Generic1
statements, which also check if a field has the same type as the last type parameter. For example, if you derived a Generic1
instance for Foo
, it would give:
type Rep1 Foo = D1 D1Foo (C1 C1_0Foo (S1 NoSelector (Rec1 Refine) :*: S1 NoSelector (Rec0 Int)))
So deriving Generic1
would also consider the a
in Refine a
to be distinct from Int
, as evidenced by their differing Rec1
and Rec0
wrappers.
comment:12 Changed 4 years ago by
I'm not asking if GHC would derive the instance I specified. I'm asking whether it's an okay Foldable
instance based on the same reasoning as for:
data Bar a where Bar :: (a ~ Int) -> Int -> Bar a
and the like. "Can we automatically do this," is not the only relevant question. "Should we do this and why," also matters.
comment:13 Changed 4 years ago by
I think it's entirely reasonable for a derived Foldable Foo
instance not to fold over the Int
argument. If it did fold over the Int
argument, then the implementation of foldr
and foldMap
would have to pattern-match the Refine
argument, making it resemble this:
instance Foldable Foo where foldr f z (Foo RefineI a2) = f a2 z foldr f z (Foo RefineD a2) = z foldMap f (Foo RefineI a2) = mappend mempty (f a2) foldMap f (Foo RefineD a2) = mappend mempty mempty
At this point, though, we are no longer defining foldr
and foldMap
in terms of the Foldable Refine
. This feels off to me, since
- Almost every other
deriving
typeclass extension dispatches to arguments' implementations of that particular typeclass (e.g., indata Bar a = Bar (Maybe a) [a] deriving Show
, the derivedShow (Bar a)
instance relies on theShow
instances forMaybe a
and[a]
). - If a user were to manually define an "untraditional"
Foldable
instance forRefine
, the derivedFoldable Foo
instance wouldn't make use of it.
In addition, requiring derived instances to pattern-match constructors of GADT arguments could break other code, e.g.,
data Refine a where RefineI :: Refine Int RefineD :: Refine Double instance Functor Refine where fmap = undefined data Foo a = Foo (Refine a) Int deriving Functor
Since pattern-matching on Refine
would refine the type of fmap
in the derived Functor Foo
instance to something that isn't universally polymorphic.
comment:14 Changed 4 years ago by
Simon's examples definitely give me pause.
So putting aside type equality constraints for a moment, the other cases seem unambiguous.
This would allow things like
data Set a where Empty :: Set a NonEmpty :: Ord a => Tree a
but wouldn't get us
data Val a where Int :: Int -> Val Int Char :: Char -> Val Char
To get there, it seems to me that shachaf
's E
example seems a step too far. Given
data Bar a = Bar a data Foo a = Foo (Bar a b)
We don't attempt to walk into Bar
and recursing into something that isn't the last argument or relying on expansion seems "deeply magic", and ill specified -- where does the recursion stop?
On the other hand, if we take Simon's examples, and say it'd be nice to generate the same code for semantically equal data declarations, and to allow whatever constraints we want, then using the type equality constraint (a ~ X) as part of the check for naked a
s and accepting a naked X
as an a
seems very coherent.
Then in the presence of (a ~ X)
, folding over all X
's immediately inside the structure we're in and delegating all (f X)
's to the Foldable
for f
would be consistent with the existing behavior.
If you wanted something more exotic where you skipped over some of them, you'd have to write a hand-rolled instance.
comment:15 Changed 4 years ago by
This is interesting, but I lack the bandwidth to follow it in detail.
I urge against a design that requires the programmer to predict the behaviour of the constraint solver in order to decide what code the "deriving" will generate. I think it's be much better to say that it depends on the syntactic form of the declaration. So for MkT1
and MkT3
you'd get the first argument included in the fold, but for MkT2
you would not.
You need to think what to do about cases like this:
data S a where MKS :: b -> c -> S (b,c)
Perhaps y'all can work out a design, document it on a wiki page, and even implement it. (I can advise.)
I'd really love the same page to document the behaviour of deriving for Functor
and Traversable
too!
Many thanks
Simon
comment:16 Changed 4 years ago by
Yeah, I'm not a big fan (at the moment) of having equality constraints and the like pick up fields that aren't syntactically identical. For instance:
data Foo b a where MkFoo :: (T b ~ a) => a -> b -> Int -> Foo b a deriving Foldable
Now if T b
reduces to Int
, we fold over the first and third argument, but if it doesn't, we only fold over the first (because we can't do anything else)? I could probably agree with having it succeed and always fold over the first field, because it's syntactically an a
. But I feel like that should be the default behavior, not trying to deduce things from the constraints.
Also consider:
class (b ~ Int) => Foo a b where ... class Foo a b => Bar a b where ... data Quux a b where MkQ :: Bar a b => Int -> a -> b -> Quux a b
Does this pick up the Int
because if we trace back from Bar a b
we get b ~ Int
? What about more complicated examples?
comment:17 Changed 4 years ago by
Well, it seems like the most controversial part of this is folding over types that aren't syntactically equivalent to the return type. I'm certainly open to the idea of not folding over types if it's not immediately clear that they're equal, but I'm not sure how many levels of indirection would be considered acceptable. In this data type:
class (a ~ Int) => Foo a data A a where A1 :: Ord a => a -> T a A2 :: Int -> T Int A3 :: b ~ Int => b -> T Int A4 :: a ~ Int => Int -> T a A5 :: a ~ Int => a -> T a A6 :: (a ~ b, b ~ Int) => Int -> b -> T a A7 :: Foo a => Int -> a -> T a
How we would we define "syntactic equality" here? It seems like A1
and A5
are pretty uncontroversial here, but the other constructors require (at least) one type substitution to infer that their fields could be folded over. Which of these constructors would you all consider to be acceptable for folding over? I'd need to get an idea of this before responding to dolio's questions.
comment:18 Changed 4 years ago by
I vote for, well, syntactic equality. That means A1
, A5
, and A7
, but none of the others.
Simon
comment:20 Changed 4 years ago by
OK, I can live with that. That way, I can derive Foldable
for any GADT and still fold over all the terms that have types equal to the last type parameter (up to rewriting Con :: PlainTy -> PlainTy -> ... -> GADT PlainTy
as Con :: a ~ PlainTy => a -> a -> ... -> GADT a
).
I'm not the original author of the DeriveFunctor/Foldable/Traversable
extensions, but I'd be willing to help write a wiki article on how they currently work (and how DeriveFoldable
will eventually work). Is there a good template article from which I could adapt? All I know about the deriving
algorithms is what I gleaned from comments in TcDeriv.hs, as well as from TcGenDeriv.hs, but they may be too formal for wiki purposes.
comment:21 Changed 4 years ago by
Here's a (rough draft of) a wiki article about the DeriveFunctor
, DeriveFoldable
, and DeriveTraversable
trio, included the proposed change from this feature request. Feedback is very much appreciated.
comment:22 Changed 4 years ago by
Differential Rev(s): | → Phab:D1031 |
---|
comment:24 Changed 4 years ago by
Milestone: | → 7.12.1 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Test Case: | → deriving/should_run/T10447 |
comment:26 Changed 2 years ago by
Keywords: | deriving added |
---|
Copying the Core Libraries Committee. What conditions are ok? For example, is this OK?
or this (which is equivalent)
I think that the 'deriving' code for
Functor
,Foldable
etc behaves specially when it comes across the universally-quantified type variable (a
above), so it might get confused if it wasInt
and nota
.Disabling the test is easy: the question is whether that's the right thing to do.
Simon