Opened 3 years ago

Closed 2 years ago

#13328 closed bug (fixed)

Foldable, Functor, and Traversable deriving handle phantom types badly

Reported by: dfeuer Owned by:
Priority: normal Milestone: 8.4.1
Component: Compiler Version: 8.1
Keywords: deriving-perf Cc: RyanGlScott
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Suppose we have

data Phantom a = Z | S (Phantom a)

We'd like to get something like

foldMap _ _ = mempty
fmap = coerce
traverse _ m = pure (coerce m)

But instead we actually pattern match all the way down! Basically, we want to treat "has a phantom role" and "does not occur" similarly.

Change History (13)

comment:1 Changed 3 years ago by RyanGlScott

That proposed foldMap implementation has different strictness properties than the current one that deriving Foldable uses. That is, this program:

{-# LANGUAGE DeriveFoldable #-}

data Phantom a = Z | S (Phantom a)
  deriving Foldable

main :: IO ()
main = print $ foldMap (const ()) (undefined :: Phantom Int)

will throw an exception, whereas this program:

data Phantom a = Z | S (Phantom a)
instance Foldable Phantom where
  foldMap _ _ = mempty

main :: IO ()
main = print $ foldMap (const ()) (undefined :: Phantom Int)

will output ().

I like the intent of this proposal, however. Can the strictness problem be fixed by changing it to foldMap _ !_ = mempty/foldr _ z !_ = z?

comment:2 Changed 3 years ago by dfeuer

No, you have to pick efficiency or strictness here. The bottom could show up anywhere down the line. I think derived instances should generally go for what someone's likely to write by hand, rather than trying to imitate the lousy results of the default. The default implementation of null will diverge on an infinite snoc-list, but that doesn't mean the derived implementation should. I think the same is true here. If the type is phantom, there's no possible reason to go there, so we shouldn't.

comment:3 Changed 3 years ago by RyanGlScott

I do agree that your proposed definition is probably the better one. I only ask that we point out this change in strictness in the users' guide (and probably the migration guide for whatever GHC release this makes it into) so that users aren't too surprised.

comment:4 Changed 3 years ago by RyanGlScott

Two more thoughts to consider.

One: how should this interact with your feature request in #13117? That is, you had previously requested that if you wrote this:

data V a deriving Functor

then GHC should generate this:

instance Functor V where
  fmap _ x = case x of {}

But V's type parameter is phantom here, so you could just as well implement it like this (as Eric Mertens originally pointed out in https://mail.haskell.org/pipermail/libraries/2017-January/027603.html):

instance Functor V where
  fmap _ = coerce

Which choice should we make here?

Two (a follow-up question to comment:1, after reading Eric's comment about role annotations in https://mail.haskell.org/pipermail/libraries/2017-January/027603.html): consider this code:

data Foo a = Foo
deriving instance Foldable Foo

In this scenario, the type parameter to Foo is phantom, so the generated Foldable instance is:

instance Foldable Foo where
  foldMap _ _ = mempty

Now what happens if you add a role annotation to Foo?

type role Foo nominal
data Foo a = Foo
deriving instance Foldable Foo

Now, since Foo's type parameter is no longer phantom, the generated Foldable instance will be:

instance Foldable Foo where
  foldMap _ Foo = mempty

That is to say, the choice of role can affect what the strictness of foldMap will be. Is this desirable?

comment:5 Changed 3 years ago by RyanGlScott

There's one more scenario that this proposed change would wreak havoc with. It's quite possible for users to define unlawful Functor instances and depend on them in other derived Functor instances:

newtype Unlawful a = Unlawful Int
instance Functor Unlawful where
  fmap f (Unlawful i) = Unlawful (i + 1)

newtype WrapUnlawful a = WrapUnlawful (Unlawful a)
  deriving Functor

But under this proposal, the generated Functor instance for WrapUnlawful would be:

instance Functor WrapUnlawful where
  fmap _ = coerce

This would be another departure in behavior, as now fmapping over a WrapUnlawful wouldn't apply (+1) to the field of the Unlawful (icky as it may be).

comment:6 Changed 3 years ago by dfeuer

The question about fmap strictness looks like a red herring to me. There's no real difference between the empty case definition and the coerce definition. The illegitimate fmap is of very little interest to me. Someone who cares about preserving their map count or whatever just won't be able to use Functor deriving for types explicitly mentioning their bogus one.

comment:7 Changed 3 years ago by RyanGlScott

The question about fmap strictness looks like a red herring to me. There's no real difference between the empty case definition and the coerce definition.

Sure, I just wanted to note the "conflict".

The illegitimate fmap is of very little interest to me. Someone who cares about preserving their map count or whatever just won't be able to use Functor deriving for types explicitly mentioning their bogus one.

That's also fine, just make sure we note this if this gets implemented. Luckily, Functor/Foldable/Traversable happen to be pretty rigid, law-abiding classes, so this restriction won't really hurt anyone in practice.

What is your opinion on the RoleAnnotations example above? This is the only thing that genuinely concerns me, since I think having the strictness of a derived Foldable instance change depending on whether a type parameter is phantom or non-phantom is quite unintuitive. I'm not sure of a good workaround, however.

comment:8 Changed 3 years ago by dfeuer

The role annotation should (I believe) only affect instances derived outside the module. So I believe the concern would be for a type that is defined using a type defined in another module that has a role annotation. But again, that matches the "what the user would likely write by hand" scenario: we can only do as well as the user could. Yes, it's true that this introduces some potential for surprises; the user won't get a nice error message when they make the change. Do you think this is a judgement call that should be made through the proposal process?

comment:9 Changed 3 years ago by RyanGlScott

Sadly, a role annotation would affect the datatype even in the module in which it is originally defined:

{-# LANGUAGE RoleAnnotations, TemplateHaskell #-}

import Language.Haskell.TH

type role Foo nominal                                                                     
data Foo a = Foo a                                                                        
                                                                                          
$(return [])                                                                              
                                                                                          
main :: IO ()                                                                             
main = putStrLn $(reifyRoles ''Foo >>= stringE . show)
$ runghc Bug.hs
[NominalR]

The issue is really that role annotations are only a crude approximation of the property we actually want here. For Functor and Traversable, we really are using coerce, so a phantom role annotation is precisely what you need. But we aren't using coerce in the proposed Foldable instance, and moreover, the property we really want to ensure is that the type parameter doesn't appear anywhere in any constructor's fields. Sadly, a phantom role does not always imply this.

I'm tempted to suggest a workaround in which we re-infer the roles for every data type, but this time, we ignore all user-supplied role annotations. That way, we would get precisely the right information about whether the last type parameter appears somewhere in the datatype's definition. But sadly, this would necessarily break abstraction in the case where a constructor's field mentions an abstract type that has been given a role annotation of representational or nominal.

Another option we could choose is to simply skip over this optimization for Foldable. That's likely not what you'd prefer, but there are a number of properties which make dealing with Foldable awkward that aren't present with Functor and Traversable.

In any event, presenting this idea via a proposal would certainly be a good thing. I'm curious to know what others think about this.

comment:10 Changed 3 years ago by dfeuer

You're right that the role is only an approximation for Foldable, but using that approximation has an important advantage: it ensures that deriving foldMap will never be worse than using foldMapDefault with a derived Traversable instance.

comment:11 Changed 3 years ago by RyanGlScott

One way we could help mitigate the effect of this is by doing an additional optimization for derived Foldable instances. First, we check each constructor, and if it does not contain an occurrence of the last type parameter, we filter out that constructor. Then we generate the foldMap cases as normal for fhe constructors that remain, and if any constructors were filtered out, we generate a catch-all foldMap _ _ = mempty case at the end.

This should at least help in the case when all the data constructors of a phantom type are in scope, but its role has been annotated as something other than phantom.

comment:12 Changed 2 years ago by David Feuer <David.Feuer@…>

In 69f070d/ghc:

Deriving for phantom and empty types

Make `Functor`, `Foldable`, and `Traversable` take advantage
of the case where the type parameter is phantom. In this case,

* `fmap _ = coerce`
* `foldMap _ _ = mempty`
* `traverse _ x = pure (coerce x)`

For the sake of consistency and especially simplicity, make other types
with no data constructors behave the same:

* `fmap _ x = case x of`
* `foldMap _ _ = mempty`
* `traverse _ x = pure (case x of)`

Similarly, for `Generic`,

* `to x = case x of`
* `from x = case x of`

Give all derived methods for types without constructors appropriate
arities. For example,

```
    compare _ _ = error ...
```

rather than

```
    compare = error ...
```

Fixes #13117 and #13328

Reviewers: austin, bgamari, RyanGlScott

Reviewed By: RyanGlScott

Subscribers: ekmett, RyanGlScott, rwbarton, thomie

Differential Revision: https://phabricator.haskell.org/D3374

comment:13 Changed 2 years ago by dfeuer

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.