Opened 3 years ago

Last modified 3 years ago

#12731 new bug

Generic type class has type family; leads to big dep_finsts

Reported by: ezyang Owned by:
Priority: normal Milestone:
Component: Compiler Version: 8.1
Keywords: Generics Cc: goldfire, simonpj, RyanGlScott
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

While trying to determine a good explanation for simonpj's question in Phab:D2607, I noticed that dep_finsts was a lot larger than I would have ordinarily expected it to be: it included many modules that did not have the TypeFamilies extension enabled for them. For example, for one module in Cabal, here's "family instance modules":

family instance modules: Distribution.Compat.Semigroup
                         Distribution.Compiler Distribution.ModuleName Distribution.Package
                         Distribution.Simple.Compiler Distribution.System
                         Distribution.Utils.ShortText Distribution.Verbosity
                         Distribution.Version Language.Haskell.Extension Control.Applicative
                         Data.Complex Data.Either Data.Functor.Const Data.Functor.Identity
                         Data.List.NonEmpty Data.Monoid Data.Semigroup Data.Type.Equality
                         Data.Version Data.Void GHC.Exts GHC.Generics GHC.IO.Exception
                         GHC.TypeLits Data.IntMap.Base Data.IntSet.Base Data.Map.Base
                         Data.Sequence Data.Set.Base Text.PrettyPrint.Annotated.HughesPJ
                         Text.PrettyPrint.HughesPJ

Do we *really* have this many type family instances in base and Cabal? I was flummoxed, until I realized that the Generic type class defines a type family!

-- | Representable types of kind *.
-- This class is derivable in GHC with the DeriveGeneric flag on.
class Generic a where
  -- | Generic representation type
  type Rep a :: * -> *
  -- | Convert from the datatype to its representation
  from  :: a -> (Rep a) x
  -- | Convert from the representation to the datatype
  to    :: (Rep a) x -> a

The upshot is that if you derive Generic, you have agreed to a perpetual interface file size tax on every module which transitively depends on your module, as well as lots of fruitless pairwise consistency checking. Ick, especially considering that it's fairly common practice to slap a Generic on every data type you define.

This is a case where we would gain a lot if we could put a local restriction on Generic instances so that individual instances are guaranteed not to overlap, e.g., like one of the rules that Rust uses (http://smallcultfollowing.com/babysteps/blog/2015/01/14/little-orphan-impls/) Then we'd avoid balling up a big transitive closure of all modules that wrote deriving Generic. Since non-overlapness is guaranteed by construction, we'd no longer need an eager check.

Related #5224

Change History (3)

comment:1 Changed 3 years ago by goldfire

What a great blog post to link to. (Although it did remind me of reading Neal Stephenson's novel Anathem, which takes place in an alternate universe where just about all technical concepts have been renamed.) I will summarize it: The author proposes and does an in-depth comparison of several approaches that can be used to avoid incoherence of either type families or classes (it's the same problem) by doing local checks instead of a check at module-import-graph join points. All of these rules prohibit orphan instances of any kind, but are otherwise much subtler than I would have guessed.

While we could explore this possibility (does anyone use orphan type family instances?), I see a simpler approach: treat Generic like we do Typeable, by requiring that GHC provide the instances. Does anyone hand-write Generic instances? If we know that GHC has written the instances, then we can be sure of coherence.

comment:2 Changed 3 years ago by ezyang

A quick grep of Hackage suggests it happens very rarely. I found one case in module-management:

#if MIN_VERSION_Cabal(1,21,1)
instance (Ord k, Generic k, Generic v) => Generic (Map k v) where
    type Rep (Map k v) = Rep [(k, v)]
    from = from . toList
    to = fromList . to
#else

This is actually a pretty reasonable instance as far as things go, but I am sure I would not lose sleep if it got axed. However, I cannot think of a good argument why Generic should only be compiler derivable (as opposed to Typeable, where the semantic value proposition is much clearer.)

Last edited 3 years ago by ezyang (previous) (diff)

comment:3 Changed 3 years ago by RyanGlScott

Cc: RyanGlScott added
Keywords: Generics added

As you've discovered, it's possible to manually define Generic instances (as long as Safe Haskell isn't enabled, I might point out). But even if we restrict Generic instances to only be derived, I don't think that would guarantee they don't overlap. After all, you can do this (requires GHC 8.0 or later):

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE StandaloneDeriving #-}

import GHC.Generics

data Foo a = Foo a
deriving instance {-# OVERLAPPABLE #-} Generic (Foo a)
deriving instance {-# OVERLAPPING  #-} Generic (Foo Int)

We used to forbid Generic instances like Generic (Foo Int) (see #5939), but reverted that decision when we found out that it wreaked havoc with sufficiently poly-kinded Generic1 instances that require kinds to be instantiated with * (see #11732).

Note: See TracTickets for help on using tickets.