Opened 2 years ago

Last modified 9 months ago

#14111 new bug

strange error when using data families with levity polymorphism and unboxed sums and data families

Reported by: carter Owned by: RyanGlScott
Priority: normal Milestone: 8.10.1
Component: Compiler Version: 8.2.1
Keywords: TypeFamilies Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: #13737 Blocking:
Related Tickets: #14457 Differential Rev(s):
Wiki Page:

Description

I've the following small example

{-# LANGUAGE MagicHash, UnboxedSums, NoImplicitPrelude #-}
{-# LANGUAGE TypeFamilies #-}
-- {-# LANGUAGE PolyKinds #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE TypeInType #-}
{-# LANGUAGE GADTs ,ExplicitNamespaces#-}
{-# LANGUAGE UnboxedTuples #-}

module Data.Unboxed.Maybe where

import GHC.Exts
import GHC.Types
import Prelude (undefined)
import Data.Void

data family Maybe(x :: TYPE (r :: RuntimeRep))

data instance Maybe (a :: * ) where
  MaybeSum :: (# a  | (# #) #) -> Maybe a
data instance Maybe (x :: TYPE 'UnliftedRep) where
  MaybeSumU :: (# x | (# #) #) -> Maybe x

and then i get the error (made much saner to read by use of printing explicit kinds)

Prelude> :r
[1 of 1] Compiling Data.Unboxed.Maybe ( src/Data/Unboxed/Maybe.hs, interpreted )

src/Data/Unboxed/Maybe.hs:22:3: error:
    • Data constructor ‘MaybeSumU’ returns type ‘Maybe 'LiftedRep x’
        instead of an instance of its parent type ‘Maybe 'UnliftedRep x’
    • In the definition of data constructor ‘MaybeSumU’
      In the data instance declaration for ‘Maybe’
   |
22 |   MaybeSumU :: (# x | (# #) #) -> Maybe x

this is

a) a case where printing runtime reps makes things easier to debug :)

b) a very confusing type error since the data instance clearly says "x :: TYPE 'UnliftedRep "

is there something i'm overlooking, or is this a bug?

Change History (28)

comment:1 Changed 2 years ago by carter

remarkably, if i add the type annotation, this works! (but it still seems like a bug in inference)


data instance Maybe (x :: TYPE 'UnliftedRep) where
  MaybeSumU :: (# x | (# #) #) -> Maybe (x :: TYPE 'UnliftedRep)

it seems kinda redundant ??

comment:2 Changed 2 years ago by RyanGlScott

Keywords: TypeFamilies added

One important thing to note is that in this data instance:

data instance Maybe (x :: TYPE 'UnliftedRep) where
  MaybeSumU :: (# x | (# #) #) -> Maybe x

The x in data instance Maybe (x :: TYPE 'UnliftedRep) does not scope over the constructors (this is a quirk of GADTs). Therefore, the kind of x in data instance Maybe (x :: TYPE 'UnliftedRep) doesn't necessarily determine the kind of x in MaybeSumU's type signature (as the x in (# x | (# #) #) -> Maybe x is bound by an implicit forall x).

That being said, I'm still surprised this doesn't typecheck. Here's a simpler example of this bug that doesn't involve levity polymorphism or TypeInType:

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE TypeFamilies #-}
module Bug where

data family   F (a :: k)
data instance F (a :: Bool) where
  MkF :: F a
$ ~/Software/ghc3/inplace/bin/ghc-stage2 --interactive Bug.hs
GHCi, version 8.3.20170807: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /home/rgscott/.ghci
[1 of 1] Compiling Bug              ( Bug.hs, interpreted )

Bug.hs:9:3: error:
    • Data constructor ‘MkF’ returns type ‘F a’
        instead of an instance of its parent type ‘F a’
    • In the definition of data constructor ‘MkF’
      In the data instance declaration for ‘F’
  |
9 |   MkF :: F a
  |   ^
Failed, 0 modules loaded.
λ> :set -fprint-explicit-kinds
λ> :r
[1 of 1] Compiling Bug              ( Bug.hs, interpreted )

Bug.hs:9:3: error:
    • Data constructor ‘MkF’ returns type ‘F k a’
        instead of an instance of its parent type ‘F Bool a’
    • In the definition of data constructor ‘MkF’
      In the data instance declaration for ‘F’
  |
9 |   MkF :: F a
  |   ^
Failed, 0 modules loaded.

Here's we can see that since a appears unadorned in F a, its kind seems to be inferred as k (as given from the data family definition), although we'd much rather have it inferred as Bool, which you'd expect from the instance head.

Last edited 2 years ago by RyanGlScott (previous) (diff)

comment:3 Changed 2 years ago by carter

In this case the error is made even more pernicious by the default suppression of Levity arguments, but yeah, i see how its indeed the same issue

comment:4 Changed 2 years ago by goldfire

I seem to recall this behavior being a deliberate design decision -- GADT constructor types stand alone. There was a long ticket with commentary and deliberations. Sadly, I can't find it, and 10 minutes of git blame-chasing in the area of code that should have been affected didn't find anything either. Perhaps SPJ remembers more.

This kind of problem should always succumb to a kind annotation somewhere, but I agree that it's unfortunate and repetitive to put the annotation in.

comment:5 Changed 2 years ago by RyanGlScott

I'm confused - is this GADT behavior, or data family behavior? I can't seem to trigger a similar sort of error with a non-family GADT. For instance, this works fine:

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE TypeFamilies #-}
module Foo where

data M (a :: Bool)

data F (a :: k) where
  MkF :: M a -> F a

(Granted, these examples aren't quite equivalent, but this was the most convenient way I could think of to constraint the type parameter to be of kind Bool without actually giving it an explicit kind signature.)

comment:6 Changed 2 years ago by goldfire

It's GADT data family behavior. :) The problem can really happen only with data/newtype instances. In comment:5, you're actually making a kind-indexed GADT. That program shouldn't be accepted, but is due to #13391.

comment:7 Changed 2 years ago by simonpj

Consider

data instance T a [a] where
  T1 :: Int -> T Bool [Bool]
  T2 :: Char -> T Bool [Char]  -- Wrong
  T3 :: p -> T p q             -- Wrong

We want to ensure that the result type of each data constructor is a substitution instance of the T a [a] in the head. T2 isn't, so it's rejected. Ditto T3

But this ticket points out that the user-written signature for the data constructor has invisible (and hence under-specified) kind arguments. We want those kind arguments to conform to the shape of the head too.

One way to do this might be, when kind-checking the type signature for a data constructor,

  • Instantiate the head shape with fresh unification variables
  • Unify it with the result type of the data constructor

That would force the shapes to match, and would instantiate any kind variables appropriately.

I'm not sure what we'd do with the kind coercion that this unification would return; but it smells to me like the right thing to do.

comment:8 Changed 2 years ago by goldfire

I don't think you mean

Instantiate the head shape with fresh unification variables

I think you meant

  • Instantiate the return type with fresh unification variables

That is, we don't want the unification variables in the head shape (that is, type patterns), but in the data constructor type.

Actually, we already do your version of instantiation, in kind checking. See TcTyClsDecls.kcDataDefn and TcTyClsDefls.tc_fam_ty_pats. But we don't want to do it again when checking data constructors.

If I understand correctly, this instantiate-and-unify would replace the call to tcMatchTy in rejigConRes. The kind coercion is easy to dispatch: it will always be reflexive, because the kind of a fully-applied tycon is always Type. This always-reflexive property doesn't mean the unification is useless, of course, as it will unify metavariables.

comment:9 Changed 2 years ago by simonpj

I did mean "instantiate the head shape with unification variables".

In our example, the head shape is T a [a]. So, when checking T1, I instantiate the head shape with unification varibles T alpha [alpha] and unify with the result of the data con T Bool [Bool]. Success, and the substitution for alpha tells me about the necessary equality constraints.

The payoff is that the rigid bits of the head shape get to instantiate any kind-unification-variables floating around in the result type of the data constructor. So this has to be done before we kind-generalise the data constructor's type.

comment:10 Changed 2 years ago by goldfire

Ah. Now I understand better. Instantiating the shape is just incidental, done so that unification succeeds at all. The real action is in solving for any metavars in the data constructor type, and this unification does not affect anything about the head shape.

Yes, that sounds reasonable.

comment:11 Changed 23 months ago by RyanGlScott

comment:12 Changed 20 months ago by carter

Did any action towards cleaning up the inference here make it into 8.4 ghc ?

comment:13 Changed 20 months ago by RyanGlScott

'Fraid not, as GHC 8.4.1 still rejects the original program and the program in comment:5.

comment:14 Changed 18 months ago by RyanGlScott

Last night, I tried working on this (in a sleep-deprived haze). I actually managed to come up with something that makes the original program compile:

  • compiler/typecheck/TcTyClsDecls.hs

    diff --git a/compiler/typecheck/TcTyClsDecls.hs b/compiler/typecheck/TcTyClsDecls.hs
    index 39697d6..842050f 100644
    a b import TcClassDcl 
    3636import {-# SOURCE #-} TcInstDcls( tcInstDecls1 )
    3737import TcDeriv (DerivInfo)
    3838import TcEvidence  ( tcCoercionKind, isEmptyTcEvBinds )
    39 import TcUnify     ( checkConstraints )
     39import TcUnify     ( checkConstraints, unifyType )
    4040import TcHsType
    4141import TcMType
    4242import TysWiredIn ( unitTy )
    tcConDecl rep_tycon tag_map tmpl_bndrs res_tmpl 
    18011801              do { ctxt <- tcHsMbContext cxt
    18021802                 ; btys <- tcConArgs hs_args
    18031803                 ; res_ty' <- tcHsLiftedType res_ty
     1804                 ; (meta_subst, _meta_tvs) <- newMetaTyVars $ binderVars tmpl_bndrs
     1805                 ; let head_shape_with_metas = substTy meta_subst res_tmpl
     1806                 ; _ <- unifyType Nothing res_ty' head_shape_with_metas
    18041807                 ; field_lbls <- lookupConstructorFields name
    18051808                 ; let (arg_tys, stricts) = unzip btys
    18061809                       bound_vars = allBoundVariabless ctxt `unionVarSet`

This is nowhere near correct, though, since it causes other programs, such as T13242a, to infinitely loop when compiling. What's worse, I don't understand why.

Help? Disclaimer: I have no idea what I'm doing.

comment:15 Changed 18 months ago by goldfire

That code looks to me to be in the ConDeclH98 clause, but Haskell98 constructors are not in play here.

That said, these lines might just do the trick in the equivalent place in the ConDeclGADT clause.

comment:16 Changed 18 months ago by RyanGlScott

This is in the ConDeclGADT clause.

comment:17 Changed 18 months ago by goldfire

Bizarre. My line numbers from HEAD match exactly yours -- but in the ConDeclH98 clause.

And I'm stymied about the infinite loop on T13242a. Oh, wait. Maybe I'm not.

Here is the problem case:

data T where
  A :: forall a. Eq a => a -> T

The problem is that T is knot-tied, because we're in the act of building it. So we can't then go and unify it. Instead, you should do the same basic thing as you're doing above, but to the arguments of the tycon, not the whole tyconapp itself. That is, use tcSplitTyConApp on the res_tmpl and the res_ty' and then unify the respective arguments. (You might have to explicitly deal with the possibility that the lists are of different length. I think lists of different lengths are definitely errors.)

Upon further thought, this "unify the args" approach solves this particular case, but it won't solve all cases, because it's possible to mention knot-tied types in the arguments to a GADT result type. I don't know a way out, then, other than to wait until #13737 gets rid of the whole knot-tying business.

comment:18 Changed 18 months ago by RyanGlScott

Blocked By: 13737 added

comment:19 Changed 14 months ago by RyanGlScott

Phab:D4974 purports to "get rid of the whole knot-tying business"... or so I thought. Actually, I don't think even Phab:D4974 would be sufficient to unblock this. The problem is that even with that patch, res_tmpl in comment:14 is still knot-tied (since it mentions the knot-tied rep_tycon). This means calling unifyType on it is doomed to loop infinitely.

Alas, https://phabricator.haskell.org/D4974#137148 suggests that rep_tycon is destined to be knot-tied, so I'm not sure how this bug could ever be fixed.

comment:20 Changed 14 months ago by goldfire

I think we have a way forward here, actually.

First off, the problem isn't the main type-checker knot. The problem is the littler knot in tcDataDefn which retrieves the tycon from the future in order to, well, build the tycon. This tycon is used in three places:

  1. In the res_ty passed to tcConDecls. This res_ty is, in turn, used in two places:
    1. In the call to buildDataCon, where it is used to set up the dcOrigResTy field of a DataCon.
    2. To do the result-type matching in rejigConRes.
  2. As the TyCon passed to tcConDecls. That is used only when building the DataCon, so that a DataCon knows what TyCon it has come from.
  3. In mkNewTyConRhs, so that a newtype CoAxiom knows what TyCon it has come from.

Usages 1a, 2, and 3 absolutely need the final, correct, fully zonked TyCon. However, usage 1b does not. Conveniently, usage 1b is the one causing us trouble.

The solution is to use kcLookupTcTyCon to get the TcTyCon of interest and build the res_ty with un-knot-tied types. Pass this (as a new parameter) to tcConDecls. And away you go!

Remaining to do:

  1. Send all the stuff in TyCons through Core Lint. I'm pretty sure we have some bits and pieces that are not fully zonked sneaking into TyCons. For example, the final_bndrs in tcDataDefn look not to have ever been zonked (by zonkTcTypeToType or a friend).
  2. Fix the errors that arise in (1).

comment:21 Changed 14 months ago by RyanGlScott

Aha! Thank you goldfire, that plan works beautifully. With that, I am able to write a patch (based off of comment:14, and applied on top of Phab:D4974) which fixes the original program. Woohoo!

...there's one gotcha, though. Since we're now straight-up using unifyType in tcConDecl, some of the error messages degrade in quality. One example is in the T12087 test case, which currently gives a rather informative error message:

T12087.hs:6:3: error:
    • GADT constructor type signature cannot contain nested ‘forall’s or contexts
      Suggestion: instead use this type signature:
        MkF1 :: forall a. (Ord a, Eq a) => a -> F1 a
    • In the definition of data constructor ‘MkF1’
      In the data type declaration for ‘F1’

But after my patch, this turns into:

T12087.hs:6:3:
     Couldn't match expected type ‘F1 a0’
                  with actual type ‘Eq a1 => a1 -> F1 a1’
     In the definition of data constructor ‘MkF1’

Which kind of sucks. The reason this happens is because the GADT constructor type signature cannot contain nested ‘forall’s or contexts error is emitted in checkValidDataCon, which comes after tcConDecl (currently, GHC assumes that tcConDecl will ignore ill formatted data constructors like these, and let checkValidDataCon catch them afterwards).

I'm not quite sure how to deal with this. Perhaps we should carve out that particular check from checkValidDataCon and put it before the call to unifyType in tcConDecl? It's a bit arbitrary, but I think it would address the issue. As an added bonus, it might fix #11384, since performing this check earlier might give GHC a chance to spot an improper return type before kind-checking occurs.

comment:22 Changed 14 months ago by goldfire

There is precedent with the strategy in comment:21. Note the arity check toward the top of tcFamTyPats. This check is fully redundant, but it gives a nice arity error (simple to understand!) instead of an obscure kind error.

So, I would say to go with comment:21, with a comment (not sure if it needs a Note) explaining why do the eager check outside of checkValidDataCon.

comment:23 Changed 14 months ago by goldfire

Ryan, what happens if you try this example in your patch:

type S = T
data T where
  MkT :: Int -> S

That works today. But I have a feeling it will fail with your patch. The problem is actually an oversight in Phab:D4974, but I was surprised that Phab:D4974 actually accepts the program above. However, a sneaking suspicion is that, with your patch as well, the program will be rejected. I'm wondering if I should bother fixing this (maybe it wasn't an oversight after all), and so I'm curious to know the behavior on your branch with that program.

comment:24 Changed 14 months ago by simonpj

I am intensely relaxed about this. It seems to me to be outright confusing to allow type synonyms in the result type of a data constructor signature. Let's just not allow it, and insist on T in the result type.

comment:25 Changed 14 months ago by RyanGlScott

Correct, the program in comment:23 no longer compiles:

Bug.hs:6:3: error:
    • Couldn't match expected type ‘T’ with actual type ‘S’
    • In the definition of data constructor ‘MkT’
      In the data declaration for ‘T’
  |
6 |   MkT :: Int -> S
  |   ^^^^^^^^^^^^^^^

comment:26 Changed 14 months ago by RyanGlScott

Recording some thoughts from a conversation with Richard and Simon:

  • The fact that programs like in the one in comment:23 won't compile is OK. In general, the type of a data constructor is quite restricted compared to other type signatures, and there already precedent for disallowing type synonyms in certain spots of data constructors. For instance, GHC rejects this example (from #7494):
data Steps s y where
  Yield :: y -> FK s y -> FK s y
  Done  :: Steps s y
type FK s y = s -> Steps s y

So it's perhaps not so unusual for GHC to reject the example from comment:23 as well. But this should be documented in the users' guide. (Should this bullet point have its own ticket?)

  • The code in comment:14 is slightly unusual in that we call unifyType, but immediately throw away the coercion that it returns. But this is also OK—there is an invariant that the coercion returned here will always be reflexive. As evidence for this claim, consider the following example:
type family F a
type instance F Char = [Bool]

data family T a b
data instance T a [a] where
  MkT :: T Bool (F Char)

Compiling MkT should require a non-reflexive coercion in order to cast its return type T Bool [Bool] to T Bool (F Char). But we disallow such shenanigans in checkValidDataCon, which checks that the type of a data constructor really is an instance of the parent type (without reducing any type families). Therefore, the shape of the data constructor return type must be the same as the shape of the parent type, so any coercion between the two must be reflexive.

This subtlety should at least be documented. (Ideally, we'd add an ASSERTion for it too, but this would be tricky to do in tcConDecl, and the alternative of stashing the coercion in MkDataCon only to check it later in the code if ASSERTions are enabled is rather ghastly.)

comment:27 Changed 13 months ago by RyanGlScott

Milestone: 8.8.1
Owner: set to RyanGlScott

comment:28 Changed 9 months ago by osa1

Milestone: 8.8.18.10.1

Bumping milestones of low-priority tickets.

Note: See TracTickets for help on using tickets.