Opened 8 years ago

Last modified 4 years ago

#5267 new bug

Missing type checks for arrow command combinators

Reported by: peteg Owned by: ross
Priority: low Milestone:
Component: Compiler (Type checker) Version: 7.0.3
Keywords: Arrows Cc: ross@…, peteg
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: GHC accepts invalid program Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Is this expected to work?

{-# LANGUAGE Arrows #-}
module T where

import Prelude
import Control.Arrow

t = proc () ->
     do rec x <- arr id <<< (| (arr id) (returnA -< x) |)
        returnA -< x

t' = proc x ->
     do x <- arr id <<< (| (arr id) (returnA -< x) |)
        returnA -< x

t'' = proc x ->
     do x <- arr id <<< (| (arr id) (returnA -< 3) |)
        returnA -< x

t''' = proc x -> arr id <<< (| (arr id) (returnA -< 3) |)

I get:

/tmp/T.hs:8:18:
    The type of the first argument of a command form has the wrong shape
      Argument type: t_tX
    In the command: (arr id) <<< ((|(arr id) ((returnA -< x))|))
    In a stmt of a 'do' expression:
        x <- (arr id) <<< ((|(arr id) ((returnA -< x))|))
    In a stmt of a 'do' expression:
        rec {x <- (arr id) <<< ((|(arr id) ((returnA -< x))|))}

/tmp/T.hs:12:14:
    The type of the first argument of a command form has the wrong shape
      Argument type: t_tG
    In the command: (arr id) <<< ((|(arr id) ((returnA -< x))|))
    In a stmt of a 'do' expression:
        x <- (arr id) <<< ((|(arr id) ((returnA -< x))|))
    In the expression:
      proc x -> do { x <- (arr id) <<< ((|(arr id) ((returnA -< x))|));
                     returnA -< x }

/tmp/T.hs:16:14:
    The type of the first argument of a command form has the wrong shape
      Argument type: t_tq
    In the command: (arr id) <<< ((|(arr id) ((returnA -< 3))|))
    In a stmt of a 'do' expression:
        x <- (arr id) <<< ((|(arr id) ((returnA -< 3))|))
    In the expression:
      proc x -> do { x <- (arr id) <<< ((|(arr id) ((returnA -< 3))|));
                     returnA -< x }

/tmp/T.hs:19:18:
    The type of the first argument of a command form has the wrong shape
      Argument type: t_k
    In the command: (arr id) <<< ((|(arr id) ((returnA -< 3))|))
    In the expression:
      proc x -> (arr id) <<< ((|(arr id) ((returnA -< 3))|))
    In an equation for `t'''':
        t''' = proc x -> (arr id) <<< ((|(arr id) ((returnA -< 3))|))

Change History (27)

comment:1 Changed 8 years ago by simonpj

Cc: ross@… added

Adding Ross in the hope he can answer.

comment:2 Changed 8 years ago by ross

Resolution: invalid
Status: newclosed

GHC seems to be performing properly here.

In each case GHC is objecting to (| (arr id) (returnA -< x) |), saying the type of the first argument of a command form has the wrong shape. Since you've used an operator with one argument, it's expecting the operator to have the form

forall e. a (...(e,t1), ... tn) t -> a (...(e,t1'), ... tn') t'

for some arrow a, and with e not free in the other t's (see Section 7.10.4 of the GHC User's Guide). In your case the operator, namely (arr id), has type forall a' b. Arrow a => a' b b. GHC has instantiated a' as -> to match the above, and then generalized and complained that the type of the argument is a type variable instead of an arrow type. Perhaps you were expecting it to use the type of the argument (returnA -< 3) to instantiate b to a e Int before generalizing, but this behaviour is consistent with the treatment of rank-2 types elsewhere.

I'm not sure what you were trying to do here. Maybe the mailing list would be a good place to discuss it.

comment:3 Changed 8 years ago by peteg

Cc: peteg added
Resolution: invalid
Status: closednew

Ross, thanks for the clarification. I was playing golf with code of this form:

{-# LANGUAGE Arrows #-}
module T where

import Prelude
import Control.Arrow

t = proc () ->
     do rec x <- (| (arr id) (returnA -< x) |)
        y <- arr id -< x
        returnA -< y

(for non-trivial arrows) and surprised to get this error. The question was whether one can put anything between <- and (|, as I did in the original bug report.

My operators tend to be of the form:

arrAC :: Arrow (~>) => (env ~> a) -> (env ~> a) -> (env ~> a)

which syntactically doesn't satisfy your requirement but seems to work anyway. I get confused by just where the syntactic criteria kick in and why.

In any case can we let this bug stand as a complaint that saying some type variable has the "wrong shape" is a terrible error message? GHC usually tells me what it was expecting.

cheers peter

comment:4 Changed 8 years ago by igloo

Milestone: 7.4.1

comment:5 Changed 8 years ago by peteg

Ross,

Here's another couple of bugs.

module B where

import Prelude
import Control.Arrow
import Control.Category

-- Some Arrow transformer.
newtype A (~>) b c = A { unA :: b ~> c }
  deriving (Arrow, Category)

-- Determine the condition in the underlying Arrow.
ite :: ArrowChoice (~>) => (env ~> Bool) -> A (~>) env d -> A (~>) env d -> A (~>) env d
ite iA tA eA = A $ proc env ->
  do i <- iA -< env
     if i then unA tA -< env else unA eA -< env

-- Bug: moving ite_perm out of the where block works.
{-
ite' cA tA eA = proc x ->
  do c <- cA -< x
     (| ite_perm (returnA -< c) |)
  where ite_perm i = ite i tA eA


    The type of the first argument of a command form has the wrong shape
      Argument type: t_tz
    In the command: (|ite_perm ((returnA -< c))|)
    In the expression:
      proc x -> do { c <- cA -< x;
                     (|ite_perm ((returnA -< c))|) }
    In an equation for `ite'':
        ite' cA tA eA
          = proc x -> do { c <- cA -< x;
                           (|ite_perm ((returnA -< c))|) }
-}

ite' cA tA eA = proc x ->
  do c <- cA -< x
     (| (ite_perm tA eA) (returnA -< c) |)

ite_perm tA eA i = ite i tA eA

-- Bug: we can't write the type signature we want.
-- ite' :: ArrowChoice (~>) => A (~>) env Bool -> A (~>) env d -> A (~>) env d -> A (~>) env d

{-

    Warning: Top-level binding with no type signature:
               ite' :: forall a t t1 (t2 :: * -> * -> *).
                        ArrowChoice t2 =>
                        A t2 t1 Bool -> A t2 a t -> A t2 a t -> A t2 t1 t

I want to set a = t1.

-}

The motivation is where ite is a lot more complex than it is here, and I don't want to repeat it in ite'. It is supposed to be a small optimisation, i.e. avoiding lift and (>>>) in A.

Perhaps these are fixed in a more recent GHC than 7.0.3.

cheers

peter

comment:6 Changed 8 years ago by ross

Bug: moving ite_perm out of the where block works.

This one:

ite' cA tA eA = proc x ->
  do c <- cA -< x
     (| ite_perm (returnA -< c) |)
  where ite_perm i = ite i tA eA

works for me in 7.0.1, 7.0.4 and HEAD, but not 6.12.2.

Bug: we can't write the type signature we want.

ite' :: ArrowChoice (~>) => A (~>) env Bool -> A (~>) env d -> A (~>) env d -> A (~>) env d
ite' cA tA eA = proc x ->
  do c <- cA -< x
     (| (ite_perm tA eA) (returnA -< c) |)

ite_perm :: ArrowChoice (~>) => A (~>) env d -> A (~>) env d -> (env ~> Bool) -> A (~>) env d
ite_perm tA eA i = ite i tA eA

We have ite_perm tA eA :: (env ~> Bool) -> A (~>) env d, and the compiler can't generalize env as required by the banana brackets, because it occurs in the type environment. So it's following the rules; the question is whether the rules can be safely relaxed.

comment:7 Changed 8 years ago by ross

Owner: set to ross

On second thoughts, I think 6.12 was right to reject

ite'' cA tA eA = proc x ->
  do c <- cA -< x
     (| ite_perm' (returnA -< c) |)
  where ite_perm' i = ite i tA eA

because the banana brackets require ite_perm' :: forall env. (env ~> Bool) -> (env ~> r), but generalization of env should be blocked because it occurs in the types of tA}} and {{{eA.

comment:8 Changed 8 years ago by ross

This seems to result from the deletion of

               -- Check that the polymorphic variable hasn't been unified with anything
               -- and is not free in res_ty or the cmd_stk  (i.e.  t, t1..tn)
       ; checkSigTyVarsWrt (tyVarsOfTypes (res_ty:cmd_stk)) [w_tv]

from the ArrForm case of tc_cmd in this patch:

commit d2ce0f52d42edf32bb9f13796e6ba6edba8bd516
Author: simonpj@microsoft.com <unknown>
Date:   Mon Sep 13 09:50:48 2010 +0000

    Super-monster patch implementing the new typechecker -- at last

Apparently the replacement code doesn't perform the same check.

comment:9 Changed 8 years ago by ross

Here's a simpler snippet illustrating the same bug:

should_fail v b = proc x -> (| f (returnA -< v) |)
  where f a = a <+> b

The deleted test rightly rejected this because f :: (env ~> r) -> (env ~> r) and the type variable env occurs in b :: env ~> r.

Another thing that the deleted test prevented was

should_also_fail x = proc y -> (| bad_op (returnA -< x) |)
  where
    bad_op :: Arrow a => a env b -> a env env
    bad_op = const returnA

where the type variable env occurs in the output type of the arrow constructed by bad_op.

comment:10 Changed 8 years ago by simonpj

Ah, I see. The relevant code is here (line 254 of TcArrows:

        ; (inst_binds, expr') <- checkConstraints ArrowSkol [w_tv] [] $
                                 escapeArrowScope (tcMonoExpr expr e_ty)

The test is actually still there (it's in checkConsraints in fact). But the bug is in escapeArrowScope. This function is rather drastic: it replaces the entire typechecker's environment, both local and global, with one saved from outside the arrow form. That is pretty dire, because the environment carries more than just what varialbes are in scope. In particular, it carries

  • A mutable cell to collect type checker constraints
  • The index of the youngest "touchable" type variable

These two help to implement the skolem-escape check, so replacing them wholesale amounts to jumping right outside the existential.

What is escapeArrowScope really meant to do? Here are the comments from TcRnTypes:

---------------------------
-- Arrow-notation context
---------------------------

{-
In arrow notation, a variable bound by a proc (or enclosed let/kappa)
is not in scope to the left of an arrow tail (-<) or the head of (|..|).
For example

	proc x -> (e1 -< e2)

Here, x is not in scope in e1, but it is in scope in e2.  This can get
a bit complicated:

	let x = 3 in
	proc y -> (proc z -> e1) -< e2

Here, x and z are in scope in e1, but y is not.  We implement this by
recording the environment when passing a proc (using newArrowScope),
and returning to that (using `escapeArrowScope`) on the left of -< and the
head of (|..|).

Well if that's all, the task should be handled 100% by the renamer, with no need for the type checker to do anything. Even in the renamer I'm suspicious of the lock-stock-and-barrel replacement of the environment. I think all you need do is to snapshot the state of tcl_rdr, and keep that in the ArrowCtxt, not the entire environment. Moreover, I think there is no need to call escapeArrowScope in the typechecker as well as in the renamer. The renamer alone does the job.

Have I understood aright?

There is another problem with TcArrows, revealed by removing the escapeArrowScope calls:

		-- Check that it has the right shape:
		-- 	((w,s1) .. sn)
		-- where the si do not mention w
	   ; checkTc (corner_ty `eqType` mkTyVarTy w_tv && 
		      not (w_tv `elemVarSet` tyVarsOfTypes arg_tys))
		     (badFormFun i tup_ty')

You just can't do this with the new type inference engine, because there may be many suspended unification constraints so the type might not yet have the right shape; only when constraint solving is gone will it have the right shape.

The Right Thing is to unify corner_ty with w_tv, rather than calling eqType. But I'm very dubious about the unscramble stuff, since it too seems to rely on enough unification having taken place. I don't understand this code well enough to sort it out. Ross, can you see how to rewrite it to use unification, or at least functions like matchExpectedListTy and friends in TcUinfy? Remember that they return coercions which must be wrapped around appropriate terms. I'd be happy to dissus by phone or skype if this is obscure. The basic rule is: you can't zonk and then look at the type; I guarantee that will fail as soon as things get at all complicated!

comment:11 Changed 8 years ago by ross

It does seem to make sense to handle the scope issues in the renamer, but I think there's another issue here beyond holey scopes. Here's the type rule for ArrForm, just in the unary case (fixing a typo, which wouldn't help):

     G      |-b  c : [s1 .. sm] s
     pop(G) |-   e : forall w. b ((w,s1) .. sm) s -> a ((w,t1) .. tn) t
     w \not\in (s, s1..sm, t, t1..tn)
     ----------------------------------------------
     G |-a  (| e c |)  :  [t1 .. tn] t

So e is checked in the outer scope, but you also need to be able to generalize over the type variable w, so it can't occur free in pop(G) (which it does in the first example above). This is part of ensuring that operators like e are oblivious to the way we plumb the environment through the arrows. The rest is the constraint on w in the above rule. The si and ti are checked by the fragment you mention above. The check on t seems to have disappeared in 7.0, as seen in the second example I gave. It seems the check on s was never implemented: the following is wrongly accepted even by 6.12:

should_fail_too x = proc y -> (| bad_op (returnA -< x) |)
  where
    bad_op :: Arrow a => a env env -> a env Bool
    bad_op _ = arr (const True)

comment:12 Changed 8 years ago by simonpj

I think if you could help me refactor the code

		-- Check that it has the right shape:
		-- 	((w,s1) .. sn)
		-- where the si do not mention w
	   ; checkTc (corner_ty `eqType` mkTyVarTy w_tv && 
		      not (w_tv `elemVarSet` tyVarsOfTypes arg_tys))
		     (badFormFun i tup_ty')

so that it used unifyType, and made proper use of the coerion that unifyType returns (see matchExpectedListTy and friends), we'd be in much better shape. I'm pretty sure that the stuff you describe will be fine; but I don't understand enough detail to be sure. Can you give a link to the paper with the typing rules? Are they still the ones you think are correct?

The key thing I don't know is: how long should nested tuple be. We can't discover this by zonking the tup_ty and looking at it, because that won't work with suspended unification constraints. It must surely be a lexical property of the code... ? Simon

comment:13 in reply to:  12 Changed 8 years ago by ross

Type of failure: GHC rejects valid programGHC accepts invalid program

Replying to simonpj:

The key thing I don't know is: how long should nested tuple be. We can't discover this by zonking the tup_ty and looking at it, because that won't work with suspended unification constraints. It must surely be a lexical property of the code... ?

Aye, there's the rub. It's not a lexical property; the depth of this nested pair comes only from the type of the operator expression in the ArrForm. If it helps, the only allowed possibilities at each point are a pair and the skolem type variable w. (I realize that's one too many.)

comment:14 Changed 8 years ago by simonpj

That is so weird. Can you give a couple of examples, where the very same ArrForm types in different ways, depending on the type of the function. I'm lost here.

comment:15 Changed 8 years ago by ross

Here are some examples from our desugaring discussion, which may or may not help.

add :: Arrow a => a b Int -> a b Int -> a b Int
add f g = proc z ->
        (f -< z) `bind` \ x ->
        (g -< z) `bind` \ y ->
        returnA -< x+y

bind :: Arrow a => a e b -> a (e, b) c -> a e c
u `bind` f = arr Prelude.id &&& u >>> f

Here we have infix ArrForms with bind as the function.

ite :: ArrowChoice a => a env Bool -> a env d -> a env d -> a env d
ite iA tA eA = proc env ->
        (iA -< env) `bind` \ i ->
        (| ifThenElseA (tA -< env) (eA -< env) |) i

ifThenElseA :: ArrowChoice a => a e r -> a e r -> a (e, Bool) r
ifThenElseA thenPart elsePart = arr split >>> thenPart ||| elsePart
  where
        split (e, True) = Left e
        split (e, False) = Right e

In these cases one can see the arity from the code, but one can also eta-contract the above to

ite' :: ArrowChoice a => a env Bool -> a env d -> a env d -> a env d
ite' iA tA eA = proc env ->
        (iA -< env) `bind` (| ifThenElseA (tA -< env) (eA -< env) |)

comment:16 Changed 8 years ago by simonpj

See also #5609

comment:17 Changed 8 years ago by simonpj

I'm sorry I just don't understand the typing rules for arrow syntax, especially this bit about arrow forms with a nested tuple type whose depth is unknown. Where can I find them written down? They aren't in the original paper.

Somehow we have to fix the type checking code for arrow forms; as it stands it is simply wrong and will fail unpredictably. But I don't undersand it well enough to fix it. Do you understand the problem, Ross? Can you fix it? Or should we talk by phone?

Simon

comment:18 Changed 8 years ago by ross

Component: CompilerCompiler (Type checker)
Summary: Arrow command combinatorsMissing type checks for arrow command combinators

comment:19 Changed 8 years ago by igloo

Milestone: 7.4.17.6.1
Priority: normallow

comment:20 Changed 7 years ago by igloo

Milestone: 7.6.17.6.2

comment:21 Changed 5 years ago by thoughtpolice

Milestone: 7.6.27.10.1

Moving to 7.10.1.

comment:22 Changed 5 years ago by Simon Peyton Jones <simonpj@…>

In f50d62bb6c0357991fabf938bc971d528bbf5cc4/ghc:

Fix the scope-nesting for arrows

Previously we were capturing the *entire environment* when moving under
a 'proc', for the newArrowScope/escapeArrowScope thing.  But that a blunderbuss,
and in any case isn't right (the untouchable-type-varaible invariant gets
invalidated).

So I fixed it to be much more refined: just the LocalRdrEnv and constraints are
captured.

I think this is right; but if not we should just add more fields to ArrowCtxt,
not return to the blunderbuss.

This patch fixes the ASSERT failure in Trac #5267

comment:23 Changed 5 years ago by simonpj

difficulty: Unknown

There *is* something to do in the type checker, namely to plumb the type constraints correctly. See the changes to newArrowScope and escapeArrowScope in the above patch. This fixes the assertion failure.

I'm still not confident about all this, and I'd love someone who understands arrows to review the type inference code for arrows. So I'll leave the ticket open. But it's better, I think.

Simon

comment:24 Changed 5 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:25 Changed 4 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:26 Changed 4 years ago by thomie

Keywords: Arrows added

comment:27 Changed 4 years ago by thomie

Milestone: 8.0.1
Note: See TracTickets for help on using tickets.