Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#10928 closed bug (fixed)

Refine pattern synonym signatures

Reported by: mpickering Owned by:
Priority: highest Milestone: 8.0.1
Component: Compiler Version: 7.10.2
Keywords: Cc: ekmett, dreixel
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Other Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by simonpj)

I think that the current state of pattern synonym signatures is quite confusing, especially regarding the constraints. For those unfamiliar, a signature looks like the following,

pattern ExNumPat :: (Show b) => (Num a, Eq a) => b -> T a

The first constraint being the "provided constraints" and the second the "required constraints".

My main concern is when only a single constraint is specified then these are treated as the provided constraints. The manual gives the reason that this is the "more common" choice. It seems that this motivation is driven by the original ticket which had a lengthy discussion about GADTs. In my experience, the opposite is true, it is more common to have required constraints.

This is true especially in a few common cases such as `pattern Foo = 27`, where users define pattern synonyms which have (overloaded) literals on the RHS. The most general signature for such a pattern is pattern :: () => (Eq a, Num a) => a.

For this reason, I think it would be better if either

  1. Only specifying one constraint corresponded to the required constraints
  2. We required users to specify both sets of constraints, even if the provided constraints are empty.

In terms of breakage, I don't think that pattern synonym signatures are widely used yet. In both schemes it is possible to write backwards compatible code by writing both sets of constraints.

I think a lot of the confusion also arises from the unusual form of the signature, it is unusual to specify two sets of constraints and I suspect users tends to assume that a single set of constraints is either provided or required depending on what they want it to mean. Forcing the specification of both, forces the user to make the distinction earlier rather than trying to decipher error messages.

This is copied from a message sent to ghc-devs. This ticket is to track the progress. But read the email thread for other comments!

Change History (32)

comment:1 Changed 4 years ago by bgamari

Summary: Refine pattern synonym siganturesRefine pattern synonym signatures
Type of failure: None/UnknownOther

comment:2 Changed 4 years ago by simonpj

Description: modified (diff)

comment:3 Changed 4 years ago by simonpj

I'm ok with either (1) or (2). There is no technical issue here, it's just a question of what most people find most "natural".

It'd be interesting to write a clear description of the three alternatives, and get people to vote (a little online survey).

Simon

comment:4 Changed 4 years ago by simonpj

Description: modified (diff)

comment:5 Changed 4 years ago by goldfire

While we're tackling this problem, I'd like to repeat my arguments (made last round during the debate about this syntax) that we've gotten the order of the constraints wrong. Not less natural, but wrong.

Consider this:

data T a where
  MkT :: (Show a, Show b) => a -> b -> T a

pattern P :: (Show a, Show b) => (Eq a, Num a) => b -> T a
pattern P x = MkT 3 x

foo :: T Int -> Bool
foo (P _) = False

This code typechecks. But let's delve into that pattern type signature a bit. It has 4 parts:

  1. The provided constraints: (Show a, Show b)
  1. The required constraints: (Eq a, Num a)
  1. The arguments: b
  1. The result: T a

Let's look, in particular, at the type variables here. Somewhat by definition, only the universal variables are mentioned in the result. Thus a is universal. Any other type variables are existential. Thus b is existential. Here is what is in scope in the 4 regions:

  1. Provided: universals and existentials are in scope
  1. Required: only universals are in scope
  1. Arguments: universals and existentials are in scope
  1. Result: only universals are in scope

Look how the scoping rules flip-flop!

Wait. What I've written is a lie. In GHC 7.10, the existentials are in scope in the required constraints. But this is hogwash. There's nothing at all useful that can be done by having a required constraint on an existential. Required constraints must be established before the pattern is matched. But existentials are available only after the match. Indeed, putting a Read b constraint in the required set makes foo fail to type-check. b should simply not be in scope for the required constraints.

By reversing the order of provided/required, our scopes would nest.

I would also support the following, verbose but clear syntax:

pattern P :: { universals = [a]
             , existentials = [b]
             , provided = (Show a, Show b)
             , required = (Eq a, Num a)
             , arguments = [b]
             , result = T a }

The provided, required, etc, above are keywords, essentially, but they wouldn't clobber any existing syntax. We know precisely when we're parsing a pattern type signature. The only compulsory entry in there would be result. The universals and existentials are lists of type variable binders, where we could assign kinds to the variables. These would be inferred, as usual, if they are omitted.

(You might think we don't need to have binders anywhere, because we can always use a kind annotation on a usage of a variable to fix its kind. But this won't work in 8.0 for two reasons. 1) With visible type application, it would be nice to give an ordering to the variables, with universals always before existentials, and 2) when we have kind families, putting a kind family in a use site could be ambiguous, whereas it is unambiguous at a binding site.)

This version has the considerable advantage of being easier to read and search for. It also means we could deprecate the current syntax for a cycle -- nothing would change in meaning.

To be clear, I'm not 100% convinced that what I've suggested is an improvement, as it's quite verbose. But I'm reminded of one argument that came up during the role annotations debate that code is read much, much more often than it is written. Optimize for reading over writing!

comment:6 Changed 4 years ago by simonpj

  • I do agree about switching the order of provided and required constraints. Let's just do that for GHC 8.0. This isn't yet a widely-used feature, so lets' fix it asap.
  • We should definitely complain if the existentials are used by required constraints. That's a bug to fix.
  • I'm much less certain about Richard's proposed syntax. Not dead set against but we could get on with the first two while debating the third.

comment:7 Changed 4 years ago by mpickering

Cc: ekmett dreixel added

Adding Edward and Pedro as they are the only two other authors to (publicly) use them.

comment:8 Changed 4 years ago by goldfire

I won't champion the verbose syntax, as I'm deeply unsure of that myself. But I wanted to see what a verbose syntax might look like, in case someone else wants to run with it.

I'm quite happy with comment:6. We should also include point (1) or (2) from the original post. I favor (1) by a tiny bit, but perhaps only because it's sunny out this morning. It's hard for me to choose there.

comment:9 Changed 4 years ago by rwbarton

One thing that bothers me about the current syntax is that C1 a => C2 a => T a in general already has a meaning. It's the same thing as (C1 a, C2 a) => T a. I don't know if this is actually valid Haskell 98 (I suspect not), but GHC accepts it without any language flags.

Just to throw out another option, long ago user "ski" on IRC suggested a syntax for existentials-plus-constraints, dual to constrained polymorphic values. The idea is dual to

forall a. C a => T a

which is a sort of function that accepts a C a constraint and produces a value, we have

exists a. C a *> T a

which is a sort of pair of a C a constraint (dictionary) and a value. (Mnemonic: * is like a pair and > is from =>. Not sure I am convinced myself.)

I'm not sure whether this applies directly to pattern synonyms since a pattern is not really the same thing as a value. But, we could at least use the idea of two different bits of syntax for provided and required constraints, e.g.,

pattern P :: (Eq a, Num a) => (Show a, Show b) *> b -> T a

Here I am thinking of (Eq a, Num a) as in negative position and (Show a, Show b) in positive position, so tentatively using the corresponding => and *>.

Advantages:

  • Does not use syntax that already has another meaning (C1 a => C2 a => T a)
  • You can write patterns with either empty required constraints or empty provided constraints (Cr a => T a, Cp a *> T a) without having to add an empty context
  • Not extremely verbose like Richard's verbose syntax

Disadvantages:

  • Another funny bit of syntax to learn. But at least it appears in only one context.

comment:10 Changed 4 years ago by goldfire

I like the idea of using a different operator to flag required vs provided here. I'm far from convinced about *>, not in the least because that steals the type operator.

I do agree with Reid about the unfortunate fact that C a => D a => T already has a meaning. But we're not stealing syntax here, because the blah in pattern P :: blah is not a type. It's a pattern type signature, which, in its current syntax, looks rather like a type, but it's a different beast, with its own strange rules. (Rather like the fact that the blah in data X where Y :: blah is also not a type. Note that C a => D a => a -> X doesn't work there, but that record and bang syntax does.) So I'm not terribly bothered by the double => in this regard, but only a little bothered.

comment:11 Changed 4 years ago by rodlogic

Sorry to inject myself here, but what about:

pattern P :: (Required b) => b -> T a => (Provided a)

I.e. required constraints come before and provided after.

comment:12 Changed 4 years ago by rwbarton

Thinking about this some more it seems wrong to change the operator used for provided constraints, since the whole type of a constructor in a GADT declaration

data T a where
  MkT :: (Show a, Show b) => a -> b -> T a

should be a unit in the type of a pattern

pattern MkT :: (Show a, Show b) => a -> b -> T a

Or looking at it another way, the fields of type a and b, despite being provided by a match on the pattern, appear in negative position in the type; so the provided context (Show a, Show b) should also appear in negative position too, which is the normal role of =>.

So if we're going to use two different operators here, the non-=> one should mark the required constraints, like

pattern P :: (Eq a, Num a) ??> (Show a, Show b) => b -> T a

But this is somehow a bit less appealing to me, since I don't see how this other operator ??> could make sense in any other context.

We have provided constraints, provided values (the values bound by a matching pattern), required constraints, but no required values. That's PatternFamilies. However there is no proposed syntax for the type of a pattern family there. It might be worthwhile to try to solve that problem at the same time. For example a syntax that would accommodate required arguments (while being rather ugly and perhaps unparseable) could be

pattern ReqC => ReqVal1 -> ReqVal2 -> P :: ProvC => ProvVal1 -> ProvVal2 -> Res

This also happens to be backwards compatible in the case where ReqC => ReqVal1 -> ReqVal2 -> is empty.

comment:13 Changed 4 years ago by rwbarton

I'm actually coming around to the original syntax, but with the constraints reversed as in comment:5. It's not so bad and it can be extended to support required values too, using an empty provided constraint if needed.

pattern IsMember :: Ord a => a -> () => Set a
pattern IsMember val <- (member val -> True)

pattern Lookup :: Ord k => k -> () => v -> Map k v
pattern Lookup key val <- (lookup key -> Just val)

I have to say writing these pattern signatures was a bit mind-bending, but I don't think that's because of the particular concrete syntax involving constraints.

I still feel that maybe we ought to be able to do better, but I consider this solution at least satisfactory.

comment:14 Changed 4 years ago by simonpj

I agree: using the current syntax with reversed constraints seems like the best option right now.

comment:15 in reply to:  11 Changed 4 years ago by rodlogic

Replying to rodlogic:

Sorry to inject myself here, but what about:

pattern P :: (Required b) => b -> T a => (Provided a)

I.e. required constraints come before and provided after.

Feeling stupid here now... this is the original proposal!

comment:16 Changed 4 years ago by goldfire

Concur with comment:14. Keep current syntax, but with reversed ordering. Specifying only one constraint would now indicate a required constraint.

On a separate note, the parallelism with GADT syntax discussed in comment:12 is already broken. Consider

data G a where
  MkG :: G Int

pattern P :: G Int
pattern P = MkG

The pattern and the data constructor have different static semantics. Specifically,

-- this works:
ctor :: G a -> a
ctor MkG = 5

-- this doesn't:
pat :: G a -> a
pat P = 5

This is because a non-uniformity in the return type of a pattern is taken as a required equality constraint, not a provided equality constraint. The top set of definitions is equivalent to

data G a where
  MkG :: a ~ Int => G a

pattern P :: {- required -} a ~ Int => {- provided -} () => G a
pattern P = MkG

This was a design decision made in pattern synonym typing. I don't love this decision, but I can't quite argue against it either. It's one more way in which we must be honest that the things appearing after MkG :: and P :: are not types, but type-like specifications with a very specific interpretation.

comment:17 Changed 4 years ago by simonpj

If anyone feels able to improve the user-manual documentation in the light of this discussion, it would be great to do so. Thanks!

Simon

comment:18 Changed 4 years ago by rwbarton

That's kind of weird. But you can write

pattern P :: a ~ Int => () => G a   -- current syntax; (Int ~ a) is provided constraint
pattern P = MkG

and then pat is accepted...

comment:19 in reply to:  18 Changed 4 years ago by goldfire

Replying to rwbarton:

That's kind of weird. But you can write

pattern P :: a ~ Int => () => G a   -- current syntax; (Int ~ a) is provided constraint
pattern P = MkG

and then pat is accepted...

Sure you can. But such a P is different from the one I defined. Very, very subtly different, but indeed different: matching on your P tells the type checker that a ~ Int, whereas matching on my P asks the type checker to prove a ~ Int.

comment:20 in reply to:  7 Changed 4 years ago by dreixel

Replying to mpickering:

Adding Edward and Pedro as they are the only two other authors to (publicly) use them.

To be honest I don't remember where I publicly used them, but I am perfectly happy with keeping the current syntax, but with reversed ordering.

comment:21 Changed 4 years ago by simonpj

Priority: normalhighest

I think we should get on and reverse the order for 8.0. And make sure that in

pattern P :: req => prov => t1 -> ... -> tn -> T s1 .. sm

then

  • The "universal" tyvars are fv(s1..sm)
  • The "existential" tyvars are fv(t1..tn) minus the universals

and the fvs of req must be all universal!

Since this is breaking change, we'd better get it done asap.

Agreed?

comment:22 Changed 4 years ago by simonpj

PS: this would fix #11010, by rejecting both the (bogus) examples given there.

comment:23 Changed 4 years ago by mpickering

I agree with Simon. Are we changing to either (1) or (2) from the original description as well?

comment:24 Changed 4 years ago by ekmett

I'd be happy with whatever resolution you want to make here.

My code will adapt.

comment:25 Changed 4 years ago by simonpj

I think (1), that is:

  • Only specifying one constraint corresponds to the required constraints, with empty provided constraints.

comment:26 Changed 4 years ago by goldfire

While we're changing the parser, we should make sure that explicit quantification is allowed. For example,

pattern P :: forall a b. ... => forall c d. ... => ....

should introduce a and b as universals, and c and d as existentials. The existentials should scope over the provided constraints and the arguments, but not the result. The universals scope over the whole shebang.

These should also be made available as scoped type variables in the pattern definition. There are four places where these variables might be in scope, labeled below:

pattern P = (1)
pattern P <- (2)
pattern P (3) ...
 ... where P = (4)

The universals should be in scope everywhere. The existentials should be in scope only in (3) and (4), I believe, but I'm really quite unsure.

comment:27 Changed 4 years ago by Simon Peyton Jones <simonpj@…>

In 04b0a73/ghc:

Pattern synonyms: swap provided/required

This patch swaps the order of provided and required constraints in
a pattern signature, so it now goes

      pattern P :: req => prov => t1 -> ... tn -> res_ty

See the long discussion in Trac #10928.

I think I have found all the places, but I could have missed something
particularly in comments.

There is a Haddock changes; so a submodule update.

comment:28 Changed 4 years ago by simonpj

Resolution: fixed
Status: newclosed

OK I've done the provided/required swap. Please look out for any inconsistencies where I've missed something

Simon

comment:29 Changed 4 years ago by mpickering

Simon, should you maybe change the parser so that prov/req refer to the right things in parsing. I admit I had a much simpler patch in mind which just made the change in the parser and pretty printer!

comment:30 Changed 4 years ago by simonpj

Ah, you mean

            {% do { let (flag, qtvs, req, prov, ty) = snd $ unLoc $4
                  ; let sig = PatSynSig $2 (flag, mkHsQTvs qtvs) req prov ty

Ie just variable naming. Yes, good point I'll swap the names.

Simon

comment:31 Changed 4 years ago by simonpj

BTW I have not implemented comment:26:

While we're changing the parser, we should make sure that explicit quantification is allowed

But someone else is welcome to!

comment:32 Changed 4 years ago by Simon Peyton Jones <simonpj@…>

In 9b3a058/ghc:

Swap prov/req in variable naming in Parser.y

This is a follow on to the patch for Trac #10928.
It's a local renaming of variables only; no change in behaviour.
Note: See TracTickets for help on using tickets.