Opened 4 years ago

Last modified 4 years ago

#10996 new bug

family is treated as keyword in types even without TypeFamilies enabled

Reported by: oerjan Owned by:
Priority: low Milestone:
Component: Compiler Version: 7.10.2
Keywords: TypeFamilies Cc: oerjan
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: GHC rejects valid program Test Case: type Test family = family
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

The following is as far as I know correct H2010 code, but breaks in GHC:

GHCi, version 7.10.2: http://www.haskell.org/ghc/  :? for help
Prelude> type Test family = family

<interactive>:2:11: parse error on input `family'
Prelude>

Change History (8)

comment:1 Changed 4 years ago by oerjan

Just as a musing, I think it might just be possible to treat family entirely as a non-keyword like as or hiding. The only kind of ambiguity I can think of that requires lookahead is with TypeOperators enabled and

type family MyTypeFamily a b = ...

type family :~: work = ...

but even then, it is disambiguated by whether the next token is an identifier or an operator.

comment:2 Changed 4 years ago by simonpj

Indeed. In Parser.y we see:

tyvarid :: { Located RdrName }
        : VARID            { sL1 $1 $! mkUnqual tvName (getVARID $1) }
        | special_id       { sL1 $1 $! mkUnqual tvName (unLoc $1) }
        | 'unsafe'         { sL1 $1 $! mkUnqual tvName (fsLit "unsafe") }
        | 'safe'           { sL1 $1 $! mkUnqual tvName (fsLit "safe") }
        | 'interruptible'  { sL1 $1 $! mkUnqual tvName (fsLit "interruptible") }

varid :: { Located RdrName }
        : VARID            { sL1 $1 $! mkUnqual varName (getVARID $1) }
        | special_id       { sL1 $1 $! mkUnqual varName (unLoc $1) }
        | 'unsafe'         { sL1 $1 $! mkUnqual varName (fsLit "unsafe") }
        | 'safe'           { sL1 $1 $! mkUnqual varName (fsLit "safe") }
        | 'interruptible'  { sL1 $1 $! mkUnqual varName (fsLit "interruptible")}
        | 'forall'         { sL1 $1 $! mkUnqual varName (fsLit "forall") }
        | 'family'         { sL1 $1 $! mkUnqual varName (fsLit "family") }
        | 'role'           { sL1 $1 $! mkUnqual varName (fsLit "role") }

-- These special_ids are treated as keywords in various places,
-- but as ordinary ids elsewhere.   'special_id' collects all these
-- except 'unsafe', 'interruptible', 'forall', 'family', and 'role',
-- whose treatment differs depending on context
special_id :: { Located FastString }
special_id
        : 'as'                  { sL1 $1 (fsLit "as") }
        | 'qualified'           { sL1 $1 (fsLit "qualified") }
        | 'hiding'              { sL1 $1 (fsLit "hiding") }
        | 'export'              { sL1 $1 (fsLit "export") }
        | 'label'               { sL1 $1 (fsLit "label")  }
        | 'dynamic'             { sL1 $1 (fsLit "dynamic") }
        | 'stdcall'             { sL1 $1 (fsLit "stdcall") }
        | 'ccall'               { sL1 $1 (fsLit "ccall") }
        | 'capi'                { sL1 $1 (fsLit "capi") }
        | 'prim'                { sL1 $1 (fsLit "prim") }
        | 'javascript'          { sL1 $1 (fsLit "javascript") }
        | 'group'               { sL1 $1 (fsLit "group") }

The idea of special_id is a good one, because it avoids stealing keywords.

The comment claims that 'unsafe', 'safe', and 'interruptible' have treatment that varies by context; but that claim seems bogus. The ONLY uses of special_id is the code above, so adding those three to special_id and removing them from varid and tyvarid should be a useful simplification. Can someone try?

'forall' is different because in terms it is not a keyword, but in types it is. The lexer therefore only recognises it when specific flags are on.

'role' and 'family' are the warts. Can we put them in special_id? I think (but I am not certain) that the reason they are not there is to exclude them from tyvarid. Why do that? I think (but I am not sure) that the reason is to resolve the ambiguity in comment:1. If someone felt able to resolve that ambiguity some other way, it'd be great. Otherwise this really is a bug.

This isn't really hard, I think; just needs someone to to think clearly for a while.

Thanks

Simon

comment:3 Changed 4 years ago by oerjan

family and role are really the same as forall: they can currently be used as varids, even with their extensions enabled. The error is that they can never be used as tyvarids. (I think my comment:1 applies to role as well, but not to forall, because forall can appear in places where an applied type constructor variable would fit.) However, forall actually does appear to be special cased. Without any extensions enabled:

Prelude> type Test role = role

<interactive>:2:11: parse error on input `role'
Prelude> type Test forall = forall
Prelude>

comment:4 Changed 4 years ago by oerjan

As I said, in principle there may be no real ambiguity, but unfortunately it seems like the head of these constructions is parsed as a type (nonterminal) token, and the check for whether it is of the right form is only done after the Happy parsing:

Prelude> :set -XTypeFamilies
Prelude> type family' a = a

<interactive>:28:6:
    Malformed head of type or class declaration: family' a
Prelude> type family a = a

<interactive>:30:13: Malformed head of type or class declaration: a

A token that parses those and only those types that can appear as the head of these declarations (in particular, disallowing applied type variables) might solve this problem. (Unless, of course, there's some ambiguous case that I've forgotten, but that should hopefully show up as a reduce/* conflict.)

comment:5 Changed 4 years ago by simonpj

Oerjan, that's exactly right. I'm not sure how much effort it's worth devoting to this, but I agree that that the current situation is a mess:

  • Most things are dealt with as special_id; good
  • 'forall' has its own special lexer treatment, so at least it is not stolen until you switch on ExplicitForAlls
  • But 'role' and 'family' are stolen all the time, in types. Which is a Haskell 98 or 2010 violation.

It's annoying!

comment:6 Changed 4 years ago by oerjan

I think this would have been so much easier to hack around if there were a way to tell Happy that a rule should resolve all its shift/reduce conflicts by reducing, without assigning precedence to any tokens (which could mess up other shift/reduce resolutions). Then you'd just declare two tokens family_kw and role_kw, and two rules

...
family_kw : 'family' %prefer_reduction { ... }
...
role_kw : 'role' %prefer_reduction { ... }

and use those tokens instead in the keyword-using declarations.

comment:7 Changed 4 years ago by oerjan

I note this quote in lexer.x:

One might think that we wish to treat 'family' and 'role' as regular old varids whenever -XTypeFamilies and -XRoleAnnotations are off, respectively. But, there is no need to do so. These pseudo-keywords are not stolen syntax: they are only used after the keyword 'type' at the top-level, where varids are not allowed. Furthermore, checks further downstream (TcTyClsDecls) ensure that type families and role annotations are never declared without their extensions on. In fact, by unconditionally lexing these pseudo-keywords as special, we can get better error messages.

Also, note that these are included in the varid production in the parser -- a key detail to make all this work.

So clearly it was intended that they not be stolen, but something got messed up, perhaps when declaring types with TypeOperators were allowed.

comment:8 Changed 4 years ago by thomie

Keywords: TypeFamilies added
Note: See TracTickets for help on using tickets.