Opened 13 months ago

Last modified 13 months ago

## #16148 new bug

# Better type inference for Constraint vs Type

Reported by: | simonpj | Owned by: | |
---|---|---|---|

Priority: | normal | Milestone: | |

Component: | Compiler | Version: | 8.6.3 |

Keywords: | Cc: | ||

Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |

Type of failure: | None/Unknown | Test Case: | |

Blocked By: | Blocking: | ||

Related Tickets: | #11621, #11715, #13742, #16139 | Differential Rev(s): | |

Wiki Page: |

### Description

An enduring infelicity in GHC’s type inference is the treatment of tuples. Consider, this from `RaeJobTalk`

:

type family F c sch where F _ '[] = (() :: Constraint) F c (col ': cols) = (c Int, F c cols)

When kind-checking `F`

, we initially give it a return kind of kappa, a unification variable. Suppose we kind-check the second equation first. Then what kind of tuple is it, a constraint tuple, or an ordinary type tuple. We must unify kappa with `Type`

or `Constraint`

respectively – and they are different kinds. How can we choose which?
Currently we have the following unsatisfactory ruse:

- Look at the “expected kind”. If we kind-check the second equation first we learn nothing from that
- Infer kinds for all of the elements of the type. Again in this case we learn nothing.
- If we still know nothing, arbitrarily pick Type

This is horrible. If we kind check the first equation first, step 1 will discover Constraint, and kind checking succeeds. But if we put the equations in the other order, kind checking fails. Ugh!

Another example with exactly the same cause is #16139.

What to do? I can only think of three approaches:

- Add a pseudo constraint
`(TC kappa)`

meaning “kappa must eventually turn out to be either`Type`

or`Constraint`

”. When generalising, default to`Type`

.

- Same, but by having a special sort of TauTv, a TCTv, which can only unify with Type or Constraint. Again, do not generalise over such type variables.

- Define
type Type = TYPE (LiftedRep T) type Constraint = TYPE (LiftedRep C)

So now we can unify

`kappa`

with`(TYPE (LiftedRep zeta))`

where`zeta`

is a unification variable that we must eventually unify with`T`

or`C`

.

Of these, (C) seems most principled, but somehow over-elaborate for our purposes. And the others also seem to be rather too much work to solve a simple, local problem.

For reference, here are the current kinding rules

t1 : TYPE rep1 t2 : TYPE rep2 (FUN) ---------------- t1 -> t2 : Type t1 : Constraint t2 : TYPE rep (PRED1) ---------------- t1 => t2 : Type t1 : Constraint t2 : Constraint (PRED2) --------------------- t1 => t2 : Constraint ty : TYPE rep `a` is not free in rep (FORALL1) ----------------------- forall a. ty : TYPE rep ty : Constraint (FORALL2) ------------------------- forall a. ty : Constraint

### Change History (6)

### comment:1 Changed 13 months ago by

### comment:2 Changed 13 months ago by

(C) doesn't work

Are you sure? I was not expecting any quantification whatsoever over the T/C thing. You say "don't worry about levity polymorphism", but the example you give is explicitly ruled out by our lev-poly story. Can you give an example that would be allowed by lev-poly but would give rise to instantiation problems?

### comment:3 Changed 13 months ago by

Related Tickets: | → #16139 |
---|

### comment:4 Changed 13 months ago by

Related Tickets: | #16139 → #11621, #11715, #13742, #16139 |
---|

### comment:5 Changed 13 months ago by

You say "don't worry about levity polymorphism", but the example you give is explicitly ruled out by our lev-poly story

Ah, that's not true: see Section 7.3 of the levity-polymorphism paper, where we discuss functions like

absX :: forall (r :: RuntimeRep) (a :: TYPE r). NumX a => a -> a

where `NumX :: forall (r :: RuntimeRep). TYPE r -> Constraint`

is levity-polymorphic. Here we allow `absX`

to have an arrow `a -> a`

with a levity-polymorphic argument; it's just *term binders* that can't have levity-polymorphic types.

Another point is that messing with TYPE etc affects the entire compiler. A major merit of plan (A) is that it affects the type inference engine only and is then gone.

### comment:6 Changed 13 months ago by

Yes, I fully agree with comment:5. Especially the last bit, about affecting only the type inference engine.

**Note:**See TracTickets for help on using tickets.

Richard responds: This has come up before, of course. I've always had an affinity for choice (A). It's easy to describe, even to users, and I think it will lead to nice error messages (which can special-case printing unsolved TC constraints). I don't think it would take all that much code.

(C) doesn't actually work, tempting though it may be. Consider (

Don't worry about levity-polymorphism restrictions in binding positions, as that's not the issue here. The question is: does f take 1 visible argument or 0? The answer depends on the choice of r (under plan C), because r might be LiftedRep C. So have no way of reliably instantiating f's type at an occurrence and are thus dead in the water.

Plan D:

That is, make the T/C distinction totally orthogonal to representation, and simply disallow any quantification over a type whose kind has a variable in the TC position. This has the nice side effect of allowing unlifted constraints (e.g., an unboxed implicit parameter or a strict dictionary). But it has the significant drawback of polluting the type of

`(->)`

, which would now beIndeed, we have considered this route before, here.

The main drawback I see from that abandoned approach in the proposal is that it banned newtype-classes. I don't think we have a satisfactory way (yet) of eliminating newtype-classes while keeping, e.g., reflection working. (This matters: singletons uses the reflection trick to change exponentially-slow processes to linear ones.)

Summing up, I see two ways forward:

Simon's (B) would work, but it seems strictly inferior to (A), simply because a new TyVar flavor is a more obscure approach to getting the behavior we would see in (A), whereas (A) uses an abstraction (a class-like constraint) that everyone knows and loves.

Between these two approaches, (A) is at least an order of magnitude easier. So I vote for that. :) I do believe, though, that we'll have to do (D) eventually, though.