Opened 5 years ago

Last modified 16 months ago

#9334 new feature request

Implement "instance chains"

Reported by: diatchki Owned by: diatchki
Priority: normal Milestone:
Component: Compiler (Type checker) Version: 7.9
Keywords: Cc: ghc@…, garrett, jstolarek, tomberek, redneb, sighingnow
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

It would be useful to implement a version of "instance chains" [1] in GHC, which would eliminate the need for OVERLAPPING_INSTANCES in most (all practcial?) programs.

The idea is that programmers can explicitly group and order instances into an "instance chain". For example:

instance (Monad m)                  => StateM (StateT s m) s where ...
else     (MonadTrans t, StateM m s) => StateM (t m)        s where ... 

When GHC searches for instances, the instances in a chain are considered together and in order, starting with the first one:

  1. If the goal matches the current instance's head, then this instance is selected and the rest are ignored, as if they were not there;
  1. If the goal does not match the current instance's head, AND it does not unify with the current instance's head, then we skip the instance and proceed to the next member of the chain;
  1. If the goal does not match the current instance's head, but it does unify with it, then we cannot use this chain to solve the goal.

In summary: earlier instances in a chain "hide" later instances, and later instances can be reached only if we are sure that none of the previous instance will match.

[1] http://web.cecs.pdx.edu/~mpj/pubs/instancechains.pdf

Change History (19)

comment:1 Changed 5 years ago by goldfire

I agree with Iavor -- it would be great to see instance chains for real.

There are two further observations I'd like to make, though:

  1. Instance chains can, I believe, be simulated accurately with closed type families. The encoding is bulky, and I think having real instance chains is much better than what we have in 7.8, but an eager programmer can use closed type families today. For example, Iavor's example could be written
type family ChooseStateMInstance x y where
  ChooseStateMInstance (StateT s m) s = 0
  ChooseStateMInstance (t m) s = 1

class StateM' (n :: Nat) x y where ...

instance Monad m => StateM' 0 (StateT s m) s where ...
instance (MonadTrans t, StateM m s) => StateM' 1 (t m) s where ...

type StateM s m = StateM' (ChooseStateMInstance s m) s m

Like I said, it's not pretty, but I believe it works.

  1. This doesn't necessarily mean that we'll never need overlapping instances -- instance chains seem to only work when the overlap would be contained only in one module. Some programs require inter-module overlap (say, for a global "default" instance).

comment:2 Changed 5 years ago by diatchki

Actually, there is no requirement that an instance chain be closed and, in fact, I was thinking of just starting with open ones. Of course, one could do the same sort of encoding using open TFs.

However, the encoding does not fully subsume instance chains. For example, using instance chains I could write instances like these:

instance           Show (MyContainer Char) where ...
else     Show a => Show (MyContainer a)    where ...

I couldn't do this with the encoding because the Show class does not have the extra parameter that would be needed.

Not only is the encoding not pretty (imagine the type error you'd get for a missing instance), but using it requires enabling some fancy machinery (TFs, which pulls in FC, and reasoning about equality, etc.). Of course, we already have all this, but somehow it feels like implementing an easy idea, using some very advanced tools and, perhaps, it is better if we do not entangle these two.

As for point (2), instance chains pretty much cover all the situations where I have wanted to use overlap (note the emphasis on I :-). For example, I wouldn't provide a global "default" instance, because it is too error prone. For the sake of concreteness, here is what I am referring to:

class MyShow a where myShow :: a -> String

-- default instance
instance {-# OVERLAP #-} MyShow a where myShow _ = "(can't show this)"


showInParens x = "(" ++ myShow x ++ ")"

In this example, showInParens would very likely not do what we intended, because it will commit to the "default" instance prematurely. Of course, this is just one example, but I think it is fairly representative of the difficulties inherent in using cross-module overlapping instances.

comment:3 Changed 5 years ago by simonpj

Some thoughts

  • The instance chains described in the instance-chain paper are much more elaborate than your proposal here; in particular they involve backtracking search and a "fails" possibility. I imagine that is a deliberate narrowing of the specification on your part.
  • The behaviour you specify for instance chains is, I think, precisely what GHC does for overlappping instances when they are all declared in the same module. See the bullets at the end of 7.6.3.5 in the user manual. I grant that putting all the overlapping equations together in one place is clearer, just as with closed type families. But you have the behaviour you want right now, I think.
  • I think you are arguing that we should replace overlapping instances with instance chains. That would render illegal any program that uses overlaping instnaces spread across modules. I suspect that would make many people cry, so we'd end up with both.

If I have this right, its just a question of whether to support a chained syntax.

Simon

comment:4 in reply to:  3 ; Changed 5 years ago by diatchki

Replying to simonpj:

  • The instance chains described in the instance-chain paper are much more elaborate than your proposal here; in particular they involve backtracking search and a "fails" possibility. I imagine that is a deliberate narrowing of the specification on your part.

Yeah, I thought that we should start simple :) I'll try to meet with Mark to understand better how the full system should work. For now, I just wrote up the part that I think fits easily with GHC.

  • The behaviour you specify for instance chains is, I think, precisely what GHC does for overlappping instances when they are all declared in the same module. See the bullets at the end of 7.6.3.5 in the user manual. I grant that putting all the overlapping equations together in one place is clearer, just as with closed type families. But you have the behaviour you want right now, I think.

Interestingly, even in my simple version, instance chains are a bit more expressive, because of the explicit ordering of instances. So we can write things like this:

instance C Int a    where ...
else     C a   Int  where ...

I am not sure how common cases like these are, but it is worth noting.

  • I think you are arguing that we should replace overlapping instances with instance chains. That would render illegal any program that uses overlaping instnaces spread across modules. I suspect that would make many people cry, so we'd end up with both.

Ah, not yet! I think the two can coexist for a while. Once we have a working version of instance chains we can see if existing overlapping instances code can be replaced, and if not, why not.

If I have this right, its just a question of whether to support a chained syntax.

For the simple implementation, I think I'll start by adding the syntax (in all passes of the front-end), and then modifying InstEnv to keep track of instance-chains rather than individual instances.

comment:5 Changed 5 years ago by simonpj

OK, by all means. Honestly, I am not (yet) convinced that benefit is worth the extra complexity. Do try to share code with the type-family apartness stuff; the paper on closed type families would be a good reference.

Simon

comment:6 Changed 5 years ago by diatchki

I've chatted with Mark and Garrett (the authors of the "Instance Chains" papers) and we've decided that there are really three separate features here:

  1. "Instance Groups", which is what is outlined in this ticket, and enables programmers to order instances explicitly, rather than using more/less general realtions.
  1. "Fails instances", which are of the form instance Num Char fails; they enable programmers to state explicitly that an instance should never exits. Interestingly, I just found a very related ticket asking for the same sort of thing (#7775).
  1. "Using instance contexts when selecting instances (aka backtracking)": currently, if the head of an instance matches a goal, GHC commits to it and then fails if it encounters an error; an alternative design would be to back-track and try a different option (e.g., next member of an instance group, or a more general matching instance).

I think that (1) and (2) are useful and shouldn't be too hard to implement in GHC. (3), however, seems like more work. Also, there are programs that rely on GHC's current behavior.

comment:7 Changed 5 years ago by Lemming

Cc: ghc@… added

comment:8 in reply to:  4 Changed 5 years ago by AntC

Replying to diatchki:

Interestingly, even in my simple version, instance chains are a bit more expressive, because of the explicit ordering of instances. So we can write things like this:

instance C Int a    where ...
else     C a   Int  where ...

I am not sure how common cases like these are, but it is worth noting.

I suspect they're rare, but yes they are problematic. Can't you always resolve this today with an instance at the intersect?

 instance C Int Int  where ...
 instance C Int a    where ...
 instance C a   Int  where ...

(The where's body for C Int Int would be the same as C Int a to match Ivor's example.) Probably for this to work all three instances must be in the same module. The main awkwardness is that GHC still sees the partially overlapping two instances and gets upset (wants XIncoherentInstances). If only it could realise there is no incoherence!

comment:9 Changed 5 years ago by aavogt

Replying to diatchki:

  1. "Fails instances", which are of the form instance Num Char fails; they enable programmers to state explicitly that an instance should never exits. Interestingly, I just found a very related ticket asking for the same sort of thing (#7775).

In some sense this can already be done:

class FailHasNoInstances a => Fail a
class FailHasNoInstances a -- not exported to ban Fail instances

instance Fail "Char may not have a Num instance" => Num Char
main = print $ '1' + '1' 
{- has a compile failure:
    No instance for (Fail "Char may not have a Num instance")
      arising from a use of ‘+’
    In the second argument of ‘($)’, namely ‘'1' + '1'’
    In the expression: print $ '1' + '1'
    In an equation for ‘main’: main = print $ '1' + '1'
-}
  1. "Using instance contexts when selecting instances (aka backtracking)": currently, if the head of an instance matches a goal, GHC commits to it and then fails if it encounters an error; an alternative design would be to back-track and try a different option (e.g., next member of an instance group, or a more general matching instance).

Perhaps with a class like

class HasInstance (cxt :: Constraint) (b :: Bool) | cxt -> b

you can encode backtracking without too much pain, and the meaning of existing programs does not change.

comment:10 Changed 5 years ago by garrett

Cc: garrett added

comment:11 Changed 4 years ago by jstolarek

Cc: jstolarek added

comment:12 Changed 4 years ago by tomberek

Cc: tomberek added

comment:13 in reply to:  6 Changed 3 years ago by Iceland_jack

Replying to diatchki:

  1. "Fails instances", which are of the form instance Num Char fails; they enable programmers to state explicitly that an instance should never exits. Interestingly, I just found a very related ticket asking for the same sort of thing (#7775).

In case this is relevant: Using Any as a superclass we can effectively forbid all instances, of course the compiler is not privy to this.

Last edited 3 years ago by Iceland_jack (previous) (diff)

comment:14 in reply to:  9 ; Changed 3 years ago by Iceland_jack

Replying to aavogt:

In some sense this can already be done:

class FailHasNoInstances a => Fail a
class FailHasNoInstances a -- not exported to ban Fail instances

instance Fail "Char may not have a Num instance" => Num Char
main = print $ '1' + '1' 

Is there a difference between using Fail and Proposal/CustomTypeErrors? (see #11967)

instance TypeError (Text "Char may not have a Num instance") => Num Char

Another approach that uses the module system to hide this class and type:

class Fail t
data TypeExists t

and uses it as

instance Fail (TypeExists t) => HasNone t (Cons t ts)
Last edited 3 years ago by Iceland_jack (previous) (diff)

comment:15 in reply to:  1 Changed 3 years ago by Iceland_jack

Replying to goldfire:

  1. This doesn't necessarily mean that we'll never need overlapping instances -- instance chains seem to only work when the overlap would be contained only in one module. Some programs require inter-module overlap (say, for a global "default" instance).

Here's some data from Type Classes and Instance Chains: A Relational Approach

Our first question was how frequently the open-endedness of overlapping instances was necessary in practice. To answer this question, we determined whether the instances in each set were located in the same module, in different modules within the same package, or in different packages (Figure 3.1). Out of the 123 sets, 19 included overlapping instances from different modules, and 6 (of those 19) included overlapping instances from different packages. THe[sic] majority (104, or 85%) of the sets only included instances from a single module. This suggests that, while applications exist for instances overlapping across modules, most overlapping instances are defined locally.

— page 37

comment:16 in reply to:  14 Changed 3 years ago by aavogt

Replying to Iceland_jack:

Replying to aavogt: Is there a difference between using Fail and Proposal/CustomTypeErrors? (see #11967)

Not much difference. All I can think of is that TypeError requires ghc >= 8 to give prettier messages.

comment:17 in reply to:  14 Changed 3 years ago by garrett

Replying to Iceland_jack:

Another approach that uses the module system to hide this class and type:

class Fail t
data TypeExists t

and uses it as

instance Fail (TypeExists t) => HasNone t (Cons t ts)

This isn't really the goal of fails instances (or other negative information) in instance chains. Negative information is used to direct instance selection; encodings, like this one, are only useful for the same purpose if the compiler knows that the Fail (TypeExists t) predicate is unsatisfiable.

For an example that doesn't rely on fails, consider the following

class F t u | t -> u
instance F Bool Int

class C t 
  where f :: t -> t -> t
instance F t Bool => C t 
  where f x y = x
else C t 
  where f x y = y

We expect that f True False evaluates to False; but, to know that the first clause of the chain does not apply, we need to know that the constraint F Bool Bool cannot hold (not just that it does not hold at the current point). In this case, we know that because of the functional dependency on F.

This is not to say that a sufficient smart solver might not be able to determine that the Fail (...) or Any constraints are unsatisfiable. Just that a design which attempts to use these constraints in the role of the fails constraints in instance chains must presuppose such a solver.

comment:18 Changed 16 months ago by redneb

Cc: redneb added

comment:19 Changed 16 months ago by sighingnow

Cc: sighingnow added
Note: See TracTickets for help on using tickets.