Opened 3 years ago

Last modified 9 months ago

#13092 new bug

family instance consistency checks are too pessimistic

Reported by: rwbarton Owned by: rwbarton
Priority: normal Milestone: 8.10.1
Component: Compiler Version: 8.1
Keywords: Cc: RyanGlScott
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #13719 Differential Rev(s): Phab:D2992, Phab: D3603
Wiki Page:

Description

FamInst has this logic for checking type family instance consistency among imports:

Note [Checking family instance consistency]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For any two family instance modules that we import directly or indirectly, we
check whether the instances in the two modules are consistent, *unless* we can
be certain that the instances of the two modules have already been checked for
consistency during the compilation of modules that we import.

[...]

How do we know which pairs of modules have already been checked?  Any pair of
modules where both modules occur in the `HscTypes.dep_finsts' set (of the
`HscTypes.Dependencies') of one of our directly imported modules must have
already been checked.  Everything else, we check now.  (So that we can be
certain that the modules in our `HscTypes.dep_finsts' are consistent.)

However, suppose one of the modules A we import directly is itself a type family instance module. Then it too has been checked for consistency with its dependencies B, C, etc., so we should skip checking the pairs A & B, A & C, etc.

The current behavior means that whenever we directly import a type family instance module, we still have to load the interface files for all its type family module dependencies, which largely defeats the purpose of the optimization in this case.

Change History (20)

comment:1 Changed 3 years ago by rwbarton

Actually this is wrong! I managed to write unsafeCoerce in my compiler with this change.

The reason is that type family instances defined in a module A are not actually checked for consistency with all its family instance dependencies B, C, ...; that seems to only happen when the interface file for (say) B is read for some other reason. I'm not sure yet whether that means even the existing consistency checks are inadequate; but it's at least surprising that you can write a module that has a type family instance that conflicts with an indirect dependency.

Just to say it again explicitly, because it seems strange: when compiling a module containing a type family instance, GHC does not read all its indirect dependencies which have type family instances to ensure consistency.

comment:2 Changed 3 years ago by rwbarton

I managed to break type family consistency in 8.0.1 using ghc -c alone, but apparently it was in some way that fooled the recompilation checker and it can't be reproduced from a clean build. So I don't know how I did it.

In any case, it seems more logical for the type family instances in A to be checked against B while compiling A, rather than forcing every importer of A to do it.

comment:3 Changed 3 years ago by goldfire

This sounds terrible. Would you mind sharing your test files?

comment:4 Changed 3 years ago by RyanGlScott

Cc: RyanGlScott added

comment:5 Changed 3 years ago by simonpj

Actually there is some justification for the status quo. Here it is:

  • If A imports B, and A has type instance F Int = Bool and B has type instance F Int = Char, then any attempt to use F Int, in A or in any module A imports, will get a "overlapping instance" error on lookup. I think. So the conflict would be reported, but lazily.
  • In contrast, if A imports B and C, and B and C have those instance declarations, then compiling A might never need to reduce F Int, so we must eagerly check for consistency.

I'm intrigued about how you managed to write unsafeCoerce.

But regardless, I think it'd be better and more consistent to eagerly check consistency of all the new family instances in A with all those in modules it imports. Then instance lookup should never find more than one match.

comment:6 Changed 3 years ago by rwbarton

I managed to write unsafeCoerce in 8.0.1 without involving recompilation tricks. Compile these modules in order with ghc -c -XTypeFamilies -fforce-recomp (then you can do a final link with ghc -o Main Main.hs -XTypeFamilies):

module A where
type family A a
module B (A, X) where
import A
data X
type instance A (X, b) = ()
{-# LANGUAGE RankNTypes #-}
module C (x) where
import Data.Proxy
import B
x :: Proxy b -> (forall t. Proxy t -> Bool -> A (t, b)) -> (Bool -> ())
x _ f = f (undefined :: Proxy X)
module Main where

import Data.Proxy
import A
import C

data Y
type instance A (a, Y) = Bool

y :: Proxy a -> Bool -> A (a, Y)
y _ = id

z :: Bool -> ()
z = x (undefined :: Proxy Y) y

main = print (z True)

Main has been rigged to not directly mention any names defined in B, by creating the intermediate module C. When Main is compiled the interface file for B is not read at all!

There is a kind of off-by-one error in the check. The logic in the Note [Checking family instance consistency] correctly takes into account the fact that we do not check consistency of family instances in a module with those in its dep_finsts, and so if you try to import Main in another module (even one that is otherwise empty) GHC will report the overlap. But that still leaves a window of one module in which the overlap can be exploited.

Even if not for this example I agree with

But regardless, I think it'd be better and more consistent to eagerly check consistency of all the new family instances in A with all those in modules it imports. Then instance lookup should never find more than one match.

I think it's even more efficient overall, since we currently do those checks in the importers of A instead, which usually will be at least one module. If we do them while compiling A then the optimization that this ticket was originally about will be sound and we can avoid the checks in the importers.

I also figured out what my example involving recompilation was, I'll file a separate ticket for that.

comment:7 Changed 3 years ago by rwbarton

See #13099 for the recompilation issue.

comment:8 Changed 3 years ago by simonpj

When Main is compiled the interface file for B is not read at all!

Ah, that is the bit I was missing! (It's not read, because nobody mentions X.) If B.hi was read we would (I hope) get an overlap error when solving (A (a,y)).

Well done! In fixing this, do give a careful account of the invariants etc.

Simon

comment:9 in reply to:  8 Changed 3 years ago by rwbarton

Replying to simonpj:

When Main is compiled the interface file for B is not read at all!

Ah, that is the bit I was missing! (It's not read, because nobody mentions X.) If B.hi was read we would (I hope) get an overlap error when solving (A (a,y)).

Earlier I had a simpler version (without the proxy/CPS stuff in x and y, and using visible type application in z) that was reading B.hi, because the type of x was A (X, b) -> () which mentions X, which is defined in B. But it didn't read it until it got to z, at which point it had already type checked y, it seems. So it never noticed the conflict.

comment:10 in reply to:  5 ; Changed 3 years ago by goldfire

Replying to simonpj:

Actually there is some justification for the status quo.

... except that, regardless of Reid's excellent example in comment:6, having F Int = Bool and F Int = Char in scope means that we can write a proof of Bool ~ Char in Core. Continuing the status quo means that Core Lint is much less useful.

comment:11 in reply to:  10 Changed 3 years ago by simonpj

I agree with comment:10. Let's fix this. Reid do you know how to proceed?

comment:12 Changed 3 years ago by rwbarton

I think so. My plan is to proceed as in 608cad595c033ba89e757c8f61a9b791a7085367 though of course it still needs explanation and tests. The new code is basically copied from tcRnImports where it loads orphan modules.

comment:13 Changed 3 years ago by rwbarton

Differential Rev(s): Phab:D2992

comment:14 Changed 3 years ago by simonpj

I think the Phab is fine, but here's a possible thought for improvement to checkFamInstConsistency. Currently it does an all-pairs enumeration, and filters out pairs that are already known consistent. That sounds like enumerating a large set and then filtering it.

What about this. Take just two imported modules A and B, with dep_finsts dA and dB resp. So we know that A + dA is consistent and B + dB is consistent.

What extra parirwise checks do we need to make to ensure that uAB is consistent, where

  uAB = A + dA + B + dB

(I'm using + for set union.) Well, define

   iAB = (A + dA) `intersect` (B + dB)

so that iAB is the intersection; certainly it is consistent all of uAB. Now define

  xA = (A + dA) - iAB
  xB = (B + dB) - iAB

If we check all the pairs with one from xA and one from xB we are done. So check those and form the union uAB. Now continue by adding in one more module, and keep doing that a time.

The bigger the intersection the better.

I'm not certain that is better, but it smells better.

comment:15 Changed 2 years ago by bgamari

Milestone: 8.2.18.4.1

Given that 8.2.1-rc1 is imminent, I'm bumping these off to the 8.4

comment:16 Changed 2 years ago by niteria

Differential Rev(s): Phab:D2992Phab:D2992, Phab: D3603

comment:17 Changed 20 months ago by bgamari

Milestone: 8.4.18.6.1

This ticket won't be resolved in 8.4; remilestoning for 8.6. Do holler if you are affected by this or would otherwise like to work on it.

comment:18 Changed 15 months ago by bgamari

This will not be addressed in GHC 8.6.

comment:19 Changed 15 months ago by bgamari

Milestone: 8.6.18.8.1

These will not be addressed in GHC 8.6.

comment:20 Changed 9 months ago by osa1

Milestone: 8.8.18.10.1

Bumping milestones of low-priority tickets.

Note: See TracTickets for help on using tickets.