Opened 3 years ago

Last modified 3 years ago

#13251 new bug

Must perform family consistency check on non-imported identifiers

Reported by: ezyang Owned by:
Priority: normal Milestone:
Component: Compiler (Type checker) Version: 8.1
Keywords: TypeFamilies Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: GHC accepts invalid program Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by rwbarton)

Currently, the family consistency check checks pairs of *imported* modules (and the modules they transitively import) for consistency. However, there are a number of mechanisms by which we can refer to an identifier from a module without explicitly importing it. Here is one example from Template Haskell:

-- A.hs
{-# LANGUAGE TypeFamilies #-}
module A where
type family F a

-- B.hs
{-# LANGUAGE TypeFamilies #-}
module B where
import A
type instance F Bool = String
g :: F Bool
g = "af"

-- C.hs
{-# LANGUAGE TypeFamilies #-}
module C where
import A
type instance F Bool = Int
h :: F Bool -> IO ()
h = print

-- D.hs
{-# LANGUAGE TemplateHaskell #-}
import C
import Language.Haskell.TH.Syntax

main = h $( return (VarE (Name (OccName "g") (NameG VarName (PkgName "main") (ModName "B")))) )

This does an unsafe coerce:

ezyang@sabre:~/Dev/labs/T13102$ ghc-head --make B.hs D.hs -fforce-recomp
[1 of 4] Compiling A                ( A.hs, A.o )
[2 of 4] Compiling B                ( B.hs, B.o )
[3 of 4] Compiling C                ( C.hs, C.o )
[4 of 4] Compiling Main             ( D.hs, D.o )
Linking D ...
ezyang@sabre:~/Dev/labs/T13102$ ./D
8070450533355229282

Clearly, checking consistency on imports is not enough: we must also check up on original names that come by other mechanisms. (Other ways we can end up with identifiers without imports include overloading, see ticket:13102#comment:13.

A few things to note about how to fix this:

  • Currently, type family instances are checked for consistency as we process imports. Template Haskell splices can occur much later in a Haskell file, so we must correspondingly do these checks later.
  • If we refer to an identifier by synthesizing a name manually, it is as if we imported it. This also means that a reference of this sort implies an implicit import of the defining module (#13102) and we should consider instances from it visible (at the moment, it's not considered visible.) (Actually, with TH, this is a bit tricky, because if we take these semantics, an instance might be visible below a top-level splice, but invisible above it.)
  • For wired in things, it's pretty easy to find out if we have an implicit import: if we bang on checkWiredInTyCon, that means we intended for the instance to visible; so we should collect all of the TyCons we banged on this way. For TH, this isn't exactly going to work, but maybe we can just track when NameGs get synthesized.

Change History (8)

comment:1 Changed 3 years ago by ezyang

Description: modified (diff)

comment:2 Changed 3 years ago by rwbarton

Not thoroughly thought out, but how about in checkWiredInTyCon (or something which wraps it) and in tcLookupGlobal, we check whether the thing lives in a module that was actually imported and, if not, add the relevant import information to tcg_imports, and either do the ensuing type family consistency checks immediately, or perhaps add them to tcg_pending_fam_checks?

I mean, we also have this behavior:

{-# LANGUAGE TemplateHaskell #-}
import Language.Haskell.TH.Syntax
main = print $( return (VarE (Name (OccName "sigFPE")
                             (NameG VarName (PkgName "unix-2.7.0.1")
                                    (ModName "System.Posix.Signals")))) )
rwbarton@morphism:~/ghc2$ ghc x/TH -fforce-recomp
[1 of 1] Compiling Main             ( x/TH.hs, x/TH.o )
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package array-0.5.0.0 ... linking ... done.
Loading package deepseq-1.3.0.2 ... linking ... done.
Loading package containers-0.5.5.1 ... linking ... done.
Loading package pretty-1.1.1.1 ... linking ... done.
Loading package template-haskell ... linking ... done.
Linking x/TH ...
x/TH.o:(.text+0x4b): undefined reference to `unixzm2zi7zi0zi1_SystemziPosixziSignals_sigFPE_closure'
x/TH.o: In function `S4hM_srt':
(.data+0x48): undefined reference to `unixzm2zi7zi0zi1_SystemziPosixziSignals_sigFPE_closure'
collect2: error: ld returned 1 exit status

If we updated the ImportAvails when we type checked the spliced thing, wouldn't that also create a dependency on the unix package at link time?

comment:3 Changed 3 years ago by rwbarton

Description: modified (diff)

comment:4 Changed 3 years ago by ezyang

Yes, I think that's along the ideas of what we ought to do, although Simon thinks it would be simpler if we didn't do the family instance checks until the very end of the entire module.

Template Haskell NameG on an external package which we don't actually depend on is an interesting problem. One possibility is that we should validate these references, and complain if the identifier comes from something we didn't depend on (I guess this is the preload set?) I haven't thought too carefully about this.

comment:5 Changed 3 years ago by simonpj

Here's a summary.

Instance providers

  • Define an instance provider thus: a module A is an instance provider for M if the class and family instances in A, and those of A's instance providers, are visible in M.
  • An orphan instance provider for M is an instance provider for M that is also an orphan module.

Instance providers come from three places:

  • M's direct imports are clearly import providers for M.
  • Less obviously, the use of a wired-in TyCon to deal with built-in syntax (e.g. overloaded lists) may add a new instance provider that was not (transitively) imported. A particular case is GHC.Exts.
  • Also less obviously, if Template Haskell splices in a definition like
    f = P.Q.g
    
    then P.Q becomes an instance provider, because the family instances available there may have been used to typecheck P.Q.g.

Consistency checking

We must do family-instance consistency checks for the transitive closure of all M's instance providers.

The easiest place to do this is right at the end of type checking, when we have a complete set of instance providers to check. The only downside is that, in the meantime, we could have inconsistent type-family instances, and we would need to account for that in type-family lookup (for example by returning not-found).

Instance lookup

Generally, we vigorously load instances into a single instance environment in the External Package State. But which ones are visible? Answer: the instance providers. We can use this to filter out the instance(s) we match on. But we need only check against the orphan instance providers because if the instance is in a no-orphan module X, then X must be in the instance providers else how could we be looking up that particular predicate?

Tracking instance providers

We can track the instance providers by accumulating

  • M's direct imports
  • Modules loaded by checkWiredInTyCon
  • Ditto for Template Haskell

comment:6 Changed 3 years ago by rwbarton

This plan sounds generally good.

Nitpick:

Less obviously, the use of a wired-in TyCon to deal with built-in syntax (e.g. overloaded lists) may add a new instance provider that was not (transitively) imported. A particular case is GHC.Exts.

The things involved in overloaded lists are fromListN and the IsList type class, both of which are known-key things, not wired-in. (The built-in list type is not the issue here.) Then "Tracking instance providers" needs to include this case as well.

comment:7 Changed 3 years ago by simonpj

Then "Tracking instance providers" needs to include this case as well

Yes, good point.

comment:8 Changed 3 years ago by rwbarton

GHCi lets you use a fully-qualified name to refer to an identifier without importing it (similar to TH), so I suppose we ought to treat a module used in this way as an instance provider, as well. Example from slyfox, involving an orphan instance.

ghci -ignore-package=regex-tdfa-rc
GHCi, version 8.0.2: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /home/slyfox/.ghci
Prelude C TP CM> "a" Text.Regex.TDFA.=~ "^(a)|(a)$" ::[[ String ]]

<interactive>:1:1: error:
    • No instance for (Text.Regex.Base.RegexLike.RegexMaker
                         Text.Regex.TDFA.Common.Regex
                         Text.Regex.TDFA.Common.CompOption
                         Text.Regex.TDFA.Common.ExecOption
                         [Char])
        arising from a use of ‘Text.Regex.TDFA.=~’
    • In the expression:
          "a" Text.Regex.TDFA.=~ "^(a)|(a)$" :: [[String]]
      In an equation for ‘it’:
          it = "a" Text.Regex.TDFA.=~ "^(a)|(a)$" :: [[String]]
Prelude C TP CM> :m Text.Regex.TDFA
Prelude Text.Regex.TDFA> "a" Text.Regex.TDFA.=~ "^(a)|(a)$" ::[[ String ]]
[["a","a",""]]

(Actually, we could in principle decide freely whether or not to provide the instances here in the sense of making them visible; but we definitely have to do the family instance consistency checks.)

Note: See TracTickets for help on using tickets.