Opened 4 years ago
Last modified 4 years ago
#11475 new bug
Lint should check for inexhaustive alternatives
Reported by: | simonpj | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Compiler | Version: | 7.10.3 |
Keywords: | Cc: | anselm.scholl@… | |
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | None/Unknown | Test Case: | |
Blocked By: | Blocking: | ||
Related Tickets: | Differential Rev(s): | ||
Wiki Page: |
Description (last modified by )
GHC's simplifier prunes inaccessible alternatives from case expressions. E.g.
case x of A y -> e1 _ -> ....(case x of A y -> r1 B -> r2 C -> r3) ...
It'll prune the A
alternative from the inner case to get
case x of A y -> e1 _ -> ....(case x of B -> r2 C -> r3) ...
BUT, there is no independent check from Lint that this pruning does the Right Thing. Yet, lacking such a check, a program can pass Lint and then seg-fault, which is Very Bad.
Trac #11172 is an example. It turned out to be a bug in the pruning, which led to something like
case x of Left y -> e1
but in fact x
ended up being bound to (Right v)
for some value v
. But because there is only a single alternative left, GHC does not test the tag, but instead goes ahead and accesses it the Right
value as if it was a Left
value. And because of pointer-tagging, it'll assume the wrong tag on the pointer, and very soon you are in seg-fault land.
So this ticket is to suggest: make Core Lint do it's best to check that all cases are in fact exhaustive. There are two ways in which we prune cases:
- Enclosing pattern matches (as above)
- GADT pattern matches may be exhaustive even when they don't mention all constructors.
For the latter, it's impossible for Lint to reproduce all the reasoning of the type checker, but I bet that in 99.5% of the cases the existing function dataConCantMatch
will do the job.
So if Lit
- Maintains an environment saying what each variable can't match (as the simplifier does)
- Does a simple-minded
dataConCantMatch
test based on the typ of the scrutinee
I think we'd be a long way forward. It's possible that a valid program could trigger a Lint report, but let's see.
Any volunteers?
Change History (6)
comment:1 Changed 4 years ago by
Description: | modified (diff) |
---|
comment:2 Changed 4 years ago by
comment:3 Changed 4 years ago by
I suggest being less ambitious! Do the simple things first.
case (case ... of ... -> Left x) of Left y -> ...
GHC itself is unlikely to prune the alternative in the outer case, precisely because it's not so obvious that the inner case can only return Left.
bar = ... case foo a b of (:) x y -> ...
Same here.
In short, dealing with in-scope pattern matches and GADTs is worth doing first.
Then when you get a false positive (which you say you are) we can think about them.
(For foo
, the CprResult
part of the strictness signature will tell that foo
only returns (:)
but I'm a bit surprised that GHC actually uses this to prune the case alternatives.
comment:4 Changed 4 years ago by
Cc: | anselm.scholl@… added |
---|
comment:5 Changed 4 years ago by
Using the CRPResult
did indeed help, I can now compile a few more modules. However, the next false positive is a little bit more complicated as there is no evidence I could use attached to it as far as I can tell.
Suppose there is an expression
foo x = case x of A -> ... B -> ... y -> error $ "foo" ++ (case y of C -> "C" D -> "D")
This is fine and works as expected. But now GHC decides to float out the expression, yielding:
lvl y = error $ "foo" ++ (case y of C -> "C" D -> "D") foo x = case x of A -> ... B -> ... y -> lvl y
I don't see where GHC would record that y
can only match C
and D
. I think it doesn't do it anywhere, most likely because no one needed that information until today.
A possible solution might be to look for unexported local functions which are non-exhaustive and record the set of allowed constructors for their arguments. All calls to such a function would then be required to only included the correct possible constructors. Or is there a better way?
comment:6 Changed 4 years ago by
Ah yes, that is tricky. I'm really not sure how far it is profitable to go here. That is, when does keeping Lint happy become burdensome?
But let me suggest a possible approach for this particular case.
- Ids have Unfoldings inside them. One such is
OtherCon [AltCon]
(seeCoreSyn
) which says "this Id does not match any of these constructors".
- Since they are Ids even lambda-bound variables may have unfoldings. I'm in two minds about whether this is a good idea, but you could give the
\y
in the defn oflvl
above aOtherCon[A,B]
unfolding. Then you wouldn't get a Lint error from thecase
inlvl
.
- But in exchange you'd have a new thing to check: that
lvl
was indeed applied to arguments that were notA
orB
. So I'm not sure we are any further forward.
Maybe this can only be done with a better flow analysis... a "super-Lint" if you like.
Transferred from ghc-defs list I am currently giving #11475 a try. However I hit a few problems and wanted to ask for a bit of advice about how to proceed.
I already can handle the following:
My first problem then was the following construct:
Or more simple:
So I have to check which constructors the scrutinee may return. I know the simplifier has to implement such logic somewhere but did not want to use the implementation for two reasons:
So I implemented such a test myself, but now I am stuck at something like this:
I cannot see through
foo
, so I assume it may return(:)
and[]
. The simplifier on the other hand is able to see through foo and did remove the[]
case, causing my current implementation to throw an incorrect lint error.How can I get a list/set of all possible
AltCons
returned by a function/constant? Or a list of the impossible ones? I tried to get some information from the unfolding of a variable, but it seems like unfoldings are only attached to top-level names.Jonas