Opened 5 years ago

Last modified 7 months ago

#9173 new bug

Improve type mismatch error messages

Reported by: simonpj Owned by: goldfire
Priority: high Milestone:
Component: Compiler Version: 7.8.2
Keywords: TypeErrorMessages Cc: hvr
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Incorrect warning at compile-time Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Generating better type-error messages is a hoary chestnut, but Herbert brought my attention to this thread on reddit, which has the following idea.

At the moment, from

module Foo where
  addThree = \x -> x + 3 :: Int
  y = addThree $ Just 5

we get this error:

Foo.hs:2:20
    Couldn't match expected type `Int' with actual type `Maybe a0'
    In the return type of a call of `Just'
    In the second argument of `($)', namely `Just 5'
    In the expression: addThree $ Just 5

Maybe we could generate this instead

Foo.hs:2:20
  inferred: "Just 5" has type "Maybe a0" 
  expected: second argument of "($)" must have type "Int"
  in the expression: addThree $ Just 5

Change History (38)

comment:1 Changed 5 years ago by thoughtpolice

I like this, and think the second set of output reads much, much better. Bonus points: use ANSI terminal colors to highlight some key words in the output, such as "Warning" or "Error" should they exist.

Lennart also makes an observation about the way the Mu Haskell compiler handles this: https://pay.reddit.com/r/haskell/comments/26tcrk/curious_with_a_bit_of_beginner_ranting_about_some/chuwwns

comment:2 Changed 5 years ago by hvr

Cc: hvr added

comment:3 Changed 5 years ago by schyler

Here's an error that a noob making a subtle typo might get from GHC:

Prelude> map 1 [1..5]

<interactive>:2:1:
    Could not deduce (Num (a0 -> b))
      arising from the ambiguity check for ‛it’
    from the context (Num (a -> b), Num a, Enum a)
      bound by the inferred type for ‛it’:
                 (Num (a -> b), Num a, Enum a) => [b]
      at <interactive>:2:1-12
    The type variable ‛a0’ is ambiguous
    When checking that ‛it’
      has the inferred type ‛forall a b.
                             (Num (a -> b), Num a, Enum a) =>
                             [b]’
    Probable cause: the inferred type is ambiguous

I wholeheartedly agree that the information needs to be presented in a way that's easier to digest.

For what should be a really simple error, I really have to stare that down to understand what it's actually trying to say ...

Wanted: Num a => a -> b
Got: Num a => a
Last edited 5 years ago by schyler (previous) (diff)

comment:4 Changed 5 years ago by nomeata

Despite the very general ticket title, let’s not confalte multiple errors. The original ticket was a about a type mismatch (inferred vs. expected). The example by schlyer is about an ambiguous type in the GHCi prompt. Note that with some more instances added, map 1 [1..5] would type check!

comment:5 Changed 4 years ago by thomie

Keywords: newcomer added

This seems like a nice and easy ticket to handle by a newcomer.

The function to change is misMatchMsg in compiler/typecheck/TcErrors.hs.

Don't forget to update the expected test output. Use 'make accept', see Building/RunningTests/Updating. Please make sure you get all of them (grep for the old error message).

comment:6 Changed 4 years ago by thomie

It seems this ticket is not as easy to fix as I thought it would be. See: https://mail.haskell.org/pipermail/ghc-devs/2015-August/009559.html

comment:7 Changed 4 years ago by goldfire

Keywords: newcomer removed

Yes, I don't think this is easy, precisely for the reasons described in that post.

comment:8 Changed 4 years ago by thomie

Reddit user physicologist writes:

While I love the the proposal in the ticket, I feel that a purely textual change to the error message could be a great boon without changing a single line of code.

"Expected" versus "actual" don't really the nature of the problem. For example, what is the type of foo in the following

foo :: Int
foo = "Howdy!"

Is foo actually an Int, since that was declared? Did the compiler expect foo to be a String, since that's what we passed it? Or is foo actually a "String", since that's what it contained, but the compiler expected a Int, since that's what we told it would be contained?

Perhaps just changing

Couldn't match expected type 'Foo' with actual type 'Bar'

to

Couldn't match declared type 'Foo' against a value of type `Bar'

might make things clearer?

comment:9 Changed 4 years ago by chreekat

thomie, that gets to the root of my problem with the message precisely!

comment:10 Changed 4 years ago by rwbarton

The error there would not be about the type of foo, though. It would be about the type of "Howdy!".

Foo.hs:4:7:
    Couldn't match expected type ‘Int’ with actual type ‘[Char]’
    In the expression: "Howdy!"
    In an equation for ‘foo’: foo = "Howdy!"

The "actual type" is ... the actual type of the expression "Howdy!" that the compiler points out. The "expected type" is the type expected from the context, that is, the type the expression would have to have for the whole thing to type check. I know lots of people find this confusing but I have never been able to understand why...

comment:11 Changed 4 years ago by rwbarton

I wonder if it would help at all to just reverse the order of the two, that is,

    Couldn't match actual type ‘[Char]’ with expected type ‘Int’
    In the expression: "Howdy!"

Presumably the programmer has a strong association between the expression and its (actual) type, and then can work out what "expected type" refers to by elimination.

I'm also not that attached to the phrase "expected type". We could be more explicit and say something like "type ‘Int’ expected from context". I think the phrase "actual type" is quite good, though.

comment:12 Changed 4 years ago by simonpj

The trouble with "declared type" is that it often isn't declared. Consider

not 'c'

The actual type of 'c' is Char, but the type expected by the context (the call of not) is Bool. But it'd be confusing to say that Bool was the "declared" type!

Switching the order as Reid suggests in comment:11 would be easy, and I can see that it might help. Do others like that?

I'd also be ok with saying "type expected by the context" instead of just "expected". That longer phrase would also suggest putting it second.

Simon

comment:13 in reply to:  11 Changed 4 years ago by chreekat

Foo.hs:4:7:
    Couldn't match expected type ‘Int’ with actual type ‘[Char]’
    In the expression: "Howdy!"
    In an equation for ‘foo’: foo = "Howdy!"

This example reminded me that sometimes the phrase "In the expression" is confusing to me. It seems that the phrase *should* be "Of the expression". Using 'in' makes me expect that the error message is referring to some particular component of the expression, rather than the whole thing.

It's a small incongruence, but it's big enough to force me to stop thinking about code and start thinking about the possible meanings of the error message.

comment:14 in reply to:  12 Changed 4 years ago by chreekat

Replying to simonpj:

Switching the order as Reid suggests in comment:11 would be easy, and I can see that it might help. Do others like that?

I'd also be ok with saying "type expected by the context" instead of just "expected". That longer phrase would also suggest putting it second.

+1. E.g.:

foo.hs:1:11:
    Couldn't match actual type ‘Char’ with type expected by context, ‘Bool’
    In the first argument of ‘not’, namely ‘'c'’
    In the expression: not 'c'
    In an equation for ‘foo’: foo = not 'c'

(I recall now that "In the expression" is used because the message often *does* refer to a component of the whole expression. Still, I think it would be nice if it switched to "In the expression" when referring to the whole thing.)

comment:15 Changed 4 years ago by thomie

Again discussed in this thread: https://www.reddit.com/r/haskell/comments/3tm3lv/proposal_expectedactual_requiredfound/

Not much new. expected/found instead of expected/actual seems a popular choice.

I vote for switching the order as well, mentioning the word "context", and perhaps omitting the word actual. Something like this:

Couldn't match type ‘Char’ with type ‘Bool’ expected from context. 

comment:16 Changed 4 years ago by goldfire

Once Phab:D808 gets merged (could be today! but more likely in 2-3), printing out the expression that has the bad type will be much easier, as I've restructured how that information sloshes around internally. So that will be a viable improvement very soon.

comment:17 Changed 4 years ago by Richard Eisenberg <eir@…>

In 67465497/ghc:

Add kind equalities to GHC.

This implements the ideas originally put forward in
"System FC with Explicit Kind Equality" (ICFP'13).

There are several noteworthy changes with this patch:
 * We now have casts in types. These change the kind
   of a type. See new constructor `CastTy`.

 * All types and all constructors can be promoted.
   This includes GADT constructors. GADT pattern matches
   take place in type family equations. In Core,
   types can now be applied to coercions via the
   `CoercionTy` constructor.

 * Coercions can now be heterogeneous, relating types
   of different kinds. A coercion proving `t1 :: k1 ~ t2 :: k2`
   proves both that `t1` and `t2` are the same and also that
   `k1` and `k2` are the same.

 * The `Coercion` type has been significantly enhanced.
   The documentation in `docs/core-spec/core-spec.pdf` reflects
   the new reality.

 * The type of `*` is now `*`. No more `BOX`.

 * Users can write explicit kind variables in their code,
   anywhere they can write type variables. For backward compatibility,
   automatic inference of kind-variable binding is still permitted.

 * The new extension `TypeInType` turns on the new user-facing
   features.

 * Type families and synonyms are now promoted to kinds. This causes
   trouble with parsing `*`, leading to the somewhat awkward new
   `HsAppsTy` constructor for `HsType`. This is dispatched with in
   the renamer, where the kind `*` can be told apart from a
   type-level multiplication operator. Without `-XTypeInType` the
   old behavior persists. With `-XTypeInType`, you need to import
   `Data.Kind` to get `*`, also known as `Type`.

 * The kind-checking algorithms in TcHsType have been significantly
   rewritten to allow for enhanced kinds.

 * The new features are still quite experimental and may be in flux.

 * TODO: Several open tickets: #11195, #11196, #11197, #11198, #11203.

 * TODO: Update user manual.

Tickets addressed: #9017, #9173, #7961, #10524, #8566, #11142.
Updates Haddock submodule.

comment:18 Changed 4 years ago by goldfire

The information needed in TcErrors to address this ticket is now very much to hand, in the uo_thing field of a TypeEqOrigin. There is some refactoring to do to improve the plumbing, but it would now be easy for someone to take an honest stab at this ticket.

comment:19 Changed 4 years ago by thomie

Type of failure: None/UnknownIncorrect warning at compile-time

comment:20 Changed 3 years ago by sgillespie

I'd like to have a look. What is the proposed message? Should we use the original message below?

Foo.hs:2:20
  inferred: "Just 5" has type "Maybe a0" 
  expected: second argument of "($)" must have type "Int"
  in the expression: addThree $ Just 5

comment:21 in reply to:  20 Changed 3 years ago by chreekat

Replying to sgillespie:

I'd like to have a look. What is the proposed message? Should we use the original message below?

Foo.hs:2:20
  inferred: "Just 5" has type "Maybe a0" 
  expected: second argument of "($)" must have type "Int"
  in the expression: addThree $ Just 5

I vote for the simpler wording change rather than whatever acrobatics would be required to overcome the problems listed here: https://mail.haskell.org/pipermail/ghc-devs/2015-August/009559.html

I guess that the patch mentioned above makes those bigger changes possible, but still.

For the wording change, I think adding "...from context" is universally desired. See my comment:14, and the one following it, for suggested text.

Incidentally, I dislike the eldest proposal (quoted in this comment) because it splits the necessary information over a couple lines, which is not so nice for integrating with vim's :make command. ;)

comment:22 Changed 2 years ago by simonpj

Keywords: TypeErrorMessages added

comment:23 Changed 2 years ago by vanto

Replying to simonpj

Hello simonpj,
I agree with rwbarton. But instead of using the word "context" I prefer the use of the word "signature".
A signature is a description of each component of a definition of a function, or in a more general way by talking about an expression.
A Type signature defined by the programmer takes precedence over the Type of the result of the calculated function.
We could write :

Type `Int` expected by signature.

or this sentence that is simple and sounds well

The Type of the result is not in accordance with the Type of the signature

Or with the Type write inside

The Type `[Char]` of the result is not in accordance with the Type `Int` of the signature

The word "context" is quite different.
It implies a greater scope in the code. It would be the equivalent of a set.
But one thing must be noted.
This thing is not to write the signature of the function.
If I write

module Foo where
    addThree = \x -> x + 3
    y = addThree $ Just 5

GHCi sayd

Prelude> :l foo.hs
[1 of 1] Compiling Foo              ( foo.hs, interpreted )

foo.hs:2:22: error:
    * No instance for (Num (Maybe a0)) arising from a use of `+'
    * In the expression: x + 3
      In the expression: \ x -> x + 3
      In an equation for `addThree': addThree = \ x -> x + 3

foo.hs:3:25: error:
    * Ambiguous type variable `a0' arising from the literal `5'
      prevents the constraint `(Num a0)' from being solved.
      Relevant bindings include y :: Maybe a0 (bound at foo.hs:3:5)
      Probable fix: use a type annotation to specify what `a0' should be.
      These potential instances exist:
        instance Num Integer -- Defined in `GHC.Num'
        instance Num Double -- Defined in `GHC.Float'
        instance Num Float -- Defined in `GHC.Float'
        ...plus two others
        (use -fprint-potential-instances to see them all)
    * In the first argument of `Just', namely `5'
      In the second argument of `($)', namely `Just 5'
      In the expression: addThree $ Just 5
Failed, modules loaded: none.
Prelude>

Here there is no Type signature in the context.
The sentence * No instance for (Num (Maybe a0)) arising from a use of `+' is more explicit.
I hope this will help.

comment:24 Changed 2 years ago by sgillespie

I would love to look into this, but it is unclear what the new message is supposed to be. Should we go with the suggestion of the original post, or one of the other rewording suggestions?

comment:25 Changed 2 years ago by bgamari

I like the extension of rwbarton's proposal from comment:11,

   Couldn't match actual type ‘[Char]’ with type ‘Int’ expected by context
   In the expression: "Howdy!"

Seems like a simple yet clear change.

comment:26 Changed 2 years ago by vanto

I agree with bgamari too.
Well, I glanced at the Oxford Dictionary. I have compared the words "context" and "signature".
Here is the definition of the word "context":

  • The circumstances that form the setting for an event, statement, or idea, and in terms of which it can be fully understood.

Examples:
‘the proposals need to be considered in the context of new European directives’
‘We are going to be able, within a European context, to be in a more positive position.’
‘This is down to his determination to place current events in a historical context.’
‘For new readers this can be an advantage, but they become disadvantages in contexts of closer study.’

Synonyms of the word context: circumstances, conditions, surroundings, factors, state of affairs
frame of reference, contextual relationship.

Here is the definition of the word "signature":

  • A distinctive pattern, product, or characteristic by which someone or something can be identified.

Examples:
‘the chef produced the pâté that was his signature’
‘Changes in population size tend to leave recognizable signatures in the patterns of nucleotide diversity.’
‘This process, called intrinsic catalysis, has a characteristic signature in the exchange time measurements versus pH.’
‘Cells tugged in one direction sent biochemical signals in the opposite direction in the form of a signature pattern of fluorescent light.’

Obviously, the word signature has a more precise meaning, characterizing a thing and the word context has a broader meaning.

comment:27 Changed 2 years ago by vanto

I'm coming back about the solution provided by rwbarton. This solution is not appropriate in the case below.

let y = [True, 'a']

What is the actual type? What is the expected type?

If I change the order of the values in the list like this:

let y = ['a', True]

What is the actual type? What is the expected type?

An obvious answer is The types must be identical in a list or The types are not equal Here we see that there is no signature. If the signature had been y :: [Char] The obvious answer is the one I have given in comment 23, and which is

The Type of the result is not in accordance with the Type of the signature

So I no longer agree with rwbarton and bgamari.

comment:28 Changed 2 years ago by j.waldmann

Interesting, but I think this is a separate issue (type error message for list literals).

['a', True] gets translated to 'a' : (True : []), see language spec 3.7. https://www.haskell.org/onlinereport/haskell2010/haskellch3.html#x8-340003.7 So there is no special rule for typing list literals, as the compiler checks the translated code anyway.

Or so I thought, but in fact there is: after stating the translation, the standard states that in [e1, ..., ek], "the types of e1 ... ek must all be the same". Is this normative, or just a (redundant) explanation?

I do think that compiler (error) messages should use wording from the standard. (So, looking up words in OED does not really help. The committee that wrote the standard already did this.)

comment:29 Changed 17 months ago by goldfire

Owner: set to goldfire
Priority: normalhigh

Some messages were worsened by the patch for #14066 that makes this a higher priority. I don't think it will be hard to fix.

comment:30 Changed 16 months ago by goldfire

I think the time is ripe for this ticket, and it shouldn't be hard. Here are the pieces:

  1. Update TcErrors.mkExpectedActualMsg to print out the uo_thing field of the TypeEqOrigin passed in. For example, if we're reporting an actual of Maybe Int and an expected of Bool, then uo_thing might hold Just 5. This step can likely be done by changing msg1 of mkExpectedActualMsg.
  1. Rip out the special treatment of kinds in mkExpectedActualMsg. Other than swapping out the word "kind" for "type", kinds and types should be treated the same. This special treatment is currently there solely to match pre-8.0 behavior. Doing this step should improve the error messages worsened by #14066 and fix, e.g., #14887. Concretely, this step is essentially to remove msg5 from mkExpectedActualMsg.

Those two steps, by themselves, would nail this ticket. But Simon and I think we can do better. Currently, the uo_thing of a TypeEqOrigin tracks the term (or type, during kind-checking) that has the "actual" type. (It's the Just 5 mentioned in step 1.) However, we can also track the aspect of the context that leads us to expect the "expected" type. For example, this string could be something like "required as the condition of an if" or "of the second argument of ($)" or "of the type used in a type signature". Here are example error messages:

Code: if 'x' then foo else bar

  Type mismatch: `Char' /= `Bool'
  Actual:   `Char' is the type of `'x''
  Expected: `Bool` is the type required as the condition of an `if'

Code: not $ 'x'

  Type mismatch: `Char' /= `Bool'
  Actual:   `Char` is the type of `'x''
  Expected: `Bool` is the type of the second argument of ($)

Code: foo :: Maybe

  Kind mismatch: `* -> *' /= `*'
  Actual:   `* -> *' is the kind of `Maybe'
  Expected: `*` is the kind of the type used in a type signature

Aren't these just lovely?

To do this:

  1. Add an SDoc field to the Check constructor of an ExpType (in TcType). Then, in every mkCheckExpType, supply an appropriate message. Ideally, this message will not mention any types, because there will be no chance to zonk them later.
  1. Add a new field uo_context to TypeEqOrigin (of type Maybe SDoc) that will get the doc from the Check. This seems like it will happen in TcUnify.tcSubTypeDS_NC_O, but perhaps in other places, too. It's worth checking every place we make a TypeEqOrigin.
  1. Teach TcErrors.mkExpectedActualMsg to use this new info.

We could do steps 1-2 separately from 3-5, but given the very large number of error messages one would have to sweep through to do so, it seems best to common these changes up.

comment:31 Changed 16 months ago by simonpj

I'd love us to try this. It could be a very helpful and (I think) insightful warm-up project for someone wanting to get to know GHC.

comment:32 Changed 12 months ago by nineonine

I will give this one a whirl.

comment:34 Changed 11 months ago by nineonine

I was able to make some progress on this ticket (screenshot with new error messages: https://prnt.sc/ktm8u2). I will keep working on 3-5. Shouldn't take long to finish this part!

comment:35 Changed 11 months ago by simonpj

Great stuff! I think it would be Really Helpful to write down a Note explaining how all the moving parts work. Then we can review that against your code. It always does my head in!

comment:36 Changed 11 months ago by nineonine

Update:

I was able to locate all the places where we make TypeEqOrigin. We do this with a help of a smart constructor mkCheckExpType. There are around 80-90 places (across 12 modules) where we use it. There is another function synKnownType that uses mkCheckExpType which has another 25 occurrences.

For now I just used dummy text everywhere to be able to compile it.

Here you can see more detailed search results for these functions:

I have already changed some of the messages where I was able to infer the context and come up with meaningful explanations. Here is a screenshot of examples from the ticket with new messages - https://prnt.sc/kycm4c, however, some places are tricky so I will probably need more time to spend on them.

Right now my plan is the following - I want to come up with all the test cases(programs) that would call all the code parts where we use the above mentioned functions. By doing that, I will slowly weed out all the dummy messages and replace with meaningful ones.

As suggested, I am collecting all the notes of my changes.

Perhaps, it is a good time to submit a patch to Phabricator? We could proceed with a discussion there and it would be much easier to discuss all the new messages (with line comments). By the way, If someone could code review and help with new messages - that would be greatly appreciated! Thank you.

comment:37 Changed 11 months ago by simonpj

This sounds great.

I think it would be very helpful if you wrote a wiki page (on the Trac wiki) explaining (a) the problem you are trying to solve (being more specific than "better error messages"), and (b) how you are solving it.

Concerning (b) there are quite a few moving parts, and it really helps to explain how they work together. It may well be that we iterate the design a bit, and it's best to do that before you have invested a great deal of effort in it.

Thanks!

comment:38 Changed 7 months ago by nineonine

Summary: Better type error messagesImprove type mismatch error messages

Sorry for the delay - had to face life!

I opened MR here: https://gitlab.haskell.org/ghc/ghc/merge_requests/223

I also added a note there that will be converted to a wiki page in the future.

Suggestions and comments are welcome!

Note: See TracTickets for help on using tickets.