Opened 8 years ago

Last modified 18 months ago

#5041 new bug

Incorrect Read deriving for MagicHash constructors

Reported by: dolio Owned by:
Priority: low Milestone:
Component: Compiler Version: 7.0.2
Keywords: deriving Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case: deriving/should_run/T5041
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Greetings,

A fellow in #haskell on freenode just discovered the following bug:

{-# LANGUAGE MagicHash #-}

data Note = A# deriving (Show, Read)

show works fine, but read doesn't:

*Main> show A#
"A#"
*Main> read (show A#)
*** Exception: Prelude.read: no parse

Probably not a surprising omission, and hopefully not difficult to fix.

Attachments (1)

5041.dpatch (4.6 KB) - added by batterseapower 8 years ago.

Download all attachments as: .zip

Change History (12)

comment:1 Changed 8 years ago by maeder

ghci needs to be called with "-XMagicHash" for "show A#" to work and read needs to know the result type. Yet:

*Main> read "A#" :: Note
*** Exception: Prelude.read: no parse

comment:2 Changed 8 years ago by simonpj

Milestone: 7.2.1

The difficulty here is that lexId in the library Text.Read.Lex doesn't recognise A# as a lexeme.

I can't see an obviously good workaround for this. The library can't change with different GHC flags. Alternatives:

  • When GHC does 'Show' for a constructor A#, make it spit out some funny escape sequence like A_hash. (And dually for reading.) But that is no fun. The ASCII in the file would look odd to a human, and human-written stuff like A# would fail.
  • Simply fix Text.Reader.Lex to recognise a postfix hash as a legal lexeme. The downside here is that lex ought to parse a+x#c as five single-character lexemes, so we'd be changing the behaviour away from H98.
  • Like the previous idea, but make a copy of Text.Reader.Lex and invoke that when building derived Read instances. The original lex would stay unchanged. In effect we'd just have a new function lexWithTrailingHashes. Actually you would not need to copy all of it, by a long chalk; but there'd be a little duplication.

My suggestion would be to follow the third plan. Anyone who cares, feel free to express an opinion.

Once we've decided, executing the fix should be easy, so I'll milestone for 7.2

Simon

comment:3 Changed 8 years ago by simonmar

Generate a parser that recognises "A" followed by "#"? It would be too liberal (extra spaces allowed), but it would work, no?

comment:4 Changed 8 years ago by simonpj

Options 2 and 3 above are precisely that.

comment:5 in reply to:  4 Changed 8 years ago by simonmar

Replying to simonpj:

Options 2 and 3 above are precisely that.

No, do it without modifying Text.Reader.Lex. Just generate a Read instance that recognises the two lexemes "A" followed by "#".

Changed 8 years ago by batterseapower

Attachment: 5041.dpatch added

comment:6 Changed 8 years ago by batterseapower

Status: newpatch

I've attached a patch that fixes this bug following Simon Marlow's suggestion.

comment:7 Changed 8 years ago by simonpj

Resolution: fixed
Status: patchclosed
Test Case: deriving/should_run/T5041

Fixed by

commit bf0d3df4d011bc93af28b195a97abfcd24b9e7d6
Author: simonpj <simonpj@cam-04-unx.europe.corp.microsoft.com>
Date:   Tue Apr 19 13:31:53 2011 +0100

    Fix Trac #5041: parse the trailing '#'
    
    This matters for constructors and field labels that
    have a trailing hash (MagicHash language extension).

 compiler/typecheck/TcGenDeriv.lhs |   31 ++++++++++++++++++-------------
 1 files changed, 18 insertions(+), 13 deletions(-)

comment:8 Changed 8 years ago by igloo

Resolution: fixed
Status: closednew

This doesn't look right to me:

{-# LANGUAGE MagicHash #-}

data Foo = Foo#
    deriving (Show, Read)

foo :: Foo
foo = read "Foo #" -- NOTE: Space between "Foo" and "#"

main :: IO ()
main = print foo
$ ./q
Foo#

comment:9 Changed 8 years ago by simonpj

Milestone: 7.2.1_|_
Priority: normallow

That's right. If you look at the discussion thread above we decided that allowing the space fixed your problem without making an invasive change to the libraries. It's too liberal as you note, but it's not getting in anyone's way (as far as I know). Until it does it's not worth the effort to fix it properly (unless someone wants to volunteer).

Still, fair enough, it's still wrong so I'll leave it open with milestone bottom

Simon

comment:10 Changed 2 years ago by RyanGlScott

Keywords: deriving added

comment:11 Changed 18 months ago by Ryan Scott <ryan.gl.scott@…>

In d5577f44/ghc:

Special-case record fields ending with hash when deriving Read

Summary:
In commit dbd81f7e86514498218572b9d978373b1699cc5b, a
regression was inadvertently introduced which caused derived `Read`
instances for record data types with fields ending in a `#` symbol
(using `MagicHash`) would no longer parse on valid output. This
is ultimately due to the same reasons as #5041, as we cannot parse
a field name like `foo#` as a single identifier. We fix this issue
by employing the same workaround as in #5041: first parse the
identifier name `foo`, then then symbol `#`.

This is accomplished by the new `readFieldHash` function in
`GHC.Read`. This will likely warrant a `base-4.11.1.0` release.

Test Plan: make test TEST=T14918

Reviewers: tdammers, hvr, bgamari

Reviewed By: bgamari

Subscribers: rwbarton, thomie, carter

GHC Trac Issues: #14918

Differential Revision: https://phabricator.haskell.org/D4502
Note: See TracTickets for help on using tickets.