Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#5257 closed bug (fixed)

Calling fail on a UTF-8 encoded string (in file) causes garbage to be printed

Reported by: anthony.de.almeida.lopes Owned by:
Priority: normal Milestone: 7.2.1
Component: Runtime System Version: 7.0.2
Keywords: Cc:
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Incorrect result at runtime Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by simonmar)

For example,

guerrilla@delta:/tmp/foo$ cat Test.hs 
module Main where

main :: IO ()
main =
    do
        putStrLn "μ"
        fail "μ"
guerrilla@delta:/tmp/foo$ ./Test 
μ
Test: user error (�)
guerrilla@delta:/tmp/foo$ ./Test 2>&1 | xxd
0000000: cebc 0a54 6573 743a 2075 7365 7220 6572  ...Test: user er
0000010: 726f 7220 28bc 290a                      ror (.).

Using either encodeString or writing it in escaped hexidecimal does work.

Change History (4)

comment:1 Changed 9 years ago by simonmar

Description: modified (diff)
Resolution: fixed
Status: newclosed

Fixed in 7.2.1, thanks to Max's work on Unicode. (the problem was that the main exception handler converts the exception to a String and then passes it to the RTS error function using withCString, which until recently did no decoding).

comment:2 Changed 9 years ago by simonmar

Milestone: 7.2.1

comment:3 Changed 9 years ago by anthony.de.almeida.lopes

Does anyone know if the encodeString workaround will start to fail when I upgrade? Thanls.

comment:4 Changed 9 years ago by batterseapower

I'm pretty sure your workaround will start to fail for non-ASCII strings. The UTF-8 encoded bytes that encodeString injects back into Chars will contain some bytes > 127 and so will be subject to another round of UTF-8 encoding when GHC encodes the String for the console.

ASCII strings should work fine either way.

Note: See TracTickets for help on using tickets.