Ticket #26 (new defect)

Opened 2 years ago

Last modified 18 months ago

ThreadScope shows garbled characters when using Unicode characters in user event

Reported by: shelarcy Owned by:
Priority: major Component: ThreadScope
Version: Keywords:
Cc: shelarcy@…

Description

Debug.Trace.traceEventIO output Unicode characters by UTF-8 String.

"ghc-events show" can show output Unicode characters' user event correctly.

import Debug.Trace
main = do
    traceEventIO "テスト"
    putStrLn "test"
$ ghc -O2 -threaded -eventlog Test.hs
[1 of 1] Compiling Main             ( Test.hs, Test.o )
Linking Test.exe ...

$ .\Test.exe +RTS -l
"test"
$ ghc-events show Test.eventlog > test.log
Event Types:
  (snip)

Events:
  (snip)
  1531003: cap 0: running thread 3
  1594090: cap 0: テスト
  1805086: cap 0: stopping thread 3 (making a foreign call)
  (snip)

But ThreadScope? can't show Unicode characters' user event correctly. ThreadScope? shows garbled characters.

This is bad. Unicode character is useful to find user event for me.

I'm using GHC 7.4.1 with threadscope 0.2.1 and gtk 0.12.3, on Windows.

Attachments

garbling.png (90.2 kB) - added by shelarcy 2 years ago.
ghc-events-unicode.dpatch (18.0 kB) - added by duncan 18 months ago.

Change History

Changed 2 years ago by shelarcy

  Changed 2 years ago by shelarcy

I think that Unicode characters' thread label causes same problem. But GHC.Conc.labelThread function has another problem.

So, I just report user event's one.

follow-up: ↓ 4   Changed 2 years ago by MikolajKonarski

Thank you for the report. We've had problems with non-ASCII characters in a few places and we have a few (quite ugly) work-arounds active. Some are only needed with older ght2hs, but others seem to be required regardless. Most probably the workarounds are not applied to the places where the problems you describe appear. Which version of ghk2hs do you use?

  Changed 2 years ago by MikolajKonarski

Some help from an expert:

20:40 <@dcoutts> mikolaj: for #26, the fix is to do utf8 decode in ghc-events for the string payload of the user message event
20:40 <@dcoutts> mikolaj: ironically it's ghc-events show that is doing the wrong thing
20:40 <@dcoutts> which happens to work when your console is in utf8 mode
20:41 <@mikolaj> dcoutts: oh, great, so the problem is already in ghc-events? should be much easier than our hacks with setting up pango fonts
20:41 <@dcoutts> and ThreadScope and Gtk+ are actually Unicode aware
20:41 <@dcoutts> Gtk2Hs already converts Haskell String to utf8 when passing it to Gtk+
20:42 <@dcoutts> so we simply need to decode when parsing the eventlog
20:42 <@mikolaj> dcoutts: aware, but buggy --- remember the hacks to get \mu on the Y scale?
20:42 <@mikolaj> but let's hope it works in this case
20:42 <@dcoutts> that was a font problem iirc, not an encoding issue
20:43 <@mikolaj> dcoutts: yes

in reply to: ↑ 2   Changed 2 years ago by shelarcy

Replying to MikolajKonarski:

Which version of ghk2hs do you use?

I'm using gtk2hs 0.12.3 with gtk+-bundle_2.22.1-20101227_win32.

  Changed 18 months ago by duncan

I have a patch for this for ghc-events, but I'm not sure I want to apply it immediately before this release, as it's not had much testing.

Changed 18 months ago by duncan

Note: See TracTickets for help on using tickets.