Ticket #54 (closed defect: fixed)

Opened 9 years ago

Last modified 9 years ago

getInputChar doesn't handle Unicode

Reported by: judah Owned by:
Priority: minor Milestone: 0.6.0
Version: Keywords:


Currently 'getInputChar' doesn't handle Unicode, since utf8-string does not provide a unicode-aware 'getChar' equivalent.

This issue will probably be fixed once ghc provides unicode support for text I/O in 6.12.

Change History

Changed 9 years ago by judah

  • milestone set to 0.6.0

It's annoying, but we should be able to get most of the way without support from ghc:

  • use iconv's EINVAL (incomplete input) to decide whether more characters need to be read.
  • use Win32 API functions to correctly translate non-ASCII, single-byte characters from the code page to Unicode.
  • use Win32's IsDBCSLeadByteEx to determine whether a second character needs to be read. (Note this won't work for UTF-8 on Windows; if someone complains, we could special-case UTF-8 or allow them to use iconv instead.)

Changed 9 years ago by judah

  • status changed from new to closed
  • resolution set to fixed
Note: See TracTickets for help on using tickets.