Ticket #81 (reopened defect)

Opened 6 years ago

Last modified 4 years ago

haskeline assumes all characters have width = 1

Reported by: guest Owned by:
Priority: major Milestone: 0.6.*
Version: 0.6 Keywords:
Cc:

Description

haskeline's input editing assumes all characters have a width of 1. The display and cursor positioning code messes up if that assumption doesn't hold, e.g. for double-wide (width = 2) or combining characters (width = 0).

Attachments

haskeline_wc.tgz (52.9 kB) - added by guest 5 years ago.
wide char support
Terminfo.hs (12.4 kB) - added by guest 5 years ago.
Multi line wide char support.

Change History

  Changed 6 years ago by judah

Can you give some examples of double-wide or combining characters? Any suggestions on how Haskeline could determine the width of a character?

  Changed 6 years ago by judah

  • milestone set to 0.6.2

I should be able to make some progress on this by the next release.

  Changed 6 years ago by judah

Status update: Haskeline now works fine with combining characters whose generalCategory is NonSpacingMark, with the caveat that unattached combining characters at the start of the line are dropped.

That's probably all I'm going to do for the next release unless someone hollers with more specifics.

  Changed 5 years ago by judah

  • priority changed from major to minor

follow-up: ↓ 6   Changed 5 years ago by judah

  • priority changed from minor to major
  • milestone changed from 0.6.2 to 0.6.*

I got a report from Ahn Ki Yung that Haskeline doesn't handle double-wide characters. With his help, I was able to reproduce the issue:

I ran the example program in the Haddock documentation, and typed three Korean characters

안기영

I first typed in my name above and pressed return, and as you see the three characters got into the buffer. Then, I type my name and tried to delete them with three backspace and here's what happens:

 kyagrd@kyavaio:~/tmp$ ghc-pkg list|grep haskeline
    haskeline-0.6.1.6, haskell-src-1.0.1.3, (haskell-src-exts-1.0.1),
    hashed-storage-0.3.6, haskeline-0.6.2, haskell-src-exts-0.3.12,
 kyagrd@kyavaio:~/tmp$ runghc haskelinetest.hs
 % 안기영
 Input was: 안기영
 % 안기
 Input was:
 %

I strongly suspect that Chinese and Japanese input will have the same problem as well.

Changed 5 years ago by guest

wide char support

in reply to: ↑ 5   Changed 5 years ago by guest

I upload a attachment base on 0.6.2.1 fix this problem under Linux with Terminfo and Dumbterm backend. Test under Ubuntu/gnome-terminal

Use wcwidth.c under GNU/readline, all [Grapheme] type instance's length calculate by graphemeWidth::[Grapheme]->Int.

I dosen't have a Win box, so I don't know any problem under Win

Changed 5 years ago by guest

Multi line wide char support.

  Changed 5 years ago by guest

I upload a multi line wide char support fix, under Linux with Terminfo.

Base on the haskeline_wc.tgz I previous upload on this ticket.

Because need to detect every line's end of line, this fix will make some performance impact. as in the GNU/readline such wide char support will be choice to let the user choose the performance impact. May be share memory can make the impact smaller.

  Changed 4 years ago by judah

  • status changed from new to closed
  • resolution set to fixed

Fixed in 0.6.2.3 (to be released next week).

  Changed 4 years ago by soiamso

  • status changed from closed to reopened
  • resolution fixed deleted

As the following link ,man page of SetConsoleCursorPosition?(), mention. Problem still exist in Win system. Because SetConsoleCursorPosition?() and COOR structure using the same "cell" machanism as terminal under *nix system. Also need to cauculate every character's "cell width"

http://msdn.microsoft.com/en-us/library/ms686025%28VS.85%29.aspx

follow-up: ↓ 11   Changed 4 years ago by judah

Can you please give some specific steps which cause a problem with Haskeline's output in the Windows console? (For example, the comment from 09/17/09 demonstrated a problem in Linux with deleting Korean characters.)

in reply to: ↑ 10   Changed 4 years ago by soiamso

Replying to judah:

Can you please give some specific steps which cause a problem with Haskeline's output in the Windows console? (For example, the comment from 09/17/09 demonstrated a problem in Linux with deleting Korean characters.)

I hope your Win system have CJK support or font set. This is the reproduce method.

Test under WinXP with ghci with GHC 7.0.1

Same as the deleting Korean characters. CJK & wide char have the same problem. while Navigate the string.

reproduce:

Prelude > 输入法

and then navigate (left button 3 times )or delete(backspace 3 times)

Prelude > 输入

2 Char will still display there or "3" cell still on the console.

follow-up: ↓ 13   Changed 4 years ago by judah

Thanks for your help. I have not gotten CJK working in my console, but I may be able to get access to another machine that has it set up.

In the meantime, I pushed a patch to the repository; would you be able to test it on your machine? You can get the development version with darcs (http://darcs.net) using:

darcs get http://code.haskell.org/haskeline

Then cd into that directory and type cabal install --user to install it. Once it's installed, try building and running the program in examples/Test.hs. Let me know if you have problems getting those steps to work.

in reply to: ↑ 12   Changed 4 years ago by soiamso

Replying to judah:

Thanks for your help. I have not gotten CJK working in my console, but I may be able to get access to another machine that has it set up. In the meantime, I pushed a patch to the repository; would you be able to test it on your machine? You can get the development version with darcs (http://darcs.net) using: darcs get http://code.haskell.org/haskeline Then cd into that directory and type cabal install --user to install it. Once it's installed, try building and running the program in examples/Test.hs. Let me know if you have problems getting those steps to work.

I test the patch *Attempt to fix #81 on Windows path

left and delete navigate is all right in one line case.

Some Problem:

1st:Right arrow Key fail Cause exception even under just ASCII string error as: SetConsoleCursorPosition? argument Invalid

2nd:multiline problem(may be this is not the major problem and no line end trace in the patch): if have multiline ASCII char, counting wrong 0:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa many "a" let cross one line 0:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

many "left" the blink-blink will in ":" If cross two line(total 3 line) will in "0"

if at the end of the line have CJK char, counting wrong like: 0:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa输入法 many "a" let the CJK just cross the line and let the fist line left one "cell" blank, many "left arrow key" the blink-blink will in ":" 0:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa输入法

a wrong position.

Note: See TracTickets for help on using tickets.