Ticket #89 (closed defect: fixed)

Opened 6 years ago

Last modified 9 months ago

character references are not recognized in emphasized text

Reported by: g9ks157k@… Owned by: SimonHengel
Priority: major Milestone:
Version: 2.13.1 Keywords: lexer
Cc: g9ks157k@…

Description

If I write /Hello&#A0;world!/ in a doc string, I don’t get a non-breaking space between “Hello” and “world!” in the output but I get “&#A0;” instead.

Change History

Changed 6 years ago by SamB

  • keywords lexer added

This appears to be an infelicity in the lexer (GHC's compiler/parser/HaddockLex.x):

<string,def> {
  $special                      { strtoken $ \s -> TokSpecial (head s) }
  \<\<.*\>\>                    { strtoken $ \s -> TokPic (init $ init $ tail $ tail s) }
  \<.*\>                        { strtoken $ \s -> TokURL (init (tail s)) }
  \#.*\#                        { strtoken $ \s -> TokAName (init (tail s)) }
  \/ [^\/]* \/                  { strtoken $ \s -> TokEmphasis (init (tail s)) }
  [\'\`] $ident+ [\'\`]         { ident }
  \\ .                          { strtoken (TokString . tail) }
  "&#" $digit+ \;               { strtoken $ \s -> TokString [chr (read (init (drop 2 s)))] }
  "&#" [xX] $hexdigit+ \;       { strtoken $ \s -> case readHex (init (drop 3 s)) of [(n,_)] -> TokString [chr n] }
  -- allow special characters through if they don't fit one of the previous
  -- patterns.
  [\/\'\`\<\#\&\\]                      { strtoken TokString }
  [^ $special \/ \< \# \n \'\` \& \\ \]]* \n { strtoken TokString `andBegin` line }
  [^ $special \/ \< \# \n \'\` \& \\ \]]+    { strtoken TokString }
}

Really, I guess the problem is that this is done in the lexer at all -- it should probably be done in HaddockParse.y instead.

Changed 6 years ago by simonmar

one possible reason for this is so that lone '\' characters don't cause a parse error. That used to be one of the most common reasons for accidental parse errors in Haddock docs. If you look back through the logs you'll probably see when that change was made.

Changed 2 years ago by anonymous

  • milestone 2.5.0 deleted

Milestone 2.5.0 deleted

Changed 2 years ago by SimonHengel

  • owner set to SimonHengel
  • status changed from new to assigned
  • version changed from 2.4.1 to 2.13.1
  • milestone set to 2.13.2

Changed 11 months ago by anonymous

  • milestone 2.13.2 deleted

Milestone 2.13.2 deleted

Changed 9 months ago by Fūzetsu

  • status changed from assigned to closed
  • resolution set to fixed

Must have missed this ticket. This now works as long as you use the syntax properly: A0 is a hex digit so you need to use &#x rather than #&

/Hello&#xA0;world!/ renders with a space as of 2.14.x

Note: See TracTickets for help on using tickets.