Ticket #250 (closed defect: fixed)

Opened 14 months ago

Last modified 8 months ago

haddocs fails to parse U+00A0 (aka c2 a0 NO-BREAK SPACE) in @...@ block, but works in '>' (bird track)

Reported by: slyfox Owned by:
Priority: major Milestone:
Version: 2.13.1 Keywords:
Cc:

Description

Bad.hs:

-- | test
--
-- @
--     nbsp_c2_a0 = " "
-- @
module Bad where

Good.hs:

-- | test
--
-- > nbsp_c2_a0 = " "
module Good where

Be careful: " " is ont "\x20", but "\xC2\xA0" in both cases.

Haddock fails to parse Bad.hs, but works on Good.hs:

[sf] /tmp/y:haddock --html src/Good.hs src/Bad.hs
Haddock coverage:
haddock module header parse failed: Cannot parse header documentation paragraphs
   0% (  0 /  1) in 'Bad'
 100% (  1 /  1) in 'Good'

Found in haddock-2.13.2.1.

Originally if was found in directory-layour source. Hackage parses it, but not local haddock:

http://hackage.haskell.org/packages/archive/directory-layout/0.4.0.0/doc/html/System-Directory-Layout-Traverse.html

Thanks!

Attachments

nbsp-bug.tar.gz (242 bytes) - added by slyfox 14 months ago.
nbsp-bug.tar.gz - compressed sources

Change History

Changed 14 months ago by slyfox

nbsp-bug.tar.gz - compressed sources

Changed 14 months ago by Fūzetsu

Short version: You'll have to wait. It has been fixed but changes aren't in HEAD yet. See bottom of the post for workaround info.

Long version: Note that the difference between a birdtrack code and a block of code is that code between @ is treated as valid markup. That is, " " indicates a module name that's just a space. In a birdtrack, markup is not accepted so it's just a literal. What's happening is that Haddock fails to parse text with such module name (I can't exactly locate why it does this though, I believe it should just take whatever is given). The reason you're getting this particular error is because you're putting docs where the module header normally goes (see http://www.haskell.org/haskellwiki/Programming_guidelines#File_Format).

In fact, with further investigation, it seems to fail whenever it encounters this character anywhere where valid markup can occur. Anything after a birdtrack is taken literally so that's why it doesn't happen there.

For desperate: There's actually work being done on Haddock right now and this issue is no longer present. The changes are not in HEAD yet but if you're really desperate, you can pull and build from this branch of my repo: https://github.com/Fuuzetsu/haddock/tree/pullbase

Mind that this is branch is very often buggy as changes are being made. I confirmed that your example parses with commit https://github.com/Fuuzetsu/haddock/tree/b466f95c1576784e6905703a32c5b7195ceaa72e and I believe that this commit doesn't have major regressions.

Changed 14 months ago by slyfox

Thanks for the detailed info!

We are using such workaround for now: https://github.com/supki/directory-layout/commit/8c07408eab3b8d46ddacc7eb2d448b8772fb19a3

Changed 8 months ago by Fūzetsu

  • status changed from new to closed
  • resolution set to fixed

Forgot to close this few days ago, the changes are now in HEAD and the original code gets parsed fine.

Note: See TracTickets for help on using tickets.