Parser
The parser has two main goals; to build an AST for the compiler and to give good syntax errors for the compiler user.
Avoid using the try combinator
To generate good parser error messages, the use of Parsec.try should be avoided. Parsec.try is problematic because:
- The process of starting at the try, parsing forward, failing and then backtracking is slower than parsing with only a single token lookahead.
- Using try means that any failure will backtrack to the inner most try which is not really where the parse failed. This means that Parsec will not be able to produce good parser error messages. For instance, if the parser uses Parsec.try to parse a faulty list comprehension like this:
x = [ x + 2, y | x <- [ 1, 2, 3, 4 ]]
the parser would get to the '|' token, fail and backtrack to the '=' token. When all other parser combinators also fail, the parser would flag an error at the '=' token. The try-less parser on the other hand would correctly flag the error at the '|' token.
In general, try-less parsing should not make the parser any more difficult to modify or extend.
As an example, a part of the expression parser (src/Source/Parser/Exp.hs) used to look like this
<|> -- \ case { ALT .. }
-- overlaps with the nexta lambda form
(Parsec.try $ do
tok <- pTok K.BackSlash
pTok K.Case
alts <- pCParen (Parsec.sepEndBy1 pCaseAlt pSemis)
return $ XLambdaCase (spTP tok) alts)
<|> -- \ PAT .. -> EXP
do tok1 <- pTok K.BackSlash
pats <- Parsec.many1 pPat1
pTok <- pTok K.RightArrow
exp <- pExp
return $ XLambdaPats (spTP tok1) pats exp
Both of these combinators start with a K.Backslash token, with the first being wrapped in a Parsec.try. In the case of the first one failing, the second combinator, also starting with a K.Backslash token will be tried.
Removing the Parsec.try in this instance involves replacing the above two combinators with:
<|> do tok <- pTok K.BackSlash
exp <- pBackslashExp (spTP tok)
return $ exp
which uses a new combinator pBackslashExp defined as:
pBackslashExp :: SP -> Parser (Exp SP) pBackslashExp startPos = do pTok K.Case alts <- pCParen (Parsec.sepEndBy1 pCaseAlt pSemis) return $ XLambdaCase startPos alts <|> do pats <- Parsec.many1 pPat1 pTok <- pTok K.RightArrow exp <- pExp return $ XLambdaPats startPos pats exp
Notice that the call to pBackslashExp passes in the position of the K.Backslash token to be used in the return value.
