Version 5 (modified by erikd, 5 years ago)

--

< Commentary

Parser

The parser has two main goals; to build an AST for the compiler and to give good syntax errors for the compiler user.

To meet the second goal, good parser error messages, the use of Parsec.try should be avoided. Parsec.try is problematic because:

  • The process of starting at the try, parsing forward, failing and then backtracking is slower than parsing with only a single token lookahead.

  • Using try means that any failure will backtrack to the inner most try which is not really where the parse failed. This means that Parsec will not be able to produce good parser error messages. For instance, with the old version of the parser, parsing:
          x = [ x + 2, y | x <- [ 1, 2, 3, 4 ]]

would result in the list parser getting to the '|' token, failing and backtracking to the '=' token. When all other parser combinators also fail, the parser would flag an error at the '=' token. The try-less parser on the other hand would correctly flag the error at the '|' token.

Furthermore, try-less parsing should not make the parser any more difficult to modify or extend. Quite the contrary, the try-less parser is easier debug and easier to extend once the concepts of try-less parsing are understood.

As an example, a part of the expression parser (src/Source/Parser/Exp.hs) used to look like this

  <|>   -- \ case { ALT .. }
  	-- overlaps with the nexta lambda form
	(Parsec.try $ do
		tok	<- pTok K.BackSlash
		pTok K.Case
		alts	<- pCParen (Parsec.sepEndBy1 pCaseAlt pSemis)
		return	$ XLambdaCase (spTP tok) alts)

  <|>	-- \ PAT .. -> EXP
	do	tok1	<- pTok K.BackSlash
		pats	<- Parsec.many1 pPat1
		pTok	<- pTok K.RightArrow
		exp	<- pExp
		return	$ XLambdaPats (spTP tok1) pats exp

Both of these combinators start with a K.Backslash token, with the first being wrapped in a Parsec.try. In the case of the first one failing, the second combinator, also starting with a K.Backslash token will be tried.

Removing the Parsec.try in this instance involves replacing the above two combinators with:

  <|>	do	tok	<- pTok K.BackSlash
  		exp	<- pBackslashExp (spTP tok)
                return	$ exp

which uses a new combinator pBackslashExp defined as:

pBackslashExp :: SP -> Parser (Exp SP)
pBackslashExp startPos =
	do	pTok K.Case
		alts	<- pCParen (Parsec.sepEndBy1 pCaseAlt pSemis)
		return	$ XLambdaCase startPos alts

  <|>	do	pats	<- Parsec.many1 pPat1
		pTok	<- pTok K.RightArrow
		exp	<- pExp
		return	$ XLambdaPats startPos pats exp

Notice that the call to pBackslashExp passes in the position of the K.Backslash token to be used in the return value.