Ticket #186 (closed bug report: fixed)

Opened 5 years ago

Last modified 5 years ago

parse . prettyPrint /= id

Reported by: dom Owned by: nibro
Priority: major Milestone: Blue Sky
Component: prettyprinter Version: 1.1.4
Keywords: Cc:

Description

I've been generating Haskell using haskell-src-exts but the prettyprinter isn't producing what I would expect.

I would expect parse . prettyPrint == id i.e. the AST should be unchanged if you prettyprint it then parse it.

Here's an example generated expression:

App (App (Var (UnQual? (Ident "pay"))) (Var (UnQual? (Ident "tPD")))) (App (Var (UnQual? (Ident "a"))) (InfixApp? (App (Var (UnQual? (Ident "length"))) (Var (UnQual? (Ident "tOD")))) (QVarOp (UnQual? (Symbol "+"))) (Lit (Int (-1)))))

Here's what prettyPrint produces:

pay tPD a length tOD + -1

Parsing it gives this (i.e. not the expression I first thought of):

InfixApp? (App (App (App (App (Var (UnQual? (Ident "pay"))) (Var (UnQual? (Ident "tPD")))) (Var (UnQual? (Ident "a")))) (Var (UnQual? (Ident "length")))) (Var (UnQual? (Ident "tOD")))) (QVarOp (UnQual? (Symbol "+"))) (NegApp? (Lit (Int 1)))

I would have expected the prettyprinter to produce this:

pay tPD (a (length tOD + -1))

Do I have to write my own prettyprinter? Do I have to put in explicit parentheses? The latter seems unsatisfactory as my generated AST is unambiguous and bracketing ought to be part of the prettyprinter. The former would be quite a lot of code as there are many cases to consider.

Change History

Changed 5 years ago by dom

Niklas Brober: Looking at your example, what you want is brackets to be inserted whenever the right subexpression in an application is non-atomic. That certainly seems reasonable to me.

Changed 5 years ago by dom

Neil Mitchell: I wanted that once, then I realised I was wrong

Should you insert brackets everywhere it's ambiguous? What about operators - you don't know the fixities, so can't tell if you should insert a bracket or not. I can't see any way for HSE to implement this feature correctly, and as such, it's really not a good feature to push down in to the pretty printer.

In HLint I have a bracketing module, which has served me well. Please take any ideas you need from it - http://community.haskell.org/~ndm/darcs/hlint/src/HSE/Bracket.hs . In particular, given a fully bracketed expression, I can call transformBracket to transform the expression, not caring about brackets, in a way that guarantees the right brackets are put back. There is also needBracket and isAtom which are very useful. If you call descendBi (transformBracket Just) it will automatically bracket your term as much as is necessary.

If you don't understand any of the ideas in Bracket then look at Uniplate (http://community.haskell.org/~ndm/uniplate) - it's where a lot of the ideas came from. If you're working with a HSE source tree without using Uniplate (or a competitor) then you're doing it wrong.

Changed 5 years ago by dom

Duncan Coutts: Surely you do want this. It's the biggest problem with the original haskell-src package, that it cannot print out any useful Haskell code obtained from the parser, because it forgets all the brackets.

Should you insert brackets everywhere it's ambiguous?

The minimal number that are necessary. The Show class manages to do this ok.

What about operators - you don't know the fixities, so can't tell if you should insert a bracket or not.

You can at least remember enough in the parser to print it out the same way. If we do not have fixity info available (eg because it's from some other module) we just keep the operators and expressions together in a list (such that they could be fully resolved if we applied fixity info)

eg

e1 %% e2 ?? e3

as (using made up AST names)

(e1, [(Op "%%", e2), (Op "??", e3)])

We can then print it out the same way. We could also resolve the bracketing if we had fixity info and in that case we could print out the expression again with the minimal number of brackets necessary.

I did it like this for an undergrad compiler, the round trip parsing and pretty printing works fine, as does getting from the plain syntax level to the fully resolved level by using fixity info.

I can't see any way for HSE to implement this feature correctly, and as such, it's really not a good feature to push down in to the pretty printer.

It probably wants to be a combination of the parser, AST and pretty printer.

Changed 5 years ago by dom

Niklas Broberg:

Surely you do want this. It's the biggest problem with the original haskell-src package, that it cannot print out any useful Haskell code obtained from the parser, because it forgets all the brackets.

I should point out that haskell-src-exts already fixes this for code obtained from the parser, by making the parser and AST remember the brackets. Or as you put it:

It probably wants to be a combination of the parser, AST and pretty printer.

Yes indeed.

But the problem at hand here is auto-generated AST code, where we cannot rely on the parser to do the right thing. There's help in the AST such that it's possible to explicitly insert brackets where needed, but I agree with Dominic that it shouldn't really be necessary in his case. Neil's point is well taken though - to do it correctly (or rather, minimally) for infix application, the pretty printer would need to be aware of the respective fixities involved.

However, that doesn't mean we can't do better than what it is now, but be conservative about it. Only insert brackets where it's clear that brackets must be inserted, which would be the case for Dominic's example. If the argument to an application is non-atomic, it needs brackets, there's nothing ambiguous about that. Nothing can be said so categorically for infix applications, so there we should assume that the fixities are already done in the correct way, or that brackets are inserted manually where needed.

Does that sound reasonable?

Changed 5 years ago by dom

Duncan Coutts:

Surely you do want this. It's the biggest problem with the original haskell-src package, that it cannot print out any useful Haskell code obtained from the parser, because it forgets all the brackets.

I should point out that haskell-src-exts already fixes this for code obtained from the parser, by making the parser and AST remember the brackets. Or as you put it:

It probably wants to be a combination of the parser, AST and pretty printer.

Yes indeed.

Ok, I misunderstood.

But the problem at hand here is auto-generated AST code, where we cannot rely on the parser to do the right thing. There's help in the AST such that it's possible to explicitly insert brackets where needed, but I agree with Dominic that it shouldn't really be necessary in his case.

Neil's point is well taken though - to do it correctly (or rather, minimally) for infix application, the pretty printer would need to be aware of the respective fixities involved.

To do it minimally yes, but correctly? In the AST you've got

InfixApp? Exp QOp Exp

so we know the tree structure, we just can't insert minimal brackets without knowing the fixity.

However, that doesn't mean we can't do better than what it is now, but be conservative about it. Only insert brackets where it's clear that brackets must be inserted, which would be the case for Dominic's example. If the argument to an application is non-atomic, it needs brackets, there's nothing ambiguous about that. Nothing can be said so categorically for infix applications, so there we should assume that the fixities are already done in the correct way, or that brackets are inserted manually where needed. Does that sound reasonable?

So to be clear, currently the printing behaviour is that no brackets are inserted for infix expressions which means the results are "wrong by default" (for ASTs constructed manually, not via the parser) but on the other hand this lets you insert the minimal (or pleasing) number of brackets.

The suggestion is to move to a "safe/correct by default" where brackets are inserted to preserve the tree structure of infix expressions. The problem then becomes, what if we want to have the minimal (or pleasing not-quite-minimal) number of brackets.

Right?

If I've understood right, then yes I think making the pretty printing right by default is a good idea, and then for the users/applications where optimising for fewer brackets is important, it should be a little extra work to supply the necessary information.

Perhaps like the ParseMode? has fixities :: [Fixity], give the PPHsMode the same (partial) fixities environment. For operators not in the environment we fall back to using brackets all the time, but for known operators we can the use minimal bracketing.

Another option I suppose would be to annotate the QOp used in InfixApp? with a Maybe fixity. The parser would annotate these when it knows them from its fixities env in the ParseMode?. For ASTs constructed manually the user would add them if they know or care. If not, they just get slightly more brackets than might strictly be necessary if the fixity were known.

Changed 5 years ago by dom

Neil Mitchell:

Adding brackets that MUST have been there, by default, sounds like a great idea. The alternative is getting it wrong, so I think that's very safe.

Adding brackets that MIGHT have been there is a lot less clear cut. One important consideration is that the fixities you parse/pretty-print with might be wrong, so it has to be sensitive to that. You have the options:

* Always do it (but then you get way too many brackets, and in the case where you mis-guess the fixities, you break the code) * Do it based on a table of fixities (might work if the parser fixities match the pretty-printer fixities, but could go wrong) * Annotate operators with fixities (this seems really wrong, and suffers from incorrect guessed fixities very badly) * Never do it

My preference would be:

-- put in enough brackets based on a fixities ensureEnoughBrackets :: [Fixities] -> a -> a

prettyPrint = show . ensureEnoughBrackets []

Always do the safe brackets, if people want to do a table-of-fixities approach they can easily do so. Also by putting this code in the pretty printer it's harder to reuse if you want to write a custom pretty print or similar - ensureEnoughBrackets may be independently useful.

Thanks

Changed 5 years ago by dom

Sebastian Fischer:

But the problem at hand here is auto-generated AST code, where we cannot rely on the parser to do the right thing. There's help in the AST such that it's possible to explicitly insert brackets where needed, but I agree with Dominic that it shouldn't really be necessary in his case. Neil's point is well taken though - to do it correctly (or rather, minimally) for infix application, the pretty printer would need to be aware of the respective fixities involved.

If I was planning to write a Haskell program that generates Haskell code, should I use HSE? Or is it more for generating nice looking code than correct code? Is there an alternative package that is more suitable for generating code that is meant to be executed rather than being looked at?

However, that doesn't mean we can't do better than what it is now, but be conservative about it. Only insert brackets where it's clear that brackets must be inserted, which would be the case for Dominic's example. If the argument to an application is non-atomic, it needs brackets, there's nothing ambiguous about that. Nothing can be said so categorically for infix applications, so there we should assume that the fixities are already done in the correct way, or that brackets are inserted manually where needed. Does that sound reasonable?

Personnaly, I would prefer Duncans approach to produce correct output by default and require additional fixity information if the output should contain fewer parens. (The reason for my preference is that I think it is quite annoying to insert parens manually into auto generated infix applications only to get correct output.)

In order to help reducing the amount of annoying parentheses in printed code, would it be sufficient - at least for the common case - to do something along the lines of

prettyPrint = prettyPrintWithFixities preludeFixities

and provide make prettyPrint, prettyPrintWithFixities, and preludeFixities public?

Changed 5 years ago by dom

Daniel Schussler:

In HLint I have a bracketing module, which has served me well. Please take any ideas you need from it - http://community.haskell.org/~ndm/darcs/hlint/src/HSE/Bracket.hs . In particular, given a fully bracketed expression, I can call transformBracket to transform the expression, not caring about brackets, in a way that guarantees the right brackets are put back. There is also needBracket and isAtom which are very useful. If you call descendBi (transformBracket Just) it will automatically bracket your term as much as is necessary.

Funny, I did the opposite approach the other day (not saying either is better :)); that is: parenthesize everything while building the AST (with a wrapper for App) and then:

deparenthesize :: (Data a) => a -> a deparenthesize = everywhereBut isString (mkT goE extT goT)

where

isString x = typeOf x == typeOf (undefined
String)

goE (App (Paren (App e1 e2)) e3) =

(App (App e1 e2) e3)

goE (Paren (Paren e)) = Paren e

goE (InfixApp? e1 op (Paren (InfixApp? e2 op' e3)))

| op == op' , knownAssociative op

= InfixApp? e1 op (InfixApp? e2 op' e3)

goE (InfixApp? (Paren (InfixApp? e1 op e2)) op' e3)

| op == op' , knownAssociative op

= InfixApp? (InfixApp? e1 op e2) op' e3

goE x = x

goT (TyApp? (TyParen? (TyApp? t1 t2)) t3) =

(TyApp? (TyApp? t1 t2) t3)

-- add rule for function types too

goT (TyParen? (TyParen? t)) = TyParen? t

goT x = x

knownAssociative x = x elem [QVarOp (UnQual? (Symbol "."))]

Though the infix thing doesn't quite work; apparently they still get printed with parens even if there are no parens in the AST? Or the rule just didn't match for some reason...

Changed 5 years ago by dom

Funny, I did the opposite approach the other day (not saying either is better :)); that is: parenthesize everything while building the AST (with a wrapper for App) and then:

I have utilities in HLint for that too - but I don't want to remove users brackets automatically

Btw, if you use uniplate you might find your code goes faster, and is simpler:

deparenthesize :: (Data a) => a -> a deparenthesize = transformBi goT . transformBi goT

where

(the rest exactly as before, but skipping isString)

I always use Uniplate when working with HSE - they go great together (do import Data.Generics.PlateData?, and you don't need any extra instances or anything)

Changed 5 years ago by dom

Also view patterns don't prettyprint as expected e.g.

(i:is) -> [(PViewPat (Var $ UnQual? $ Ident $ name ++ "View") (PTuple [PVar (Ident (fromJust i)),PApp (UnQual? (Ident "False")) []]))]

prints as

a aView -> (i, False) = (b) (i + -1)

not

a (aView -> (i, False)) = (b) (i + -1)

Of course (i + -1) is itself problematical.

Changed 5 years ago by nibro

  • status changed from new to closed
  • resolution set to fixed

I went for the minimalist approach, in that I only introduce parentheses for non-atomic right subexpressions of applications. Infix application is left untouched, and it is thus assumed that fixities are already handled correctly.

Language.Haskell.Exts> let x = App (App (Var (UnQual (Ident "pay"))) (Var (UnQual (Ident "tPD")))) (App (Var (UnQual (Ident "a"))) (InfixApp (App (Var (UnQual (Ident "length"))) (Var (UnQual (Ident "tOD")))) (QVarOp (UnQual (Symbol "+"))) (Lit (Int (-1)))))
Language.Haskell.Exts> prettyPrint x
"pay tPD (a (length tOD + -1))"

On a separate note, I've also fixed view patterns (and N+K patterns) to introduce the proper parentheses as per Dominic's last comment.

Case closed.

Note: See TracTickets for help on using tickets.