I really have to back up the recommendation for Wadler's "prettier printer" paper. I'm working on a Haskell parser and pretty-printer for Rust and I had pretty much given up on ever finding a lightweight pretty printer that would be able to choose whether to put something on one line or to split it up into multiple lines all by itself. For anyone needing this in Haskell, I strongly urge you to consider the modern rewrite of the initial library [1].

Yeah, I can second that recommendation on reading the paper [0]. It's one of the nicest I've read. There is also a Rust crate for it[1] - though it's not as... 'pretty'... as the Haskell EDSL, obviously.

I think there's beauty in the best you can do with a very simple implementation.

Even if you can improve "prettiness" with heroic technical efforts, I think the end result would be uglier. There's something special about simple rules coming together to address just the most important needs.

In the tech world, we tend to put more finesse into things... because we can. But as a result, most software is over-finessed and therefore not finessed at all. A great example is how tortured we let our CSS rules get, rather than allowing a design to relax a little into the constraints of the layout engine.

A hand-set newspaper isn't beautiful because it overcame every constraint, it's beautiful because it accepted its constraints, and made hard tradeoffs in service of a goal.

> If computers are good at anything, they are good at parsing code and analyzing it. So I set out to make this work, and prettier was born. I didn't want to start from scratch, so it's a fork of recast's printer with the internals rewritten to use Wadler's algorithm from "A prettier printer".

Bob Nystrom (munificent) disagrees[0] after writing one himself:

> The search space we have to cover is exponentially large, and even ranking different solutions is a subtle problem.

I will note that one could design a pretty printer that reliably had better performance if the formatting style they chose was simpler and more amenable to it.

My (mostly self-imposed) task was more difficult because I was trying to follow the existing style that humans were hand-applying to their Dart code, and that had a lot of tricky non-local cases that look nice but are hard to automate.

I have to say Dartfmt is one of my favorite things about using Dart. It works amazingly well and the fact that it's not configurable is a great decision [0]. I'm never afraid to peek into unfamiliar code because of that. So thank you.

You're welcome! That means a lot to me. I tried really really hard to make it produce output I and others would like.

> I'm never afraid to peek into unfamiliar code because of that.

This is exactly why dartfmt exists. It's not about making your code more readable to you, it's about making strangers' code more readable, because that lowers the bar to contribution between people in the ecosystem.

This is a great read! From having dipped my toes into this, I'd say a large part of the problem (from the library designer's point of view) is deciding what interface to expose to the library user and making that as language agnostic as possible. The question becomes: what are the fundamental things that usually characterize pretty-printing? Obviously (or maybe not), you'll want indentation, hard-breaks, soft-breaks, ... but where does it end? Less obviously, you may want to have alignment features. Figuring out the right set of building blocks is what I think Wadler did well.

Also, describing how these building blocks interact using laws is really incredibly useful! For that, credit goes (I think) to Hughes [1].

This looks really nice. One of my favourite things about Golang is the use of gofmt and the resulting consistency of all Go programs. The idea of parsing the code to AST representation and then printing it back as source is brilliant, all though it can mess stuff up like the chaining of `.then` as mentioned in the post.

It also helps if you want to write code conversion tools, like gofix. I once stopped writing such a tool for a Python project because it was a pain trying to change only the affected lines, while keeping the idiosyncratic syntax of the rest of the files unaffected.

gofmt doesn't solve the hard problem that this blog post (and Wadler's paper) is about. From the article: "There's an extremely important piece missing from existing styling tools: the maximum line length."

gofmt's decision to not bother to require a line length may work somewhat for Go (which tends to encourage short lines by virtue of lacking expressiveness), but it doesn't work for lots of other languages, including JavaScript.

The fact that gofmt doesn't enforce some arbitrary line length is a blessing. I have a modern widescreen monitor and I use a maximized window text editor. I have no problem with 200 char length lines. I'm more easily annoyed by 80 (or 60, like in this case) char limits, as if I was viewing this code on punch cards.

It can be configurable. The rust community for example sticks to 100 or 120 lines (and rustfmt lets you choose).

I have big screens but side by side diffs on unformatted code on GitHub are always wrapped. Plus I often split my screen. Not everyone uses maximized windows all the time, and in a shared codebase sometimes you want to try and cater to everyone.

You don't usually code for one screen since you're not generally coding for yourself alone.
IMHO unlimited line length + editor soft-wrap is the best way to deal with the problem since it automatically adjust the size to the reader's screen.

Same for tabs for indentation[1], that allows the reader to set its own indentation width depending on its own preferences.

I don't really understand why people keep using spaces + fixed line length. Old habits die hard I think.

That's exactly how I feel about it. Line length is something I couldn't care less about. If it makes sense to be on one line, then it should be. Otherwise, create the new line where it does make sense.

Theoretically, an editor could. I have yet to see one that does this well. Most do dumb wrapping, with no language sensitivity.

I think it's just a matter of programmer habit now -- folks expect their editor to mirror the code without formatting tweaks.

It gets confusing when you realize that programmers may insert their own indentation and wrapping, and sometimes you want to make this part of the actual file, and sometimes you want to remove it before serializing.

Yeah that's a good point. I was referring more the general idea of enforcing style by parsing the code and reconstructing the source from the AST though. This is a better method than the linter approach that you get with eslint for example.

My team recently ganged up on me to tell me my line spacing was f*cked. Upon reflection, it seems that as I'm writing code I group lines into 'working well - one line, 'not sure - two lines', 'probably will change - lots of space', 'hey look at me! - off by itself'. Apparently, this unconcious invisible system drives others crazy. Developers are so picky! ;-)

We've actually got a bunch of automatic eslint rules, but it doesn't correct for this. Looking at the various beautifiers, I couldn't find anything that really cleaned up line spacing very well, and ended up running a bunch of sed commands to remove lines and then spacing out things manually in a "sane" way (so picky!!). Having a formatter that actually parsed the AST and rewrote the code like gofmt will be very handy if it works - I'll have to try it out.

After using Elm and the elm-format package in Emacs for the last few weeks I can say that not having to format the source and just letting your editor do it for you on save or via a command is so very nice.

This is a trend that I feel is going to catch on across any language that can support it. It makes trivial decisions and arguments about formatting a thing of the past.

This has been pretty standard practice in the boring ol' enterprise for at least 10 or 15 years.

Java has Checkstyle, with format-on-save support for IntelliJ, Eclipse, and NetBeans at the very least. Visual Studio supports this for all of the .NET languages. Obviously, Golang has `go fmt`. Etc.

Not to be snarky, but whenever a bold new trend seems to be really "catching on"... it's almost always a rehash of something that the enterprise was doing back in the 90's, or academic researchers were writing papers about back in the 70's. The only things that ever really change in this industry are: (1) solutions that were once impractical on old hardware become practical on newer hardware, and (2) solutions that were over-engineered in their original form come back in more user-friendly simplified forms.

Many communities have been using their own solutions for it for a while (e.g pylint / pep8 if you've used Python, clang-format for C++, js-beautify for javascript, etc) and many companies use "linter as style enforcer" but you still need to watch out for some things that linters don't catch in most languages.

I wonder why this is suddenly accepted. Having very rigid coding standards used to be a no-go for projects you were either not paid for or in an industry where "creative expression" mattered more (i.e. not J2EE).

There were coding styles, but actually enforcing them wasn't something regularly done (linting is okay), for fear of "bondage & discipline" complaints.

...? The first makes me shudder with revulsion every time I see it --- that's not where commas go, dammit --- and there must be a reason, which I've never been able to figure out.

The Elm style guide mentions in passing that trailing commas require a diff that adds a field to modify two lines but one, but that applies to leading commas too if you add the field at the beginning of the list.

I get why this bothers a lot of people, but OTOH I get why some people like it. I think the reason for the former in the mind of the user is that the comma clearly signifies that you are adding another value after it, and you save valuable key strokes not having to add a comma to the end of the preceding line just to add another k:v on the line you're concerned about.

- It allows you to easily delete a line without needing to also modify the line before.
- It also makes your diffs cleaner. With the latter, if you add a line, you'll also have to inspect the `radius: Float` line when doing a code review, even though it only added a comma.

I've looked at pfff and couldn't really figure out what's supposed to work and what isn't. Would love to pretty print python with it. (I've been looking at the many different pretty printers in the python world and haven't found anything that I feel works well enough). Did you use it for pretty printing python?

This has been the single biggest missing tool in the JS ecosystem. I never even realized until I experienced the magic of gofmt. It should honestly be a standard tool for new languages going forward. Kudos to the creator. I'll try it out later today - hopefully it works well for us.

Looking at both github repos, jsbeautifier looks more complicated (as a project), and I don't think it uses the AST or at least I did not see anything that looked like it would create one. Looking at some of the code I saw `string.replace(...)` so it seems to be string-manipulation. It also is for HTML and CSS too it seems, while the project discussed here is just JS, so more targeted, and the approach to go through the AST seems intriguing to me.

(100 characters wide, double indent on the continuation line. This is pretty standard for Java formatting.)

Stated differently, is this meant to be very opinionated in hopes that all JS would follow a uniform style? I think that can work for Go since it was like that from Day 1. For JS, though, it has been out for so long that many teams have developed their own preferred style. I suspect that most of them would avoid an opinionated formatter that differs in a few small ways from their in-house format, if only because it seems silly to break diffs and git blame for a sweeping formatting change.

Configuration has a cost, and the product will be better and more reliable if it's limited. As it stands right now, the configuration isn't ideal for my preferences, but I'm fine with that; either I'll

* use the tool and adopt new patterns (ones which, frankly, have very little impact on anything),
* keep doing this stuff manually (it's gotten me this far!)
* fork the project, and bear the cost of maintaining the updated configuration myself.

If you're using typescript and emacs, https://github.com/ananthakumaran/tide is a very good integration with the built-in language server for semantic auto-completion and "jump-to" support. I use it every day on large projects.

I certainly think that language definitions should include a standard parsed format in addition to their human-friendly syntax. This can just be an s-expression representation of the normal, concrete syntax, without any transformations applied. This way, we don't have to build language-specific rules for precedence, offside/significant-whitespace, fixity, etc. into every tool for that language.

Implementations (compilers and interpreters) should include tools/modes for converting between these two representations, and should support running programs written in the parsed format in addition to those written in the human-friendly syntax. This isn't asking much: the difficult part is parsing the human-friendly syntax, which implementations must already do.

The benefit is that we don't end up with a mismatch between what the compilers/interpreters accept, and what the other tooling accepts (linters, formatters, doc generators, static analysers, search engines, syntax highlighters, use-finders, go-to-definition, refactoring tools, etc.). This would, incidentally, allow people to read and write code as s-expressions, but that's not the point; it's just about representing concrete syntax as closely as possible (plus arbitrary annotations, e.g. for line numbers, etc.), whilst exposing the structure in a machine-friendly way.

This would also make it much easier to make new tools, extend existing ones, and improve practices, e.g. like syntax-aware diffing, standard code formatting (i.e. tools like this one), version-control-friendly representations, tree-based editors, more powerful navigation in editors and IDEs, etc.

Much better than "Standard JS" and its choice of no semicolons. I know about "Semi Standard JS", but I refuse to accept such a little difference that could be a configuration has to live in a entirely new project.

Semicolons are not optional in JavaScript: ASI (http://www.ecma-international.org/ecma-262/6.0/#sec-automati...) is an error correction scheme for novice programmers. The spec's parsing rules calls out the statements following where a semicolon should be "offending tokens". There is no leeway here for style or preference.

“You can use standard --fix to automatically fix most issues automatically.

standard --fix is built into standard (since v8.0.0) for maximum convenience. Lots of problems are fixable, but some errors, like forgetting to handle the error in node-style callbacks, must be fixed manually.”

"There's an extremely important piece missing from existing styling tools: the maximum line length. Sure, you can tell eslint to warn you when you have a line that's too long, but that's an after-thought (eslint never knows how to fix it). The maximum line length is a critical piece the formatter needs for laying out and wrapping code."

How does this handle tabs vs spaces? I looked the post, the repo and the issues page and I didn't find anything. I'm specially concerned about not seeing this in the options of the API, since there are strong opinions on both sides.

It always prints spaces and you can configure the number of spaces with `--tab-width`.

For tabs/spaces and semicolons I'm considering if we should support those options. We need to figure out the goal of the project: is it to converge on generally a single format, or is it to provide formatting options for a few large groups of people that have different opinions. There are issues on the project discussing this right now.

Whilst it's easy to flame about tabs vs spaces and other bike-sheddy issues online, I wonder how much people really care about it? After all, many people just get on with life when it comes to non-optional syntax, e.g. whilst many people flame about Python's use of significant whitespace, I doubt that's been a major reason for many people to actually use a different language, or some semicolon-and-braces-to-indentation preprocessor.

Perhaps stick to just one, as seems to be the case right now (whichever it is), and mention in the documentation that those wanting the other option are free to e.g. write a patch/fork on GitHub/postprocessor/etc., with the understanding that such support will eventually get merged in iff it's actively maintained for some amount of time, its author/maintainer is active in general development and maintenance for the project, and there's significant community adoption of such a patch.

Whilst not perfect, this sort of approach might better determine who actually cares enough about this to offset the community-splitting effects of allowing both; compared to a general accumulation of online grumbling.

Fantastic. One thing I've learned that's incredibly important is code style has to be uniform across all engineers. Seems like an unimportant point to devs that haven't worked with teams that enforce it, but it really is.

As was once told to me, my code should look like your code and yours like mine.

I attempted to use clang-format on a mixed typescript/tsx codebase and found that it mangled jsx expressions. This wasn't surprising but I suspect that typescript users probably overlap heavily with react users so it's not a great solution as is.

Yes, that's a good point. If you ever get a chance to compare to clang-format I'd be interested to read your results. My vague understanding is that clang-format has some sophisticated-ish algorithms for intelligently wrapping complex expressions. However, I also think its output is inferior on code like your .then() chain example.

The entire comment was patronizing. "Nice attempt" minimizes the effort involved and implies the author is a novice. "but currently wrong" categorizes the entire project as a failure because of a single bug. The followup "From my own experience writing a formatter" really drives the whole thing home.

Also, “currently wrong” is a really sweeping statement to make based on a single problem written in a fairly uncommon style. Yes, there are things which need to be fixed but that doesn't mean we have to be so dismissive of someone else's work which they've given away for free to the community.

I think what the GP is saying is to do the re-parsing before/after every time you reformat (and then presumably error exit if the ASTs don't match). When/if you no longer have any known bugs like that you might want to remove it, but it does seem like a good idea until you get there.

I used Recast on a codebase at work to migrate to a new code style, and I ran into a long tail of issues with preserving comments that looked like they would require some significant work to fix. I know the author of Recast (Hi Ben) and corresponded with him a little bit about it at the time.

I know this isn't really practical, but as a rule you shouldn't mix comments and code. Comments should be above any function definitions for this and many other obvious reasons. Removing inline comments seems like a reasonable default for a project like this, since those adopting it have already ceded some amount of control over the way their code is structured.

This will add semicolons all over your code depending on some rules, for example, if you write a return, and write the value you want to return in the next line, it adds a semicolon after the return AND after the value, effectively ignoring the value in the end.

return
myValue;

becomes

return;
myValue;

It's a bit meh, because you have to remember these rules even if you use semicolons.

It seems a pretty format can be subjective depending on personal preference and code complexity. I'm surprised that there aren't and formatted that are customizable in that you break down the format of a function into a few sections and give the user options on how they would best prefer the format to look. (At least there's not a formatter like that which I'm aware of)

This is awesome - this reminds me of Google's clang-format which they use to format Angular automatically, but sounds like it's less sucky. The idea behind it is that developers tend to be opinionated about code style, so having a tool do it for you eliminates those arguments, which can be a hold up for PRs getting merged into a codebase.

This is awesome, but one request: can you please extend it into a real (ie, AST based) diffing tool? Current diffing tools all compare the text (which is basically an input format) rather than the AST, and hence end up focusing on formatting rather than actual changes.

> Many of you know that I usually don't use JSX when writing React code. Over a month ago I wanted to try it out, and I realized one of the things holding me back was poor JSX support in Emacs.

huh? I've been using emacs with WebMode + tide + FlyCheck and it supports jsx just fine. Moreover, tide[1] provides great support for plain javascript (I've used it extensively on a ES6 codebase with great results).

I was actually just looking for something like this the other day because I'm forced to use actual JS like a json config, so it can use Require to import XML data and sprite sheet location right into a config my application loads. Problem with all the other formatters is they made everything on new lines, which would take up a giant amount of space.

Would it be reasonable to have an editor that ran this every time you resize your editor window?

I like to have multiple windows on screen at once, and will often adjust them to make best use of screen real estate. I tend let it soft-wrap the text, but I'd love it if it would do soft-wrapping... prettier.

That sounds like an awesome idea (given that display is totally separated from the format-on-save anyways).

I got a quick feel for it with this in a terminal:

watch -n 0.1 'prettier --print-width $COLUMNS index.js'

... and unfortunately didn't really like it. Maybe in a pinch, but it still ends up feeling really cramped trying to read column in low-width terminals. At the other extreme (having the terminal be the only window on a 21:9 ultrawide), some of the lines get so long that you'd want a max anyways.

I'm playing with this in CodeMirror (browser based code editor) which talks to a local node.js server, and I think it looks really good, as long as you have lower and upper limits as you suggest. Biggest concern is stuff like the undo history. I think it just needs a "formatting" button or even background process when you leave it idle for a bit. Might be better to run on the browser but haven't bothered trying yet.

In the section of 'respecting patterns', I was really expecting that it'd maintain my style of curly braces if they're already there, which has same indentation level (and on new line) for a function's opening and closing braces.

i guess i'm relatively new to javascript (can count years on one hand), but like many devs, i'm OCD about formatting and i'm a huge fan of eslint. i have found that curating a list of rules can require cycles (not to mention differences in opinion). i recently came across this project which had some allure, and i made note to try when i had a moment: https://github.com/sindresorhus/xo, how does "prettier" compare to this?

I'm personally not a fan on alignment because of its impact on the project's commit history and diffs. Not saying it's not nice to look at, I just prefer not changing many unrelated lines because a new variable name is one character longer than the names that surround it.

This tool looks amazing! Would you consider adding a flag for whether to terminate statements with a semicolon, or consider accepting a PR that added such an option? (My team is all-in on `standard`, so I'd love to be able to execute this in a way that would be an easy fixer for our linting rules.)

gofmt — the official formatter of the Go language — made a very wise choice for its formatting strategy. I think it's something other tools can learn from.

gofmt will fix indentation, spacing, and so on, but it will generally preserve structure. For example, this:

a:=Foo{value:42}

becomes, of course:

a := Foo{value: 42}

But! This:

a:=Foo{
value:42,
}

becomes:

a := Foo{
value: 42,
}

That's because gofmt can't really pretend to know that it knows better than the developer here. Sometimes code does need to be loose (like in a DSL or a big declaration, or a test (which must be readable), or similar). Sometimes it should be compact.

This means you never have to fight gofmt. Never once have I disagreed with its decisions.

gofmt works this way because its ruleset isn't exhaustive; it says that, yes, an indent must happen after a hanging "{", but the ruleset doesn't say that a line break must happen. If there's a line break, let's stick with it.

Prettier seems to have a strict normalization approach: AST goes in, canonical form goes out. For example, I tried this fictional piece of code:

Unfortunately, in every instance where I indent the code in this manner, it's for a specific reason. I wanted it on separate lines; the fact that it happens to fit on a single line doesn't matter at all. Prettier overruled my carefully indented code.

I often format code in a specific way for regularity: Every chunk in a block should have the same format, because each chunk is an instance of the same pattern. For example, I might have something like:

Maybe one could use some advanced heuristics to find an optimal balance between width vs. indentation vs. compactness; for example, in the above array, a clever formatter could see that it's an array of object literals, which means that it should prioritize regularity over compactness. If it's an array of something simple (like numbers, but not numbers with trailing comments), it can go compact. Maybe.

I don't use Prettier at the moment, but I know the strictness would drive me nuts. I predict that Prettier is going to cause a lot of frustration and heated discussion as a result of the one-size-fits-all approach. I don't think a canonical form for everything even makes sense; people need regularity (and no surprises), but not at the cost of readability.

I am trying to write my own formatter at the moment, and this is exactly the property I want, also. I'm working in Haskell, and one example is case statements. There are two ways I'd like to format them:

case foo of
A -> ...
B -> ...

or

case foo of
A ->
...

B ->
...

The first is when every case fits on one line, but the second is when at least one case needs to span multiple lines. I don't want this:

case foo of
A ->
...

B -> ...

I haven't yet come up with a good way to do that heuristically and efficiently, but maybe just looking at how the original code is formatted would be enough! Thanks for providing some food for thought :)