A case for automatic code formatting

A case for automatic code formatting

August 2, 2020 (1169 words)

As you might have seen on twitter, I’m pretty much of one opinion where it comes to code formatting and enforcing a common rule in your CI. Here is a sample selection of me screaming into the night, with all the hard to understand brevity that comes with twitter.

Reminder to end the discussion before it starts:

Enforce code formatting with a "clang-format -style=WebKit" in your CI as part of every single PR. Fail every single build that doesn't pass the formatting pass and prevent merge. Then move on to actual programming!

— Jim Tilander July 4, 2020
If your code review involves talking about formatting or lintable warnings, you are doing it wrong! Go and enforce formatting (any formatting) with automatic CI rules, as well as enforcing linting and warning free code.
— Jim Tilander September 27, 2019
A coding style guide should always be accompanied with infrastructure and configs (e.g. clang-format configs) to minimize adherence burden!
— Jim Tilander February 9, 2017
Trust programmers to have opinions about code formatting. In the end it doesn't matter. Understanding code is deeper than just formatting.
— Jim Tilander January 5, 2012

Since code formatting and whitespace style is something that all programmers can have opinions on, of course everyone does. This is why sometimes we jokingly refer to the braces war.

Programmers can change! 10 years of braces war and I’m now typing opening braces at the same line…

— Jim Tilander October 8, 2018

Yet, it is important in a professional setting to realize that formatting matters less than code structure, and the code itself. The formatting of the code should in fact rarely be the subject of any discussion.

Yet it is very common to see too senior people discuss code formatting. Which to me seems like a giant waste of time and money. Discussing where the whitespaces go in a PR by people with capacity to solve much harder problems is a waste of talent. Yet the people who tend to be good programmers tend also have a phase where they meticulously align all their names, indent the keywords just so and have tables and formulas just right in the code. Most get over this phase and then focus on the code, but some don’t.

What if you have no standard?

What if you had a policy that “anything goes”? You can check in any type of formatting? Well, a couple of natural consequences follows:

  • Code formatting and style will vary across the codebase
  • It’s easier to have little islands of code and systems
  • Poking into other people’s code will be a thing
  • People will spend time and energy formatting code manually because they are in other people’s code.
  • Heaven forbid if you accidentally forgot to format the code as a guest coder.
  • Formatting on save is a deadly sin.
  • You will have tab v/s spaces discussions.
  • Code within a single .cpp file will look wildly different as you page through it.
  • Merging branches causes a lot of “false” conflicts purely due to formatting.

In a group of 20 programmers I could also tell without bringing up the source control history, who had written the section of code. Which meant that we had 20 distinct code styles and formatting.

If that last paragraph didn’t scare you, I’ll also mention this: this is not a hypothetical case.

Iron fist checks before trunk

One can remove all of these issues by simply enforcing that all code run through an automatic code formatter. And then you simply fail CI if the code doesn’t match the formatter.

The last point is important. There are a couple of ways that you can enforce that everyone has formatted the code properly. One of the more flexible ways for people to work is to just check if the code is formatted correctly, just before you merge to trunk. If it’s not, then don’t allow the merge.

This allows people to locally use any formatter, editor, experimental code snippets etc – without running afoul of the enforcement. The only time where the code has to be formatted correctly is when you want to promote your code to trunk.

Recipe

  1. First settle on a major, out of the box, style. I usually pick WebKit.
  2. Format all the code in the repository to this style. Really. Without this it doesn’t work.
  3. Stick a rule in your CI that fails the build if you don’t have proper formatting.
  4. Require successful CI builds before merging to trunk.
  5. Provide a 1 step automatic formatting script that developers can use for formatting.

Typical workflow

So when I sit down and code, I normally do this:

$ git checkout master
$ git pull
$ git checkout -b feature/new-thing
$ git commit --allow-empty -m "Starting on task xxx, checking if CI works" && git push

This will kick of a baseline CI build for me, make a PR to comment on and generally make sure that I start from the most recent in master. Now I can start working.

Once I’ve done a bunch of work, I generally run this:

$ git diff
$ git add .
$ make format
$ git diff
$ git commit -m "My feature done"
$ git push

So the first diff just checks my work that I’ve done and makes sure that it’s the expected work. After that I stage my files. And after that I run a make target that will format all the files. Notice that the second diff will just show the differences in formatting! I’ve now got the option to unstage if the formatting caused some catastrophic changes.

Benefits

Having an uniformly formatted codebase brings together the the necessary conditions for rapid trunk based team collaboration. It doesn’t produce this rapid team collaboration, but without a uniformly formatted codebase it’s going to be very difficult to arrive at a highly functional team practicing trunk based development.

You remove most of the formatting only merge conflicts, making life easier for your fellow developers. This in turn empowers you to make quick semantic changes in code that is in flight by your team mates, without worrying about causing headaches for them overly and thus minimizing the chances that they will lynch you.

You will also have 100% less chance of talking about formatting in your PRs, which increases the chances that you will talk about productive things in the PR.

In closing

Everyone will have an opinion about how / where the braces and spaces go. How to format the code. In the end, this doesn’t matter overly, except that it should all be formatted the same way. Pick a way and then enforce it. The benefits can then be acted upon by having more frequent checkins, merges and tighter collaboration. Overall just enacting most of the promise in trunk based development and minimizing risk in development and also surface the important code first, while promoting collaboration and reducing the noise in discussing whitespace. Who doesn’t want this?

Resources