Suppose there is a large project not adhering to a specific python formatting standard, and you want to reformat all the python code using the python formatter black, and suppose the large scale project is pretty big (let's say ~2,000 python files), and ~30 people working on said project, each working on multiple features on multiple branches. Also there could be different semi-frozen "master" branches, that are occasionally merged with the newer code.
Using black on the entire code base would cause super conflicts with any open branch... How would you go about doing this process and minimizing conflicts?
I just had to do something similar, so I thought I'd share what worked for me.
Say you have a main branch and a feature branch that split off main many commits ago. At some point you configured pre-commit on the main branch with black, isort, and other auto-formatters. Now you want to rebase or merge the feature branch into the main branch and avoid all the conflicts.
The main idea here is to rewrite commits in the feature branch with the new formatters, so that any differences between the branches will not be format-related.
First, we need to find the point where feature was split from master:
We can give it a name with some shell trickery:
Now let's apply the formatters to feature_base. For me, this simply involved copying .pre-commit-config.yaml and pyproject.toml from the master branch.
That pre-commit step might take a while, as it will auto-format every single file in your repository. Notice that I added the
-n
option to thegit commit
command, which skips running the pre-commit hook again. This is just to save some time.Now we are ready for the rebase. For convenience, let's give the rebased feature branch a new name, so we'll have the old one easily available in case something goes wrong:
Now to rewrite the feature_formatted branch:
Let's go over the details of this command:
git rebase feature_base feature_formatted
tells git to rewrite the commits in the feature_formatted branch to be on top of the feature_base branch.Normally, this would result in many merge conflicts, all caused by the formatting changes we performed above. The
-Xtheirs
option tells git to not bother trying to actually merge anything, and just use the files as they are in the original feature branch.The
--exec
option tells git to run a specific shell command after every commit in the rebased branch. What we want to do here is to apply the formatting again, since the-Xtheirs
option left us with unformatted files. In order to do that, we "undo" the last commit, apply pre-config, and then commit again.git reset --soft HEAD^
- You might have usedgit reset
before to undo a commit. The--soft
option means that the code changes remain in the staging area, ready to be committed again.pre-commit run
- Since everything is staged for commit, running the pre-commit hook just applies the formatting to all the changed files.git add -u
- This command adds whatever changes the pre-commit run made into the staging area.git commit -C HEAD@{1} --no-edit
- The-C
paramter tells git to copy the commit details from another commit. That other commit isHEAD@[1}
which means "the last place HEAD was". In our case that's the commit we were just on before thegit reset
command.That's it! We're done. Now the feature_formatted branch can be merged into master without any formatted-related conflicts. If you want to rebase onto master, you should avoid the "Applied formatting" commit from feature base, so you should use this command: