Let's say we have a Pull Request merging the commits from branch A into branch B, and we can perform the merge with normal merge and squash merge. And if we first perform the merge with squash merge (all the commits will be combined into only one commit) and then submit another similar PR from branch A to branch B, why does git still allow the merge in the normal way (all the commits will be kept)? I mean the changes have already been merged into branch B with the squash merge, and why does it not cause any conflict when having the 2nd merge in the normal way?
How does git handle squash merge vs normal merge?
118 views Asked by Jason Yu AtThere are 3 answers
...if we first perform the merge with squash merge (all the commits will be combined into only one commit) and then submit another similar PR ... why does git still allow the merge in the normal way (all the commits will be kept)?
To restate the problem- first you're merging a branch with squash, and then you're merging the same branch again with a normal merge.
After the first PR with the squash merge, you should observe that the second PR brings in a bunch of commits but with no file changes. This is why there aren't any conflicts, since you can't have conflicts if there is no change in state. The reason that it "allows" you to do it, is because when you merge you are bringing in the new commits, and sometimes it makes sense to do this even if there isn't a change in state. A common scenario where you want to do this is when you decide to cherry-pick some commits from a development branch, into a release branch so it can be deployed sooner. After that you may merge the release branch back down to the development branch to make sure it stays in sync, but since those changes are already in both branches, the merge only brings in the new commit IDs without any actual changes.
BTW, intending to squash merge followed by a non-squash merge of the same branch is pretty pointless. Instead, before you've done either option, decide if you want the granularity of the commits (regular merge) or you don't (squash merge). Then pick just one. Perhaps the only time it would make sense to perform a regular merge of an identical branch after a squash merge, would be if you already did the squash merge and regretted it, and then realized you wanted to keep the granularity of the commits. Note the reverse is not true; it would never make sense to purposely squash merge the same branch after a regular merge, since the squash merge will add literally zero value, with there being no new content and no existing commits to merge.
The squash merge brings in the changes. The second true merge[1] brings in no changes but connects the two histories with a parent pointer to each.
Given this state:
cd /tmp
dir=$(mktemp -d)
cd $dir
git init
touch readme.md
git add readme.md
git commit -m readme
git checkout -b other
printf "change 1\n" >> a.txt
git add a.txt
git commit -m 'a first'
printf "change 2" >> a.txt
git add a.txt
git commit -m 'a second'
You now have:
a second (other)
a first
/
main
You do a squash merge:
git checkout main
git merge --squash other
# For some reason you need to finalize it like this
git commit --no-edit
and get:
squash
| a second
| a first
| /
main
The state of main
and other
are identical:
$ git diff main other
[empty]
But you can still do a true merge:
git merge --no-edit other
And you have:
merge -
squash \
| a second
| a first
| /
main
Why does git allow you to do a true merge? Because you are telling it that
you want to connect these two histories. And they haven’t been connected
yet; the squash merge has no relation to other
since it just takes the
changes from other
and makes a new commit, not related to other
(as
you can see in the diagram).
It doesn’t matter that main
and other
have the exact same tree;[2]
the histories still need to be connected.
The squash merge might as well have been done by a different person who
came along and did a commit with the same contents on top of the initial
main
commit:
unrelated
| a second
| a first
| /
main
Maybe this person had the same idea as you and happened to implement the same thing. And if you did a true merge then you would get the same result:
merge ---
unrelated \
| a second
| a first
| /
main
What git-merge(1) does when the tree contents are the same
Say you have main
and other
and they have the same tree (empty
diff). By default it will:
- If
other
is ahead ofmain
andmain
has no commits thatother
does not have:[3] do a fast-forward - If
other
is ahead ofmain
andmain
has commits which are not reachable fromother
:[4] do a true merge
The merging of the contents of these two will be a no-op since there is nothing to merge. The only work that needs to be done is to make parent pointers in the case of (2).
And (2) is always the case if you first did a squash merge of other
into main
. Because main
will have at least one commit which is not
reachable from other
, namely the squash merge.
Notes
- I’ll refer to a merge which creates a new commit which points to both parents as a true merge since I think the git(1) documentation does that.
- The contents are identical: file
readme.md
anda.txt
with the same file contents. That’s what we found when we did the diff. - For example:
main - a - b - c (other)
- For example:
main - a - b
;a - b2 (other)
When you squash, git does not keep any information (other than, perhaps, the comment) about the commits that were merged so, unlike a real merge, git cannot know that the original branch was merged already. That's why it is discouraged to use squashes when you are dealing with long-runnning branches.
In a real merge, the common ancestor between two branches that have been merged changes but in a squash-merge the common ancestor does not move so later merges between the 2 branches will easily produce conflicts, either conflicts that were taken care of in previous squash-merges or new conflicts.
To explain it graphically, suppose you have this for starters:
At this point, what is the latest common ancestor?
AAA
, right?Now, suppose you do a real merge, we get something like this:
Good. What is the latest common ancestor?
Tip of he answer: It's
GGG
. Make sure you digest that before moving on.Now, suppose you keep on working on both branches and you end up with this:
If you tried to merge again, git would need to consider the changes after the last common ancestor, which we already know is
GGG
, right? So, git would need to consider this for the merge:Now, let's go back to see how it would be if we had squash-merged instead. After the first squash-merge, we would get:
And, now.... what is the latest common ancestor? It's still
AAA
,... and now, on both branches you have a lot of common code... and not so common code that might have been adjusted from conflict resolution because of the squash-merge inHHH
. How it would look if you had continued working on both branches?If you tried to merge, git would have to start over considering the changes from
AAA
, notGGG
, as it happened before.... and given that you have a lot of common code coming from the squash and it's very likely that both branches might have touched those sections of code (which makes them different from git's POV), then you will get a bunch of conflicts.... it's actually very likely you will get the same conflicts you got when you did the first squash merge(content on each branch will be a little bit different from the original conflict, actually... but it will be the same section of code) plus a few more... just for the fun of it.So, all in all... it's ok to squash, but it should be done for short-lived branches like feature branches that you work on and you kill them once they are merged.... if you are dealing with long-running branches, make sure to use real merges, unless you would like to take a peek at what hell looks like.
Now, about there not being any conflicts: git will not produce a conflict if exactly the same change is coming from the branches being merged.... If you squashed and then try to merge the real branch (without additional changes) then to git the same thing is coming from both branches so it's ok. There are scenarios (like when cherry-picking) when git complains about there not being any real change being introduced by the cherry-pick operation and then you need to decide what to do (skip it, create am empty commit)... This is an scenario I'd like to see if git does not complain about and allows the merge to go just like that.