The merge operation doesn't really affect any branch, in one fundamental sense. (It does of course make a new commit, which affects that branch in the usual way.) The trick with Git is to keep the following five simultaneous ideas in your head:
What matters in Git are commits, and their parent links. Branch names are mostly just distractions (but see points 2 and 3).
A branch name is just the name for a particular commit, which we call the tip commit of that branch.
When making a new commit, Git writes the new commit with the current commit as its parent.1 If the new commit has multiple parents (see next point), the current commit becomes its first parent. In any case Git then updates the branch-name to point to the new commit. This is how branches "grow".
A merge commit is a commit with two (or more) parent commits. This is "merge as a noun", as it were.
The act of making a merge—by which I mean a merge commit—involves doing the three-way-merge action, then making a new commit as usual, except that the new commit has two (or more) parents. The "extra" parents are the merged-in commit(s).2
The merge action—"merge as a verb"—uses the history built up through the five points above. Git finds three commits:
- The current commit, aka
HEAD
. (This is easy.)
- The commit(s) to be merged: whatever ID(s)
git rev-parse
comes up with for the argument(s) you pass to git merge
. A branch name just finds the branch-tip commit.
- The merge base. This is where the commit history comes in, and this is why you need to draw graph fragments.
The merge base of any two commits is loosely defined as "the (first) point where the graph comes back together":
...--o--*--o--o--o <-- branch1
o--o--o--o <-- branch2
The name branch1
points to the tip (rightmost) commit on the top line. The name branch2
points to the tip commit on the bottom line. The merge base of these two commits is the one marked *
.3
To perform the merge action, Git then diffs (as in git diff
) the merge base commit *
against the two tips, giving two diffs. Git then combines the diffs, taking just one copy of each change: if both you (on branch1
) and they (on branch2
) changed the word color
to colour
in README
, Git just takes the change once.
The resulting source, as stored in the work-tree, becomes the tree for the merge commit. Note that up until this point, it does not matter whether we are merging branch2
into branch1
, or branch1
into branch2
: we will get the same merge base, and have the same two tip commits, and hence get the same two diffs and combine those two diffs in the same way to arrive at the same work-tree. But now we make the actual merge commit, with its two parents, and now it matters which branch we're on. If we are on branch1
, we make the new commit on branch1
, and advance the name branch1
:
...--o--o--o--o--o---M <-- branch1
/
o--o--o--o <-- branch2
The new merge commit has two parents: one is the old tip of branch1
and the other is the tip of branch2
.
Because we now have a new graph, a later git merge
will find a new merge base. Let's say that we make several more commits on both branches:
...--o--o--o--o--o---M--o--o <-- branch1
/
o--o--o--*---o--o--o <-- branch2
If we now ask to merge the two branches, Git first finds the base. That's a commit that's on both branches, nearest to the two tips. I've identified this commit as *
again, and look where it is: it's the commit that used to be the tip of branch2
, back when we did the merge.
Note that this is still the case regardless of which way we do the merge.
It is, however, critical that we make an actual merge commit. If we use git merge --squash
, which does not make a merge commit, we will not get this kind of graph. It's also important that neither branch gets "rebased" after merging, since git rebase
works by copying commits, and git merge
works on the basis of commit identities and following parent pointers. Any copies are different commits, so any old commits will not point into the new copied commits. (It's OK to rebase commits after the merge point—to the right, in these drawings; what's not OK is copying commits that are to the left.)
If you do not prohibit git merge
from doing a "fast forward" operation, it's also possible for git merge
to skip making a merge commit, and instead just move the branch label. In this case the two branch labels—the one you just moved, and the one you asked to merge—wind up pointing to the same commit. Once this happens, there's no way to "untangle" the two branches except by moving the label back. To prevent git merge
from doing this fast-forward instead of actually merging, use --no-ff
.
Here is an example of a fast-forward "merge" (in quotes because there is no actual merge). We start, as usual, with diverged branches—but there are no commits on the current branch, branch1
, that are not already also on the other branch, branch2
:
...--o--* <-- branch1
o--o--o <-- branch2
If, while sitting on branch1
, we run git merge branch2
—note the lack of --no-ff
—Git notices that no actual merging is required. Instead, it does a label "fast forward" operation, sliding the name branch1
forward until it meets the tip commit on branch2
:
...--o--o
o--o--o <-- branch1, branch2
This graph has nowhere to record any "separateness" between the two branches, so we might as well straighten out the kink:
...--o--o--o--o--o <-- branch1, branch2
until we make new commits on branch2
:
...--o--o--o--o--* <-- branch1
o <-- branch2
There's nothing wrong with this, but note how it is now impossible to tell that the three commits we "moved up" to the first row were merged.
1This is true for regular git commit
and for git merge
, but not for git commit --amend
. The "amend" variant of a commit makes a new commit as usual, but instead of setting the current commit as the new commit's parent, it sets the current commit's parents (as many of them as there are, which may even be no parents at all) as the new commit's parents. The effect is to shove the current commit aside, making it seem as though the commit has changed, when in fact the old commit is still in the repository.
2The more-than-two-parents case is called an "octopus merge" and we can ignore it here. (It does nothing you cannot do with repeated pairwise merges.)
3In complex graphs there may be more than one such "first point". In this case, all lowest-common-ancestor nodes are merge bases, and for Git, the -s strategy
merge strategy argument decides how to handle this case. (Of course, there is also the -s ours
strategy, which ignores all the other commits, and simply bypasses the three-way merge code entirely. But I'm assuming normal merge here.)