How to list branches that contain an equivalent commit

5.6k views Asked by At

In a prior question someone provided an answer for finding branches that contained an EXACT commit:

How to list branches that contain a given commit

The accepted answer highlighted that this only works for an EXACT commit id, and not for an identical commit. It was stated further that Git Cherry can be used to solve this.

Git cherry SEEMS to be geared for the reverse; finding commits NOT pushed upstream. This is useless if I don't know which branch created it and what is upstream of what. So I don't see how it's going to help solve this problem.

Can someone explain / provide an example of how to use git cherry to find all branches that contain the 'equivalent' of a specific commit?

4

There are 4 answers

1
Andrew C On BEST ANSWER

Before you can answer the question of which branches contain an equivalent commit you have to determine "which commits are equivalent". Once you have that, you simply use git branch --contains on each of the commits.

Unfortunately, there is no 100% reliable way to determine equivalent commits.

The most reliable method is to check the patch id of the changeset introduced by the commit. This is what git cherry, git log --cherry, and git log --cherry-mark rely on. Internally, they all call git patch-id. A patch id is just the SHA1 of the normalized diff of changes. Any commit that introduces identical changes will have the same patch id. Additionally, any commit that introduces mostly identical changes that differ only in whitespace or the line number where they apply in the file will have the same patch id. If two commits have the same Patch ID, it is almost guaranteed that they are equivalent - you will virtually never get a false positive via the patch id. False negatives occur frequently though. Any time you do git cherry-pick and have to manually resolve merge-conflicts you probably introduced differences in the changeset. Even a 1 character change will cause a different patch id to be generated.

Checking patch ID requires scripting as Chronial advises. First calculate the patch id of the Original Commit with something like

(note - scripts not tested, should be reasonably close to working though)

origCommitPatchId=$(git diff ORIG_COMMIT^! | git patch-id | awk '{print $1}')

Now you are going to have to search through all the other commits in your history and calculate the Patch IDs for them, and see if any of them are the same.

for rev in $(git rev-list --all)
do
   testPatchId=$(git diff ${rev}^1..${rev} | git patch-id | awk '{print $1}')
   if [ "${origCommitPatchId}" = "${testPatchId}" ]; then
      echo "${rev}"
   fi
done

Now you have the list of SHAs, and you can pass those to git branch -a --contains

What if the above doesn't work for you though, because of merge conflicts?

Well, there are a few other things you can try. Typically when you cherry-pick a commit the original author name, email, and date fields in the commit are preserved. So you will get a new commit, but the authorship information will be identical.

So you could get this info from your original commit with

git log -1 --pretty="%an %ae %ad" ORIG_COMMIT

Then as before you would have to go through every commit in your history, print that same information out and compare. That might give you some additional matches.

You could also use git log --grep=ORIG_COMMIT which would find any commits that references the ORIG_COMMIT in the commit message.

If none of those work you could attempt to look for a particular line that was introduced with the pickaxe, or could git log --grep for something else that might have been unique in the commit message.

If this all sounds complicated, well, it is. That's why I tell people to avoid using cherry-pick whenever possible. git branch --contains is incredibly valuable and easy to use and 100% reliable. None of the other solutions even come close.

1
Christoph Zauner On

Command

Use the following Bash command (replace <COMMIT HASH> with the commit hash you are searching for):

PATCH_ID=$(git show <COMMIT HASH> | git patch-id | cut -d' ' -f1) \
&& ALL_MATCHING_COMMIT_HASHES=$(git log --all -p | git patch-id | grep $PATCH_ID | cut -d' ' -f2) \
&& for HASH in $ALL_MATCHING_COMMIT_HASHES; do echo "$(git branch -a --contains $HASH) (commit $HASH)"; done 

Example output

user@host test_cherry_picking $ PATCH_ID=$(git show 59faabb91cfc8e449737f93be8c7df3825491674 | git patch-id | cut -d' ' -f1) \
&& ALL_MATCHING_COMMIT_HASHES=$(git log --all -p | git patch-id | grep $PATCH_ID | cut -d' ' -f2) \
&& for HASH in $ALL_MATCHING_COMMIT_HASHES; do echo "$(git branch -a --contains $HASH) (commit $HASH)"; done

* hotfix (commit 59faabb91cfc8e449737f93be8c7df3825491674)
master (commit bb5fa0d16931fa1d5fa9f5e9ee5c27634fad7da8)

user@host test_cherry_picking $

Description

Calculates the PATCH ID for a given GIT REVISION PARAMETER (e.g. the hash of a commit). Then finds all commits with the calculated PATCH ID. Finally all branch names which contain these commits are printed to the console.

This of course only works as long as the PATCH ID is the same for all (cherry-picked) commits. Any time you cherry-pick and have to manually resolve merge-conflicts you probably introduce differences in the changeset. This will lead to different PATCH IDs.

2
John Mellor On

The following seems to work (but hasn't been tested much). It runs git cherry for each local git branch, and prints the branch name if git cherry doesn't list the commit as missing from the branch.

# USAGE: git-cherry-contains <commit> [refs]
# Prints each local branch containing an equivalent commit.
git-cherry-contains() {
    local sha; sha=$(git rev-parse --verify "$1") || return 1
    local refs; refs=${2:-refs/heads/}
    local branch
    while IFS= read -r branch; do
        if ! git cherry "$branch" "$sha" "$sha^" | grep -qE "^\+ $sha"; then
            echo "$branch"
        fi
    done < <(git for-each-ref --format='%(refname:short)' $refs)
}

See Andrew C's post for a great explanation of how git cherry actually works (using git patch-id).

0
solstice333 On
$ for i in `git rev-list --all --grep="something unique in the commit message"`; do git branch --all --contains $i; done | sort | uniq