find the first or last commit a patch applies to

1.4k views Asked by At

Assuming a patch was created from a specific commit in the past, and no longer applies to HEAD.

How can I find either the first or better the last commit in the history of HEAD, where this patch applies with "git apply" ? Maybe something with git bisect? But which command will tell me if a patch applies?

Ideally I want to move back to that commit, apply the patch, then rebase on or merge with the original HEAD, and diff again to create a new patch, unless there was a conflict. After this I would like to go back to the original HEAD, so I can continue with more patches.

Background: There is a number of patches that need to be rerolled... (and yes, there are ecosystems where patches are still a thing..)

2

There are 2 answers

6
AudioBubble On

This answer assumes that the patches were created with git diff, and not git format-patch, and that your default pager for your git log is less.

Here is an example of a patch created from git diff <sha1> <sha2>,

diff --git a/osx/.bash_profile b/osx/.bash_profile
index c7b41df..fb80367 100644
--- a/osx/.bash_profile
+++ b/osx/.bash_profile
@@ -3,6 +3,10 @@
 # Setup PATH for Homebrew packages
 export PATH=/usr/local/bin:$PATH

+# Setup Scala variables
+export SCALA_HOME=/usr/local/Frameworks/scala # Symlinked directory
+export PATH=$PATH:$SCALA_HOME/bin
+
 # Initialize rbenv,
 # https://github.com/sstephenson/rbenv#homebrew-on-mac-os-x
 eval "$(rbenv init -)"

Take this line:

+export SCALA_HOME=/usr/local/Frameworks/scala # Symlinked directory

and search for it in git log --patch or git log -p. Type / when in less, then enter the regex you want to search for:

/\+export SCALA_HOME=/usr/local/Frameworks/scala # Symlinked directory

The + is escaped with \ here, because it's a special character in regexes. Hit enter to find the first match, and n to bring up the next match, or N to go to the previous match.

This will help you find commits that might be possible candidates for where the patch came from. You can also use spacebar in less to page down, and b to page up.

1
JustinB On

The way git wants you to do this

git apply --3way should locate the base versions of each file using blob hashes and merge forward all in one step, assuming they exist somewhere in your repo history and you can deal with the merge conflicts. That's possibly an easier solution for many people.

The way to do what you asked for

If you still really want to know a historical commit that contains the base files a diff came from, my script below expands one of the solutions to locate commits containing a single blob hash to try and find commits containing a group of blob hashes pulled from a patch file.

#!/bin/sh
# git-find-patch-base takes a patch produced by "git diff" and tries to locate commit(s)
# containing all source blobs

# The first parameter is the name of the patch file to examine
patch_file="$1"
# Any remaining parameters are passed as a group to the git log command using $@ below
shift

# Make a temporary file and capture a list of all the starting
# file blob hashes that the patch used in it. Note: Adding a file shows
# a starting hash of 00000000, so we filter that one out...
tmp_blob_file=$(mktemp)
echo "Examining patch file \"$patch_file\"..." 1>&2
grep -E "^index" "$patch_file" | colrm 1 6 | colrm 10 | sort | uniq | grep -v 00000000 > "$tmp_blob_file"

# Count how many unique blob hashes we identified
blobcount=$(cat "$tmp_blob_file" | wc -l)
echo "Found $blobcount unique blob hashes in patch..." 1>&2

# Use git log to get a list of commits to check against. Then, for
# each of those commits, count how many of the blob hashes that we
# wanted appear in it, and output the commit hash if it's at least the
# ideal blob count. Note: this is an imperfect searching method, since
# there is a chance for hash collision, exacerbated since the grep is not
# forcing the short hashes to only match the beginning of the long
# hashes.
echo "Searching log/tree history of git..." 1>&2
git log "$@" --pretty=format:'%T %h %s' \
| while read tree commit subject ; do
    if test $(git ls-tree -r "$tree" | grep -f "$tmp_blob_file" | wc -l) -ge "$blobcount" ; then
        echo "$commit" "$subject"
        break
    fi
done

# Clean up the temporary file we made...
rm "$tmp_blob_file"

The first parameter is the name of the patch file to analyze, and any remaining parameters are passed to git log to help expand/restrict the list of commits to check against. If you want the first commit relative to a specific branch, you can run git-find-patch-base foo.patch branchname. If you're completely lost as to where something is from, you can run git-find-patch-base foo.patch --all and go get some coffee while it does it's thing. There are a lot of useful limiters on git log like --grep or --author that can speed up this process.

The script as shown stops on the first match with the break out of the while loop. You can remove that and it will exhaustively search all the way back spitting out all candidate commits.