Python: Finding a mapping of all commits to their diffs for a given git repository

445 views Asked by At

I am creating a tool for analysis of a Git repository, but have stumbled at something that should (seemingly) be quite simple.

I want to create a mapping of commits to diffs (i.e. actual blob changes line by line for the given commit); I have tried using GitPython but haven't had any success. I would like to achieve something like this:

def get_all_commits(REPO_URL):
    chromium_repo = Repo(REPO_URL)
    commits = list(chromium_repo.iter_commits())
    commmit_diffs = {}
    for commit in commits:
        diff = # get all blob changes for commit
        commit_diffs[commit.hexsha] = diff
    return commit_diffs

but am not sure how to get all blob changes for a given commit. commit_diffs would be in the form:

{ 232d8f39bedc0fb64d15eed4f46d6202c75066b6 : '<String detailing all blob changes for given commit>' }

Any help would be great.

1

There are 1 answers

0
Fraser Price On

I was unaware of the git diff <commit_a> <commit_b> command. The following (I think!) solves the issue:

def get_all_commits(REPO_URL):
    repo = Repo(REPO_URL)
    commits = list(repo.iter_commits())
    commmit_diffs = {}
    for index, commit in enumerate(commits):
        next_index = index + 1
        if next_index < len(commits):
            commit_diffs[commit.hexsha] = repo.git.diff(commits[next_index], commit)
    return commit_diffs