How does git create a file blame?

318 views Asked by At

I recently learned about git blame and what it does. I want to know how git finds when each line was changed in a file, even across file renames. In other words, I want to know how the blame algorithm works.

1

There are 1 answers

2
Obsidian On BEST ANSWER

First of all, the blame feature exists in almost all others SCM too, including CVS. So the algorithm used will vary according to the tool you're using.

Basically, however, the simplest way to achieve this is starting from the most recent state of your file, then browsing history backwards (toward the past) and applying the negative of each changeset.

Every affected row is marked as belonging to last commit, all other rows to previous one. Aside of this, you'll count the number of these latter rows. Then you restart this process with commit n-1 and n-2. If the rows don't explicitly belong to "n-1", they are ignored because this means they've been altered by some more recent commit (actually, the reverse changeset will still be applied, but commit number won't be updated). Otherwise, you apply the same computations, updating the commit number each row belongs to.

You then just have to iterate on this all the way down 'til initial commit if needed but if you reached a state where the "number of rows" quoted above reaches zero, you know you can stop here because it means that all the rows have been altered since the original state of the file and there's no more need to go any further.