Access git log data using ruby rugged gem?

2.1k views Asked by At

For a given file in a git repo, I'd like to look up the SHA of the last commit in which the file was modified, along with the timestamp.

At the command line, this data is visible with git log for a particular file path, e.g.

git log -n 1 path/to/file

Using the "git" gem for ruby I can also do this:

require 'git'
g = Git.open("/path/to/repo")
modified = g.log(1).object(relative/path/to/file).first.date
sha = g.log(1).object(relative/path/to/file).first.sha

Which is great, but is running too slowly for me when looping through lots of paths. As Rugged uses C libraries instead, I was hoping it would be faster but cannot see how to construct the right query in the rugged syntax. Any suggestions?

1

There are 1 answers

4
Arthur Schreiber On

This should work:

repo = Rugged::Repository.new("/path/to/repo")
walker = Rugged::Walker.new(repo)
walker.sorting(Rugged::SORT_DATE)
walker.push(repo.head.target)
commit = walker.find do |commit|
  commit.parents.size == 1 && commit.diff(paths: ["relative/path/to/file"]).size > 0
end
sha = commit.oid

Taken and adapted from https://github.com/libgit2/pygit2/issues/200#issuecomment-15899713

As an aside: Just because rugged is written in C does not mean that costly operations suddenly become cheap and quick. Obviously, you save a lot of string parsing and stuff like that, but this is not always the bottleneck.

As you're not interested in the actual textual diff here, the libgit2 GIT_DIFF_FORCE_BINARY might be something that could also help in increasing the performance of this lookup - unfortunately this is not yet available in Rugged (but will be, soon).

Testing this with the Rugged repo itself, it works correctly:

repo = Rugged::Repository.new(".")
walker = Rugged::Walker.new(repo)
walker.sorting(Rugged::SORT_DATE)
walker.push(repo.head.target)
commit = walker.find do |commit|
  commit.parents.size == 1 && commit.diff(paths: ["Gemfile"]).size > 0
end
sha = commit.oid # => "8f5c763377f5bf0fb88d196b7c45a7d715264ad4"

walker = Rugged::Walker.new(repo)
walker.sorting(Rugged::SORT_DATE)
walker.push(repo.head.target)
commit = walker.find do |commit|
  commit.parents.size == 1 && commit.diff(paths: [".travis.yml"]).size > 0
end
sha = commit.oid # => "4e18e05944daa2ba8d63a2c6b149900e3b93a88f"