Git analog to Hg's Bigfiles Extension?

1.5k views Asked by At

I want something in git that is similar to Mercurial's Bigfiles Extension (note: I know of git-bigfiles, but that is unrelated).

Basically I want to store large binaries in my git repository, but I don't want to get every version ever of the large binary when I do a clone. I only want to download the large binaries when I checkout a specific revision containing those large files.

1

There are 1 answers

4
Mark Longair On BEST ANSWER

Here are a few options to consider:

Shallow clones: You can add the --depth <depth> parameter to git clone to get a shallow clone of the repository. e.g. if <depth> is 1, this means that the clone will only fetch the files needed for the most recent commit. However, such repositories have awkward restrictions on what you can do with them, as outlined in the git clone man page:

        --depth 
           Create a shallow clone with a history truncated to the specified
           number of revisions. A shallow repository has a number of
           limitations (you cannot clone or fetch from it, nor push from nor
           into it), but is adequate if you are only interested in the recent
           history of a large project with a long history, and would want to
           send in fixes as patches.

In fact, as discussed in this thread that's something of an overstatement - there are useful situations where pushing from a shallow clone will still work, and it's possible that will fit your workflow.

Scott Chacon's "git media" extension: the author describes this in answer to this similar question and in the README on github: http://github.com/schacon/git-media .

Shallow submodules: you could keep all your large files in a separate git repository and add that as a shallow submodule to your main repository. This would have the advantage that you don't have the restrictions of shallow clones for your code, just the repository with the large files.

There also are any number of ways of doing this by adding hooks that (for example) rsync over your large files in from git hooks, but I assume that there are good reasons that you want to keep these files under git's control in the first place.

I hope that's of some help.