remove all but selected files in all Git commits

382 views Asked by At

I have several "interesting" files (which I have touched) among all other files in the Git history. And I want to publish the "interesting" files only with their history as a Git repo, without any other files being present anywhere in the history of this repo.

How to write a smart script for git filter-branch --index-filter? (Or at least for git filter-branch --tree-filter, which is however undesirable, since it is slower, and my saved trees are huge.)

Note that my question is a bit different to the most common similar one people are asking 12: How to remove a specific ("sensitive") file from the Git history? I need to remove the complement, and keep the specific files.

1

There are 1 answers

0
imz -- Ivan Zakharyaschev On BEST ANSWER

So, the tricky part in this script for git filter-branch --index-filter is to get the list of files from the index, filter out the specific ones, and then remove the resulting list.

I have implemented this as a separate executable script git-update-index-keeping-only; here is the rough implementation:

git ls-files --full-name \
| fgrep -v -x -f <(echo "$FILELIST") \
| xargs git rm --cached "$@" --

where I haven't thought much about what would happen to newlines and spaces in the filenames (spaces must be a problem for xargs, unless it is told to invoke the command again for each argument, which I didn't do for efficiency).

A sample usage is written down in another script useful for my use case: get the list of interesting as those modified or added in the diff between 2 commits (say, an "upstream" commit and your last commit on top of that).

It's git-filter-only-files-modified-since; its essence is like this:

FILES="$(git diff-tree "$SINCE": HEAD: \
   -r --name-only --diff-filter=MACRT)"
export FILES
git filter-branch \
   --index-filter 'echo "$FILES" | git-update-index-keeping-only -q'