The scenario
I'm trying to remove some files from the entire history of a git repository. They all share a couple of common criteria:
- They have "settings" in the file name. They may also have various prefixes and suffixes, though.
- They will be two levels deep inside a certain directory in the file tree. The names of the second level directories vary. There are settings files deeper in the file tree that should not be removed.
Here, then, is an example of the file tree:
root-directory/
|-> apples/
| |-> bad-settings-alpha.txt
| |-> bad-settings-beta.txt
|
|-> oranges/
| |-> bad-settings-gamma.txt
| |-> bad-settings-delta.txt
| |-> navels/
| |-> good-settings.txt
|
|-> good-settings.txt
I need to filter out all the bad-settings
files while keeping the good-settings
files.
My approach
So, using a tutorial provided by GitHub, in conjunction with the manpage for git-rm, I crafted this command (split across two lines):
git filter-branch -f --index-filter 'git rm --dry-run --cached \
--ignore-unmatch root-directory/*/*settings*.txt' --prune-empty -- --all
Of particular note here is the file glob I used: root-directory/*/*settings*.txt
. If I use that file glob with ls
, then I get exactly the list of files I want to remove. So, it should work, right?
Apparently not. If I run my command with that glob, it takes out all the settings files in levels deeper than two as well. In the file tree example above, that means that root-directory/oranges/navels/good-settings.php
would get nuked.
I've tried to solve this on my own, trying variations on the file glob and using the wonderful --dry-run
option for git-rm
. Nothing seemed to work--all I could figure out how to do was change the file tree depth at which started I removing settings files.
I did find one thing that seemed extremely relevant to my problem. In the manpage for git-rm
, there is this example:
git rm Documentation/\*.txt
Removes all *.txt files from the index that are under the Documentation
directory and any of its subdirectories.
Note that the asterisk * is quoted from the shell in this example; this
lets git, and not the shell, expand the pathnames of files and
subdirectories under the Documentation/ directory.
"Removes all...files from the index that are under the...directory and any of its subdirectories" is consistent with what's actually happening. What's really interesting is the mention of the quoted asterisk. I understand that this lets git-rm
handle the file glob expansion as opposed to bash
. Okay. But that leaves these questions:
- Why would I want to do that?
- I'm not quoting my asterisks, so
bash
should be doing the expansion. If that's true, and my file glob works withls
, then why isn't it working withgit-rm
?
I've also seen the example directly beneath the one above, and it seems to do what I'm trying to do. And yet, that does not happen for me, or else I would not be here. It does seem to confirm that I want to do file expansion with bash
, though.
Why not use
find
to show two level deep files:This will give your the list of bad-settings of only two level deep directores. You can them pipe them to
git rm
viaxargs
: