Despite easy-looking and readable documentation of git-filter-repo
(https://htmlpreview.github.io/?https://github.com/newren/git-filter-repo/blob/docs/html/git-filter-repo.html) I cannot seem to make it work.
The use-case is: having a repo with submodules (and those submodules also having submodules), leave only a single file/directory from the subsubmodule with all related git history, and delete all other files/submodules (also including deleting the commits). So that it is easier to test, let's take this repo example created from empty files:
mkdir main_repo
cd main_repo
git init
touch main.txt
git add main.txt
git commit -m "Initial commit in main repo"
mkdir submodule1
mkdir submodule2
mkdir submodule3
cd submodule1
# Initialize the submodule repository
git init
# Create and commit files within the submodule
touch file1.txt
git add file1.txt
git commit -m "Initial commit in submodule1"
cd ../submodule2
# Initialize the submodule repository
git init
# Create and commit files within the submodule
touch file2.txt
git add file2.txt
git commit -m "Initial commit in submodule2"
cd ../submodule3
# Initialize the submodule repository
git init
# Create and commit files within the submodule
touch file3.txt
git add file3.txt
git commit -m "Initial commit in submodule3"
mkdir subsubmodule1
mkdir subsubmodule2
mkdir subsubmodule3
cd subsubmodule1
git init
# Create random files and commits
touch subsubfile1.txt
git add subsubfile1.txt
git commit -m "Initial commit in subsubmodule1"
cd ../subsubmodule2
git init
# Create random files and commits
touch subsubfile2.txt
git add subsubfile2.txt
git commit -m "Initial commit in subsubmodule2"
cd ../subsubmodule3
git init
# Create random files and commits
touch subsubfile3.txt
git add subsubfile3.txt
git commit -m "Initial commit in subsubmodule3"
cd ..
git submodule add ./subsubmodule1 subsubmodule1
git submodule add ./subsubmodule2 subsubmodule2
git submodule add ./subsubmodule3 subsubmodule3
git add .
git commit -m "Added sub-submodules in submodule1"
cd ..
git submodule add ./submodule1 submodule1
git submodule add ./submodule2 submodule2
git submodule add ./submodule3 submodule3
git add .
git commit -m "Added sub-submodules to submodules"
The structure is then:
main_repo
---submodule1
------file1.txt
---submodule2
------file2.txt
---submodule3
------file3.txt
------subsubmodule1
---------subsubfile1.txt
------subsubmodule2
---------subsubfile2.txt
------subsubmodule3
---------subsubfile3.txt
So let's say that I want to keep only main_repo/submodule3/subsubmodule3
directory (in case it would have more files) and related commits. All other files & submodules must be deleted with the respective history parts.
I tried:
- Using
git-filter-repo
on themain_repo
. But to no success: it does not seem to "see" inside the submodules. Moreover, it does not delete the other submodules even if I use it asgit-filter-repo --path submodule3/subsubmodule3 --force
. Evensubmodule1
andsubmodule2
remain in this case, even though they are directly visible tomain_repo
. - I tried to go from bottom and apply
git-filter-repo --path subsubmodule3 --force
tosubmodule3
, thengit-filter-repo --path submodule3 --force
tomain_repo
. This seemed a bit better but all the other submodules such assubmodule1
andsubmodule2
still could not be deleted.
Question:
- Is
git-filter-repo
the right tool and I am just using it very poorly? Or is it unsuitable for this use-case? - Can I use some readily-available tool for this use-case, or I have to end-up writing complicated extraction scripts (which probably is out of my competence tbh)?