git-filter-repo: how to filter with submodules present

77 views Asked by At

Despite easy-looking and readable documentation of git-filter-repo (https://htmlpreview.github.io/?https://github.com/newren/git-filter-repo/blob/docs/html/git-filter-repo.html) I cannot seem to make it work.

The use-case is: having a repo with submodules (and those submodules also having submodules), leave only a single file/directory from the subsubmodule with all related git history, and delete all other files/submodules (also including deleting the commits). So that it is easier to test, let's take this repo example created from empty files:

mkdir main_repo
cd main_repo
git init
touch main.txt
git add main.txt
git commit -m "Initial commit in main repo"
mkdir submodule1
mkdir submodule2
mkdir submodule3
cd submodule1

# Initialize the submodule repository
git init

# Create and commit files within the submodule
touch file1.txt
git add file1.txt
git commit -m "Initial commit in submodule1"

cd ../submodule2

# Initialize the submodule repository
git init

# Create and commit files within the submodule
touch file2.txt
git add file2.txt
git commit -m "Initial commit in submodule2"

cd ../submodule3

# Initialize the submodule repository
git init

# Create and commit files within the submodule
touch file3.txt
git add file3.txt
git commit -m "Initial commit in submodule3" 

mkdir subsubmodule1
mkdir subsubmodule2
mkdir subsubmodule3

cd subsubmodule1
git init
# Create random files and commits
touch subsubfile1.txt
git add subsubfile1.txt
git commit -m "Initial commit in subsubmodule1"

cd ../subsubmodule2
git init
# Create random files and commits
touch subsubfile2.txt
git add subsubfile2.txt
git commit -m "Initial commit in subsubmodule2"

cd ../subsubmodule3
git init
# Create random files and commits
touch subsubfile3.txt
git add subsubfile3.txt
git commit -m "Initial commit in subsubmodule3"

cd ..

git submodule add ./subsubmodule1 subsubmodule1
git submodule add ./subsubmodule2 subsubmodule2
git submodule add ./subsubmodule3 subsubmodule3
git add .
git commit -m "Added sub-submodules in submodule1"

cd ..

git submodule add ./submodule1 submodule1
git submodule add ./submodule2 submodule2
git submodule add ./submodule3 submodule3
git add .
git commit -m "Added sub-submodules to submodules"

The structure is then:

main_repo
---submodule1
------file1.txt
---submodule2
------file2.txt
---submodule3
------file3.txt
------subsubmodule1
---------subsubfile1.txt
------subsubmodule2
---------subsubfile2.txt
------subsubmodule3
---------subsubfile3.txt

So let's say that I want to keep only main_repo/submodule3/subsubmodule3 directory (in case it would have more files) and related commits. All other files & submodules must be deleted with the respective history parts.

I tried:

  • Using git-filter-repo on the main_repo. But to no success: it does not seem to "see" inside the submodules. Moreover, it does not delete the other submodules even if I use it as git-filter-repo --path submodule3/subsubmodule3 --force. Even submodule1 and submodule2 remain in this case, even though they are directly visible to main_repo.
  • I tried to go from bottom and apply git-filter-repo --path subsubmodule3 --force to submodule3, then git-filter-repo --path submodule3 --force to main_repo. This seemed a bit better but all the other submodules such as submodule1 and submodule2 still could not be deleted.

Question:

  • Is git-filter-repo the right tool and I am just using it very poorly? Or is it unsuitable for this use-case?
  • Can I use some readily-available tool for this use-case, or I have to end-up writing complicated extraction scripts (which probably is out of my competence tbh)?
0

There are 0 answers