I work with a code base shared by a number of people and have a particular branch that is for my personal use only. I made a huge mistake and about 10 commits ago removed a particular file type from the git ignore and then pushed a number of files (which added up to a lot of memory) to the remote of my personal branch. Now our repo is taking up a lot of memory and I need to somehow remove these files from the past commits. I have already added the file type back to the git ignore and removed the files for future commits, now it is only the past commits which I need to remove the files.

I have done some research and know I have options such as git filter-branch and git filter-repo, while other solutions seem to only pertain to modifying one commit back. I prefer not to use git filter-branch as I have seen many warnings about it. I have tried git filter-repo but I think I am misunderstanding how to push my changes only to the branch I want to change (not having to merge them into or effect the master branch which I am not able to push to). If there is a simple solution which involves editing a single commit that would also work. Also note that is not one large file but many (10-20) files that I need to remove (existing in a few different paths).

I am hoping the solution will look something like this: Checkout the branch I want to change the history of > run some sort of git command to remove the files from that branch or to change particular commits (would be easy if I could just edit the git ignore of those commits) > push those changes to the remote version of that specific branch

The result being that I have successfully removed files from old commits on the remote version of this one branch thus reducing the overall size of our repository.

This was a huge screw up on my part, any and all advice is appreciated!

1

There are 1 answers

9
TTT On BEST ANSWER

Given that the problem is only in commits on your own branch and hasn't been merged anywhere else yet, I believe re-writing your own branch is the simplest course of action. Suppose your git log --oneline looks similar to this:

db759dd Refactor ...
ed7e3cc Update ...
fff1234 Fix .gitignore and delete unwanted files # The FIX
314fc46 Update ...
fea0230 Refactor ...
d9fdf5d Update ...
74249a2 Increase ...
c985a1c Fix ...
d94122f Add ...
bad9999 Update .gitignore and add big files # The ISSUE
3a284fb Increase ...
abc1234 Merge PR 1234: Increase number of threads # BRANCH START
...

In the above example commit ID abcd123 represents the commit you branched off when creating your branch. Commit bad9999 represents the problem commit ID, and fff1234 represents the commit ID with the fix. With your branch checked out, you want to do an interactive rebase by specifying the commit you branched off of (or any later commit that occurred before the problem commit), like this:

git rebase -i abc1234

Now you will be presented with a TODO list, which will display all (non-merge) commits after the one you specified, in reverse order, like this:

pick 3a284fb Increase ...
pick bad9999 Update .gitignore and add big files
pick d94122f Add ...
pick c985a1c Fix ...
pick 74249a2 Increase ...
pick d9fdf5d Update ...
pick fea0230 Refactor ...
pick 314fc46 Update ...
pick fff1234 Fix .gitignore and delete unwanted files
pick ed7e3cc Update ...
pick db759dd Refactor ...

Tip: while doing an interactive rebase, if you change your mind and decide to cancel it, you must first delete all the lines in the file (or at least all the ones with instructions such as "pick"), and then save, and exit. If you simply save and exit without first deleting the lines, you will still proceed with the rebase (which might end up having no effect if your branch was linear, but it's not worth hoping for this is you really intend to cancel it).

Now you have a decision to make. Can you simply delete the ISSUE commit, and if so would it make sense to also delete the FIX commit? If you can do this, change the word "pick" to "d" (for "drop", or you could simply delete the line completely!) on the ISSUE and perhaps also the FIX lines, like this:

pick 3a284fb Increase ...
d bad9999 Update .gitignore and add big files
pick d94122f Add ...
pick c985a1c Fix ...
pick 74249a2 Increase ...
pick d9fdf5d Update ...
pick fea0230 Refactor ...
pick 314fc46 Update ...
d fff1234 Fix .gitignore and delete unwanted files
pick ed7e3cc Update ...
pick db759dd Refactor ...

If the ISSUE commit also had other changes in it that you want to keep, then instead of completely dropping it, the change you (probably1) can make to fix the bad commit is to simply move up the FIX commit and squash it into the ISSUE commit, like this:

pick 3a284fb Increase ...
pick bad9999 Update .gitignore and add big files
s fff1234 Fix .gitignore and delete unwanted files
pick d94122f Add ...
pick c985a1c Fix ...
pick 74249a2 Increase ...
pick d9fdf5d Update ...
pick fea0230 Refactor ...
pick 314fc46 Update ...
pick ed7e3cc Update ...
pick db759dd Refactor ...

Note "s" is the same as "squash", and you always squash "up" to the previous picked commit, so this will squash those 2 commits together. If the FIX commit reverses some of the changes made by ISSUE commit those changes will cancel out and won't be present in the newly created commit2. Now save and exit, and your rebase will begin. As the rebase progresses, it will pause after the squash and prompt you to modify the commit message of the new squashed commit. After you write the new commit message and save and exit, the rebase will continue.

Once the rebase is finished, look over the commits to make sure you are happy with them, and then force push out your branch:

git push --force-with-lease

Now your big bad commits should be gone from the repo history, and all new clones will not contain them. They may still remain indefinitely on the server though, perhaps even referenceable via the Web UI if you know the commit ID, and if that is a problem for you, two typical options are to ask the repo host's support team if they can purge the orphaned commits on the server for you, or, you could delete and re-upload the repo which may have minimum impact since only about 10 commit IDs are being purged. I would confirm with support before deleting and re-pushing the repo that you won't have to redo all of your security settings.

Notes:

1 If the FIX commit contained more than just the fix in it, and if that other stuff needs commits that came between ISSUE and FIX, then you're going to have conflicts here. Conflicts are a fact of life and you'll just have to resolve them. (Fortunately they are oftentimes straight-forward to resolve, especially in your case where you wrote both sides of the conflicting code.)

2 If the FIX commit is the exact opposite of the ISSUE commit (for example if you created FIX by reverting ISSUE), then a new squashed commit won't even be created- it will just skip it and your new branch history will have 2 less commits as if neither existed. If this is applicable you could have also simply dropped the commits in the rebase TODO list.