So, I've been searching all morning for the correct way to do this, and I'm just not that command line savvy to figure it out.
I have a git repo with a ton of assets in it. It's like the cardinal sin, I know.
The repo has grown to be too huge. I'd like to clean it up so I can programmatically remove all files that do not exist in HEAD anymore from the entire history of the repo. I've seen ways to do this where you can specify the file paths, but really, I am talking like 1000+ files that have been removed from our final product that I really don't care to have in my repo anymore.
UPDATE:
I've cleaned the repo of all the assets that shouldn't have been there in the first place. I really just have source code in there now and a few assets that SHOULD be there. I'd really love to keep all the history of all the source code... so I'm really looking to scrap the deleted files from history while preserving the history what currently exists. That's the goal. I am pretty sure it can be done using git filter-branch
- but I just don't understand it well enough.
Use the BFG Repo-Cleaner, a simpler, faster alternative to
git-filter-branch
specifically designed for removing unwanted files from Git history.By default, the BFG 'protects' all files in your HEAD commit, but will delete other files that match your criteria.
You should carefully follow the usage instructions, but the core part is just this:
Any files over 1MB in size - that aren't in your latest commit - will be removed from your Git repository's history. If you have normal, smaller-than-1MB, source files that you still want to remove, you can specify them with the
--delete-files
or--delete-folders
options.The BFG is typically at least 10-50x faster than running
git-filter-branch
, and generally easier to use.Full disclosure: I'm the author of the BFG Repo-Cleaner.