How to remove a dangling commit from GitHub?

12.8k views Asked by At

Yesterday, I pushed to my fork of ConnectBot on GitHub. I pushed once, realized that I hadn't made the change the way I wanted, redid the commit and pushed again.

Now, GitHub has both commits:

My master branch is only tracking the second commit, but the first commit is still available and is still in my activity feed. How can I remove it to make sure no one accidentally pulls that commit instead of the corrected version?

3

There are 3 answers

1
Ciro Santilli OurBigBook.com On BEST ANSWER

Delete the repo or contact GitHub

Deleting the repo and recreating it without the bad commit seems to work if you can afford losing all issues. The data also disappears from the commit API (although push events are still visible). See also: https://stackoverflow.com/a/32840254/895245

If you can't afford to lose issue data, GitHub support can manually delete dangling commits. For example, when I uploaded all GitHub commit emails to a repo they asked me to take it down, so I did, and they did a gc. Pull requests that contain the data have to be deleted however: that repo data remained accessible up to one year after initial takedown due to this.

Their current help page says:

you can permanently remove all of your repository's cached views and pull requests on GitHub by contacting GitHub Support.

Maybe making the repo private will also keep the issues around and get rid of the commit, I'm not sure. You lose stars/forks for sure though. Not sure if after restore the commits will be gone or not. But at least you might be able to keep a private backup of issues.

10
Emil Sit On

If you really need it to be removed immediately, you would probably have to contact GitHub Support.

Pulling should generate a pack that contains only objects that are referenced so no one should get that commit as a result of a clone or a pull. For example,

$ git clone git://github.com/nylen/connectbot.git
Cloning into connectbot...
remote: Counting objects: 6261, done.
remote: Compressing objects: 100% (1900/1900), done.
remote: Total 6261 (delta 3739), reused 5980 (delta 3520)
Receiving objects: 100% (6261/6261), 3.04 MiB | 3.40 MiB/s, done.
Resolving deltas: 100% (3739/3739), done.
$ git cat-file -t 1cd775d
fatal: Not a valid object name 1cd775d
0
cjs On

How can I remove it to make sure no one accidentally pulls that commit instead of the corrected version?

There's no need to do this, anybody using the master branch on your repo will get the correct commit (i.e., whatever you happen to have master branch pointing to at the time).

The reason that the other commit hasn't been garbage-collected is because there's still a reference to it somewhere.

In local repos this is usually the reflog, and the commit will be GC'd once it gets old enough that the reflog entry that indicates that HEAD and/or master pointed to that commit sometime in the past ages out and is GC'd, or is explicitly deleted.

GitHub is a bit more complex because there are plenty of things outside of a particular repo that can reference commits in a repo. This includes PRs, issues, and apparently even references in other repos, as the current message at the top of https://github.com/nylen/connectbot/commit/1cd775d indicates:

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

I tried to track down where this reference might be by checking issues and PRs in the upstream repo and another fork repo mentioned in a PR for the "good" commit above. I didn't manage to track it down, but the upstream repo currently has 624 active forks, each with its own set of commits, PRs and issues (and whatever else GitHub has that references commits), so the reference is no doubt in there somewhere.

But again, there's no need to worry about this. Anybody looking at your master branch will always get the "correct" commit as of the time they last fetched your repo on the tracking branch, and they'll have to resolve things in the usual way if they happen to have a local branch that they made reference the older version of that commit. (In situations like this one, usually a simple git rebase will "replace" the old commit with the new one on the local branch.)