How to fix corrupted git repository - "git fsck" reports "warning in tree [hash]: contains entries pointing to null sha1"

4.2k views Asked by At

Overview:

I am unable to successfully pull changes in our repo to our production server.

Running "git fsck" on my repo returned 5 instances of the same error:

warning in tree [hash]: contains entries pointing to a null sha1

The error exists on all versions of our repo including the version hosted on bitbucket.

My colleague and I both have unpushed and uncommitted changes in our local versions of the repo that we'd very much like to preserve.

I've tried to google, stackoverflow, and man page my way out of this but I can't find a good guide that explains what's going on or how to fix the problem.

My colleague and I are relative noobs when it comes to GIT. We've got the basics down but we haven't spent any time in the low-level commands yet.

I'd appreciate any and all help to restore the integrity of my repo.

Detailed description:

My problems started when I tried to pull a remote branch to my production server. It should have been a simple update to the working directory but I got some vague error that I can't remember and found my working directory was corrupted.

Git status reported tons of untracked and modified files after the failed merge. I couldn't figure out how to fix the problem with git commands so I manually manipulated the file system to remove the files (but I didn't touch anything in the .git directory) and got my working directory back to a state where my production server would serve my website without errors.

Running "git fsck" on my repo returned 5 instances of the same error:

warning in tree [hash]: contains entries pointing to a null sha1

I ran git fsck on:

  • my repo on my dev machine
  • my colleague's dev machine
  • a freshly cloned version repo from bitbucket on both dev and prod

Everything I tried shows the same warnings. So whatever the problem is, it's in all versions of our repo.

Calling "git ls-tree [tree hash reporting an error]" shows an normal directory printout along with the bad tree hash:

160000 commit 0000000000000000000000000000000000000000 [name of repo]

The closest thing to a solution I found is this stackoverflow post: How to remove an entry with null sha1 in a Git tree. However, I was unable to really comprehend the steps and cutting and pasting commands failed to resolve my problems.

My questions:

  • What do these errors really mean? How serious are they?
  • How do we repair our repo (if possible please go step-by-step for us noobs)?
  • Should we be committing and pushing all our changes to the repo before we repair it or after?
  • What are the implications of repairing the repo? How do we distribute the repair to all versions of the repo (eg. to dev machines, and the production server)?
  • What causes this error and how do we prevent it from reoccurring?
1

There are 1 answers

1
AnoE On

This question is very old; still, maybe it helps someone else.

  • This error probably means that you had submodules once, later got rid of them, and something went wrong.
  • How to repair very much depends on what the involved trees look like, exactly. The key would be to understand the git internals enough to figure out what to do. This is not actually very hard since git only had a handful of concepts at its root. Check out https://git-scm.com/book/en/v2/Git-Internals-Git-Objects . The question you linked had an answer with very good pointers.
  • I would do it like this:
    • Make a local copy (using your OS file tools, not git) of your repository, then do a git reset --hard in the copy, then fix everything.
    • Send this fixed repository to a new remote so your friend can pick it up.
    • Both of you copy your local changes over from your original repositories, again with cp, not git.
    • Commit, push, pull as usual until those three new repositories are fine. -Throw away your old local repositories, replace your old remote by git push --all --force.
  • The implications depend on how complicated the actual repairs were. But it will likely be comparable to a huge rebase, I.e. new commit hashes everywhere and git giving you the "diverged branches" message with hundreds of commits in between. You should not lose history. I would advise against pushing/pulling/merging between the old and new repositories.
  • The cause will likely have been some submodule shenanigans, since submodules are the one thing that cause commit hashes to appear in trees. To avoid... avoid submodules? Hard to say.