Rehash A Corrupt Git Repo

947 views Asked by At

Long story short, we are moving everything from an old repo to a new repo due to a major restructure in terms of both the product and workflow, and due to several severe data issues. I.e. kill old repo, take all files, dump into new repo, start from scratch as far as Git is concerned.

Now, I did this, but I am running into the same issues as we had in the old repo. There is certainly no shortage of problems, either:

error: inflate: data stream error (incorrect data check)
error: sha1 mismatch 3bf84448dc14c5773dcaaea2e5d28c099fe6cc32
error: 3bf84448dc14c5773dcaaea2e5d28c099fe6cc32: object corrupt or missing
error: sha1 mismatch 3d9d3715b55262d61a11b0b7fa9b01b3c9a6beaa
error: 3d9d3715b55262d61a11b0b7fa9b01b3c9a6beaa: object corrupt or missing
error: sha1 mismatch 525f1182a21a8e7b7d65062effe0d89c3937a2e8
error: 525f1182a21a8e7b7d65062effe0d89c3937a2e8: object corrupt or missing
error: inflate: data stream error (incorrect data check)
error: sha1 mismatch 53ad3219a54af10015ba006a895f67a29bb262e1
error: 53ad3219a54af10015ba006a895f67a29bb262e1: object corrupt or missing

Manually locating the names of these blobs show files which are totally fine and uncorrupted. Rehashing them manually has seemingly fixed the targeted files, but there are still many mismatches and other issues.

All of my working source is present and the data is fine. Is there a way to entirely reconstruct the repository filestructure from my working source, i.e. entirely clean Git and have it rehash everything, or is there some other suggested approach?

1

There are 1 answers

0
Mathieu J. On

I have had issues with corrupted git repos more than a few times. doing any manual operations within project/.git folder is highly unrecommended. Sometimes, I have no idea what I did wrong... sometimes I did a grep -rl pattern | xargs sed -i 's/foo/bar/' in which I had obviously forgot to exclude .git/**/* files

in most cases, I had to give up the repo, usually keeping a copy in a graveyard-corrupted folder; just in case. But I ended up never going back to it. Usually creating a new clone is fine, as I push often and keep everything of value on the remote.

but if you got local branches, or stashes you want to restore. Then fixing your repo might be great.

today I found the solution.

# step out
cd ..
# backup your local clone
cp -a project project-corrupted
# clone a new copy from remote
git clone remote_url project-fresh
# back a bundle backup for your git refs
(cd project-fresh; git bundle create b-all --all)
# restore on top of your corrupted repo
cd project
git bundle unbundle ../project-fresh/b-all

hope this helps someone