git rebase a blob after a reset lost the changes

977 views Asked by At

I created a Git repository with GitHub for Windows and did a reset before committing my files and now I've lost all my projects/files.

I find the files with

$ git fsck --lost-found

My files are in .git\lost-found\other and I can view them with $ git show SHA but when I do a $ git rebase SHA I get this:

error: Object SHA is a blob, not a commit
fatal: Needed a single revision
invalid upstream SHA

What can I do to recover my files?

Alternately, can read files with another program or language?

1

There are 1 answers

3
yanhan On

I'm not too familiar with git recovery in general, but you might want to take a look at this stackoverflow question:

Accidentally reverted to master, lost uncommitted changes

Suppose you did not git add, git stash and git commit your changes. So I think none of the methods outlined in the accepted answer of the above question will work.

Based on what you said, after executing git fsck --lost-found, your files are now in the .git\lost-found\other folder, and you can look at the blobs using git show <SHA1>. git rebase only works for commits and not blobs, which is why it does not work.

I think the safest way is to write a script or manually do git show for each blob and pipe the output to a file. Suppose there is a blob called 4156fb7a, which is originally the README.txt file in your repository. You would do:

git show 4156fb7a > README.txt

So that the contents of that blob now get output to README.txt. The process will be tedious if you do it manually, but this is probably a safe way to do it.

I think that anything not present in the .git\lost-found\other folder can be assumed to be lost for good. Remember to commit early and often. Hope that helps.

EDIT to update answer:

@Malaka, if there are not a lot of blobs, you could inspect them by yourself. Otherwise, this is just a guess, but I am guessing you probably didn't modify all 900 files. If you remembered what you modified, then the restoration job is a lot easier. Otherwise, you might want to look at my suggestion below.

What I'm going to suggest is something that may or may not work out. And it assumes the presence of another folder of the original files, whether the folder is a git repo or otherwise. If you do not have that other pristine folder/repo, then my idea below is useless.

STEP 1:

Write a computer program that applies some checksum, say SHA1 (using the sha1sum utility; if that is not present, use md5sum), on every single file in the pristine folder. That computer program will ultimately generate a text file that has the below format:

fullFileName<SP>SHA1 checksum

Where <SP> stands for a space. So let's say I have the following files in the original folder/repo:

app/controllers/ApplicationController.rb
assets/utilities.js
README.markdown

Then your program will generate a file that looks something like the following. The checksums are all made up of course:

app/controllers/ApplicationController.rb 01dfea13
assets/utilities.js 55a31aae
README.markdown 9671c6a0

We shall refer to the above file as checksumfile from now on.

I hope that your original files do not use spaces. If they have, just use some other delimiter to separate the filename and its checksum.

STEP 2:

Now, write another computer program (we call it blobCat) that applies git show to every single blob in the .git\lost-found\other folder, and output them to some arbitrary named files in another folder. We shall denote that folder by unknownBlobs. For simplicity, you could just output them to say 0.txt, 1.txt, 2.txt, and so on, in other words, just use some integer counter to name them.

Once done, write another program that reads in checksumfile, and use a dictionary to index the checksums to the original filename. In other words, given the checksumfile above, we would generate the below dictionary/hash:

"01dfea13" => "app/controllers/ApplicationController.rb" 01dfea13
"55a31aae" => "assets/utilities.js" 
"9671c6a0" => "README.markdown" 

Now, traverse the unknownBlobs folder generated by the blobCat program (the folder containing all the 0.txt, 1.txt, 2.txt files), and apply the same checksum utility as you did in the first step when you generated the checksumfile. If you find the checksum in the dictionary, this means that the blob is very likely to be originally that file, and you can safely restore it to the hierarchy in another folder. You might want to do some checking for the presence of another blob with the same checksum in case there are checksum collisions, although this is very unlikely to happen.

For any blob in the unknownBlobs folder that has a checksum not found in the dictionary, this can mean any of the following:

  1. It was modified from some file existing in the original repository
  2. It was not part of the repository
  3. It was a newly created file that was never tracked

In any case, just keep track of those blobs not found in the dictionary, and output their names at the end of everything. This should be a rather small set of blobs, so you can inspect them manually and determine if they are part of your original repository, then copy them over to their destination if they are.

The above sounds tedious but I think it is much faster than trying to inspect all the blobs by yourself, assuming there's a lot of them.