I found many interesting posts about git fsck
, so I wanted to experiment a little on them. First of all the sources I read before this question:
How can I find an unreachable commit hash in a GIT repository by keywords?
git fsck: how --dangling vs. --unreachable vs. --lost-found differ?
I started with this repo:
* 9c7d1ea (HEAD -> test) f
* cd28884 e
| * 7b7bac0 (master) d
| * cab074f c
|/
* d35af2c b
| * f907f39 r # unreferenced commit
|/
* 81d6675 a
Where r
has been created from a detached HEAD
from a
.
Then I wanted to rebase master
on test
, but I had some unstaged changes, so I did:
git rebase --autostash test
Obtaining (I am not showing r
but it is still there):
* caee68c (HEAD -> master) d
* 2e1cb7d c
* 9c7d1ea (test) f
* cd28884 e
* d35af2c b
* 81d6675 a
Next I run:
$ git fsck
#...
dangling commit 6387b70fe14f1ecb90e650faba5270128694613d # stash
#...
$ git fsck --unreachable
#...
unreachable commit 6387b70fe14f1ecb90e650faba5270128694613d # stash
unreachable commit d8bb677ce0f6602f4ccad46123ee50f2bf6b5819 # stash index
#...
$ git fsck --lost-found
#...
dangling commit 6387b70fe14f1ecb90e650faba5270128694613d # stash
dangling commit f907f39d41763accf6d64f4c736642c0120d5ae2 # r
#...
First question
Why does only the --lost-found
version return the r
commit? And why are not the c
and d
before the rebase
shown among the unreachables? I thought I understood the difference reading the linked questions, but I am clearly missing something. I still have the complete reflog, but I guess you do not need it, since all commits (except those related to the stash
) are referenced.
I know I should create another post but the second question is partially related. I tried out of curiosity:
$ git fsck --lost-found --unreachable
#...
unreachable commit 6387b70fe14f1ecb90e650faba5270128694613d # stash
unreachable commit d8bb677ce0f6602f4ccad46123ee50f2bf6b5819 # stash index
unreachable commit f907f39d41763accf6d64f4c736642c0120d5ae2 # r
unreachable commit 7b7bac0608936a0bcc29267f68091de3466de1cf # c before rebase
unreachable commit cab074f2c9d63919c3fa59a2dd63ec874b0f0891 # d before rebase
#...
Second question
Combining both options I get all the unreachable commits (and not just the union of --lost-found
and --unreachable
), this is very unexpected. Why does it behave like this?
Some of this is indeed puzzling, and appears not to be properly documented, but a quick look at builtin/fsck.c shows that using
--lost-found
:--full
;--no-reflogs
.Item 1 isn't particularly interesting since
--full
is now on by default anyway, but the documentation really should call out that--lost-found
disables--no-full
. Item 2 explainsmost of the rest; I have a guess at the last part[Edit: the rest].Note that when you ran:
this made Git run
git stash push
, which made a new stash consisting of two new commits. Git then did the rebase as usual, which copied thecab074f
and7b7bac0
commits, visible in the originalgit log --all --decorate --oneline --graph
output, to the new2e1cb7d
andcaee68c
commits visible in the second output.Presumably that commit is still in the
HEAD
reflog. That makes it reachable from a reference—but since--lost-found
implies--no-reflogs
, it becomes unreachable this time. The same goes for the originals ofc
andd
: they're reachable via multiple reflog entries, from bothHEAD
's reflog andmaster
's.That's more puzzling.[Edit: solved; see below.] Let's run these in order of yourgit fsck
commands:fsck 1 and fsck 2: Both discover the autostash commits. That's because
git stash push
copied the originalrefs/stash
to the stash reflog, so thatrefs/stash
could point to the autostashw
(working tree) commit. Then the impliedgit stash apply && git stash drop
(git stash pop
) applied the stash and dropped it, moving thestash@{1}
entry back torefs/stash
and deleting the stash reflog. So thew
commit from the autostash is truly "dangling". It's not inrefs/stash
and it's not even in thestash
reflog, becausegit stash
(ab)uses this reflog as the "stash stack". It does, however, point to thei
commit from the autostash.The first fsck, then, prints
6387b70fe14f1ecb90e650faba5270128694613d
and calls it "dangling". That's thew
commit that was dropped. The secondfsck
, with--unreachable
, addsd8bb677ce0f6602f4ccad46123ee50f2bf6b5819
: the correspondingi
commit that was dropped.fsck 3: The
r
and rebased commits remained invisible undergit fsck --unreachable
because they're referenced from the reflogs. But now, with--lost-found
, fsck does not look at the reflogs. We should expect to see the autostashw
commit, ther
commit, and the pre-rebased
, all as dangling. [Edit: as per comment, this is wrong:w
links back toi
andd
, so this will hided
.]We actually see the
w
andr
commitsbut not the.d
commitWhy not? This is my guess but it's easy to test if you still have the setup around: when you usegit rebase
successfully, Git creates or updates the pseudo-ref namedORIG_HEAD
to remember the hash ID of the tip commit before the rebase completes. Note that this same name is used to remember the previous value of a ref after a successfulgit reset
that moves one, and after any other operation that might move a branch name some distance (fast-forward merge, for instance).It's pretty obvious thatgit fsck
must consider all of the various*_HEAD
pseudo-refs as starting points for reachability. This, too, is not documented (and it's not even completely clear it's intentional here—the ref code has been under some fairly heavy rework lately, to support alternative ref backends).fsck 4, just before your SECOND QUESTION section:
either[edit] Since--unreachable
turned off the pseudoref inclusion, or—I think this is more likely—you did something in between that touchedORIG_HEAD
so that it no longer selected the original, pre-rebased
commit.--unreachable
lists all unreachable commits, the fact thatd
is reachable indirectly from the autostashw
commit is irrelevant, and we see everything.If you would like to report a Git documentation bug, that the fsck documentation does not note that
--lost-found
implies--no-reflogs
, you should do that.