I'm working on a pre-commit hook to reformat code, and in general, it works; it reformats and git add
s any staged files, and the resulting commit contains the reformatted code as desired.
However, it doesn't play nicely with git commit --only
(which is the variant used by JetBrains IDEs), and I'm trying to understand why. The combination of git commit --only
and the pre-commit hook results in an undesirable index/working tree state, as described in the following sequence of events:
If I make a small change with a formatting error to a file and then run git status
, this is what I see:
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: file.php
no changes added to commit (use "git add" and/or "git commit -a")
If I then commit using git commit --only -- file.php
, the pre-commit hook runs, and the changed and reformatted file.php
is committed.
However, if I then run git status
again, this is the result (arrow annotations mine):
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: file.php <-- contains original change, improperly formatted
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: file.php <-- contains original change, properly formatted (per the most recent commit)
Where are the new staged change and the change in the working tree coming from?
Can someone explain exactly how git commit --only
interacts with the index to produce the result shown above — and even better, whether there's a way to have my pre-commit hook play nicely with it?
My understanding is that git commit --only
works with the version of the file in the working tree, so I tried removing the git add
step from the pre-commit hook to see what would happen, and it resulted in the improperly-formatted version of the file being committed and the properly-formatted one in the working tree (which matches my expectations for a standard git commit
, but I wasn't sure what to expect in the context of git commit --only
).
I'm aware of the possibility of using a clean
filter to reformat the code, rather than a pre-commit hook, but there are a few situational complications introduced by that approach that would be nice to avoid if possible.
Note: This question is related to Phpstorm and pre commit hooks that modify files but is focused on addressing the problem in the context of git commit --only
. Moreover, the problem doesn't seem to have been addressed by JetBrains, as was suggested in the accepted answer to that question.
The precise details vary from one version of Git to another, and some people—I'm not saying the JetBrains folks are among them, as I have no idea—have tried to bypass the way Git does things and in the process, screwed things up such that either they can't be worked-around, or the work-around is Git-version-dependent. However, the main idea in these Git hooks is all the same:
These two need not be in sync when you first run
git commit
, and if you add files to thegit commit
command, with either--only
or--include
, Git must then make a new index, which may differ from the regular ordinary index. So now we wind up with an environment variable,GIT_INDEX_FILE
, set to the path of a new, temporary index.1 Since all Git commands automatically respect the environment variables, the pre-commit hook will use the temporary index's files, andgit write-tree
will use the temporary index's files.Of course, anything that fails to respect the temporary index—or, potentially, depending on
--include
vs--only
, just uses the contents of the work-tree—will get the wrong answer.There is still a problem, though, even with programs that do respect the environment variables. Suppose we have a file—let's call it
test
since that's its purpose—that initially contains "headvers", and matches the current (HEAD
) commit. Now we modify it in the work-tree to contain "indexvers" and rungit add test
. The index version oftest
thus reads "indexvers". Now we modify it again in the work-tree, to contain "workvers", and run eithergit commit --only test
orgit commit --include test
.We know for sure what should go into the new commit: it should be the version of test containing
workvers
, because we specifically told Git to commit the work-tree version. But what should be left in the index and work-tree afterward? Does this depend on whether we used--include
vs--only
? I don't know what to consider the "right" answer here! All I can tell you is that when I experimented with Git before, it tended to containworkvers
afterward (both in the index, and in the work-tree). That is, the temporary index's version became the normal index's version, and the work-tree file was untouched.(If you have Git hooks that manipulate the index and/or work-tree, you will be able to pry open the difference between "copying index to saved-index, then copying back" vs "copying index to temp-index, then using temp-index".)
1This was the actual implementation at one time, when I was testing various behaviors, but it's possible that the actual implementation has changed a bit. For instance, Git could save the "normal" index in a temporary file and then replace the normal index, so that
GIT_INDEX_FILE
is not set after all. And, again, it may depend on--include
vs--only
.Note that
git commit -a
may also use a temporary index, or not. I believe this behavior has changed between Git 1.7 and Git 2.10, based on the result of runninggit status
in another window while still editing the commit message in the window that was runninggit commit -a
.