Understanding git commit --only and pre-commit hooks

1.5k views Asked by At

I'm working on a pre-commit hook to reformat code, and in general, it works; it reformats and git adds any staged files, and the resulting commit contains the reformatted code as desired.

However, it doesn't play nicely with git commit --only (which is the variant used by JetBrains IDEs), and I'm trying to understand why. The combination of git commit --only and the pre-commit hook results in an undesirable index/working tree state, as described in the following sequence of events:

If I make a small change with a formatting error to a file and then run git status, this is what I see:

On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   file.php

no changes added to commit (use "git add" and/or "git commit -a")

If I then commit using git commit --only -- file.php, the pre-commit hook runs, and the changed and reformatted file.php is committed.

However, if I then run git status again, this is the result (arrow annotations mine):

On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    modified:   file.php <-- contains original change, improperly formatted

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   file.php <-- contains original change, properly formatted (per the most recent commit)

Where are the new staged change and the change in the working tree coming from?

Can someone explain exactly how git commit --only interacts with the index to produce the result shown above — and even better, whether there's a way to have my pre-commit hook play nicely with it?

My understanding is that git commit --only works with the version of the file in the working tree, so I tried removing the git add step from the pre-commit hook to see what would happen, and it resulted in the improperly-formatted version of the file being committed and the properly-formatted one in the working tree (which matches my expectations for a standard git commit, but I wasn't sure what to expect in the context of git commit --only).

I'm aware of the possibility of using a clean filter to reformat the code, rather than a pre-commit hook, but there are a few situational complications introduced by that approach that would be nice to avoid if possible.

Note: This question is related to Phpstorm and pre commit hooks that modify files but is focused on addressing the problem in the context of git commit --only. Moreover, the problem doesn't seem to have been addressed by JetBrains, as was suggested in the accepted answer to that question.

2

There are 2 answers

0
torek On BEST ANSWER

The precise details vary from one version of Git to another, and some people—I'm not saying the JetBrains folks are among them, as I have no idea—have tried to bypass the way Git does things and in the process, screwed things up such that either they can't be worked-around, or the work-around is Git-version-dependent. However, the main idea in these Git hooks is all the same:

  • the index contains the commit-to-make, and
  • the work-tree contains the work-tree.

These two need not be in sync when you first run git commit, and if you add files to the git commit command, with either --only or --include, Git must then make a new index, which may differ from the regular ordinary index. So now we wind up with an environment variable, GIT_INDEX_FILE, set to the path of a new, temporary index.1 Since all Git commands automatically respect the environment variables, the pre-commit hook will use the temporary index's files, and git write-tree will use the temporary index's files.

Of course, anything that fails to respect the temporary index—or, potentially, depending on --include vs --only, just uses the contents of the work-tree—will get the wrong answer.

There is still a problem, though, even with programs that do respect the environment variables. Suppose we have a file—let's call it test since that's its purpose—that initially contains "headvers", and matches the current (HEAD) commit. Now we modify it in the work-tree to contain "indexvers" and run git add test. The index version of test thus reads "indexvers". Now we modify it again in the work-tree, to contain "workvers", and run either git commit --only test or git commit --include test.

We know for sure what should go into the new commit: it should be the version of test containing workvers, because we specifically told Git to commit the work-tree version. But what should be left in the index and work-tree afterward? Does this depend on whether we used --include vs --only? I don't know what to consider the "right" answer here! All I can tell you is that when I experimented with Git before, it tended to contain workvers afterward (both in the index, and in the work-tree). That is, the temporary index's version became the normal index's version, and the work-tree file was untouched.

(If you have Git hooks that manipulate the index and/or work-tree, you will be able to pry open the difference between "copying index to saved-index, then copying back" vs "copying index to temp-index, then using temp-index".)


1This was the actual implementation at one time, when I was testing various behaviors, but it's possible that the actual implementation has changed a bit. For instance, Git could save the "normal" index in a temporary file and then replace the normal index, so that GIT_INDEX_FILE is not set after all. And, again, it may depend on --include vs --only.

Note that git commit -a may also use a temporary index, or not. I believe this behavior has changed between Git 1.7 and Git 2.10, based on the result of running git status in another window while still editing the commit message in the window that was running git commit -a.

0
Winter Young On

I came across the same problem. Here is the solution I got from a Jetbrains dev Dmitriy Smirnov.


git commit --only is used due to several reasons:

  1. Git stage is not supported - https://youtrack.jetbrains.com/issue/IDEA-63391
  2. It allows doing partial commits - committing individual files. this is crucial to support change lists in the IDE.

It is not possible to change the behavior at the moment.

Given the pre-commit hook as follows (ruby):

`git status --porcelain`.lines do |line|
    changed_file = line.split(' ', 2)[1].strip()
    if (File.extname(changed_file).downcase() == '.java')
        system "java -jar bin/google-java-format-1.4-all-deps.jar --aosp --replace #{changed_file}"
        system "git add #{changed_file}"
    end
end

Add a post-commit hook:

git update-index -g

See

https://youtrack.jetbrains.com/issue/IDEA-81139#comment=27-295117