Git: How to re-stage the staged files in a pre-commit hook

12k views Asked by At

I'm writting a git pre-commit hook.
The script could reformat some code, so it could modify the staged files.

How can I re-stage all files that are already staged ?

4

There are 4 answers

4
tzi On BEST ANSWER

Without the pre-commit hook context, you can get a list of the staged files with the following command:

git diff --name-only --cached

So if you want to re-index the staged files, you can use:

git diff --name-only --cached | xargs -l git add

In the pre-commit hook context, you should follow the advices of David Winterbottom and stash unstaged changes before anything else.

This technique allows you not to be worry about indexing, or alterate, a change that was not staged. So you don't have to stage all the staged files, but all the updated files:

# Stash unstaged changes
git stash -q --keep-index

# Edit your project files here
...

# Stage updated files
git add -u

# Re-apply original unstaged changes
git stash pop -q
0
vtwaldo21 On

I too had the same issues as the others above, with files left with conflict markers or just changes not applied when I felt they should. So I tweaked @NearHuscarl's answer just a bit. I admit to not know much about git, so I don't really understand most of why this appears to work for me and that this isn't very efficient, but the end result so far has seemed to fit my personal use case better (small-ish repo, infrequent-ish commits, super computer development environment). YMMV.

I also liked @Filipe's stance, just abort when there are unstaged changes. Probably good practice vice complicated hook scripts.

In any case, my pre-commit ended up like,

#!/bin/bash
#
# Allow disabling feature
suppresscheck=$(git config hooks.suppresschecks)

if [ "$suppresscheck" == "true" ]; then
   exit 0
fi

# Redirect output to stderr.
exec 1>&2

#use the commit hash as a shared variable with the post-commit
tfile=/tmp/$(git rev-parse HEAD)
touch $tfile

git diff --diff-filter=ACMR --name-only > ${tfile}.unstaged
git diff --diff-filter=ACMR --cached --name-only > ${tfile}.staged

#do nothing if no changes to process
[ ! -s ${tfile}.staged ] && exit 0

#stash the working files and leave the staged file
git stash save -q --keep-index "current wd"

./your_script.sh < ${tfile}.staged
RESULT=$?

#check for errors, if found reset everything
if [ $RESULT -ne 0 ]; then
   echo
   echo "pre-commit: you can disable checks with 'git config hooks.suppresschecks true' or 'git commit --no-verify'"
   echo
   git stash save -q "original index"
   git stash apply -q --index stash@{1}
   git stash drop -q; git stash drop -q
   rm -f ${tfile}*
   exit 1
fi

#get the list of files altered
git diff --diff-filter=M --name-only | sort > ${tfile}.cleaned

#save the updates so they are committed
git add -u

#the unstaged changes are placed back during post-commit hook

That left the post-commit being,

#!/bin/bash
#

#pull the prior commit hash as pre-commit used
#that to save off data
tfile=/tmp/$(git log -2 --pretty="%H" | tail -1)

#pre-commit was skipped
if [ ! -f ${tfile} ]; then
   exit 0
fi

#re-apply the stash to the working. had to use apply/drop
#because the pop would leave items in the stash on conflicts
git stash apply -q
git stash drop -q

#for any file that was fully staged prior to commit
#force the working file to match the committed file
while read line; do
   if ! grep -q ^"$line"$ ${tfile}.unstaged ; then
      git reset --quiet -- "$line"
      git checkout -- "$line"
   fi
done < <(git diff --diff-filter=M --name-only )

#generate a list of files that had unstaged changes and were modified
while read line; do
   if grep -q ^"$line"$ ${tfile}.staged ; then
      echo "$line"
   fi
done < <(git diff --diff-filter=M --name-only) | sort > ${tfile}.conflicts

#remove all files that are conflicting from the list of
#files that were altered, just so each file is only in list A or list B
comm -23 ${tfile}.cleaned ${tfile}.conflicts | sponge ${tfile}.cleaned

#tell the user which files are automatically altered
if [[ -s ${tfile}.cleaned ]]; then
  tput setaf 3
  echo
  echo "Following files were auto cleaned"
  tput sgr0
  cat ${tfile}.cleaned
  echo
fi

#tell the user which files may require more work
#the file should have the standard git conflict markers
if [[ -s ${tfile}.conflicts ]]; then
  tput setaf 3
  echo
  echo "Following files may need manual resolution (git mergetool -y)"
  tput sgr0
  cat ${tfile}.conflicts
  echo
fi

rm -f ${tfile}*
0
Filipe On

Sadly, I don't think @NearHuscarl's answer above quite cuts it. It is the closest I've seen, but when you pop the stash in your post-commit hook you will still introduce a merge conflict. That's because what is stashed (even with the --keep-index flag) are both the unstaged changes (which we want) and the staged changes before you run an auto-formatter on them (which we don't want). That will create a merge conflict between the committed auto-formatted changes and the original not-yet-formatted stashed changes. As I understand it there's no easy way to tell git to ONLY stash unstaged changes. The --keep-index flag stashes unstaged changes and staged changes, while leaving the staged changes in place. That differs from the default behavior in that the staged changes will typically also be stashed away alongside the unstaged changes. But it does not stash the unstaged changes exclusively, which is really what we'd need.

I'd love to be wrong about this but I'm pretty sure there is no quick to implement solution in bash. Like any problem, it is of course solvable but it takes quite a bit of legwork. lint-staged handles this quite gracefully actually but not without putting in the work. Here's the PR where they introduced this feature and here's the corresponding discussion on the issue. Even with all that work there remain edge cases where they explicitly fail the hook and reset the WD to its original state. They simply can't always guarantee that they aren't introducing conflicts.

My take away is: if you're dealing with a javascript project, use lint-staged. If you're like me and you really want to stick to a simple bash script it might be worth just checking whether there are any partially staged files before doing anything else and aborting with a message telling the user to fix their partially staged files. Until something like lefthook introduces this feature (issue here), your other options are all pretty heinous.

But I'd love for someone to prove me wrong.

1
Novice C On

I liked @tzi's answer; however, in David Winterbottom's quoted article there is a edge case concern raised in the comments in which you will lose some commit history. Though, it's not as doom and gloom as the commenter makes it sound, and again is an edge case for people with problematic practices. It happens when

  1. You stage a file (version A)
  2. Edit the same file before committing (version B)
  3. Wished to commit the originally staged file (version A) and not the modified one (version B)

If your commit fails, or succeeds and pops the stash before a committing, you lose your originally staged file (v. A), as it was never commit and is overwritten (with v. B). Obviously not catastrophic, and you still have the latest edit (v. B), but it might hamper some people's workflows and (suboptimal) committing practices. To avoid this you just check the exit of your script and work some stashing tricks to revert to the original state (index has v. A and WD has v. B).

pre-commit

#!/bin/sh

... # other pre-commit tasks

## Stash unstaged changes, but keep the current index
### Modified files in WD should be those of INDEX (v. A), everything else HEAD
### Stashed was the WD of the original state (v. B)

git stash save -q --keep-index "current wd"

## script for editing project files
### This is editing your original staged files version (v. A), since this is your WD 
### (call changed files v. A')

./your_script.sh

## Check for exit errors of your_script.sh; on errors revert to original state 
## (index has v. A and WD has v. B)

RESULT=$?
if [ $RESULT -ne 0 ]; then
git stash save -q "original index"
git stash apply -q --index stash@{1}
git stash drop -q; git stash drop -q
fi
[ $RESULT -ne 0 ] && exit 1

## Stage your_script.sh modified files (v. A')

git add -u

You should also move the git stash pop to the post-commit hook, as this is what overwrite the staged file (v. A) with the modified file (v. B) prior to committing. In practice mostly likely your script doesn't fail, but even so your git stash pop in the pre-commit hook creates a merge conflict with your script modified files (v . A') and your unstaged modifications (v. B). This then prevents the file from being committed at all, but you do have your script modified originally staged file (v. A') and your unstaged post-staging modified file(v. B) (arguably not losing any significant history assuming your_script.sh only does stuff such as indenting so v. A and v. A' are pretty much the same).

Summary: If you use best practices and commit staged files before modifying them again, the original answer is easiest and great. If you have, in my opinion, bad habits of not doing so and wanting both versions (staged and modified) in your history, you need to be careful (an argument for why this is a bad practice)! In any case, this could be a possible safety net.