git rename vs delete duplicate heuristic

78 views Asked by At

Suppose I rename a file from a.txt to b.txt. Git detects this properly; very nice. Here's a trickier case. I have two IDENTICAL files, x.txt and y.txt. I want to de-dupe them and call the result z.txt. Git reports one rename and one deletion. The question is, WHICH file gets renamed, and which gets deleted? Who cares, they're the same, right? But only the one that's renamed gets its history preserved, and I do have preferences there.

2

There are 2 answers

2
Mureinik On

If you have a preference, don't rely on git's deduping. Explicitly rename the one you want using git mv x.txt z.txt and delete the other using git rm y.txt. Note: this assumes you want to preserve x.txt's history. If you want to preserve y.txt's history, swap x.txt and y.txt in the aforementioned commands.

1
eftshift0 On

This is hackish but it can be solved by tricking git itself by doing a rename on a separate branch and forcing git to keep both files on a merge.

git checkout -b rename-branch
git mv a.txt b.txt
git commit -m "Renaming file"
# if you did a git blame of b.txt, it would _follow_ a.txt history, right?
git checkout main
git merge --no-ff --no-commit rename-branch
git checkout HEAD -- a.txt # get the file back
git commit -m "Not really renaming file"

With a straight copy, you get this:

$ git log --graph --oneline --name-status
* 70f03aa (HEAD -> master) COpying file straight
| A     new_file.txt
* efc04f3 (first) First commit for file
  A     hello_world.txt
$ git blame -s new_file.txt
70f03aab 1) I am here
70f03aab 2) 
70f03aab 3) Yes I am
$ git blame -s hello_world.txt
^efc04f3 1) I am here
^efc04f3 2) 
^efc04f3 3) Yes I am

Using the rename on the side and getting the file back you get:

$ git log --oneline --graph master2 --name-status
*   30b76ab (HEAD, master2) Not really renaming
|\  
| * 652921f Renaming file
|/  
|   R100        hello_world.txt new_file.txt
* efc04f3 (first) First commit for file
  A     hello_world.txt
$ git blame -s new_file.txt
^efc04f3 hello_world.txt 1) I am here
^efc04f3 hello_world.txt 2) 
^efc04f3 hello_world.txt 3) Yes I am
$ git blame -s hello_world.txt
^efc04f3 1) I am here
^efc04f3 2) 
^efc04f3 3) Yes I am

Rationale is that if you want to see history of the original file git will do it without issues.... if you want to do it on the copy, then git will follow the separate branch where the rename is and then it will be able to jump to the original file following the copy, just because it's done on that branch.