Git manual merge with binary files using propriety merge tool. How does git mergetool work?

99 views Asked by At

I have a Git repo containing a CodeSys 2.3 source code file. This is a proprietary binary file that can't be merged easily using the tool built into Git. When I attempt a merge and the Auto-merge fails, I need to manually resolve the conflicts. It's best to have a copy of the local file and a copy of the file with the proposed changes and use the compare tool built into CodeSys to merge the two.

Similar question: How do I merge a binary file?

The answer to that question recommends using something like git checkout --ours binary.dat to choose one version of the file. Typically, I don't want to select one version of the file over the other, but I would like to combine the two using the CodeSys IDE.

The Git book seems to have an answer: https://git-scm.com/book/en/v2/Git-Tools-Advanced-Merging in the section: Manual File Re-merging. The git show command is used to grab the "ours" and "theirs" copies of the file and create a file with an identifier appended to the name.

The example from the book:

$ git show :1:hello.rb > hello.common.rb
$ git show :2:hello.rb > hello.ours.rb
$ git show :3:hello.rb > hello.theirs.rb

I was having issues with this method as it produces a corrupted file that can't be opened in the CodeSys IDE. Turns out that's just an issue with how Windows PowerShell deals with binary data streams. Running the command in cmd.exe worked. cmd /c "git show :3:hello.rb" > hello.theirs.rb

If I use git mergetool and set mergetool.keepBackup = true then a set of files are created with one of them being: filename_REMOTE_number.pro. This is the correct file. Unfortunately, mergetool fails and complains because vindiff can't open the files.

Question 1: How could I automate the process of generating the additional copies of the binary files? Ideally, I would create a git alias command that searches the index for all files in merge conflict and outputs file-name.INCOMING.pro into the working folder.

Question 1.2: How could create the REMOTE and LOCAL files that mergetool creates without using mergetool?

Question 2: How does the :1:hello.rb > hello.common.rb path work? I can't add wildcards like '*' without getting an error.

Bonus Question: Why are these files in the index? I thought that the index is another name for the staging area. These files aren't "staged" or even in the working tree at all? Can someone point me towards an explanation of what is in this index? Is it the same as the staging area?

1

There are 1 answers

0
BluePhish On

I have found a solution to question 1:

Get-ChildItem -Include *.pro -Exclude REMOTE.* -Name | foreach ($_) {"git show `":3:{0}`" > `"REMOTE.{0}`"" -f $_ | cmd.exe }|Select-String -Pattern "fatal*"

Step 1 is to find the binary files that need to be processed. Thankfully all the files have the .pro extension and typically only one present in each repo. So, we can use the Cmdlet Get-ChildItem to find the name of the file. This also excludes any files made with this command.

Get-ChildItem -Include *.pro -Exclude REMOTE.* -Name

Now for each name returned by "Get-ChildItem" we want to format the correct git command. The git documentation describes how to select the correct version during a merge: https://git-scm.com/docs/gitrevisions.html#_specifying_revisions. Stage 3 (denoted by the :3:) is the version from the branch that is to be merged.

foreach ($_) {"git show `":3:{0}`" > `"REMOTE.{0}`"" -f $_ | cmd.exe }

These strings are then piped to cmd.exe to be executed because PowerShell has some issues with binary data. (See an investigation here: https://brianreiter.org/2010/01/29/powershells-object-pipeline-corrupts-piped-binary-data/).

The output of cmd.exe is then filtered to catch and display any errors.