version control for binary files

3.6k views Asked by At

I'm in the process of writing a program that will use its own binary format for storing data.

I would like to make possible to have some form of version control, at the very least two different people should be able to make changes on a file and after that merge this changes.

As git seems to be a very common and very powerful version control system, I am wondering if it is possible to use it. I have only basic knowledge of git (pull add commit push), but I understand it has more advanced features. I want to understand if I can implement some basic functionality and get all the advanced features for free.

So the use case is that each project would consist of a sole binary file, and git would have to be able to work with it.

I've been searching and got to understand that I would have to write a custom mergetool ? Is my understanding correct?

I also understand that a three-way merge is done by a diff3 program or at least that git has some of it functionality embedded. Would I have to write a custom version of it? Would it be even possible to use it with git? Would it be necessary to recompile git?

git also stores commits as changes in order to save space. Does it use diff for it? Would be possible to replace it? Would git need to be recompiled?
Is there any other kind of functionality that I would have to implement?
My initial plan was to use a single file for the project, but each project is made of independent sub-projects, that would be merged independently. Would I gain anything by storing the project as different files for each subproject?

Is there anywhere some good documentation on to what interface a diff , diff3 and mergetool must conform? In which languages could these be written?

I'm quite confused because everyone seems to be interested in eliminating binary files from version control and apparently nobody wants to use git on them. Is it a bad idea? I feel like any kind of data for which merging makes sense in some way should be version controlled.

1

There are 1 answers

4
CodeWizard On

So the use case is that each project would consist of a sole binary file, and git would have to be able to work with it.

There are 3rd party tools for handling binary files.

enter image description here


but each project is made of independent sub-projects

In git you have submodules && subtree for this purpose

git submodule add only pick latest Commit

enter image description here


Is there anywhere some good documentation on to what interface a diff , diff3 and mergetool must conform?

In which languages could these be written?

Read this out, there is a references to diff2 & diff3 algorithm
What is the diff version git use? diff2 or diff3?