I have a project that has more than 3 years of history in the svn repository. It was migrated to git, but the guy who did this, just take the last version and throw out all these 3 years of history.
Now the project has the last 3-4 months of history in one repository, and I've imported the other 3 years of svn history into a new git repository.
Is there some way to connect the root commit of the second repository into the last commit of the first one?
It is something like this:
* 2017-04-21 - last commit on master
|
* 2017-03-20 - merge branch Y into master
|\
| * 2017-03-19 - commit on branch Y
| |
* | 2017-03-18 - merge branch X into master
/| * 2017-02-17 - commit on another new branch Y
* |/ 2017-02-16 - commit on branch X
| * 2017-02-15 - commit on master branch
* | 2017-01-14 - commit on new branch X
\|
* 2017-01-13 - first commit on new repository
|
* 2017-01-12 - init new git project with the last version of the code in svn repository
.
.
There is no relationship between the two different repositories yet, this is what I wanna
do. I want to connect the root commit of 2nd repository with the last commit of the first
one.
.
.
* 2017-01-09 - commit
|
* 2017-01-08 - commit
|
* 2017-01-07 - merge
/|
* | 2016-01-06 - 2nd commit the other branch
| * 2016-01-05 - commit on trunk
* | 2016-01-04 - commit on new branch
\|
* 2015-01-03 - first commit
|
* 2015-01-02 - beggining of the project
Update:
I just learn that I need to do a The answer was to use git rebase
, but how? Please, let's consider the commit dates like it was the SHA-1 codes...git filter-branch
with --parent-filter
option, not a git rebase
.
Update 2:
I tried the command git filter-branch --parent-filter 'test $GIT_COMMIT = 443aec8880e898710796a1c4fb4decea1ca5ff66 && echo "-p 98e2b95e07b84ad1e40c3231e66840ea910e9d66" || cat' HEAD
and it didn't work:
PS D:\git\rebase-test\rep2cc> git filter-branch --parent-filter 'test $GIT_COMMIT = 443aec8880e898710796a1c4fb4decea1ca5ff66 && echo "-p 98e2b95e07b84ad1e40c3231e66840ea910e9d66" || cat' HEAD
fatal: ambiguous argument '98e2b95e07b84ad1e40c3231e66840ea910e9d66 || cat': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
Update 3:
It didn't work on Windows CMD or PowerShell, but it did work in Git Bash on windows.
First things first: you need a single repo that has all the available history.
Make a clone of the repo with the recent history. Add the repo with the old history as a remote. I recommend this clone be a "mirror" and that you finish by replacing your origin repo with this one. But alternately you can leave
--mirror
off, and you'll finish by pushing (possibly force-pushing depending on which approach you use) all refs back to origin.The next thing you need to do is figure out where you'll be splicing the history. The terminology to describe this is a bit fuzzy I think... what you want is to find the two commits that correspond to the most recent SVN revision for which both histories have a commit. For example your SVN repo contained versions 1, 2, 3, and 4. Now you have
where
A
represents version 1,B
represents version 2,C
andC'
represent version 3, andD
andD'
represent version 4.E
andF
are work created after the original migration. So you want to splice the commits whose parent isD
(E
in this example) ontoD'
.Now, I can think of two approaches, each with pros and cons.
Rewriting The Recent History
IMO the best way if you can coordinate a cut-over of all developers to a new repo (meaning you arrange a time when they all agree that all outstanding work is pushed, so they discard their clones; then you do the conversion; then they all re-clone) is to (effectively) rebase the recent history onto the old history.
If there is really just a single branch, then you can literally use rebase
(where
D
andD'
are replaced with the SHA ID of the commits).More likely you have some branches and merges in the recent history; in that case a rebase operation will start becoming a problem very quickly. On the other hand, you can take advantage of the fact that
D
has the same tree asD'
-- so a rebase and a re-parent are more or less equivalent.So you can use
git filter-branch
with a--parent-filter
to do the rewrite. Based on the examples in the docs at https://git-scm.com/docs/git-filter-branch you would do something like(where again
D
andD'
are replaced with the SHA ID of the commits).This creates "backup" refs that you'll need to clean up. In the end you'll get
It's the fact that
F
was replace byF'
which creates the need for a hard cut-over (more or less).Now if you made a mirror clone back at step 1, you can consider wiping the reflog, dropping the remotes, and running
gc
, and then this is a new ready-to-use origin repo.If you made a regular clone, then you'll need to
push -f
all the refs to the origin, and this will likely leave behind some clutter on the origin repo.Using a "replacement commit"
The other option doesn't create a hard cut-over, but it leaves you with small headaches to deal with forever. You can use
git replace
. In your combined repoBy default, when generating log output or whatever, if git finds
D
, it will substituteD'
(and its history) in the output.There are some known glitches. There may be unknown glitches. And by default the "replacement refs" that make this all work aren't shared, so you have to push and fetch them deliberately.