Migrating from CVS to git while preserving history for multiple modules?

81 views Asked by At

I'd like to migrate an old CVS repo to git, while preserving history.

Here's how I check out the 5 branches that need to be migrated in CVS:

branch1:
cvs co -r branch1 sim Login SystemMonitor archiver

branch2:
cvs co -r branch2 par sim Login archiver
cvs co -r branch2_redux editor

branch3:
cvs co -r GEN3 par sim SystemMonitor archiver
cvs co -r GEN3_update1 tms editor
cvs co -r GEN3Sim
...

So the issue here is that a single branch specifies:

  1. multiple revisions
  2. different modules within a single checkout
  3. multiple checkouts commands

I'm no expert but it seems like something far too flexible to fit into git where you're limited to checking out one branch/tag at a time.

What I would like to have is:

  1. a git repo which contains branch1 through branch5
  2. 'git checkout branch3' needs to create the same files on the filesystem as the 3 cvs checkout commands shown above. It's OK if there's extra modules/directories that remain unused if that makes things easier, but I need the same 3 filesets as CVS to also be there so the project builds.
  3. preserve commit history of each branch
  4. for commits that are the same between different branches, they should also be the same in git. For example the first 75% of commits for the Login module are identical across all branches, so I'd expect them to have the same SHA in git.

Is this achievable, and if so, how?

What I already tried I successfully ran cvs2git on the repo (a very long 10 hours). I ended up with branches like branch1, branch2, branch2_redux, GEN3, GEN3_update1, GEN3Sim (the -r flags). However, I'm not sure how I'm supposed to combine the histories and files of GEN3+GEN3_update1+GEN3Sim into a single branch3.

This is the command I used to convert the repo:

cd ~
rsync -avz myusername@cvsserver:/path/to/cvsrepo/ ./cvsrepo_local_copy/
cvs2git --blobfile=~/git-blob.dat --dumpfile=~/git-dump.dat --username=cvs2git ~/cvsrepo_local_copy
mkdir new_git_repo && cd new_git_repo && git init
cat ~/git-{blob,dump}.dat | git fast-import
1

There are 1 answers

2
Mort On

I'd suggest modifying (a copy of) the CVS repo first and then running the cvs2git.

The ,v files are simply text files with no checksums or anything in them, so you can just edit them.

Something like this, done for all the branches you want standardized to a base branch name.

find editor -type f -name '*,v' | xargs sed -i -e 's/branch2_redux/branch2/g'

Once you've done that, all your "branch2*" branches should be just called "branch2" and when you run the cvs2git all the files should be created together on the same branch, as you want.