Does a bisect in version control benefit from using a rebaseif workflow?

166 views Asked by At

The rebaseif mercurial extension automates the process, when pulling, of doing a rebase only if the merge can be done automatically with no conflicts.  (If there are conflicts to resolve manually, it does not rebase, leaving you ready to do a manual merge of the two branches.)  This simplifies and linearizes the history when developers are working in different parts of the code, although any rebase does throw away some information about the state of the world when a developer was doing work. I tend to agree with arguments like this and this that in the general case, rebasing is not a good idea, but I find the rebase-if philosophy appealing for the non-conflict case. I’m on the fence about it, even though I understand that there are still risks of logic errors when changes happen in different parts of the code (and the author of rebaseif extension has come to feel it’s a bad idea..)

I recently went through a complicated and painful bisect, and I think that having a large number of merges of short branches in our repository was the main reason the bisect did not live up to its implied O(lg n) promise.  I found myself needing to run "bisect --extend" many times, to stretch the range beyond the merge, going by a couple of changesets at a time, essentially making bisect O(n).  I also found it very complicated to keep track of how the bisect was going and to understand what information I'd gained so far, because I couldn't follow the branching when looking at graphs of the repository.

Are there better ways to use bisect (and to look at and understand the revision history) or am I right that the process would have been smoother if we had used rebaseif more in development. Alternately, can you help me understand more concretely what may go wrong using rebase in the non-conflict case: is it likely enough to cause problems that it should be avoided?

I’m tagging this more generally (not just mercurial) since I think rebaseif matches a more typical git workflow: git users may have seen the gotchas.

2

There are 2 answers

1
Oben Sonne On BEST ANSWER

I think the answer is simple: you have to devide between hard bisects or risky rebasing.

Or, something in between: only rebase if it is very unlikely that the rebase silently breaks things. If a rebase involves only a few changesets which additionally are semantically distant to the changes they are rebased on, it's usually safe to rebase.

Here's an example, where a conflict-free merge breaks things:

Suppose two branches start from a file with this content:

def foo(a):
    # do
    # something
    # with a (an integer)

...

foo(4)

In branch A, this is changed to:

def foo(a):
    # now this function is 10 times faster, but only work with positive integers
    assert a > 0
    # do
    # something with
    # with a

...

foo(4)

In branch B, it is changed to:

def foo(a):
    # do
    # something
    # with a (an integer)

...

foo(4)

...

foo(-1) # now we have a use case where we need to call foo with -1

Semantically, both edits conflict with each other. However, Mercurial happily merges them without conflicts (in both cases, when rebasing or when doing a regular merge):

def foo(a):
    # now this function is 10 times faster, but only work with positive integers
    assert a > 0
    # do
    # something with
    # with a

...

foo(4)

...

foo(-1) # now we have a use case where we need to call foo with -1

The advantage of a merge is that a it allows to understand what went wrong at some later point, so you can fix things accordingly. A rebase might throw away information you need to understand bugs caused by automatic merges.

1
tc. On

The main argument against git rebase seems to be a philosophical one around "losing history", but if I really cared about that I'd make the final build step a checkin (or the first build step to track all the failed builds too!).

I'm not particularly familiar with Mercurial or bisecting (except that it's a bit like git), but in my month-and-a-bit with git I exclusively stuck to rebase. I also use git rebase -i --autosquash and git add -p a lot.

IME, there's also not that much difference between a rebase and a merge when it comes to fixing conflicts — the answer you linked to suggests "rebaseif" is bad because the "if" conditions on whether the merge proceeded without conflict, whereas it should be conditioned on whether the codebase builds and tests pass.

Perhaps my thinking is skewed by an inherent weakness in git's design (it doesn't explicitly keep track of the history of a branch, i.e. the subset of commits that it's actually pointed to), or perhaps it's just how I work (check that the diff is sane and that it builds, although admittedly after a rebase I don't check that intermediate commits build).

(Aside: For personal projects I often would like to keep track of each build output and corresponding source snapshot, but I've yet to find anything which is good at doing so.)