I want to make a rebase to remove a certain commit from my history. I know how to do that. However if I do it, the commit timestamp is set to the moment I completed the rebase. I want the commits to keep the timestamp.
I saw the last answer here: https://stackoverflow.com/a/19522951/3995351 , however it didn't work.
The last important command just showed a new line with
>
So I am opening a new question.
The setup
Let's say this is the history around the commit you want to remove
where:
bad
is the commit you want to remove;start
is the parent of the commit you want to remove;next
is the next commit afterbad
; it is good, you want to keep it and all the timeline after it; it will replacebad
after rebase.Prerequisites
In order to be able to safely remove
bad
, it's important that no other branch existing at the time whenbad
was created was merged into the main timeline afterbad
. I.e. by removingbad
and its connections with its parent and child commits from the history graph, you get two disconnected timeline pieces.It is probably possible to remove
bad
even if another existing branch was merged afterbad
. I didn't check this situation but I expect some impediments because of the merge commit.The idea
Each
git
commit is identified by a hash that is computed using the commit's properties: content, message, author and committer date and email.A rebase always changes the committer date. It can also change committer email, commit message and content too.
In order to restore the original committer dates after a rebase we need to save them together with some information that can identify each commit after the rebase.
Because you want to modify a commit, the commit contents change during the rebase. Adding or removing files or commits change the contents all future commits.
This leave us without a property that uniquely identifies the commits and does not change during the desired rebase. We can try to use two or more properties that do not change during the rebase.
The emails (author and committer) are of almost no use. If there is a single person that worked on the project, they are the same for all commits and cannot be used. The properties that remains (are different on most commits, are not affected by the rebase) are author date and commit message (the first line).
If the pair (author date, commit message) provides unique values for all the commits affected by the rebase then we can restore the commit dates afterwards without errors.
Verify if it can be done safely
There is a simple way to verify if the (author date, commit message) pairs are unique for the affected commits.
Run the following two commands:
If they display the same number then you are lucky: the pair (author date, commit message) can be used to uniquely identify the commits. Read on.
If the numbers are different (the first command will always produce a number smaller than or equal to the one produced by the second command) then you are out of luck.
Extract the information needed to fix the commit dates after the rebase
This command
extracts the commit hash, committer date (the payload), author date and commit message (the key) for all the commits starting with
start
and stores them in a file.Backup the current master
While it is a common misconception that
git
"rewrites history", in fact it just generates an alternative history line and decides it is the correct history. It does not change or remove the "rewritten" commits; they are still present for some time in its database and can be restored in case the operation fails.We can proactively backup the current history line to easily restore it if needed. All we have to do is to create a new branch that points to
master
. This way, whengit rebase
movesmaster
to the new timeline, the old one is still accessible using the new branch.The command above creates a branch named
old_master
that keeps the current timeline in focus until we complete all the changes and are satisfied with the new world order.Do the rebase
Removing the commit
bad
from the history is as simple as:Fix the commit dates
The following command "rewrites" the history and changes the committer date using the values we saved before:
How it works:
git
walks the history between the commits labelledstart
andmaster
and for each commit it runs the command provided as argument to--env-filter
before rewriting the commit. It sets the environment variableGIT_COMMIT
with the hash of the commit being rewritten.Since we already did a
rebase
that modified the hashes of all the commits we cannot use$GIT_COMMIT
directly to identify the original commit date of the commit (because$GIT_COMMIT
is a commit generated bygit rebase
and we are not interested in their committer dates).The command we provide to
--env-filter
runs
git log -1 --format="%aI %s" $GIT_COMMIT
to generate the key pair (author date, commit message) discussed above. Its output is passed as argument to the commandfgrep -m 1 "..." /tmp/hashlist | cut -d" " -f2
that finds the pair in the list of previously saved hashes (fgrep
) and extracts the original commit date from the saved line (cut
). Finally, the value of the commit date is stored in the environment variableGIT_COMMITTER_DATE
that is used bygit
to rewrite the commit.Verification
Using the
git log
command againyou can verify that the rewritten history matches the original history. If you use a graphical
git
client you can check the results easier by visual inspection. The branchold_master
keeps the old history line visible in the client and you can easily compare the dates of each commit ofold_master
branch with the corresponding one ofmaster
branch.If something didn't go well or you need to modify the procedure you can easily start over by:
Cleanup
When you are satisfied by the result you can remove the backup branch and the file used to store the original commit dates:
That's all!