git pull / stash conflicts with a git filter

1.3k views Asked by At

This question is about using git with nbstripout filter, which removes some fields from a jupyter notebook (a JSON file) before storing it under git. The strip out filter is used to minimize conflicts when the same notebook is worked on by several developers.

So the repo configuration to start with:

$ cat .git/config
[core]
        repositoryformatversion = 0
        filemode = true
        bare = false
        logallrefupdates = true
[remote "origin"]
        url = [email protected]:stas00/fastai_v1.git
        fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
        remote = origin
        merge = refs/heads/master
[include]
        path = ../.gitconfig

$ cat .gitconfig
[filter "nbstripout"]
        clean = nbstripout
        smudge = cat
        required = true
[diff "ipynb"]
        textconv = nbstripout -t

$ cat .gitattributes
*.ipynb filter=nbstripout

*.ipynb diff=ipynb

Using this configuration during git diff or git commit the notebook is run through a filter that removes json fields that are local (like cell's execution_count) and will vary from developer to develop.

Now consider a normal situation where the same notebook changes upstream and locally. Trying to sync the local repo with the upstream fails:

$ git pull origin master
From github.com:stas00/fastai_v1
 * branch            master     -> FETCH_HEAD
Updating 1ea49ad..ae0ba93
error: Your local changes to the following files would be overwritten by merge:
        dev_nb/004_callbacks.ipynb
Please commit your changes or stash them before you merge.
Aborting

$ git diff dev_nb/004_callbacks.ipynb | wc -l
60

$ git stash
Saved working directory and index state WIP on pull-merge: 1ea49ad Console progress bar

$ git diff dev_nb/004_callbacks.ipynb | wc -l
0

$ git pull origin master
From github.com:stas00/fastai_v1
 * branch            master     -> FETCH_HEAD
Updating 1ea49ad..ae0ba93
error: Your local changes to the following files would be overwritten by merge:
        dev_nb/004_callbacks.ipynb
Please commit your changes or stash them before you merge.
Aborting

This shouldn't happen, since git stash should have stashed away all local changes. I'm not quite sure what exactly happens, but I think git stash also gets run through a filter and it stashes only the changes showing through the nbstripout filter. So perhaps git stash doesn't quite bring the files to pre-modified state? Yet, after I disable the filter git diff shows nothing (and neither before disabling it).

In other words, why git pull sees a potential conflict and won't merge, even though git diff shows no local changes exist (but they do in reality, they are just the changes that get stripped via the filter).

At the very least I expect git diff to show the changes after the stripout filter is disabled but it doesn't.

To make the git stash; gist pull work I have to disable the filter before running git stash.

$ nbstripout --uninstall

$ git stash
Saved working directory and index state WIP on pull-merge: 1ea49ad Console progress bar

$ git pull origin master
remote: Counting objects: 3, done.
remote: Compressing objects: 100% (1/1), done.
remote: Total 3 (delta 2), reused 3 (delta 2), pack-reused 0
Unpacking objects: 100% (3/3), done.
From github.com:stas00/fastai_v1
 * branch            master     -> FETCH_HEAD
   1ea49ad..ae0ba93  master     -> origin/master
Updating 1ea49ad..ae0ba93
Fast-forward
 dev_nb/004_callbacks.ipynb | 1268 ----------------------------------------------------------------------------------------------------------------------------------------
 1 file changed, 1268 deletions(-)

and now I have to remember to re-enable the filter:

$ nbstripout --install

Is there a better workflow that doesn't require disabling/enabling the filter for this to work?

0

There are 0 answers