One-line code change staged by Dulwich shows every line different

262 views Asked by At

I have a file with a one-line change: git status reports

S:\mydir\AEL>git status CodingTools_SourceControl.ael
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   CodingTools_SourceControl.ael

no changes added to commit (use "git add" and/or "git commit -a")

And here is the change that diff reports

S:\mydir\AEL>git diff CodingTools_SourceControl.ael
diff --git a/AEL/CodingTools_SourceControl.ael b/AEL/CodingTools_SourceControl.ael
index 7ae86d7..fd53caa 100644
--- a/AEL/CodingTools_SourceControl.ael
+++ b/AEL/CodingTools_SourceControl.ael
@@ -22,7 +22,7 @@ import ael
 import acm
 is_64_bit = True

-# Special-purpose overrides
+# Special-purpose overrides. These deliberately require minor code changes.
 #CodingTools_PyLint.VERBOSE = True
 #CodingTools_PyLint.PYLINTRC = "default.pylintrc"

Now I stage my change:

S:\mydir\AEL>git add CodingTools_SourceControl.ael

S:\mydir\AEL>git status CodingTools_SourceControl.ael
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   CodingTools_SourceControl.ael

And if I ask for a report on the staged change I see the same one-line change:

S:\mydir\AEL>git diff --cached CodingTools_SourceControl.ael
diff --git a/AEL/CodingTools_SourceControl.ael b/AEL/CodingTools_SourceControl.ael
index 7ae86d7..fd53caa 100644
--- a/AEL/CodingTools_SourceControl.ael
+++ b/AEL/CodingTools_SourceControl.ael
@@ -22,7 +22,7 @@ import ael
 import acm
 is_64_bit = True

-# Special-purpose overrides
+# Special-purpose overrides. These deliberately require minor code changes.
 #CodingTools_PyLint.VERBOSE = True
 #CodingTools_PyLint.PYLINTRC = "default.pylintrc"

Now I unstage the change

S:\PrimeObjects\ADSO71\KEATING\AEL>git reset CodingTools_SourceControl.ael
Unstaged changes after reset:
M       AEL/ATS_SourceControl.ael
...several other unstaged changes...

I want to be able to use Dulwich to manage staging and commits. So inside Idle, after the reset, I do this:

>>> from dulwich.repo import Repo
>>> repo = Repo(br"S:\mydir")
>>> repo.stage([br"AEL\CodingTools_SourceControl.ael"])

After that, git status shows the change as staged, just like before

S:\mydir\AEL>git status CodingTools_SourceControl.ael
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   CodingTools_SourceControl.ael

But if I now issue a git diff command, I get a diff report that shows all 1500+ lines of the file as changed:

S:\mydir\AEL>git diff --cached --stat CodingTools_SourceControl.ael
 AEL/CodingTools_SourceControl.ael | 3082 ++++++++++++++++++-------------------
 1 file changed, 1541 insertions(+), 1541 deletions(-)

Edit: Following up on @RomainVALERI's helpful comment, I tried this command

S:\mydir\AEL>git diff --cached --stat --ignore-cr-at-eol CodingTools_SourceControl.ael
 AEL/CodingTools_SourceControl.ael | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

and it reports one line changed. So it is a line-ending problem. But I need Dulwich operations to be interchangeable with command-line operations. How do I tell Dulwich Repo.stage() to treat line endings the way git add does?

I tried using porcelain.add() instead of Repo.stage()

porcelain.add(repo, r"S:\mydir\AEL\CodingTools_SourceControl.ael")

but it didn't help any.

2

There are 2 answers

0
BoarGules On BEST ANSWER

From the code in dulwich.index.blob_from_path_and_stat() it appears that Dulwich pays no attention to the core.autocrlf setting and pays no attention to anything in the .gitattributes file and simply writes a byte-for-byte copy of whatever is in the working directory file to the Git database.

So Dulwich 0.19.5 and Windows are not a good match if your team will also use other tools that are aware of line-ending policies and apply them in the way that Git does. A later version may well address this, but for now it's oil and water.

As a Git beginner I found Mind the end of your line by Tim Clem at GitHub the very clearest explanation of the dozen or so I read while trying to understand the issue and solve the problem.

0
Petr Kozelka On

You can control line endings by specifying rules in .gitattributes file - see more at https://git-scm.com/docs/gitattributes

I always use one with following content whenever I create a new source repository:

* text=auto

because, introducing it later comes with some pain, especially when your repository is shared with a team - because it changes all (or most) files, and furthermore, this change appears after new checkout, not on the current working copy.

To minimize this pain, you can specify that it affects just your extension.