git-grep not using multiple threads

710 views Asked by At

I am trying to use git grep to search all revisions of a very large repository. The command I am using is:

$ git rev-list --all | xargs git grep -I --threads 10 --line-number \
  --only-matching "SomeString"

I am using the latest official version of git on mac:

$ git --version
git version 2.19.1

It's taking a very long time, looking at activity monitor git is only using one thread. However the docs say it should use 8 by default. It only uses one thread with or without the --threads <num> option. I don't have any other config set that would override this setting either:

$ git config --list
credential.helper=osxkeychain
user.name=****
user.email=****

Any ideas what I'm missing? Can anybody else use git-grep and confirm that they see multiple threads?

Thanks for any help

2

There are 2 answers

7
msanford On BEST ANSWER

I wonder if it's because you're using | xargs, which waits for input on stdin. Since the output from git rev-list is a single stream, xargs, by default will use only one process:

-P max-procs, --max-procs=max-procs
              Run up to max-procs processes at a time; **the default is 1**.  If
              max-procs is 0, xargs will run as many processes as possible
              at a time.

So try increasing it using the above flag:

git rev-list --all | xargs -P 10 git grep -I --threads 1 --line-number \
    --only-matching "SomeString"

This will spawn multiple git greps, rather that enable git grep to use multiple threads, so a sort-of-functional answer.

0
VonC On

The number of threads to allocate to xargs will depends on the number of threads used by git grep.

It used to be 8 by default for git grep.

But:

With Git 2.26 (Q1 2020), this is now the number of cores.

See commit f1928f0, commit 70a9fef, commit 1184a95, commit 6c30762, commit c441ea4, commit d799242, commit 1d1729c, commit 31877c9, commit b1fc9da, commit d5b0bac, commit faf123c, commit c3a5bb3 (16 Jan 2020) by Matheus Tavares (matheustavares).
(Merged by Junio C Hamano -- gitster -- in commit 56ceb64, 14 Feb 2020)

grep: use no. of cores as the default no. of threads

Signed-off-by: Matheus Tavares

When --threads is not specified, git grep will use 8 threads by default.

This fixed number may be too many for machines with fewer cores and too little for machines with more cores.
So, instead, use the number of logical cores available in the machine, which seems to result in the best overall performance.

The following measurements correspond to the mean elapsed times for 30 git grep executions in chromium's repository with a 95% confidence interval (each set of 30 were performed after 2 warmup runs).
Regex 1 is 'abcd[02]' and Regex 2 is '(static|extern) (int|double) \*'.

(chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution.)

      |          Working tree         |           Object Store
------|-------------------------------|--------------------------------
 #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
------|---------------|---------------|----------------|---------------
  32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
  16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
   8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
   4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
   2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
   1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01

The above tests were performed in a desktop running Debian 10.0 with Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of RAM and a 7200 rpm, SATA 3.1 HDD.

Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:

      |          Working tree          |           Object Store
------|--------------------------------|--------------------------------
 #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
------|---------------|----------------|----------------|---------------
  32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
  16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
   8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
   4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
   2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
   1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08