We have a single repo of our source code which if downloaded is around 2.8GB. We have 4 self hosted agents and over 100 build pipelines. With that, it is not feasible to download the entire source coded for each build/agent.
The approach I gone with is to disable the checkout for these pipeline and then run a command-line script to perform a Git sparse checkout. However this is taking around 15 minutes to get ~100MB worth of source code.
We are using self-hosted Linux agents.
steps:
- checkout: none
- task: CmdLine@2
displayName: "Project Specific Checkout"
inputs:
script: |
cd $(Build.SourcesDirectory)
git init
git config --global user.email ""
git config --global user.name ""
git config --global core.sparsecheckout true
echo STARS/Source/A/ >> .git/info/sparse-checkout
echo STARS/Source/B/ >> .git/info/sparse-checkout
echo STARS/Source/C/ >> .git/info/sparse-checkout
git remote rm origin
git remote add origin https://service:$(Service.Account.Personal.Access.Token)@dev.azure.com/Organization/Project/_git/STARS
git reset --hard
git pull origin $(Build.SourceBranch)
Is there anything I'm doing wrong here which is causing it to take so long to pull this data.
1.Since you use self-hosted agent, you could go to the agent machine, to run the git commands manually, to see whether you would get the same performance.
2.Set variable
system.debug
totrue
, to check which command cost more time.3.Instead of Git Sparse checkout, you may specify
path
incheckout
step:https://learn.microsoft.com/en-us/azure/devops/pipelines/yaml-schema?view=azure-devops&tabs=schema%2Cparameter-schema#checkout
4.Since you run a pipeline on a self-hosted agent, by default, none of the subdirectories are cleaned in between two consecutive runs. As a result, you can do incremental builds and deployments, provided that tasks are implemented to make use of that. So you can set Clean option to false.
https://learn.microsoft.com/en-us/azure/devops/pipelines/process/phases?view=azure-devops&tabs=yaml#workspace