Avoid git clean with Azure Devops self-hosted Build Agent

2.2k views Asked by At

I have a YAML build script in an Azure hosted git repository which gets triggered across 7 build agents running on a local VM. Every time this runs, the build performs a git clean which takes a significant amount of time due to a large node_modules folder which takes a long time to clean up.

The MSDN page here seems to suggest this is configurable but shows no detail of how to configure it. I can't tell whether this is a setting that should be specified on the agent, the YAML script, within DevOps on the pipeline, or where.

Is there any other documentation I'm missing or is this not possible?

Update: The start of the YAML file is here:

variables:
  BUILD_VERSION: 1.0.0.$(Build.BuildId)
  buildConfiguration: 'Release'
  process.clean: false

jobs:
###### ######################################################
###### 1 - Build and publish .NET
#############################################################

- job: net_build_publish
  displayName: .NET build and publish
  pool:
    name: default
  steps:
  - script: echo $(BUILD_VERSION)

  - task: DotNetCoreCLI@2
    displayName: dotnet build $(buildConfiguration)
    inputs:
      command: 'build'
      projects: |
        myrepo/**/API/*.csproj
      arguments: '-c $(buildConfiguration) /p:Version=$(BUILD_VERSION)'

The complete yaml is a lot longer, but the output from the first job includes this output in a Checkout task

Checkout myrepo@master to s

View raw log

Starting: Checkout myrepo@master to s
==============================================================================
Task         : Get sources
Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
Version      : 1.0.0
Author       : Microsoft
Help         : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
==============================================================================
Syncing repository: myrepo (Git)
Prepending Path environment variable with directory containing 'git.exe'.
git version
git version 2.26.2.windows.1
git lfs version
git-lfs/2.11.0 (GitHub; windows amd64; go 1.14.2; git 48b28d97)
git config --get remote.origin.url
git clean -ffdx
Removing myrepo/Data/Core/API/bin/
Removing myrepo/Data/Core/API/customersettings.json
Removing myrepo/Data/Core/API/obj/
Removing myrepo/Data/Core/Shared/bin/
Removing myrepo/Data/Core/Shared/obj/
....

We have another job further down which runs npm install and npm build for an Angular project, and every build in the pipeline is taking 5 minutes to perform the npm install step, possibly because of this git clean when retrieving the repository?

4

There are 4 answers

0
Cece Dong - MSFT On

git clean -ffdx will clean any change untracked by source control in the source. You may try Pipeline caching, which can help reduce build time by allowing the outputs or downloaded dependencies from one run to be reused in later runs, thereby reducing or avoiding the cost to recreate or redownload the same files again. Check the following link:

https://learn.microsoft.com/en-us/azure/devops/pipelines/release/caching?view=azure-devops#nodejsnpm

variables:
  npm_config_cache: $(Pipeline.Workspace)/.npm

steps:
- task: Cache@2
  inputs:
    key: 'npm | "$(Agent.OS)" | package-lock.json'
    restoreKeys: |
       npm | "$(Agent.OS)"
    path: $(npm_config_cache)
  displayName: Cache npm
5
Krzysztof Madej On

As I mentioned below. You need to calculate hash before you run npm install. If hash is the same as the one kept close to node_modules you can skip installing dependencies. This may help you achieve this:

steps:
- task: PowerShell@2
  displayName: 'Calculate and save packages.config hash'
  inputs:
    targetType: 'inline'
    pwsh: true
    script: |
      # generates a hash of package-lock.json
      $newHash = Get-FileHash -Algorithm MD5 -Path (Get-ChildItem package-lock.json)
      $hashPath = "$(System.DefaultWorkingDirectory)/cache-npm/hash.txt"
      if(Test-Path -path $hashPath) {
        if(Compare-Object -ReferenceObject $(Get-Content $hashPath) -DifferenceObject $newHash) {
          
          Write-Host "##vso[task.setvariable variable=NodeModulesAreUpToDate;]true"
          $newHash > $hashPath
          Write-Host ("Hash File saved to " + $hashPath)
        } else {
          # files are the same
          Write-Host "no need to install node_modules"
        }
      } else {
        $newHash > $hashPath
        Write-Host ("Hash File saved to " + $hashPath)
      }
      
      $storedHash = Get-Content $hashPath
      Write-Host $storedHash
    workingDirectory: '$(System.DefaultWorkingDirectory)/cache-npm'

- script: npm install
  workingDirectory: '$(Build.SourcesDirectory)/cache-npm'
  condition: ne(variables['NodeModulesAreUpToDate'], true)
0
Vasantha Ganesh On

In the checkout step, it allows us to set the boolean option clean to true or false. The default is true so it runs git clean by default.

Below is a minimal example with clean set to false.

jobs:
- job: Build_Job
  timeoutInMinutes: 0
  pool: 'PoolOne'

  steps:
  - checkout: self
    clean: false
    submodules: recursive

  - task: PowerShell@2
    displayName: Make build
    inputs:
      targetType: 'inline'
      script: |
        bash -c 'make'

More documentation and related options can be found here

0
Matt On
  1. Click on your pipeline to show the run history
  2. Click Edit
  3. Click the 3 dot kebab menu
  4. Click Triggers
  5. Click YAML
  6. Click Get Sources
  7. Set Clean to False and Save

To say this is obfuscated is an understatement!

I can't say what affect this will have though, I think the agent reuses the same folder each time a pipeline runs and I'm not Node.js developer so I don't know what leaving old node_modules hanging around will do!

P.S. what people were saying about pipeline caching I don't think is what you were asking, also pipeline caching zips up the cached folder and uploads it to your artifacts storage, it then downloads it each time, if you only have 1 build agent then actually not doing a git clean might be more efficent I'm not 100%