Error handling in cloud init scripts within azurerm_linux_virtual_machine

1.1k views Asked by At

I run a custom shell script when I deploy my virtual machine with terraform, which can throw errors.

My question is, how do you handle these errors, because regardless of the return code of the script, terraform always reports the deployment was successful, which leads to confusion when the VM does not what it’s supposed to do.

Here a snippet of the terraform file for context:

data "template_file" "setup_script" {
  count    = var.agent_count
  template = file("scripts/setup.sh")
  vars = {
    POOL_NAME        = var.pool_name
    AGENT            = "agent-${count.index}"
    ORGANIZATION_URL = var.organization_url
    TOKEN            = var.token
    TERRAFORM_VERSION = var.terraform_version
    VSTS_AGENT_VERSION = var.vsts_agent_version
  }
}

resource "azurerm_linux_virtual_machine" "vmachine" {
  count               = length(module.network.network_interfaces)
  name                = "agent-${count.index}"
  resource_group_name = azurerm_resource_group.deployment-agents.name
  location            = azurerm_resource_group.deployment-agents.location
  size                = "Standard_B1ms"
  admin_username      = "adminuser"

  network_interface_ids = [
    module.network.network_interfaces[count.index].id,
  ]

  admin_ssh_key {
    username   = "adminuser"
    public_key = var.ssh_public_key
  }

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "Canonical"
    offer     = "UbuntuServer"
    sku       = "18.04-LTS"
    version   = "latest"
  }

  boot_diagnostics {
    storage_account_uri = azurerm_storage_account.boot.primary_blob_endpoint
  }
  custom_data = base64encode(data.template_file.setup_script.*.rendered[count.index])
}

And the setup.sh shell script:

# --- snip ----
apt-get install azure-cli
if [ $? -gt 0 ]; then
  echo "Cannot install azure cli!"
  exit 1
fi

# Test
exit 1

Thanks for the help.

2

There are 2 answers

2
Charles Xu On BEST ANSWER

Terraform only return the error about itself, not the script execute inside the VM. You can find the error message inside the VM.

And to install the Azure CLI via the cloud-init with a shell script, you need to add #!/bin/bash at the beginning of the shell script, see the note:

enter image description here

And install the Azure CLI, I think there are more things you need to do than what you have tried, take a look at the steps that install the Azure CLI in Ubuntu. Or use the existing shell script here.

0
Ilon Sjögren On

The important piece here is "a cloud-init image deployment will NOT fail if the script fails" from the Azure cloud-init docs. This mean that Azure think the deployment was successful, which is what Terraform will also show.

In order to fail the Terraform deployment, you would have to use cloud-init to install the cli using cloud-init itself, which should make the deployment fail:

data "template_cloudinit_config" "config" {
  gzip          = true
  base64_encode = true

  part {
    content_type = "text/cloud-config"
    content      = "packages: ['azure-cli']"
  }
}

More examples on what could be done using cloud-init in the docs.