Namespace must be defined to use Custom Metrics

1.8k views Asked by At

I am trying to deploy Memory metric which comes under guest metrics for a VM using Terraform, I have already defined namespace for the metric that I am using but its throwing below error.

Error creating or updating metric alert "Memory Usage Alert" (resource group "MyTemp"): insights.MetricAlertsClient#CreateOrUpdate: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BadRequest" Message="The following metric name(s) were not found: Memory\% Committed bytes in use. Please note that for custom metrics, the relevant metric namespace must be specified.

resource "azurerm_monitor_metric_alert" "myalert" {
  name                      = "Memory Usage Alert"
  resource_group_name       = var.rg_name //resource name to which you want to deploy this alert
  scopes                    = [var.virtual_machine_id]
  description               = "Action will be triggered when Memory Utilzation count is greater than 85."
  target_resource_type      = "Microsoft.Compute/virtualMachines"
  target_resource_location  = "centralindia"
  frequency                 = "PT30M"
  window_size               = "P1D"
  severity                  = "2"
  enabled                   = "true"

  criteria {
    metric_namespace = "azure.vm.windows.guestmetrics"
    metric_name      = "Memory\\% Committed bytes in use"
    aggregation      = "Average"
    operator         = "GreaterThanOrEqual"
    threshold        = 85
  }

  action {
    action_group_id = var.action_name
  }
}
1

There are 1 answers

1
Jim Xu On BEST ANSWER

Regarding the issue, you can refer to the following script

  1. Send Azure guest VM metrics to Azure monitor with Azure Diagnostics VM extension. For more details, please refer to here and here
resource "random_string" "password" {
  length  = 16
  special = false
}
data  "azurerm_resource_group" "mygroup" {
  name     = var.resource_group_name
}
resource "azurerm_storage_account" "account" {
  name                     = "myaccount1458975"
  resource_group_name      = data.azurerm_resource_group.mygroup.name
  location                 = data.azurerm_resource_group.mygroup.location
  account_tier             = "Standard"
  account_replication_type = "GRS"
}
resource "azurerm_virtual_machine_extension" "vmextension" {
  name                       = random_string.password.result
  virtual_machine_id         = < your VM id>
  publisher                  = "Microsoft.Azure.Diagnostics"
  type                       = "IaaSDiagnostics"
  type_handler_version       = "1.11"
  auto_upgrade_minor_version = true
   depends_on = [
      azurerm_storage_account.account
   
  ]
  settings = <<SETTINGS
    {
       "StorageAccount": "${azurerm_storage_account.account.name}",
          "WadCfg": {
            "SinksConfig": {
              "Sink": [
                {
                  "name": "AzMonSink",
                  "AzureMonitor": {}
                }
              ]
            },
            "DiagnosticMonitorConfiguration": {
              "overallQuotaInMB": 5120,
              "Metrics": {
                "resourceId": "your VM id",
                "MetricAggregation": [
                  {
                    "scheduledTransferPeriod": "PT1H"
                  },
                  {
                    "scheduledTransferPeriod": "PT1M"
                  }
                ]
              },
              "DiagnosticInfrastructureLogs": {
                "scheduledTransferLogLevelFilter": "Error"
              },
              "PerformanceCounters": {
                "sinks": "AzMonSink",
                "scheduledTransferPeriod": "PT1M",
                "PerformanceCounterConfiguration": [
                  {
                    "counterSpecifier": "\\Processor Information(_Total)\\% Processor Time",
                    "unit": "Percent",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Processor Information(_Total)\\% Privileged Time",
                    "unit": "Percent",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Processor Information(_Total)\\% User Time",
                    "unit": "Percent",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Processor Information(_Total)\\Processor Frequency",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\System\\Processes",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Process(_Total)\\Thread Count",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Process(_Total)\\Handle Count",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\System\\System Up Time",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\System\\Context Switches/sec",
                    "unit": "CountPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\System\\Processor Queue Length",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Memory\\% Committed Bytes In Use",
                    "unit": "Percent",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Memory\\Available Bytes",
                    "unit": "Bytes",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Memory\\Committed Bytes",
                    "unit": "Bytes",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Memory\\Cache Bytes",
                    "unit": "Bytes",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Memory\\Pool Paged Bytes",
                    "unit": "Bytes",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Memory\\Pool Nonpaged Bytes",
                    "unit": "Bytes",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Memory\\Pages/sec",
                    "unit": "CountPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Memory\\Page Faults/sec",
                    "unit": "CountPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Process(_Total)\\Working Set",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Process(_Total)\\Working Set - Private",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\% Disk Time",
                    "unit": "Percent",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\% Disk Read Time",
                    "unit": "Percent",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\% Disk Write Time",
                    "unit": "Percent",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\% Idle Time",
                    "unit": "Percent",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Disk Bytes/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Disk Read Bytes/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Disk Write Bytes/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Disk Transfers/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Disk Reads/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Disk Writes/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Avg. Disk sec/Transfer",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Avg. Disk sec/Read",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Avg. Disk sec/Write",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Avg. Disk Queue Length",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Avg. Disk Read Queue Length",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Avg. Disk Write Queue Length",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\% Free Space",
                    "unit": "Percent",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\LogicalDisk(_Total)\\Free Megabytes",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Network Interface(*)\\Bytes Total/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Network Interface(*)\\Bytes Sent/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Network Interface(*)\\Bytes Received/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Network Interface(*)\\Packets/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Network Interface(*)\\Packets Sent/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Network Interface(*)\\Packets Received/sec",
                    "unit": "BytesPerSecond",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Network Interface(*)\\Packets Outbound Errors",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  },
                  {
                    "counterSpecifier": "\\Network Interface(*)\\Packets Received Errors",
                    "unit": "Count",
                    "sampleRate": "PT60S"
                  }
                ]
              },
              "WindowsEventLog": {
                "scheduledTransferPeriod": "PT1M",
                "DataSource": [
                  {
                    "name": "Application!*[System[(Level = 1 or Level = 2 or Level = 3)]]"
                  },
                  {
                    "name": "Security!*[System[band(Keywords,4503599627370496)]]"
                  },
                  {
                    "name": "System!*[System[(Level = 1 or Level = 2 or Level = 3)]]"
                  }
                ]
              }
            }
          }
    }
SETTINGS
 
  protected_settings = <<SETTINGS
    {
        "storageAccountName": "${azurerm_storage_account.account.name}",
          "storageAccountKey": "${azurerm_storage_account.account.primary_access_key }",
          "storageAccountEndPoint": "https://core.windows.net/"
    }
SETTINGS

  
}

Please note that before running the script, you need to enable Azure MSI and boot_diagnostics in your VM

  1. Create Metric Alert
resource "azurerm_monitor_metric_alert" "myalert" {
  name                     = "myalert"
  resource_group_name      = var.resource_group_name
  scopes                   = [var.virtual_machine_id]
  description              = "Action will be triggered when Memory Utilzation count is greater than 85."
  severity                 = "2"
  enabled                  = "true"
  frequency                = "PT1M"
  window_size              = "PT5M",
  target_resource_type     = "Microsoft.Compute/virtualMachines"
  target_resource_location = "japaneast"

  criteria {
    metric_namespace = "azure.vm.windows.guestmetrics"
    metric_name       = "Memory\\% Committed bytes in use"
    aggregation       = "Average"
    operator          = "GreaterThanOrEqual"
    threshold         = 85

   
  }

  action {
    action_group_id = var.action_group_name
  }
}

enter image description here enter image description here