I'm trying to auto-instrument my NestJS project using OpenTelemetry using the nestjs-otel package. I followed the instructions and made corrections as suggested by one of its opened issues.
This is my main configuration of the otelSdk:
export const otelSDK = new NodeSDK({
metricReader: new PrometheusExporter({
port: 8125,
}),
contextManager: new AsyncLocalStorageContextManager(),
instrumentations: [
new PinoInstrumentation(),
new HttpInstrumentation(),
new NestInstrumentation(),
getNodeAutoInstrumentations(),
]
});
When running the service locally, I've managed to get the metrics up and running, so when visiting http://localhost:8125/metrics
I see the metrics coming in:
...
# HELP http_server_duration Measures the duration of inbound HTTP requests.
# UNIT http_server_duration ms
# TYPE http_server_duration histogram
http_server_duration_count{http_scheme="http",http_method="GET",net_host_name="localhost",http_flavor="1.1",http_status_code="200",net_host_port="8125"} 3
http_server_duration_sum{http_scheme="http",http_method="GET",net_host_name="localhost",http_flavor="1.1",http_status_code="200",net_host_port="8125"} 933.854501
http_server_duration_bucket{http_scheme="http",http_method="GET",net_host_name="localhost",http_flavor="1.1",http_status_code="200",net_host_port="8125",le="0"} 0
http_server_duration_bucket{http_scheme="http",http_method="GET",net_host_name="localhost",http_flavor="1.1",http_status_code="200",net_host_port="8125",le="5"} 0
http_server_duration_bucket{http_scheme="http",http_method="GET",net_host_name="localhost",http_flavor="1.1",http_status_code="200",net_host_port="8125",le="10"} 0
...
I'm deploying my service using Kubernetes and using telegraf-operator
to inject a telegraf sidecar to collect my metrics. I've provided the following annotations on my deployment
resource:
telegraf.influxdata.com/class: influxdb
telegraf.influxdata.com/inputs: |+
[[inputs.prometheus]]
urls = ["http://localhost:{{ .Values.deployment.metrics.port }}{{ .Values.deployment.metrics.route }}"]
metric_version = 1
However, when running the service over Kubernetes, I'm getting the following error:
[inputs.prometheus] Error in plugin: error reading metrics for http://localhost:8125/metrics: reading text format failed: text format parsing error in line X: second HELP line for metric name "http_server_duration"
To my understanding, there's a mismatch between the metrics format and the telegraf input plugin exceptions. I'm not sure which plugin I should use, and if I need to make any configuration changes for this to work.
Your help will be appreciated.
I found out that the issue was because the
http_server_duration
metric was sent twice. I had to remove thenew HttpInstrumentation()
andgetNodeAutoInstrumentations()
for the duplicate to be gone. Then, the issue was solved.