Exporting Prometheus Alertmanager Alerts to CSV Using Python, Filtering by Specific Timestamps

72 views Asked by At

I'm working with a monitoring platform provisioned on AWS, utilising EKS, and configured with a Prometheus data source and Alertmanager. Currently, alerts triggered by Alertmanager are sent to a Slack channel. I have used Alertmanager API to fetch all the alerts triggered within a specific period of time.

Below is the Python code I've written for this purpose. Please note that I'm not very experienced with Python. The code successfully exports alerts to an Excel sheet, but it includes random alerts from various dates (e.g., 20th, 21st, 26th, 27th of February 2024), rather than only those triggered on a specific day (e.g., 27th of February).

I'm seeking assistance to modify the code so that it exports only alerts triggered within the specified date range. Currently, it downloads the Excel sheet to my PC with numerous random alerts outside the provided dates in the script.

Any help or suggestions would be greatly appreciated. Thank you.

import requests
import pandas as pd
from datetime import datetime
import os  # Import the os module

def export_alerts(prometheus_url, start_time, end_time):
    # Construct the URL to query the alerts
    alerts_url = f"{prometheus_url}/api/v1/alerts"

    # Initialize an empty list to store all alerts
    all_alerts = []

    # Define query parameters for the initial request
    params = {
        "start": start_time,
        "end": end_time,
        "limit": 100  # Adjust as needed based on your expected number of alerts per query
    }

    try:
        while True:
            # Send a GET request to fetch alerts with pagination
            response = requests.get(alerts_url, params=params)
            response.raise_for_status()  # Raise an exception for any error response

            # Extract alerts from the response
            alerts_data = response.json()["data"]["alerts"]

            # Append alerts to the list of all alerts
            all_alerts.extend(alerts_data)

            # Check if there are more alerts to fetch
            if "next" in response.json()["data"]:
                params["start"] = response.json()["data"]["next"]
            else:
                break  # Exit loop if no more alerts to fetch

        # Return all alerts
        return all_alerts

    except requests.RequestException as e:
        # Handle any HTTP request errors
        print(f"Error fetching alerts: {e}")
        return None

# Example usage
prometheus_url = "https://prometheus.dev.spencecom.eu.spence.cloud"
start_time = "2024-02-26T00:00:00Z"
end_time = "2024-02-27T23:59:59Z"

alerts = export_alerts(prometheus_url, start_time, end_time)
if alerts:

    # Convert alerts to a DataFrame
    df = pd.DataFrame(alerts)

    # Print column names for verification
    print("Column names in DataFrame:")
    print(df.columns)

    # Generate timestamp for filename
    timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")

    # dir to save excel
    # Example: "/Users/your_username/Documents/alerts/"
    output_directory = "/Users/spenceebuka/Documents/PYTHON_SCRIPT"

    # Ensure that the output directory exists
    os.makedirs(output_directory, exist_ok=True)

    # Export DataFrame to Excel file with specified column names and timestamp in filename
    excel_file_path = os.path.join(output_directory, f"alerts_{timestamp}.xlsx")
    df.to_excel(excel_file_path, index=False)
    print(f"Alerts exported to {excel_file_path} successfully.")
else:
    print("Failed to fetch alerts.")
1

There are 1 answers

0
DazWilkin On

One way to parse the RFC3339 strings in Python is to use dateutil:

from dateutil.parser import parse


# Parse the Prometheus RFC3339 date strings as datetime's
START_TIME = parse("2024-02-26T00:00:00Z")
END_TIME = parse("2024-02-27T23:59:59Z")

# A function that determines whether a date is between START_TIME and END_TIME
def is_alert_in_range(date_str: str) -> bool:
    d=parse(date_str)
    return START_TIME <= d <= END_TIME

I think you can drop the while loop (retain the try) because you'll get everything in one request.

Then you can:

    try:
        # Send a GET request to fetch alerts with pagination
        response = requests.get(alerts_url)
        response.raise_for_status()  # Raise an exception for any error response

        # Extract alerts from the response
        alerts_data = response.json()["data"]["alerts"]

        # Filter alerts by activeAt between start_time|end_time
        filtered_alerts_data = [alert for alert in alerts_data if is_alert_in_range(alert["activeAt"])]

        # Return all alerts
        return filtered_alerts_data

    except requests.RequestException as e:
        # Handle any HTTP request errors
        print(f"Error fetching alerts: {e}")
        return None

NOTE There's a Prometheus Python Client but I think it doesn't support querying alerts.