Kubernetes Pods Seem to be Running Twice in Different Namespaces with Similar Logs

86 views Asked by At

I've observed that my Kubernetes pods appear to be running simultaneously in two different namespaces (airflow and scheduler). The processes in these pods have numerous similarities, especially in their logs, age, and number of restarts.

However, there are minor discrepancies in the logs, which make me believe they are running twice:

Discrepancies such as slightly different timestamps

# Logs from the 'airflow' namespace
➜  ~ kubectl logs skills-extraction-5l07iznv -n airflow
...
2023-10-16 14:23:14,120| [INFO] Entries that do not match language [en]: 2 of 237

# Logs from the 'scheduler' namespace
➜  ~ kubectl logs da-boards-pipeline-skills-extraction-task-87jnihxn -n scheduler
...
[2023-10-16T14:23:14.122+0000] {pod_manager.py:418} INFO - [base] 2023-10-16 14:23:14,120| [INFO] Entries that do not match language [en]: 2 of 237

Below is a sample of the Airflow code that gets deployed in Kubernetes:

# [Airflow code snippet]
...
from datetime import datetime, timedelta

from airflow.models import DAG
from airflow.providers.cncf.kubernetes.operators.pod import KubernetesPodOperator
from config import default_args, IMAGE_SCHEDULER_IMAGES

# DAG configuration
DAG_ID = "boards_pipeline"
DAG_DESCRIPTION = "[DA Team] Field identification and extraction"
DAG_IMAGE_PRE_DEDUPLICATION_TAG = "scheduler__da_data_deduplication__latest"

RESOURCES = {
    # ... 
    'medium': client.V1ResourceRequirements(
        requests={"cpu": "2000m", "memory": "2Gi"},
        limits={"cpu": "2000m", "memory": "2Gi"}
    ),
}

pod_args = {
    'namespace': "airflow",
    'service_account_name': "airflow",
    'image_pull_secrets': [k8s.V1LocalObjectReference("docker-registry")],
    'env_vars': {
        "EXECUTION_DATE": "{{ execution_date }}",
    },
    'in_cluster': True,
    'get_logs': True,
    'on_finish_action': 'delete_succeeded_pod',
    'trigger_rule': 'always',
}

on_demand_pod_args = {
    **pod_args,
    'node_selector': {"abcd.com/tenant": "scheduler"},
    'tolerations': [k8s.V1Toleration(key="abcd.com/tenant", operator="Equal", value="scheduler")],
    'on_finish_action': 'delete_pod',
}

with DAG(
        DAG_ID,
        default_args=default_args,
        start_date=datetime(2021, 5, 11),
        schedule_interval="30 0 * * *",
        max_active_runs=10,
        max_active_tasks=14,
        catchup=False,
        description=DAG_DESCRIPTION,
        tags=["Data-analytics", "boards", "Pipeline"],
) as dag:
    pre_deduplication = KubernetesPodOperator(
        cmds=["python3", "run_deduplication.py", "--item", "boards"],
        name="deduplication",
        task_id="deduplication-task",
        max_active_tis_per_dag=1,  
        image=IMAGE_SCHEDULER_IMAGES.format(DAG_IMAGE=DAG_IMAGE_PRE_DEDUPLICATION_TAG),
        container_resources=RESOURCES['medium'],
        **on_demand_pod_args
    )

    pre_deduplication

Given the above, I have a few questions:

  1. Is it possible that the DAG is being triggered twice, leading to the creation of pods in both namespaces?
  2. How can I ensure that the DAG runs only once and in the intended namespace?
  3. Are there any configurations or settings in Airflow or Kubernetes that I might be overlooking which could cause this behavior?

Any insights or suggestions would be greatly appreciated!

0

There are 0 answers