EMRserverless is allocating half of the memory to the executors than what we actually define in spark jobs

151 views Asked by At

When I define an spark's executor's memory to 12gb, it actually allocates almost half of it like 6.7gb.

Tried setting 20gb as well, then it allocates close to 11gb, half of it.

I have defined sufficient application limits like cores, memory and disk space but am facing this strange issue.

It's just not with server less EMR, same issue is there with regular EMR too

Job is submitted by airflow dags as follows:

job_task = EmrServerlessStartJobOperator(
            task_id=dag_id,
            application_id=emrAppId,
            execution_role_arn=roleId,
            execution_timeout=timedelta(hours=8),
            config={"name": dag_id,"executionTimeoutMinutes": 3600},
            job_driver={
                "sparkSubmit": {
                    "entryPoint": configuration['S3_DAGS_HDFS_JARS_PATH']+"/app-jobs.jar",
                    "entryPointArguments": job_args,
                    "sparkSubmitParameters": "--class com.edifecs.em.cloud.ecf.MyMainClass"+
                    " --conf spark.jars=configuration['S3_DAGS_HDFS_JARS_PATH']+"/hudi/hudi-utilities-bundle_*.jar" +
                    " --executor-cores=4" +
                    " --conf spark.driver.cores=4"+ 
                    " --conf spark.executor.memory=12G" + 
                    " --conf spark.driver.memory=8G"+ 
                    " --conf spark.dynamicAllocation.initialExecutors=4"+ 
                    " --conf spark.dynamicAllocation.minExecutors=2"+ 
                    " --conf spark.dynamicAllocation.maxExecutors=30"
                }
            }

My serverless application limits are: Vcpus=130 Maximum memory: 450Gb Disk: 650Gb

Any specific reason?

0

There are 0 answers