AWS Dynamodb package problem - EMR Serverless

Question

AWS Dynamodb package problem - EMR Serverless

81 views Asked by Valle1208 At 29 November 2023 at 18:21

I have a question with EMR serverless. I want to create a script that reads data from S3 and then upload the data to a dynamodb table using EMR Serverless.

And as a Normal EMR, I want to use this package com.audienceproject:spark-dynamodb_2.12:1.1.1

But when I set in spark properties

My step in my EMR never stops and when I manually stop no error appears but it seems that It never reach the package. The role that I'm using has dynamodb:* in * resources and my code the spark part is

spark = SparkSession.builder.appName("EMR_SERVERLESS")\
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")\
.config("spark.executor.extraJavaOptions", "-XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=35")\
.config("spark.sql.parquet.datetimeRebaseModeInRead", "CORRECTED").config("spark.sql.avro.datetimeRebaseModeInWrite", "CORRECTED")\
.config("spark.jars.packages", "com.audienceproject:spark-dynamodb_2.12:1.1.1")\
.config("yarn.nodemanager.vmem-check-enabled", "false")\
.config("yarn.nodemanager.pmem-check-enabled", "false").getOrCreate()

df = spark.read.format("csv").option("header","true").load(f'MYS3')

##TODO CODE TO PROCESS FILE

df.write.mode("append").option("tableName", f'MYTABLE').option("targetCapacity","0.99").option("region","MYREGION").format("dynamodb").save()

Can someone help me, please?

Original Q&A

There are 1 answers

**Leeroy Hannigan** · Answer 1 · 2023-11-29T18:40:59+00:00

While it's impossible to help without access to your cluster logs, I would suggest using an alternative package. com.audienceproject:spark-dynamodb_2.12:1.1.1 is an archived package and has not been updated in several years, last time it was updated EMR Serverless did not exist.

My suggestion is to use the official AWS connector for DynamoDB and Spark which is actively maintained:

https://github.com/awslabs/emr-dynamodb-connector

TechQA.

AWS Dynamodb package problem - EMR Serverless

There are 1 answers

Related Questions in AMAZON-WEB-SERVICES

Related Questions in APACHE-SPARK

Related Questions in AMAZON-DYNAMODB

Related Questions in EMR-SERVERLESS

Popular Questions

Popular Tags

Trending Questions