Dockerfile for Spark/Java application to execute via Spark Operator

2.2k views Asked by At

I am trying to run spark/java application on kubernetese (via minikube) using spark-operator. I am getting a bit confused on what should I place in the Dockerfile so that it could be built in the image format and execute via spark-operator ?

Sample spark-operator.yaml :

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: my-spark-app
  namespace: default
spec:
  type: Java
  mode: cluster
  image: docker/repo/my-spark-app-image
  mainApplicationFile: local:///opt/app/my-spark-app.jar

As mentioned above, the spark operator yaml only requires the jar and the image location. So, do I need to mention just below in my Dockerfile ? Is there any sample Dockerfile available which I can refer ?

Dockerfile:

FROM openjdk11:alpine-jre

COPY target/*.jar /opt/app/csp_auxdb_refresh.jar
COPY src/main/resources/*  opt/app
1

There are 1 answers

0
Sadik Bakiu On

In the Dockerfile you have provided, nor Spark, nor other dependencies are installed. To quickly get started, use gcr.io/spark-operator/spark:v3.1.1 as the base for your image, i.e. change the FROM statement to FROM gcr.io/spark-operator/spark:v3.1.1 and build again.

There is a great guide on how to get started with the spark-operator in their Github repo (here).