I am trying to connect to Azure cache for redis from databricks .
I have installed this package com.redislabs:spark-redis:2.3.0 from maven package in databricks. I have created a spark session with below code
SparkSession\
.builder\
.appName("myApp")\
.config("spark.redis.host", "my host")\
.config("spark.redis.port", "6379")\
.config("spark.redis.auth", "passwd")\
.getOrCreate()
But when I ran df.write.format("org.apache.spark.sql.redis").option("table", "people").option("key.column", "name").save()
I am getting below error.
*Py4JJavaError: An error occurred while calling o390.save.
: java.lang.ClassNotFoundException:
Failed to find data source: org.apache.spark.sql.redis. Please find packages at
http://spark.apache.org/third-party-projects.html*
Could you please let me know the detailed steps to install all necessary libraries/jars to access redis in databricks.
I have seen below code in spark-redis python doc but I don't know how to run it in databricks.
$ ./bin/pyspark --jars <path-to>/spark-redis-<version>-jar-with-dependencies.jar
And also please let me know what is the latest spark-redis version.
The following notebook shows how to use Redis with Apache Spark in Azure Databricks.
For more details, refer to Azure Databricks - Redis.