How to connect my cloud spanner via jdbc in pyspark

149 views Asked by At

I am trying to connect my spanner, it's done but when I reading/writing to table then getting error like (String literal issue)

For I got the issue is "columns_name" vs columns_name

. The problem is how I can create custom jdbc dialect in pyspark.

Connection is done my spanner is connected. Reading and writing is issue.

Is issue is "" vs `` the code are :

from pyspark.sql import SparkSession
from google.cloud import spanner
from pyspark.sql.types import StructType, StructField, 
StringType, IntegerType
import os


OPERATION_TIMEOUT_SECONDS = 240
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "./credentials_dev.json"
credentials = "./credentials.json"
spark = SparkSession.builder.appName(
"Submit local CSV to Spanner").getOrCreate()

spark.sparkContext.addFile("google-cloud-spanner-jdbc-2.8.0.jar")

jdbc_url = "jdbc:cloudspanner:/projects/dev-data/" + \
"instances/spanner-test/databases/testdb?" + \
"credentials=credentials_dev.json;autocommit=false"

driverClass = "com.google.cloud.spanner.jdbc.JdbcDriver"
connection_properties = {
'driver': 'com.google.cloud.spanner.jdbc.JdbcDriver'
}


# ---------read operation-----------------------------------------------
# df = spark.read.jdbc(url=jdbc_url,table='stories', properties=connection_properties)
# print(df)
# df.show()


schema = StructType([
StructField("author", StringType(), True),
StructField("s_by", StringType(), True),
StructField("dead", StringType(), True),
])

csv = 'smalldata.csv'

#reading local csv file--------

df1 = spark.read.schema(schema).csv("smalldata.csv",inferSchema=True,header=True)
# df1.show()

#write operation------------

df1.write.jdbc(url=jdbc_url, table="stories", mode="append",properties=connection_properties).save()

Pyspark version =3.3.2

Cloudspannnerjdbc version=2.8.0

0

There are 0 answers