count() function fails when reading data from Cassandra into pyspark dataframe

Question

count() function fails when reading data from Cassandra into pyspark dataframe

148 views Asked by neha At 15 July 2023 at 18:42

I am reading data from Cassandra as :

df = spark.read\
    .format("org.apache.spark.sql.cassandra")\
    .options(**configs)\
    .options(table=tablename, keyspace=keyspace)\
    .option("ssl", True)\
    .option("sslmode", "require")\
    .load()

Now this df is pyspark dataframe. I able to perform show(), printSchema() function on this df but when I am printing

df.count()

it's throwing error:

An error was encountered:
An error occurred while calling o1394.count.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 19 in stage 
48.0 failed 4 times, most recent failure: Lost task 19.3 in stage 48.0 (TID 2053, js- 
56258-63801-i-32-w-1.net, executor 9): java.lang.IllegalArgumentException: 
requirement failed: Column not found in Java driver Row: count

How I can resolve this issue? Thanks in advance

Original Q&A