I know that there is some issue in nth row of my spark Scala dataframe(let's say data type is not proper). When I try to write this dataframe in cassandra using spark structured streaming, it fails and whole process stops there. Now, I want that in case of such scenario, erroneous record should get filtered and inserted to some other db and write to cassandra continues for rest of the records. Need to do this because until we identify and remove erroneous record, process doesn't move forward and it creates huge lag in Kafka producer. Is it possible to identify and filter such record as I am not able to find any such solution on internet.
Thanks,
Not getting anything useful to try.
Found the solution for this. We can use foreachPartitions with Cassandra connection to process each record faster and seperate out the erroneous record.