I have a status dataset like below:
I want to select all the rows from this dataset which have "FAILURE" in any of these 5 status columns.
So, I want the result to contain only IDs 1,2,4 as they have FAILURE in one of the Status columns.
I guess in SQL we can do something like below:
SELECT * FROM status WHERE "FAILURE" IN (Status1, Status2, Status3, Status4, Status5);
In spark, I know I can do a filter by comparing each Status column with "FAILURE"
status.filter(s => {s.Status1.equals(FAILURE) || s.Status2.equals(FAILURE) ... and so on..})
But I would like to know if there is a smarter way of doing this in Spark SQL.
Thanks in advance!