Multiple Filter condition in scala and in and not in clause filter

95 views Asked by At

I am trying to do a filter similar to below using scala

where col1 = 'abc' and col2 not in (0,4) and col3 in (1,2,3,4)

I tried writing something like this

val finalDf: DataFrame = 
    initDf.filter(col("col1") ="abc")
          .filter(col("col2") <> 0)
          .filter(col("col2") <> 4)
          .filter(col("col3") = 1 ||col("col3") = 2 ||col("col3") = 3 ||col("col3") = 4)

or

val finalDf: DataFrame = 
     initDf.filter(col("col1") ="abc") 
     && col("col2") != 0 && col("col2") != 4 
     && (col("col3") = 1 
     || col("col3") = 2 
     || col("col3") = 3 
     || col("col3") = 4))

both not seems to be working. Can anyone help me on this.

1

There are 1 answers

0
M_S On BEST ANSWER

For col operators are a little bit different

For equality use ===

For Inequality =!=

If you want to use literals you can use lit function

Your example may look like this

dfMain.filter(col("col1") === lit("abc"))
          .filter(col("col2") =!= lit(0))
          .filter(col("col2") =!= lit(4))
          .filter(col("col3") === lit(1) || col("col3") === lit(2) ||col("col3") === lit(3) ||col("col3") === lit(4))

You can also use isin instead of this filter with multiply ors

If you want to find more about operators for cols you ca read this

Medium blog post part1

Medium blog post part2