Unable to pass variable to Deequ Checks

58 views Asked by At

I am trying to implement Deequ Check: date_start distinct values should match number of days between 2018-01-01 and $runDate

Here is what I do: Calculate date diff

val min_dt = LocalDate.of(2018, 1, 1)
// Adjusting max_dt to account for the Airflow Daily DAG run_date hourly run_date
// Also, accounting for Days.between is exclusive of the end date
val max_dt: LocalDate = if(check_sched == "daily") runDate.plusDays(2) else runDate.plusDays(1)
val expected_count:Long = min_dt.toEpochDay() -  max_dt.toEpochDay()
  1. Adding Check for hasSize
val assert_func = (size:Long) => size == expected_count

val basic_checks: Check =
        Check(CheckLevel.Error, s"date_start distinct values should match number of days between 2018-01-01 and $runDate")
            .hasSize(assert_func,
                Some(s"date_start distinct values should match number of days between 2018-01-01 and $runDate")
            )

But check fails.

Now, If I just add a hard coded value in place of expected_count the check passes.

val assert_func = (size:Long) => size == 1949
val basic_checks: Check =
        Check(CheckLevel.Error, s"date_start distinct values should match number of days between 2018-01-01 and $runDate")
            .hasSize(assert_func,
                Some(s"date_start distinct values should match number of days between 2018-01-01 and $runDate")
            )

Not sure why value of expected_count is not getting resolved here. deequ check hasSize is as follows: https://github.com/awslabs/deequ/blob/ea52006fa7c8754459afedeec65ebae6c0074018/src/main/scala/com/amazon/deequ/checks/Check.scala#L112

0

There are 0 answers