I have a where clause in Spark SQL that for some reason doesn't return any records. I think it doesn't work and so I would like to ask what would be equivalent to it?
SELECT
c.client_id,
current_date() as insert_date
FROM
CLIENT_SUB c
WHERE
(c.client_id, insert_date) not in (SELECT client_id, insert_date from CLIENT_SUBSCRIBER_CONTRACT)
I heard I could do it with a join
I would recommend
not exists
: it isnull
-safe, whilenot it
isn't - and it usually scales better, too.I am also suspiscious about the reference to
insert_date
: do you really mean that, or you actually wantcurrent_date()
?For performance, consider an index on
client_subscriber_contract(client_id, insert_date)
.