I'm working with an incremental dataset. Sometimes my input switches from an update transaction to a snapshot transaction. I want to know the type of my input transaction for my build because my transformation depends on the input type.
I used the count()
of dataframe which is working but it's not optimized and it's not the best way. ctx.isincremental
is about the current job and not the input.
You should use the
is_incremental
boolean.Something like:
See https://www.palantir.com/docs/foundry/transforms-python/incremental-reference/index.html#important-information
--- to answer to comment
Your current transform will snapshot if your upstream dataset snapshots regardless of schema changes. Your current transform won't snapshots if this particular input/upstream is whitelisted in the "snapshot_input" of your incremental decorator.
That being said, your questions is valid if your input is in the snapshot_input list AND sometimes it run as
snapshot
" and sometimes asupdate
AND you need to differentiate in the code what happens with each type.In Java transforms, you can use the
modificationType()
but I don't think there is an equivalent method in python.You can however read your dataframe "as you want" (only the incremental part, the whole dataset) with the read_modes, which should be enough to cover most (all?) use-cases.
Might you describe what/why you are trying to achieve this to help you more ?