Get type of transaction for input dataset

80 views Asked by At

I'm working with an incremental dataset. Sometimes my input switches from an update transaction to a snapshot transaction. I want to know the type of my input transaction for my build because my transformation depends on the input type.

I used the count() of dataframe which is working but it's not optimized and it's not the best way. ctx.isincremental is about the current job and not the input.

1

There are 1 answers

2
ZettaP On

You should use the is_incremental boolean.

Something like:

def compute(ctx, ...):
    ctx.is_incremental == true:
        # Your logic if is incremental
    else:
        # Your logic is snapshot

See https://www.palantir.com/docs/foundry/transforms-python/incremental-reference/index.html#important-information

--- to answer to comment

Your current transform will snapshot if your upstream dataset snapshots regardless of schema changes. Your current transform won't snapshots if this particular input/upstream is whitelisted in the "snapshot_input" of your incremental decorator.

That being said, your questions is valid if your input is in the snapshot_input list AND sometimes it run as snapshot" and sometimes as update AND you need to differentiate in the code what happens with each type.

In Java transforms, you can use the modificationType() but I don't think there is an equivalent method in python.

You can however read your dataframe "as you want" (only the incremental part, the whole dataset) with the read_modes, which should be enough to cover most (all?) use-cases.

Might you describe what/why you are trying to achieve this to help you more ?