My current company is ingesting data from S3 to Snowflake using Informatica. After the data ingestion job is completed, we are adding a new entry to table called "INFORMATICA_JOBS" that contains the schema and table name, the number of rows processed and the timestamp.
INFORMATICA_JOBS table in snowflake:
| table | rows | timestamp |
|---|---|---|
| schema_A.table1 | 123 | 2023-06-06 13:02 |
I need to monitor this table and if the "rows" column is 0 for any of the tables then I need to login to Informatica and manually run the mapping task and see if the table has been populated. I have been trying to automate this and what I came up with is that - I can create a TASK in snowflake that returns the count of rows where the "rows" column is 0 i.e.
CREATE OR REPLACE TASK CheckTableForZeroRowsTask
WAREHOUSE = warehouse_name
SCHEDULE = 'DAILY'
AS
USE ROLE dev_role;
USE WAREHOUSE warehouse_name;
SELECT COUNT(*) FROM schema_A.table_name WHERE rows=0;
But I am not sure how to trigger the Informatica mapping task to run if the above Snowflake task says that the "rows" column does have 0.
This task you've created runs
DAILY.It's not possible to make API calls from within Snowflake Task - as per the Snowflake docs they are used to manipulate data:
You should consider creating Informatica Taskflow, schedule it to run daily, run the query
SELECT COUNT(*) FROM schema_A.table_name WHERE rows=0;and for each resulting row call appropriate taskflow API to run them.