I want to bulk load the data into snowflake warehouse using pandas. Please find whole requirement below:
- I have source data in snowflake table. I am reading the same in a dataframe.
- After loading the data in dataframe, I have made changes in data using some pandas functions.
- After these changes i need to load the data in snowflake again.
file size : 200k records
Things i have tried:
- first created the for loop which was creating the insert statement at go and also loading the same. This script ran for ~4 hours and loaded ~9k records(so this is not a viable option).
- Then i created the whole insert query earlier before executing it on database. This approach is also failing and taking a lot of time(same as the above one).
- I tried parallel processing and also created batch for data. Still no luck.
- Later i tried copy into approach and it is working.
But i do not want to use COPY into as it is only snowflake specific.
Please help me with bulk loading of data using python.
Try using the snowflake-connector-python library
Snowflake provides the copy_into method to efficiently bulk load data. You can use it as follows:
This approach uses the built-in capabilities of Snowflake Connector for bulk loading, which should be much faster than inserting rows one at a time.