Snowflake warehouse cache

756 views Asked by At

I cannot find such information anywhere in the documentation. I would like to figure out how the cache behaves when:

  • WH scales out, does the query on the additionally added server use the cache from the first server? When scaling, is the cache duplicated and then synchronized?
  • WH scales down and cache has a lot of data - is the cache partially truncated (bacause of smaller hardware)?
  • After turning WH off and back on, it may happen that the cache is restored? If so is possible to estimate the chance? Thanks in advance for the information
2

There are 2 answers

0
Mike Walton On BEST ANSWER

To answer the questions asked directly:

  • No, each cluster in a multi-cluster warehouse maintains its own cache, but when there are more than 1 cluster in operation, the Snowflake services will attempt to execute the query on the cluster that contains the best cache for that query.
  • When scaling down, you lose nodes of the warehouse, and the cache for those nodes will also be lost.
  • If you suspend a warehouse, you lose the cache.
0
Rajib Deb On

I think you are talking about the data cache(or SSD Cache or Local Disk cache). Think about it like this. A warehouse is a cluster of nodes. These nodes are nothing but the compute instances of the underlying cloud provider. For example if it is AWS, these nodes are EC2 instances. Each of these instances have SSD attached to it and the SSD caches some or whole of the table data when a query retrieves data from the remote storage(S3 in case of AWS). These cache is available till the warehouse is active. During that time any query which wants data from the same table can access the data from the SSD cache. But if the warehouse suspends, then next time when it resumes you may not get the same compute nodes attached, hence you may lose the data cache completely.