How does run queue work in Snowflake? Is there a concept timeslice at all?

2.9k views Asked by At

I am a newbie to Snowflake and documentation is not clear.

  1. Say I use a Large warehouse with 5 max concurrent queries
  2. There are 5 users who fire heavy duty queries which may take many minutes to finish
  3. The 6th user has a simple query to execute
  4. Does the processes running those 5 queries yield at any point in time or do they run to completion?
  5. Will the 6th user have to wait till the timeout limit is reached and attempt using a different Virtual Warehouse

Thanks!

2

There are 2 answers

0
Simon D On

The queue is a first-in-first-out queue like most (all?) other databases. If a query is queued because other queries are consuming all resources of the cluster then it'll have to wait until the other queries are finished (or timeout) before it can run. Snowflake won't pause a query that is running to "sneak in" a smaller query.

You could always resize the warehouse though to push through the query. Here is a good line from the documentation:

Single-cluster or multi-cluster (in Maximized mode): Statements are queued until already-allocated resources are freed or additional resources are provisioned, which can be accomplished by increasing the size of the warehouse.

1
Rajib Deb On

This is actually a good question to ask and understanding how this works in snowflake will help you use snowflake more optimally. As you already know snowflake uses virtual warehouses for compute which are nothing but cluster of compute nodes. Each node has 8 cores. So, when you submit a query to a virtual warehouse, each query is being processed by one or more core(based on if the query can be parallelized). So, if the virtual warehouse does not have any core to execute the 6th query, it will queue up. If you logon to snowflake UI and click on the warehouse tab, you will see this queueing through the yellow color on the bars. You can also see it under 'QUEUED_OVERLOAD_TIME' if you query the QUERY_HISTORY view.

Now, this is not a good thing for queries to queue up consistently. So, the best practice is to have a multi warehouse strategy. Give every unique group of workload a dedicated warehouse so that you can scale them horizontally/vertically based on the query load of the given workload.