ColumnStore index benefits on Azure?

1.3k views Asked by At

We are currently running on Azure and we have a table with hundreds of millions of rows. This table is static and will be refreshed weekly. We've looked at ColumnStore index but unfortunately it is not Azure yet so below are my questions,

  • Will ColumnStore index be available in Azure?
  • if not what other technology can we use to get the same performance benefits as the ColumnStore index would provide?
  • Can we get the same query performance by using Azure Table Storage?

I'm a newbie to both Azure and Columnar databases so please bear me with me if I ask these questions.. :)

2

There are 2 answers

0
Amar Palsapure On
  • About ColumnStore, if you have bought the license, you can check with development team or ask on blogs such as ScottGu's Blog. From there only you will come to know about any feature release.
  • Azure Database is designed for scalability. You will need to use the Partition Key very wisely. Partition Key is like index of book, so if you want to search something in book, you can quickly refer to the index and reach the page quickly. In other words, you can group data depending upon certain criteria and store it in a single partition. So where ever you have the same criteria, your query will hit only one partition. The thing with partitions is, for a table you can any number of partition, but it is not necessary that all the partition will reside on same machine or even same farm. So when you fire a query on badly designed Azure Table, it can hit more than one server, and thus bad performance. Read about Real World: Designing a Scalable Partitioning Strategy for Windows Azure Table Storage

Hope you get what you are looking for.

0
BrentDaCodeMonkey On

As Amar pointed out, keep an eye on the team blogs for the latest in new feature announcements. The goal for SQL Azure is for it to eventually be where new features are found first. However, it will still take awhile for things to get there.

As for your peformance question, there's no simple answer for this. Windows Azure resources are designed for scale, not necessarially high performance. So its to take your scale/capacity targets into account when designing solutions. For your situation, I would encourage you to conside table storage, but this will depend on frequency access and the types of queries you need to make on the data. Just do not be surprised if you have to mave redundant copies of your data that are modelled differently, or possibly even running parrallel queries and aggregating results. This is the way table storage was designed to be used. Its cheaper then SQL Azure and its this price difference that makes redundant specialized data models possible.

This approach also has to be weighed against the cost of retraining your developers to stop thinking in RDBMS terms. :)