Why are Solr's logs time series stored in different collections based on time instead of different shards based on time

Question

Why are Solr's logs time series stored in different collections based on time instead of different shards based on time

105 views Asked by Shishir Choudhary At 21 December 2016 at 07:10

If you see Lucidworks Time Based Partitioning or Large Scale Log Analytics with Solr, multiple solr "collections" are created partitioned on time.

My question is

Why not in such cases just create multiple shards based on time ?
In case of multiple collection, how would a query spanning multiple collections/time be done ?

Original Q&A

There are 1 answers

**Toke Eskildsen** · Accepted Answer · 2016-12-21T11:32:37+00:00

There is not much difference between multiple shards with implicit routing or multiple collections. When you issue a query, you can (optionally) specify which shards or which collections to search.

Alternatively you can set up an alias containing multiple collections, thus hiding the logistics from the search client. This makes it easy to create custom views over the full data set, such as an alias for each year, one for everything and one for the last quarter. If you at a later time decide to slice your data differently, e.g. make a collection for each week instead of each month, this change will be transparent to the client application. Aliases does not work for shards, so that is one reason to prefer collections.

TechQA.

Why are Solr's logs time series stored in different collections based on time instead of different shards based on time

There are 1 answers

Related Questions in SOLR

Related Questions in SOLRCLOUD

Related Questions in LOG-ANALYSIS

Related Questions in LUCIDWORKS

Popular Questions

Popular Tags

Trending Questions