Set the right partitions for Crate Database

Question

Set the right partitions for Crate Database

263 views Asked by duichwer At 13 December 2016 at 15:22

I am modelling for the Database CrateDB.
I have an avg. of 400 customers and the produce different amounts of time-series data every day. (Between 5K and 500K; avg. ~15K)

Later I should be able to query per customer_year_month and per customer_year_calendar_week. That means that I will only query for the intervals:

week
and month

Now I'am asking myself how to partition this table?

I would partion per customer and year.
Does this make sense?
Or would it be better to partion by customer, year and month?

Original Q&A

There are 1 answers

**claus** · Accepted Answer · 2016-12-14T15:35:25+00:00

so the question of partitioning a table is quite complex and should consider a lot of things. Among others:

What queries should be run?
The way the data is inserted
Available hardware resources
Cluster size

Essentially, each partition also creates overhead by multiplying the shard count (a partition can be considered a "sub-table" based on a column value), which - if chosen improperly - can hinder performance a lot. So in your case 15k inserts a day is not too much, however the distribution of inserts might cause problems, a customer's partition that grows with 500k inserts a day will run into performance problems earlier than the 5k person. As a consequence I would use weekly partitioning only.

create table "customer-logging" (
    customer_id long, 
    log string, 
    ts timestamp, 
    week as date_trunc('week', ts)
) partitioned by (week) into 8 shards

Please only use 8 shards if you have an appropriate amount of CPU cores ;)

Docs: date_trunc(), partitioned tables

Ideally you try out a few different combinations and find what works best for you. Insights into shard sizes and locations are provided by our sys tables, so you can see if there's a particularly fat shard that overloads a node ;)

Cheers, Claus

TechQA.

Set the right partitions for Crate Database

There are 1 answers

Related Questions in DATABASE

Related Questions in PARTITION

Related Questions in CRATE

Popular Questions

Popular Tags

Trending Questions