We are using Cassandra in production with these following configuration:
- 16 core CPU
- 56 GB Memory
- 4 Nodes
- Secondary disk type: SSD
Drivers version:
- Spring data cassandra : 1.5.1
- Spring data cassandra contain cassandra driver core : 3.1.4
- Native protocol : v4
Cassandra version: - DSE 5.0.5 which uses Cassandra 3.0.11.1485
CPU consuption:
- Node 1 : 80%
- Node 2 : 30%
- Node 3 : 20%
- Node 4 : 40%
Cassandra connection configurations are following:
spring.data.cassandra.cluster-name= Test Cluster
spring.data.cassandra.compression= none
spring.data.cassandra.connect-timeout-millis= 5000
spring.data.cassandra.keyspace-name= event
spring.data.cassandra.contact-points=host1,host2,host3,host4
spring.data.cassandra.port= 9042
spring.data.cassandra.max.request.per.connection.local=32768
spring.data.cassandra.max.request.per.connection.remote=2000
spring.data.cassandra.max-core-connection.local=8
spring.data.cassandra.max-core-connection.remote=2
Cassandra configuration class:
@Configuration
public class CassandraConfiguration {
@Value("${spring.data.cassandra.cluster-name}")
public String clusterName;
@Value("${spring.data.cassandra.keyspace-name}")
public String keySpace;
@Value("${spring.data.cassandra.contact-points}")
private String contactpoints;
@Value("${spring.data.cassandra.port}")
public int port;
@Value("${spring.data.cassandra.max.request.per.connection.local}")
public int localRequestConnection;
@Value("${spring.data.cassandra.max.request.per.connection.remote}")
public int remoteRequestConnection;
@Value("${spring.data.cassandra.max.connection.local}")
public int localConnection;
@Value("${spring.data.cassandra.max.connection.remote}")
public int remoteConnection;
@Value("${spring.data.cassandra.connect-timeout-millis}")
public int poolTimeoutMillis;
@Bean
public CassandraClusterFactoryBean cluster() {
CassandraClusterFactoryBean cluster = new CassandraClusterFactoryBean();
cluster.setContactPoints(contactpoints);
cluster.setPort(port);
cluster.setClusterName(clusterName);
cluster.setCompressionType(CompressionType.NONE);
cluster.setPoolingOptions(getCassandraPool());
return cluster;
}
@Bean
public PoolingOptions getCassandraPool() {
PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions
.setMaxRequestsPerConnection(HostDistance.LOCAL, localRequestConnection)
.setMaxRequestsPerConnection(HostDistance.REMOTE, remoteRequestConnection)
.setCoreConnectionsPerHost(HostDistance.LOCAL, 8)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 2)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 8)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 2)
.setPoolTimeoutMillis(poolTimeoutMillis);
return poolingOptions;
}
@Bean
public CassandraMappingContext mappingContext() {
return new BasicCassandraMappingContext();
}
@Bean
public CassandraConverter converter() {
return new MappingCassandraConverter(mappingContext());
}
@Bean
public CassandraSessionFactoryBean session() throws Exception {
CassandraSessionFactoryBean session = new CassandraSessionFactoryBean();
session.setCluster(cluster().getObject());
session.setKeyspaceName(keySpace);
session.setConverter(converter());
return session;
}
@Bean
public CassandraOperations cassandraTemplate() throws Exception {
return new CassandraTemplate(session().getObject());
}
}
What problem we are facing is:
During normal traffic, everything goes well, but as soon as traffic goes high we get these below error from Cassandra driver:
com.datastax.driver.core.exceptions.OperationTimedOutException: [/105.12.14.114:9042] Timed out waiting for server response
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
at com.datastax.driver.core.DefaultResultSetFuture.onException(DefaultResultSetFuture.java:202)
at com.datastax.driver.core.RequestHandler.setFinalException(RequestHandler.java:201)
at com.datastax.driver.core.RequestHandler.access$2400(RequestHandler.java:46)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalException(RequestHandler.java:795)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.processRetryDecision(RequestHandler.java:421)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onTimeout(RequestHandler.java:778)
at com.datastax.driver.core.Connection$ResponseHandler$1.run(Connection.java:1374)
at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581)
at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:655)
at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.datastax.driver.core.exceptions.OperationTimedOutException: [/105.12.14.114:9042] Timed out waiting for server response
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onTimeout(RequestHandler.java:772)
... 6 more
Edit 1 :
I have around 60 tables similar to below table which is used to accomplish different use-cases
Cassandra table.
create table IF NOT EXISTS kspace.count_table (source_id bigint, name varchar, date text, pname varchar, ptype varchar, pvalue blob, count counter,unique_count counter, PRIMARY KEY((source_id,name,pname,ptype,date),pvalue))
Cassandra queries:
I have different varieties of queries for our different use-case. Below are example table queries which resemble our original queries:
- Single query (around 12)
- Batch queries belonging to the same partition key (around 30 queries)
- Batch queries belonging to the different partition key (around 20 queries)
Edit 2 : We are facing this problem after migrating from Apache Cassandra 2.1.8 to DSE version 5.0.5