Using Gemfire on high volume transaction system

539 views Asked by At

I would like to know if there are any good resource to advice me on the best pratices for a high transaction(2000 TPS) and volume system(Millions and Millions of records) using gemfire as it's main Database.

I ask this because I am receiving information to skip doing queries that use "LIKE" or any other search that is not a Key fetch on Gemfire, and try to use the region directly on Java memory wherever possible (If the JVM can handle size of the data). Making Gemfire almost a huge HashMap that has no functionality other than a Map.get().

Is there any basis on arguments like above?

Aren't Gemfire clusters handling zillions of transactions per second around the world everyday?

Thanks

3

There are 3 answers

1
John Blum On BEST ANSWER

So, I don't know about "zillions" of transactions per day :-), but certainly customers use GemFire to process millions of transactions a day and store billions or records (objects).

You can see more details by viewing the case studies (China Railway, India Railways & Newedge) on the Pivotal's website (https://pivotal.io/big-data/pivotal-gemfire).

While it is generally always better to perform a direct lookup via a Key index (even in a OQL statement and not necessarily with Map.get(key)), it is not impossible to use the LIKE operator in a OQL predicate in the presence of an Index (http://gemfire.docs.pivotal.io/latest/userguide/index.html#developing/query_select/the_where_clause.html#the_where_clause__section_D91E0B06FFF6431490CC0BFA369425AD).

The important thing to remember is that Indexes incur a cost to maintain and and store in memory, therefore it is important to get them right. See here for more tips on Indexing... (http://gemfire.docs.pivotal.io/latest/userguide/index.html#developing/query_index/indexing_guidelines.html).

Regarding best practices, our EA team would better be able to advise you on your particular UCs (?) and functional requirements.

0
William Markito On

That's definitely not true. We have many customers using OQL and other advanced features of the product dealing with thousands of concurrent clients/queries.

It's hard to give any specific advise without object size, queries and indexes being used. There are scenarios where it makes sense to use QueryService (firing a query from the client) and there are others where it's better to use data aware Functions in order to better distribute the query execution.

Take a look at Querying Partitioned Region and look specifically at Optimizing Queries on Data Partitioned by a Key or Field Value for some examples and ideas.

Hope that helps

0
joshuad2 On

I have been on a few projects using GemFire, and yes it can be used to query across a large set of data. As William and John stated it really comes down to how your GemFire cluster is designed to work with your data such as partitioning, replication, etc. IMHO you should avoid indexes where you can, and use the GemFire cluster as a data grid instead. With this feature you can have your queries running in parallel across your cluster which increases your speed and flexibility. Take a look at Geode Function best practices