Web analytics customer segmentation data modeling with Cassandra?

348 views Asked by At

I would like to use Cassandra for my website analytics which is geared especially towards customers segmentation. What this analytics will do is gather Page View data for every customers/visitor which includes:

userName, country, city, gender, timeStamp, source, campaign, pageUrl, timeOnPage

This data will have to be sliced by all the dimensions for easy customer segmentation. For example:

SELECT * Users WHERE Country = USA AND pageUrl = "http://mystore.com/bestsettingproduct" AND timeStamp < DateTime.Now AND timeStamp > DateTime.Now - Days.30

OR

SELECT * Users WHERE Campaign = "Last Email Campaign" AND AND timeStamp < DateTime.Now AND timeStamp > DateTime.Now - Days.30

As I understand in Cassandra you can query only by key. Given these sort of dynamic queries where one or more dimension can be included in the where clause, what would be a good data model?

I am thinking of having several tables with the following keys:

table 1 (userName, country, city, gender, timeStampWeek, timeStamp source, campaign, pageUrl, timeOnPage,

primary key((timeStampWeek, country), city, gender, timeStamp source, campaign, pageUrl, timeOnPage));


table 2 (userName, country, city, gender, timeStampWeek, timeStamp source, campaign, pageUrl, timeOnPage,

primary key((timeStampWeek, city), country, gender, timeStamp, source, campaign, pageUrl, timeOnPage));


table 3 (userName, country, city, gender, timeStampWeek, timeStamp, source, campaign, pageUrl, timeOnPage,

primary key((timeStampWeek, campaign), country, city, gender, timeStamp, source, pageUrl, timeOnPage));


And so on for all combinations of the dimensions? But this seems wild? Could there be a smarter way to model around these queries?

0

There are 0 answers