Dimensional Modeling: app session or activity measures

214 views Asked by At

I am trying to answer the below question given by the business (The business generates revenue from multiple apps through customer pay model) The business is interested in the below questions

  1. new users (trend with respect to previous months)
  2. daily active users
  3. Day 1 retention I came up with the below DM

Dimension: users, app, deviceid, useractions, plan, date

Fact: fact_activity(userid, appid,deviceid, actionid)

Actions could be: app installed, app launch, registered, completed purchase, postedcomments, playgame etc

The questions I have is

  1. should the fact table contain action_type instead of the actionid into the fact (to avoid join with useractions)
  2. Definition of day 1 retention: No of apps installed/ app launches next day how do to avoid multiple counting of single user using multiple devices
  3. Would it be advisable to have device details in the user dimension or separate.
  4. If I need to measure average session duration, should I use another fact at session level or tweak the activity fact?
1

There are 1 answers

0
NickW On

your questions are really unanswerable without significant more information about your business processes, data definitions, etc. In effect, you are asking someone to design a dimensional model for you before they can answer your questions - which is obviously not going to happen.

However, I can give you some very generic pointers that may help you:

Dimensions

A Dimension describes an entity so if attributes can't be described as belonging to the same entity then they shouldn't be in the same dimension. In your case, I assume a Device and a User are not the same thing and therefore they need to be separate dimensions

Facts

You need to define your measures i.e. what precisely are the things you are going to want to aggregate (count, sum, avg, etc) and how are they defined/calculated.

For each measure, you also need to define its grain i.e. what is the minimum set of dimensions that uniquely identify it. Once you have the grain defined, if multiple measures have the same grain then they can be held in the same fact table and if they don't then they can't