I would like to calculate the mode/median or better, most frequent observation of a categorical variable within my query. E.g, if the variable has the following string values: dog, dog, dog, cat, cat and I want to get dog since its 3 vs 2. Is there any function that does that? I tried APPX_MEDIAN() but it only returns the first 10 characters as median and I do not want that. Also, I would like to get the most frequent observation with respect to date if there is a tie-break.
Thank you!
the most frequent observation is
mode
and you can calculate it like this.Single value mode can be calculated like this on a value column. Get the count and pick up row with max count.
now, in case you have multiple modes, you need to join back to the main table to find all matches.
This will get all count of values and then match them based on count. Now, if one value of count matches to max count, you will get 1 row, if you have two value counts matches to max count, you will get 2 rows and so on.
Calculation of
median
is little tricky - and it will give you middle value. And its not most frequent one.