I have integrated Pentaho5 EE with Impala. In my schema dimension values are not gathered from the fact table as it is a huge table and it takes too long to calculate them. Since dimension values come dimension tables Mondrian compiles a query which does a join of dimension table with fact table in that order (i.e dimension table on the left). The query this way is slow and I read on the Cloudera website that if you do a join in Impala the bigger table (the fact table) has to be on the right.
I did compare compare the query compiled by Mondrian directly in Impala and when I put the fact table on the left in the join it is much much faster. My question is: Is there a mondrian/analyzer property setting which I can use to enable such behavior as currently Mondrian always joins with dimension table on the left. Also, is there a Hadoop plug in for Pentaho that you would recommend that would improve performance of Pentaho with Impala?