Translating a T-SQL query to MDX

124 views Asked by At

I am a new MDX user.

I can easily get what I need using T-SQL but getting the equivalent using MDX has proven difficult.

use [AdventureWorksDW2012]

------------------------------------------------------------
--Select customers that purchased specific items during specific time period
------------------------------------------------------------ 
drop table #Customers_Purchased_SelectedProduct
select
distinct 
    a.CustomerKey
into #Customers_Purchased_SelectedProduct
from [dbo].[FactInternetSales] a
    inner join [dbo].[DimProduct] b on a.ProductKey = b.ProductKey
    inner join [dbo].[DimProductSubcategory] c on b.ProductSubcategoryKey = c.ProductSubcategoryKey
where
     a.ShipDateKey between 20050101 and 20081215
    and c.ProductSubcategoryKey in (1 , 2)

------------------------------------------------------------
--Get sales metrics for customers identified above
------------------------------------------------------------ 
select
    c.ProductSubcategoryKey
    , b.ProductKey
    , sum(a.SalesAmount) as SalesAmount
    , count(distinct a.CustomerKey) as 'CustomerDistinct_withPurchases'
from [dbo].[FactInternetSales] a
    inner join [dbo].[DimProduct] b on a.ProductKey = b.ProductKey
    inner join [dbo].[DimProductSubcategory] c on b.ProductSubcategoryKey = c.ProductSubcategoryKey
    inner join #Customers_Purchased_SelectedProduct bb on a.CustomerKey = bb.CustomerKey
where
    a.ShipDateKey between 20050101 and 20081215
    and c.ProductSubcategoryKey not in (1 , 2)
group by 
    c.ProductSubcategoryKey
    , b.ProductKey

The code below is what I came up with. Seems extremely clunky and after 2 minutes it returns data and isn't correct.

use [AdventureWorksDW2012]

------------------------------------------------------------
--Select customers that purchased specific items during specific time period
------------------------------------------------------------ 
drop table #Customers_Purchased_SelectedProduct
select
distinct 
    a.CustomerKey
into #Customers_Purchased_SelectedProduct
from [dbo].[FactInternetSales] a
    inner join [dbo].[DimProduct] b on a.ProductKey = b.ProductKey
    inner join [dbo].[DimProductSubcategory] c on b.ProductSubcategoryKey = c.ProductSubcategoryKey
where
     a.ShipDateKey between 20050101 and 20081215
    and c.ProductSubcategoryKey in (1 , 2)

------------------------------------------------------------
--Get sales metrics for customers identified above
------------------------------------------------------------ 
select
    c.ProductSubcategoryKey
    , b.ProductKey
    , sum(a.SalesAmount) as SalesAmount
    , count(distinct a.CustomerKey) as 'CustomerDistinct_withPurchases'
from [dbo].[FactInternetSales] a
    inner join [dbo].[DimProduct] b on a.ProductKey = b.ProductKey
    inner join [dbo].[DimProductSubcategory] c on b.ProductSubcategoryKey = c.ProductSubcategoryKey
    inner join #Customers_Purchased_SelectedProduct bb on a.CustomerKey = bb.CustomerKey
where
    a.ShipDateKey between 20050101 and 20081215
    and c.ProductSubcategoryKey not in (1 , 2)
group by 
    c.ProductSubcategoryKey
    , b.ProductKey
The code below is what I came up with.  Seems extremely clunky and after 2 minutes it returns data and isn't correct.

with

------------------------------------------------------------
----Select customers that purchased specific items during specific time period
------------------------------------------------------------ 
set [Cust] as
nonempty(
            [Dim Customer].[Customer Key].[Customer Key].members ,
            (
                ({[Dim Product].[Product Subcategory Key].&[1] ,[Dim Product].[Product Subcategory Key].&[2]}) ,
                ({[Ship Date].[Date Key].&[20050101]: [Ship Date].[Date Key].&[20081215]}) ,
                [Measures].[Sales Amount]
            )
        )

------------------------------------------------------------
--Create list of subcategories excluding the ones from above
------------------------------------------------------------ 

set [SubCategory Other] as
    except (
                [Dim Product].[Product Subcategory Key].[Product Subcategory Key]
            , ({[Dim Product].[Product Subcategory Key].&[1] ,[Dim Product].[Product Subcategory Key].&[2]})
            )

member [Sales Amount Selected Customers] as sum([Cust] , [Measures].[Sales Amount])
member [Customer Count] as count(nonempty([Cust],[Sales Amount Selected Customers]))

select 
{[Sales Amount Selected Customers] , [Customer Count]} on 0
, ([SubCategory Other] * [Dim Product].[Product Key].[Product Key]) on 1
 from [Adventure Works DW2012]

Incorrect results set:

enter image description here

The T-SQL query runs in less than 1 second. I'm clearly messing something up.

1

There are 1 answers

0
MoazRub On

My understanding is that you want to have the sales Amount and distinct customer count for products and their subcategories within a data range. These products and their subcategories were bought by the customers who bought items from sub category 1 and 2 during the same period. For this you first took the list of customers in a temp table, then for all products bought by these customers you grouped product sales and distinct customer count.

Now there is couple of issues. 1) MDX does' not support sub query. So there is no straight forward way for you to collect the list of customers that you did in SQL.

2) In MDX you cannot put a single dimension attribute on two axes. Translating this to your problem, you want to cross analyze all product sales against, two products (market basket analysis). So Ideally in MDX the solution should be along the lines of putting the same dimension attribute on both axis, but this is not supported.

3) In your SQL query you are only using fact internet sales, in MDX you are also using [Measures].[Sales Amount] which is not from internet sales

4) The reason why 18484 are repeated is that the set is unaware of the query context. In simple terms for row 1 the set does not know it is executing for combination 3, 560