Python API call sampling issue

107 views Asked by At

I am trying to programmatically(in python) recreate the Conversions>ecommerce>product performance report with product as the primary dimension and campaign as the secondary.

I appear to be having a massive sampling issue trying to pull ga:campaign dimension for a client, even when restricting the date range to a single day. the property I am pulling data for is a G360 account and regardless of what I set samplingLevel the sampling appears to be the same 'samplesReadCounts': ['999984'], 'samplingSpaceSizes': ['3980975'].

I am able to get unsampled data if I only have product as the dimension but it is sampling with campaign alone or the 2 together.

I have tried samplingLevel as SMALL, LARGE, DEFAULT to see if the samplesReadCounts and samplingSpaceSizes would change but it does not.

return analytics.reports().batchGet(
        body={
            'reportRequests': [
                {
                    'viewId': VIEW_ID,
                    'dateRanges': [{'startDate': startdt, 'endDate': enddt}],
                    'metrics': [{'expression': 'ga:itemRevenue'},{'expression': 'ga:uniquePurchases'},{'expression': 'ga:itemQuantity'},
                                {'expression': 'ga:revenuePerItem'},{'expression': 'ga:itemsPerPurchase'},{'expression': 'ga:productRefundAmount'}],
                    'dimensions': [{'name': 'ga:productName' },{'name': 'ga:campaign' }],
                    'samplingLevel':'LARGE',

                }]
        }
    ).execute()
1

There are 1 answers

1
A Wafers On

Are there a lot of unique campaign values? If so, you may be experiencing a high-cardinality issue. That would be why product is fine but when you add campaign dimension, it's sampled again (number of product values * number of campaign values) means a significant increase in the number of result rows.

Unfortunately, in my experience there's not much you can do about it on historical data through API requests. Since you have 360, you can still request the unsampled data manually through the UI. Moving forward, you can also reduce the number of campaign values by setting up Filters under your View settings or creating new Views entirely. However, this would only apply to future-collected data.

You may also look into using BigQuery, which allows you to query unsampled data (at a cost). However, you can't access historical data, only what starts being stored from the point after you've set it up.