Linked Questions

Popular Questions

I am trying to get the daily growth rate of users per merchant and add it as another column to my dataframe. I was able to come up with the code already; however I think the issue is because of certain merchants that may have consecutive days of 0 sales. 0-> 0 doesn't have a growth rate so I believe that's why its messing up my code. Any recommendations?

Here is the code I am using: Apologies for the length

# First I filter our the dates and order category
# week
df_wk = df.loc[(df['Booked at'] >= '2023-01-01') & (df['Booked at'] <= '2023-05-31')]

#placed eats orders
df_eats = df_wk.loc[df_wk['Job Type'] == 'Eats']

# Create a dataframe with all merchants and dates within the date range
dates = pd.date_range('2023-01-01', '2023-05-31', freq='D')
merchants = df_eats['Merchant Name'].unique()
index = pd.MultiIndex.from_product([dates, merchants], names=['Booked at', 'Merchant Name'])
all_sales = pd.DataFrame(index=index).reset_index()

#including all merchants even on zero sales days and getting number of unique users per merchant and day
eats_users = df_eats.groupby(['Booked at', 'Merchant Name']).agg({'Customer Name': lambda x: x.nunique()})
eats_users.rename(columns={'Customer Name': 'Total Unique Users'}, inplace=True)

# Sort the dataframe by 'Booked at' column to ensure chronological order
eats_users.sort_values('Booked at', inplace=True)

#merge all sales with new df
unique_users = pd.merge(all_sales, eats_users, on=['Booked at', 'Merchant Name'], how='left')
unique_users.fillna(0, inplace=True)

# Sort the dataframe by 'Booked at' column to ensure chronological order
unique_users.sort_values('Booked at', inplace=True)

# Add new column that calculates the daily growth rate per merchant
previous_users = {}
for index, row in unique_users.iterrows():
    current_users = row['Total Unique Users']
    merchant = index[1]  # Get the merchant name from the index tuple
    if pd.isna(current_users):
        unique_users.at[index, 'Daily Growth Rate for Unique Users'] = pd.NA
    else:
        if merchant in previous_users:
            previous_users_count = previous_users[merchant]
            growth_rate = (current_users / previous_users_count - 1) * 100
            unique_users.at[index, 'Daily Growth Rate for Unique Users'] = growth_rate
        else:
            unique_users.at[index, 'Daily Growth Rate for Unique Users'] = pd.NA
        previous_users[merchant] = current_users

Expected Output is something like this:

    Booked at Merchant Name  Total Unique Users  \
0   2023-01-01    Merchant A                  38   
1   2023-01-01    Merchant B                  75   
2   2023-01-01    Merchant C                  74   
3   2023-01-02    Merchant A                  33   
4   2023-01-02    Merchant B                  71   
..         ...           ...                 ...   
447 2023-05-30    Merchant A                  75   
449 2023-05-30    Merchant C                  13   
451 2023-05-31    Merchant B                  76   
450 2023-05-31    Merchant A                  52   
452 2023-05-31    Merchant C                   2   

     Daily Growth Rate for Unique Users  
0                                   NaN  
1                                   NaN  
2                                   NaN  
3                            -13.157895  
4                             -5.333333  
..                                  ...  
447                            5.633803  
449                          -45.833333  
451                           13.432836  
450                          -30.666667  
452                          -84.615385  

[453 rows x 4 columns]

Related Questions