I have a series of proc sql statements which pull data for Active, Inactive and Lapsed customers. I end up with 3 tables.
- *Customers_Active
- *Customers_InActive
- *Customers_Lapsed
Active: 0-12M purchaser
Inactive: 13-24M purchaser, did not purchase 0-12M (active day range minus 12 months)
Lapsed: 25-36M purchaser, did not purchase 0-24M (inactive day range minus 24 months)
Again, In order to do this I have 3 separate proc sql statements and I start out by defining 6 macro variables.
I want to apply macro language so I can uses just 1 query to produce the 3 tables.
%let actstart = '24Feb2013'd;
%let actend = '22Feb2014'd;
%let iactstart = '26Feb2012'd;
%let iactend= '23Feb2013'd;
%let lapstart = '27Feb2011'd;
%let lapend = '25Feb2012'd;*
Here are the three statements
/Active Table/
proc sql;
create table Customers_Active as
select
household_id,
customer_id,
web_order_id,
transaction_date,
sku,
quantity,
original_price
from transaction_vw
where transaction_date >= &actstart and transaction_date <= &actend
order by customer_id;
quit;
/Inactive Table/
Notice I am not bringing in the customer_id which is already in the Customers_Active
table.
proc sql;
create table Customers_Inactive as
select
household_id,
customer_id,
web_order_id,
transaction_date,
sku,
quantity,
original_price
from transaction_vw
where transaction_date >= & iactstart and transaction_date <= &iactend
and customer_id not in (select distinct customer_id from Customers_Active)
order by customer_id;
quit;
/Lapsed Table/
Notice I am not bringing in the customer_id which is not already in the Customers_Active and the Customers_Inactive table.
proc sql;
create table Customers_Lapsed as
select
household_id,
customer_id,
web_order_id,
transaction_date,
sku,
quantity,
original_price
from transaction_vw
where transaction_date >= & lapstart and transaction_date <= & lapend
and customer_id not in (select distinct customer_id from Customers_Active)
and customer_id not in (select distinct customer_id from Customers_Inactive)
order by customer_id;
quit;
To recap: the end result is 3 tables.
- *Customers_Active
- *Customers_InActive
- *Customers_Lapsed
•I'm pulling in the same vars for all three tables •Just the date range changes •In addition I don't want to include customer_id in the Customers_Inactive table which is already in the Customers_Active table •and customer_ids in the lapsed table which are already in the Customer_Active and Customer_inactive tables
Again, since I'm pulling in the same vars I don't want to have three separate queries to do this just one.
I would do this in a bit different manner. You can do it in a few ways, but maybe one SQL step and one datastep would be easiest.
intck
with thec
modifier at the end tells number of periods between two dates, so 1 year means between 1 year and 1 year 364 days.The point of doing it this way is to use fewer passes through the data - so, first calculate how long last purchase was, then output their data. It could be even more efficient than this in a DoW loop (which does 1 real pass and 2 virtual passes, assuming data fits in memory). So we use a format to get it out (you could also join the tables if you like joins better, formats are just faster). Then we compare that date using
intck
to see how many years have passed, and output to the appropriate file.As a side note, there is a preference in SAS for not putting 3 files like this, but having 1 file and then adding a flag indicating which group it belongs to; then using
by
statements to do analyses on each group separately. This simplifies the code further. To do that, just change theif
statements to set flags instead of outputting - or skip theif
and just assign the year count directly.