Survival Analysis for Telecom Churn using R

9.4k views Asked by At

I am working on Telecom Churn problem and here is my dataset.

http://www.sgi.com/tech/mlc/db/churn.data

Names - http://www.sgi.com/tech/mlc/db/churn.names

I'm new to survival analysis.Given the training data,my idea to build a survival model to estimate the survival time along with predicting churn/non churn on test data based on the independent factors.Could anyone help me with the code or pointers on how to go about this problem.

To be precise,say my train data has got

customer call usage details,plan details,tenure of his account etc and whether did he churn or not.

Using general classification models,I can predict churn or not on test data.Now using Survival analysis,I want to predict the tenure of the survival in test data.

Thanks, Maddy

2

There are 2 answers

4
Andrie On

Here is some code to get you started:

First, read the data

nm <- read.csv("http://www.sgi.com/tech/mlc/db/churn.names", 
               skip=4, colClasses=c("character", "NULL"), header=FALSE, sep=":")[[1]]
dat <- read.csv("http://www.sgi.com/tech/mlc/db/churn.data", header=FALSE, col.names=c(nm, "Churn"))

Use Surv() to set up a survival object for modeling

library(survival)

s <- with(dat, Surv(account.length, as.numeric(Churn)))

Fit a cox proportional hazards model and plot the result

model <- coxph(s ~ total.day.charge + number.customer.service.calls, data=dat[, -4])
summary(model)
plot(survfit(model))

enter image description here

Add a stratum:

model <- coxph(s ~ total.day.charge + strata(number.customer.service.calls <= 3), data=dat[, -4])
summary(model)
plot(survfit(model), col=c("blue", "red"))

enter image description here

0
John Chrysostom On

If you're still interested (or for the benefit of those coming later), I've written a few guides specifically for conducting survival analysis on customer churn data using R. They cover a bunch of different analytical techniques, all with sample data and R code.

Basic survival analysis: http://daynebatten.com/2015/02/customer-churn-survival-analysis/

Basic cox regression: http://daynebatten.com/2015/02/customer-churn-cox-regression/

Time-dependent covariates in cox regression: http://daynebatten.com/2015/12/survival-analysis-customer-churn-time-varying-covariates/

Time-dependent coefficients in cox regression: http://daynebatten.com/2016/01/customer-churn-time-dependent-coefficients/

Restricted mean survival time (quantify the impact of churn in dollar terms): http://daynebatten.com/2015/03/customer-churn-restricted-mean-survival-time/

Pseudo-observations (quantify dollar gain/loss associated with the churn effects of variables): http://daynebatten.com/2015/03/customer-churn-pseudo-observations/

Please forgive the goofy images.