Using ggsurv function in R version 3.1.0

1.4k views Asked by At

I am trying to create a survival plot in R for deaths from exposure to a fungal disease over a number of weeks. I have the death week(continuous), whether they were alive (TRUE/FALSE), as well as categorical variables for diet (high/low) and sex(male/female). I have run a coxph model:

surv1 <- coxph(Surv(week_died,alive) ~ exposed + diet + sex,
        data=surv)

I would like to plot a survival curve, with individual lines for exposed males on high and low diets, and the same for females on high and low diets (resulting in 4 individual survival curves on the same plot). if I use this then I only get a single curve.

plot(survfit(surv1), ylim=c(), xlab="Weeks") 

I have also tried to use the ggsurv function created by Edwin Thoen (http://www.r-statistics.com/2013/07/creating-good-looking-survival-curves-the-ggsurv-function/) but keep getting an error for "invalid line type". I have tried to work out what could be causing this and think it would be this last ifelse statement - but I am not sure.

pl <- if(strata == 1) {ggsurv.s(s, CI , plot.cens, surv.col ,
                              cens.col, lty.est, lty.ci,
                              cens.shape, back.white, xlab,
                              ylab, main)
} else {ggsurv.m(s, CI, plot.cens, surv.col ,
               cens.col, lty.est, lty.ci,
               cens.shape, back.white, xlab,
               ylab, main)}

Does anyone have any idea on what is causing this error/how to fix it or if I completely trying to do the wrong thing to plot these curves.

Many thanks!

1

There are 1 answers

0
Brian Diggs On BEST ANSWER

survfit on a coxph model without any other specifications gives the survival curve for a case whose covariate predictors are the average of the population that the model was created with. From the help for survfit.coxph

Serious thought has been given to removing the default value for newdata, which is to use a single "psuedo" subject with covariate values equal to the means of the data set, since the resulting curve(s) almost never make sense. ... Two particularly egregious examples are factor variables and interactions. Suppose one were studying interspecies transmission of a virus, and the data set has a factor variable with levels ("pig", "chicken") and about equal numbers of observations for each. The “mean” covariate level will be 1/2 – is this a flying pig? ... Users are strongly advised to use the newdata argument.

So after you have computed surv1,

sf <- survfit(surv1,
              newdata = expand.grid(diet = unique(surv$diet),
                                    sex = unique(surv$sex)))

plot(sf)

sf should also work as an argument to ggsurv, though I've not tested it.