In pursuit of visualizing the distribution of a continuous predictor by survival odds in the sample, I wish to transform the K-M plot as described in my title.

The idea is this way it is easy to see the full range of continuous values. I sacrifice the visualization of overall time distribution of survival, but using colors and maintaining P(Survival) on the y-axis, I could easily plot the 1 year, 5 year survival rate etcetera.

Note I am not looking for Cox visualization. I am interested in the data from a K-M function, i.e.:

km <- survfit(Surv(time, censor) ~ continuous.predictor, data = df)
ggsurvplot(km)

but with the x-axis as the continuous predictor and with lines colored by chosen points in time. Theoretically, this should just be a matter of making factored time the strata and continuous.predictor the new x-value.

However, I am not sure how to do this in R. I have used survfit() in the survival package and ggsurvplot() from survminer but is unclear if either support such a transformation.

2

There are 2 answers

2
Allan Cameron On

You could do this using the predict function on your model, supplying the values of the continuous variable and the times at which you wish to measure survival probability.

Let's use the lung example from the survival package, with age as the continuous variable of interest:

library(survival)

model <- coxph(Surv(time, status) ~ age, data = lung)

Now we create a data frame of all ages 30 - 80 at follow up times of 6 months, 1 year and 5 years:

newdata <- expand.grid(age = 30:80, time = c(182, 365, 5*365), status = 1)

We can feed this to predict and get the survival probabilities, with 95% confidence intervals:

preds <- predict(model, newdata = newdata, type = 'expected', se.fit = TRUE)

newdata$pred <- exp(-preds$fit)
newdata$upper <- exp(-(preds$fit + 1.96 * preds$se.fit))
newdata$lower <- exp(-(preds$fit - 1.96 * preds$se.fit))

Now we can plot with vanilla ggplot:

library(ggplot2)

ggplot(newdata, aes(age, pred, color = factor(time))) +
  geom_ribbon(aes(ymax = upper, ymin = lower, fill = factor(time)),
              alpha = 0.2, color = NA) +
  geom_line() +
  scale_fill_discrete('Time', labels = c('6 months', '1 year', '5 years')) +
  scale_color_discrete('Time', labels = c('6 months', '1 year', '5 years')) +
  scale_y_continuous('Survival probability', labels = scales::percent) +
  theme_minimal() +
  ggtitle(paste('Survival Probability according to age',
                'at 6 months, 1 year, 5 years'))

enter image description here

0
Denzo On

I have recently published an article on this topic called "Visualizing the (causal) effect of a continuous variable on a time-to-event outcome" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10392888/). It describes multiple different ways to visualize the effect of continuos variables on the survival probability. For example, this is the survival area plot:

survival_area_plot

You can create this and other graphs similar to this using the contsurvplot R-package, available on CRAN: https://robindenz1.github.io/contsurvplot/

If you want the continuous variable on the x-axis, the survival probability on the y-axis and colored lines for some specific points in time you can use the plot_surv_at_t function. In this case, the only difference to the answer given by Allan Cameron is that it allows you to include adjustment variables in the Cox model as well.