I'm trying to find the the best variable transformation that yields linearity in log hazard or log cumulative hazard of a Cox proportional Hazard model. As those ar research data, I cannot publish them here. For this I'm trying to plot the variable against the residuals by using this code.
cox_mod_spline = coxph(Surv(timespan_censored,status)~ risk_factor, data = df)
res = residuals(cox_mod_spline, type = "martingale")
df$risk_factor
plot(na.omit(df$risk_factor), res)
However I get this error message : Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ
Indeed when I enter this code:
length(df$risk_factor)
length(res)
I get
[1] 587
[1] 577
respectively
I also checked that there are no NA
in df$risk_factor
why do the residuals and the variable differ in length given the fact that the residuals are created FROM the variable itself?