Feature (Covariates) selection in CoxPHFitter, Lifelines Survival Analysis

2k views Asked by At

i am using this implemented model in Python for the purpose of survival analysis:

from lifelines import CoxPHFitter

Unfortunately i am not able(i do not know how) to loop over all covariates (features) to run the regression individualy for the purpose of feature selection and save their result. I am trying the script below:

`def fit_and_score_features2(X):
    y=X[["Status","duration_yrs"]]
    X.drop(["duration_yrs", "Status"], axis=1, inplace=True)
    n_features = X.shape[1]
    scores = np.empty(n_features)
    m = CoxPHFitter()

    for j in range(n_features):
       Xj = X.values[:, j:j+1]
       Xj=pd.merge(X, y,  how='right', left_index=True, right_index=True)
       m.fit(Xj, duration_col="duration_yrs", event_col="Status", show_progress=True)
       scores[j] = m._score_
    return scores`

Unfortunately it return me this error:

ValueError Traceback (most recent call last) in () 1 #Trying the function above ----> 2 scores = fit_and_score_features2(sample) 3 pd.Series(scores, index=features.columns).sort_values(ascending=False)

in fit_and_score_features2(X) 15 Xj=pd.merge(X, y, how='right', left_index=True, right_index=True) 16 m.fit(Xj, duration_col="duration_yrs", event_col="Status", show_progress=True) ---> 17 scores[j] = m.score 18 return scores

ValueError: setting an array element with a sequence.

Thank you in advance.

2

There are 2 answers

0
Antonio Dichev On

I think that i was able to debug with your help (@Cam.Davidson.Pilon). Thanks a lot. It is the proper script in my opinion:

`def fit_and_score_features2(X):
   y=X[["Status","duration_yrs"]]
   X.drop(["duration_yrs", "Status"], axis=1, inplace=True)
   n_features = X.shape[1]
   scores = np.empty(n_features)
   m = CoxPHFitter()

   for j in range(n_features):
       Xj = X.iloc[:, j:j+1]
       Xj=pd.merge(Xj, y,  how='right', left_index=True, right_index=True)
       m.fit(Xj, duration_col="duration_yrs", event_col="Status", show_progress=True)
       scores[j] = m.score_
   return scores`
1
Gisel Hernandez Chavez On

For lifeline version 0.27.0 replace m.score_ with m.score(Xj) if you want to know the log likelihood and m.score(Xj,scoring_method='concordance_index') if you want to know the concordance index.