Cox PH Hazard Function in Python

1.2k views Asked by At

I have the following data:

gasdfhourly[['Unit','Hourcount','Target']].head()
Out[377]: 
                       Unit  Hourcount  Target
Date       hour                               
2014-01-01 0     748.816493          1     0.0
           1     759.759946          2     0.0
           2     756.737007          3     0.0
           3     761.075262          4     0.0
           4     765.142517          5     0.0

I was trying to fit in Cox PH model to it:

from lifelines import CoxPHFitter
cph = CoxPHFitter()
cph.fit(gasdfhourly, duration_col='Hourcount', event_col='Target', show_progress=False)
cph.print_summary() 
X=gasdfhourly['Unit']

However, while trying to derive the survival function:

cph.predict_survival_function(X)

I get the following error:

Traceback (most recent call last):

  File "<ipython-input-378-6bde79cbbb89>", line 1, in <module>
    cph.predict_survival_function(X)

  File "C:\ProgramData\Anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py", line 514, in predict_survival_function
    return exp(-self.predict_cumulative_hazard(X, times=times))

  File "C:\ProgramData\Anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py", line 495, in predict_cumulative_hazard
    v = self.predict_partial_hazard(X)

  File "C:\ProgramData\Anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py", line 437, in predict_partial_hazard
    return exp(self.predict_log_partial_hazard(X))

  File "C:\ProgramData\Anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py", line 459, in predict_log_partial_hazard
    return pd.DataFrame(np.dot(X, self.hazards_.T), index=index)

ValueError: shapes (215,) and (1,1) not aligned: 215 (dim 0) != 1 (dim 0)

Can somebody please point out the error in my code?

1

There are 1 answers

0
Cam.Davidson.Pilon On

The error is X=gasdfhourly['Unit'], that produces a Series, and not a DataFrame. The predict functions want a DataFrame. Instead, try X=gasdfhourly[['Unit']] (note the two brackets)