I have two questions. First, I want to chart the predicted survival function. The code is as follows:
from sksurv.preprocessing import OneHotEncoder
from sksurv.datasets import load_veterans_lung_cancer
from sksurv.linear_model import CoxPHSurvivalAnalysis
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data_x, data_y = load_veterans_lung_cancer()
data_y
data_x_numeric = OneHotEncoder().fit_transform(data_x)
estimator = CoxPHSurvivalAnalysis()
estimator.fit(data_x_numeric, data_y)
x_new = pd.DataFrame.from_dict({
1: [65, 0, 0, 1, 60, 1, 0, 1],
2: [65, 0, 0, 1, 60, 1, 0, 0],
3: [65, 0, 1, 0, 60, 1, 0, 0],
4: [65, 0, 1, 0, 60, 1, 0, 1]},
columns=data_x_numeric.columns, orient='index')
pred_surv = estimator.predict_survival_function(x_new)
When I want to plot the result:
time_points = np.arange(1, 1000)
for i, surv_func in enumerate(pred_surv):
plt.step(time_points, surv_func(time_points), where="post",
label="Sample %d" % (i + 1))
plt.ylabel("est. probability of survival $\hat{S}(t)$")
plt.xlabel("time $t$")
plt.legend(loc="best")
I get the following error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
How can I solve this problem?
The second question is, how results from an object pred_surv can be transferred to a dataframe ?