I'm trying to build a sklearn.Pipeline
for survival analysis including two stages:
- Class imbalance using
imblearn
classes. scikit-survival
classes for running survival analysis.
The problem I'm having is an incapability of target features between these two classes, since for imblearn
the target is binary and for scikit-survival
it is continuous. Since the pipeline object only takes an target vector, I'm unable to combine these two steps. Do you guys know any workaround to build a pipeline using different target vectors for different steps? Thank you in advance.
Example:
from sklearn.pipeline import make_pipeline
from sksurv.linear_model import CoxPHSurvivalAnalysis, CoxnetSurvivalAnalysis
from imblearn.under_sampling import RandomUnderSampler
# Load data
X_train = data[feats]
y_train = data[target]
# Construct pipe
steps = [RandomUnderSampler(), CoxPHSurvivalAnalysis()]
cph = make_pipeline(*steps)
cph.fit(X_train, y_train)