XGBoost - python - fitting a regressor

520 views Asked by At

I'm trying to fit a xgboost regressor in a really large data. I was hoping to use the earlystop in 50 trees if no improvement is made, and to print the evaluation metric in each 10 trees (I'm using RMSE as my main metric).

My current code's the following:

#Building a training DMatrix from my training dataset 
xgb_tr=xgb.DMatrix(data=x_train[predictors],label=x_train['target'].values,feature_names=predictors)

#Building a testinng DMatrix from my testing dataset     
xgb_te=xgb.DMatrix(data=x_test[predictors],label=x_test['target'].values,feature_names=predictors)

params_xgb={
                'objective':'reg:linear',
                'eval_metric':'rmse'
            }

best_xgb=xgb.train(params_xgb,
                     xgb_tr,
                     evals=[(xgb_tr,'training'), (xgb_te,'test')],
                     num_boost_round=3000,
                     early_stopping_rounds=50,
                     verbose_eval=10)

What I was expecting was something like this (this is the output from a lgbm model):

Training until validation scores don't improve for 50 rounds
[10]    train's rmse: 1.18004   valid's rmse: 1.10737
[20]    train's rmse: 1.16906   valid's rmse: 1.09693
[30]    train's rmse: 1.15957   valid's rmse: 1.08851
[40]    train's rmse: 1.14905   valid's rmse: 1.07874
[50]    train's rmse: 1.14026   valid's rmse: 1.07104
[60]    train's rmse: 1.13104   valid's rmse: 1.06248
[70]    train's rmse: 1.12265   valid's rmse: 1.05476
[80]    train's rmse: 1.114 valid's rmse: 1.04638
[90]    train's rmse: 1.10739   valid's rmse: 1.04018
[100]   train's rmse: 1.10001   valid's rmse: 1.03354

But instead I got a puzzling error message:

---------------------------------------------------------------------------
XGBoostError                              Traceback (most recent call last)
<ipython-input-26-827da738fc42> in <module>
      1 evals_results = {}
----> 2 best_xgb=xgb.train(params_xgb,
      3                      xgb_tr,
      4                      evals=[(xgb_tr,'training'), (xgb_te,'test')],
      5                      num_boost_round=3000,

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/xgboost/training.py in train(params, dtrain, num_boost_round, evals, obj, feval, maximize, early_stopping_rounds, evals_result, verbose_eval, xgb_model, callbacks, learning_rates)
    210         callbacks.append(callback.reset_learning_rate(learning_rates))
    211 
--> 212     return _train_internal(params, dtrain,
    213                            num_boost_round=num_boost_round,
    214                            evals=evals,

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/xgboost/training.py in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks)
     72         # Skip the first update if it is a recovery step.
     73         if version % 2 == 0:
---> 74             bst.update(dtrain, i, obj)
     75             bst.save_rabit_checkpoint()
     76             version += 1

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/xgboost/core.py in update(self, dtrain, iteration, fobj)
   1106 
   1107         if fobj is None:
-> 1108             _check_call(_LIB.XGBoosterUpdateOneIter(self.handle, ctypes.c_int(iteration),
   1109                                                     dtrain.handle))
   1110         else:

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/xgboost/core.py in _check_call(ret)
    174     """
    175     if ret != 0:
--> 176         raise XGBoostError(py_str(_LIB.XGBGetLastError()))
    177 
    178 

XGBoostError: [17:24:56] src/tree/updater_histmaker.cc:311: fv=inf, hist.last=inf
Stack trace:
  [bt] (0) 1   libxgboost.dylib                    0x0000000116ac6319 dmlc::LogMessageFatal::~LogMessageFatal() + 57
  [bt] (1) 2   libxgboost.dylib                    0x0000000116b8bef4 xgboost::tree::CQHistMaker::HistEntry::Add(float, xgboost::detail::GradientPairInternal<float>) + 772
  [bt] (2) 3   libxgboost.dylib                    0x0000000116b8b6b3 xgboost::tree::CQHistMaker::UpdateHistCol(std::__1::vector<xgboost::detail::GradientPairInternal<float>, std::__1::allocator<xgboost::detail::GradientPairInternal<float> > > const&, xgboost::common::Span<xgboost::Entry const, -1ll> const&, xgboost::MetaInfo const&, xgboost::RegTree const&, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const&, unsigned int, std::__1::vector<xgboost::tree::CQHistMaker::HistEntry, std::__1::allocator<xgboost::tree::CQHistMaker::HistEntry> >*) + 643
  [bt] (3) 4   libxgboost.dylib                    0x0000000116b8d639 xgboost::tree::GlobalProposalHistMaker::CreateHist(std::__1::vector<xgboost::detail::GradientPairInternal<float>, std::__1::allocator<xgboost::detail::GradientPairInternal<float> > > const&, xgboost::DMatrix*, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const&, xgboost::RegTree const&) + 1433
  [bt] (4) 5   libxgboost.dylib                    0x0000000116b834c4 xgboost::tree::HistMaker::Update(std::__1::vector<xgboost::detail::GradientPairInternal<float>, std::__1::allocator<xgboost::detail::GradientPairInternal<float> > > const&, xgboost::DMatrix*, xgboost::RegTree*) + 388
  [bt] (5) 6   libxgboost.dylib                    0x0000000116b82df0 xgboost::tree::HistMaker::Update(xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::DMatrix*, std::__1::vector<xgboost::RegTree*, std::__1::allocator<xgboost::RegTree*> > const&) + 144
  [bt] (6) 7   libxgboost.dylib                    0x0000000116b26296 xgboost::gbm::GBTree::BoostNewTrees(xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::DMatrix*, int, std::__1::vector<std::__1::unique_ptr<xgboost::RegTree, std::__1::default_delete<xgboost::RegTree> >, std::__1::allocator<std::__1::unique_ptr<xgboost::RegTree, std::__1::default_delete<xgboost::RegTree> > > >*) + 1766
  [bt] (7) 8   libxgboost.dylib                    0x0000000116b22566 xgboost::gbm::GBTree::DoBoost(xgboost::DMatrix*, xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*, xgboost::ObjFunction*) + 310
  [bt] (8) 9   libxgboost.dylib                    0x0000000116ac27cc xgboost::LearnerImpl::UpdateOneIter(int, xgboost::DMatrix*) + 1532

Does anyone came across such error? If no, is there a better way to implement the XGBoost alghoritm in a regression with this callback?

0

There are 0 answers