Choosing a sample rate for GBM models

201 views Asked by Karan At 18 June 2015 at 13:24

I've created several GBM models to tune the parameters (trees, shrinkage and depth) to my data and the model performs well on the out-of-time sample. The data is credit card transactions (running into 100s of millions) so I sampled 1% of the good (non-event) and 100% of the bad.

However, when I increased the sample size to 3% of the good, there was a noticeable improvement in performance. My question is - how do I decide the optimal sampling rate, without running several iterations and deciding which one fits best? Is there a theory around this?

I have about 3 million total transactions (for the 1% sample), containing 380k bads and ~250 variables

Original Q&A

TechQA.

Choosing a sample rate for GBM models

There are 0 answers

Related Questions in MACHINE-LEARNING

Related Questions in MODELING

Related Questions in SAMPLING

Related Questions in GBM

Popular Questions

Popular Tags

Trending Questions